Why Harvard and the Smithsonian teamed up to crowdsource a century of astronomical history

photographic plate

Several afternoons a week, a Pennsylvania retiree named Albert Lamperti boots up his computer and spends about three hours scrutinizing decades-old documents and inputting their contents into an excel sheet. Lamperti took an early retirement in 2009 from Temple University’s medical school, where he taught anatomy to first year students, “so I could have fun with all this stuff.” That “stuff” being amateur astronomy, a field that has fascinated him since 1984 when he and his oldest son purchased a telescope and spent the summer observing things like the rings of Saturn and moons of Jupiter. When he was a kid, Lamperti worked a paper route and had used his early savings to spend $29.95 on an Edmund Telescope, a purchase that went mostly unused until he finally sold it as a teenager. This second time around, however, he was hooked, and it wasn’t long before he joined a local astronomy club his wife saw advertised in the newspaper. Over the intervening decades, Lamperti told me in a phone interview, he would go on to make over 5,000 astronomical observations. “From 1984 on, I’ve had that as a very nice diversion from work. Even though you come home at night very tired from observing, at least you feel that you’ve accomplished something, and it took your mind off any stresses from your day.”

Lamperti belongs to the Delaware Valley Amateur Astronomers, a hobbyist group that falls under the umbrella of the Astronomical League, and it was through this group that he first heard about an ambitious new project headed up by the Harvard College Observatory, an astronomical research facility that has existed for over 150 years. So in January he and several other amateur astronomers hopped on a conference call with David Sliski, who was then a curatorial assistant at Harvard (he recently took a job as a telescope operator at the University of Pennsylvania), to learn about the plan to transcribe logbooks for nearly half a million photographic plates of the night sky that had been taken over the course of a hundred years.

The project, called Digital Access to a Sky Century at Harvard (DASCH for short), is actually a collaboration between the Harvard College Observatory and the Smithsonian Institution, the latter of which has embarked on a much larger endeavor to crowdsource the transcription of millions of pages of archival material. The DC-based network of museums launched a beta version of its crowdsourcing platform in June 2013, and over the next year about 1,000 volunteers transcribed 13,000 pages of documents. After emerging from beta in August and opening up to the public, that volunteer list has swelled to over 4,000. Though it isn’t the first archival institution to begin digitizing its collected works — everyone from the New York Public Library to the British Museum have launched similar initiatives — its scope of work is arguably the most massive; there are 137 million objects spread across the Smithsonian’s 19 museums.

But in a world where Google has the technology to scan millions of library books and provide searchable text, why can’t the Smithsonian do the same?  “Basically these documents are not able to be recognized by [optical character recognition] text readers,” said Sarah Sulick, a public affairs specialist at Smithsonian. “You feed it through a scanner and it can read all the script on the page, but for these handwritten documents it just can’t do that. We actually need a human for this to happen.”

The Smithsonian and Harvard College Observatory, Sliski told me, have had a longstanding collaboration ever since the Smithsonian physically moved its own observatory from the National Mall in the 1950s up to Cambridge, Massachusetts. The Harvard–Smithsonian Center for Astrophysics is a joint venture between the two and has become one of the most venerable astrophysical institutions in the world. For their most recent collaboration, the Smithsonian has offered up its transcription program and thousands of volunteers to tackle the arduous task of transcribing the handwritten logbooks for thousands of photographic plates.

What are these plates? Well, they used to be the method by which all photographs were recorded, whereby a photographer would use a glass plate with a silver gelatin emulsion on the back to capture an image. By the 20th century, most photographers had moved on to more agile forms of film, but the plates were still used by astronomers who were taking pictures of the stars because they were less likely to bend or deteriorate with age. “In 1850, the director of the Observatory, William Cranch Bond, took the first photograph of a star, Vega,” said Sliski. “Later in the 1880s, the fourth director, Edward Pickering, launched a program to photograph the entire sky. He started in Cambridge, Massachusetts but quickly went west to Mt. Wilson, then south to Peru. Later the observatory would move to South Africa, New Zealand, and even Jamaica.” This was significant because, from about the 1890s onward, astronomers now had a complete picture of the night sky in both the northern and southern hemispheres. It provided tremendous value for those wanting to study patterns within the universe over a period of time. “What this collection allows you to do is look at the entire night sky for 100 years. If today you took the entire GDP of every country and invested it in astronomy, which would obviously never happen, you still couldn’t reproduce the results here, because there’s an element of time you can’t buy. There’s a uniqueness in that you can look for objects [in the universe] which occur on very unusual time scales.”

For instance, if a black hole accretes matter from a nearby star and ​goes nova, it will increase in brightness by a factor of ~100 for a few weeks, an event that will only happen every 50 to 100 years for a particular object.. “It’s been impossible to map the occurrence rate of that because we’ve only seen two or three,” said Sliski. “We know it exists, but we need to survey the entire sky for 50 to 100 years to count these things and see how frequently they’re happening.”

There’s a significant amount of human history attached to these plates, which is both a blessing and a curse because it causes tensions between archivists and scientists. A museum curator views the plates, which often include notations and other writings scribbled onto them by astronomers, as historical artifacts that must be preserved as is. A scientist, on the other hand, sees the plate as just one piece of data, and those annotations and scribbles get in the way of analyzing that data. “You have to wipe away all the handwriting,” explained Sliski. “Otherwise it’ll look like stars when you digitize it at a very high resolution.

Whose handwriting would be wiped away? As it turns out, some fairly important people in the history of astronomy.

Edward Charles Pickering, the director of the Harvard Observatory in the late 19th and early 20th centuries, grew so frustrated with the quality of clerical and computational work from his male staff that he ended up hiring his maid, Williamina Fleming, who he felt could do a better job. It turned out that she could, and when Pickering later received donation money to hire more staff, he placed Fleming in charge of a group of women who were later referred to as “Pickering’s Harem,” an unfortunate name. Most of the women performed this work, much of it monotonous and grueling, for between 25 to 35 cents an hour, half what a man would get paid for the same tasks. In fact, that was their appeal.


These “human computers,” as they’re often called now, provided much of the foundation on which some of the largest discoveries in astronomy have been made. It was the observations of Henrietta Leavitt, for instance, that were employed by Edwin Hubble when he used variations in brightness to map the distance between stars. “That moment in time transitions our views of the cosmos of it being a galaxy with millions of stars to a universe of hundreds of millions of galaxies, each with hundreds of millions of stars, and then we later determined it to be hundreds of billions of galaxies each with hundreds of billions of stars,” said Sliski. “That conceptual change is the direct result of what these women were doing.”

human computers

Edward Charles Pickering with his “human computers”

So the scientists and the historians reached a compromise: Harvard can wipe the plates clean, thereby allowing it to scan them and aggregate their data, but first it must take high resolution photographs of the plates with the writings intact. This way, museums can still consult the pre-wiped plates and the scribblings they once contained. So that’s what Sliski had been doing the past three years: taking about 400 plates off the shelf a day, photographing them at such a high resolution that a single plate requires a gigabyte worth of data, wiping them, scanning them, and then placing them back on the shelf.

The plates, in aggregate, allow Harvard to utilize algorithms and other pattern-seeking technology to search for notable astronomical phenomena stretching back into the 1800s. But before it can do that it must first perform a task that requires the help of the Smithsonian transcription center and hundreds of volunteers like  Albert Lamperti: transcribe the accompanying logbooks. “If you scan a plate we can measure the position of the stars, and we know which stars they are, but all this is irrelevant without knowing the date and time of the exposure because the whole point of this is to measure how the position and brightness changes in time,” explained Sliski.

That’s where Lamperti comes in. Several days a week he’ll log onto the Smithsonian transcription website and visit the astrophysics section. Once there he’s shown a bevy of partially-completed logbooks (each logbook is 100 pages, with up to 20 plates recorded to a page), and when he’s editing a particular page nobody else can access it (so as to avoid repeat work). For a particular page, he’s shown a high resolution scan of it that he can zoom in and out of, and there are 13 columns he has to transcribe into an excel sheet below the image. “Usually I work on it in the afternoon for two or three hours,” he said. “But I work on it for no longer than three hours, because once you get tired you make mistakes.”

I asked Lamperti what he gets out of the experience. Isn’t the act of transcription rote and grueling? He recounted for me a brief summary of the history of these plates, how they were used by astronomers to make some of the most exciting discoveries about our universe. “To see those actual plates from the late 1800s and early 1900s, and to see that old writing in the logbooks, knowing what that photographing led to later on, I feel like I’m part of the astronomical history here in the United States.”

Once the logbooks are transcribed and reviewed, Sliski would upload all the data and images to an open database that will allow scientists and the public to study them.  And the amount of information and potential these plates have is immense. According to Sliski, “99.99 percent of the data has never been looked at by anybody. Back in the day, people would look at maybe 100 stars on the plate, not the 50,000 when we can now see when we scan them.”

When the logbook transcriptions are completed and the rest of the data is made available, the project’s contribution to astronomy and science will be widely recognized around the world. “In astronomy, the field surveys itself every 10 years and asks what’s the number one priority,” said Sliski. “For the decadal survey for 2010, the biggest priority was time domain astrophysics.” And when the decadal survey is conducted again in 2020, the National Research Council of the National Academy of Sciences will likely look back at the previous 10 years and recognize the discoveries that were only made possible by a group of amateur astronomers, many of whom, like Albert Lamperti, logged in day in and day out to record a tiny sliver of our universe’s history, giving us a glimpse into the cosmos that made the mysteries of why we’re here and where we’re going just a little less mysterious. And like the human computers who laid the foundations for astronomical discovery in the 20th century, the contribution of these volunteers will not soon be forgotten.


Like my writing? Then you should hire me to create awesome content for you.

Images via Popular Science and Harvard