Category Archives: History

Making Numbers Legible

What do you do with numbers? I mean this in the context of writing, not research. How do you incorporate quantitative evidence into your writing in a way that makes it legible for your readers? I’ve been thinking more and more about this as I write my dissertation, which examines the role of the nineteenth-century Post in the American West. Much like today, the Post was massive. Its sheer size was part of what made it so important. And I find myself using the size of the Post to help answer the curmudgeonly “so what?” question that stalks the mental corridors of graduate students. On a very basic level, the Post mattered because so many Americans sent so many letters through such a large network operated by so many people. Answering the “so what?” question means that I have to incorporate numbers into my writing. But numbers are tricky.

Let’s begin with the amount of mail that moved through the U.S. Post. In 1880 Americans sent 1,053,252,876 letters. That number is barely legible for most readers. I mean this in two ways. In a mechanical sense we HATE having to actually read so many digits. A more conceptual problem is that this big of a number doesn’t mean all that much. If I change 1,053,252,876 to 1,253,252,876, would it lead you, the reader, to a fundamentally different conclusion about the size of the U.S. Post? I doubt it, even though the difference of 200 million letters is a pretty substantial one. And if instead of adding 200 million letters I subtract 200 million letters – 1,053,252,876 down to 853,252,876 – the reader’s perception is more likely to change. But this is only because the number shed one of its digits and crossed the magic cognitive threshold from “billion” to “million.” It’s not because of an inherent understanding of what those huge numbers actually mean.

ActualPerceived

Actual and perceived differences between 853,252,876 vs. 1,053,252,876 vs. 1,253,252,876

One strategy to make a number like 1,053,252,876 legible is by reduction: to turn large numbers into much smaller ones. If we spread out those billion letters across the population over the age of ten, the average American sent roughly twenty-eight letters over the course of 1880, or one every thirteen days. A ten-digit monstrosity turns into something the reader can relate to. After all, it’s easier to picture writing a letter every two weeks than it is to picture a mountain of one billion letters. Numbers, especially big ones, are easier to digest when they’re reduced to a more personal scale.

1,053,252,876 letters / 36,761,607 Americans over the age of ten = 28.65 letters / person

A second way to make numbers legible is by comparison. The most direct corollary to the U.S. Post was the telegraph industry. Put simply, the telegraph is a lot sexier than the Post and both nineteenth-century Americans and modern historians alike lionized the technology. A typical account goes something like this: “News no longer traveled at the excruciatingly slow pace of ships, horses, feet, or trains. It now moved at 670 million miles per hour.” In essence, “the telegraph liberated information.” But the telegraph only liberated information if you could afford to pay for it. In 1880 the cost of sending a telegram through Western Union from San Francisco to New York was $2.50, or 125 times the price to mail a two-cent letter. Not surprisingly, Americans sent roughly 35 times the number of letters than telegrams. The enormous size of the Post was in part a product of how cheap it was to use.

telegraphvspost

Cost of Telegram vs. Letter, San Francisco to New York (1880)

This points to a third strategy to make numbers legible: visualization. In the above case the chart acts as a rhetorical device. I’m less concerned with the reader being able to precisely measure the difference between $2.50 and $0.02 than I am with driving home the point that the telegraph was really, really expensive and the U.S. Post was really, really cheap. A more substantive comparison can be made by looking at the size of the Post Office Department’s workforce. In 1880 it employed an army of 56,421 postmasters, clerks, and contractors to process and transport the mail. Just how large was this workforce? In fact, the “postal army” was more than twice the size of the actual U.S. Army. Fifteen years removed from the Civil War there were now more postmasters than soldiers in American society. Readers are a lot better at visually comparing different bars than they are at doing mental arithmetic with large, unwieldy numbers.

PostOffice_Military

Almost as important as the sheer size of the U.S. Post was its geographic reach. Most postal employees worked in one of 43,012 post offices scattered across the United States. A liberal postal policy meant that almost any community could successfully petition the department for a new post office. Wherever people moved, a post office followed close on their heels. This resulted in a sprawling network that stretched from one corner of the country to the other. But what did the nation’s largest spatial network actually look like?

1880_PostOffices

Mapping 43,012 post offices gives the reader an instant sense for both the size and scope of the U.S. Post. The map serves an illustrative purpose rather than an argumentative one. I’m not offering interpretations of the network or even pointing out particular patterns. It’s simply a way for the reader to wrap their minds around the basic geography of such a vast spatial system. But the map is also a useful cautionary tale about visualizing numbers. If anything, the map undersells the size and extent of the Post. It may seem like a whole lot of data, but it’s actually missing around ten thousand post offices, or 22% of the total number that existed in 1880. Some of those offices were so obscure or had such a short existence that I wasn’t able to automatically find their locations. And these missing post offices aren’t evenly distributed: about 99% of Oregon’s post offices appear on the map compared to only 47% of Alabama’s.

Disclaimers aside, compare the map to a sentence I wrote earlier: “Most postal employees worked in one of 43,012 post offices scattered across the United States.” In that context the specific number 43,012 doesn’t make much of a difference – it could just as well be 38,519 or 51,933 – and therefore doesn’t contribute all that much weight to my broader point that the Post was ubiquitous in the nineteenth-century United States. A map of 43,012 post offices is much more effective at demonstrating my point. The map also has one additional advantage: it beckons the reader to not only appreciate the size and extent of the network, but to ask questions about its clusters and lines and blank spaces.* A map can spark curiosity and act as an invitation to keep reading. This kind of active engagement is a hallmark of good writing and one that’s hard to achieve using numbers alone. The first step is to make numbers legible. The second is to make them interesting.

* Most obviously: what’s going on with Oklahoma? Two things. Mostly it’s a data artifact – the geolocating program I wrote doesn’t handle Oklahoma locations very well, so I was only able to locate 19 out of 95 post offices. I’m planning to fix this problem at some point. But even if every post office appeared on the map, Oklahoma would still look barren compared to its neighbors. This is because Oklahoma was still Indian Territory in 1880. Mail service didn’t necessarily stop at its borders but postal coverage effectively fell off a cliff; in 1880 Indian Territory had fewer post offices than any other state/territory besides Wyoming. The dearth of post offices is especially telling given the ubiquity of the U.S. Post in the rest of the country, showing how the administrative status of the territory and decades of federal Indian policy directly shaped communications geography.

Still Playing Catch-Up

As I was flipping through the February 2014 issue of the American Historical Review I was encouraged to see that American historical profession’s flagship journal seems to be doing a pretty decent job of publishing the impressive work of female historians. Three out of its four main articles were written by women and four out of the five books in its “Featured Reviews” section were also by women. That’s encouraging. But what about the rest of the February issue? Figuring out how many women are in the 176 contributors for this single issue is a lot harder. And what about not just this issue, but all five issues it publishes annually? And what about not just this year, but every year since its inception in 1895?

Looking at gender representation in the American Historical Review is exactly the kind of historical project that lends itself well towards digital analysis. Collecting individual author information from 120 years of publication history would take an enormous amount of tedious labor. Fortunately the information is already online. I wrote a Python script to scrape the table-of-contents from every AHR issue and then, with the help of Bridget Baird, began to process all of this text to try and extract the books that were reviewed in the AHR, their authors, and the names of the person reviewing them. The data was something of a nightmare, but we were eventually able to get everything we wanted: around 60,000 books, authors, and reviewers. The challenge turned to: was there a way to automatically identify the gender of all of these different people? Especially for a dataset that spanned more than a hundred years we needed a way to take into account potential changes in naming conventions. A historian named Leslie who was born before 1950 was likely to be a man, but if that same Leslie was born after 1950 the person was likely to be a woman. Bridget’s solution was for us to write a program that relies on a database of names from the Social Security Administration dating back to 1880 to account for these changes. This approach is not without problems. It only includes American names while subtly reinforcing an insidious gender binary framework. Nevertheless, it does contribute a useful new digital humanities methodology and one that we are planning to explore with Lincoln Mullen in more depth.

This might come as a real shock, but the American Historical Review didn’t feature very many women for much of its publication history. Over the first eighty years of the AHR‘s existence there were rarely more than a handful of books written by female authors in any given issue – as a percentage of all authors, women made up less than 10% of reviewed books through the 1970s. But things began to change in the late 1970s, when female authors began a steady ascent in the AHR‘s reviews. By the end of the 1980s women’s books had nearly doubled in the journal. By the twenty-first century there were three times as many women as there had been in the 1970s.

gender_percent_byyear

Gender of book authors (as a percent of all authors) in the American Historical Review between 1895 and 2013. The number of authors categorized as “Unknown” in the early years stems from the widespread use of initials (ex. K. T. Drew). Most of these authors were likely men, but we’ve erred on the safe side in categorizing them as Unknown. In the later years, many of the “Unknowns” stem from non-U.S. names.

But other numbers paint a less rosy picture. Lincoln Mullen’s recent work on history dissertations showed a similarly steady upwards trajectory in the number of female-authored history dissertations since 1950. Although it has plateaued in recent years, women have very nearly closed the gap in terms of newly completed history dissertations. But the glass ceiling remains stubbornly low in terms of what happens from that point onwards. In book reviews published in the AHR male authors continue to outnumber female authors by a factor of nearly 2 to 1. Whereas there is now a gap of around 3-5% separating the proportion of male and female dissertation authors, that gap jumps to 25-35% in terms of the proportion of male and female book authors being reviewed in the American Historical Review.

mf_diss_book_bluegreen

Gender of dissertation authors and of book authors in the American Historical Review. Note: The above chart only looks at authors whose gender was successfully identified by the program. It is also something of an apples-to-oranges comparison given that Lincoln and I were using slightly different methods, but it gives a rough sense for the gap between dissertations and the AHR.

On the reviewer side of the equation, things aren’t much better. There are still more than twice as many male reviewers as female reviewers in the AHR. But gender inflects this relationship in less direct ways. In particular, we can look at the gender dynamics of who reviews who. About three times as many men write reviews of male-authored books as do women. In the case of female-authored books, there are slightly more male reviewers than female reviewers but the ratio is much closer to 50/50. In short, women are much more likely to write reviews of other women. And while men still write reviews of the majority of female-authored books, they tend to gravitate towards male authors – who are, of course, already over-represented in the AHR.

male_authors_withreviewers

Gender of reviewers for male-authored books. Note: The above chart only looks at authors and reviewers whose gender was successfully identified by the program.

female_authors_withreviewers

Gender of reviewers for female-authored books. Note: The above chart only looks at authors and reviewers whose gender was successfully identified by the program.

Bridget and I were also able to extract the subjects used by the AHR to categorize their reviews. Although these conventions changed quite a bit over time, I took a stab at aggregating them into some broad categories for the past forty years. Essentially, I wanted to find out the gender representation within different historical fields. As you can see in the chart below, the proportion of men and women is not the same for all fields. Caribbean/Latin American history has had something approaching equal representation for the past decade-and-a-half. In both African history and Ancient/Medieval history female historians made some quite dramatic gains during the late-nineties and aughts. The guiltiest parties, however, are also the two subject categories that publish the most book reviews: Modern/Early Modern Europe and the United States/Canada. Both of them have made steady progress but still hover at around two-thirds male.

categories_gender_bytime

The different subjects are sorted left-to-right by the number of reviews in the AHR. Again, please note that the above chart only looks at authors whose gender was successfully identified by the program.

Women are now producing history dissertations at nearly the same rate as men, but the flagship journal of the American historical profession has yet to catch up. There are, of course, a lot of factors at play. This gap might reflect a substantial time-lag as a younger, more evenly-balanced generation gradually moves its way through the ranks even as an older, male-skewed generation continues to publish monographs. It might reflect biases in the wider publishing industry, or the fact that female historians continue to bear a disproportionate amount of the time-burden of caring for families. That the AHR continues to publish far more reviews of male authors than female authors is depressing, but unfortunately not surprising given the systemic inequalities that continue to exist across the profession.

The County Problem in the West

Happy GIS Day! Below is a version of a lightning talk I’m giving today at Stanford’s GIS Day.

Historians of the American West have a county problem. It’s primarily one of geographic size: counties in the West are really, really big. A “List of the Largest Counties in the United States” might as well be titled “Counties in the Western United States (and a few others)” – you have to go all the way to #30 before you find one that falls east of the 100th meridian. The problem this poses to historians is that a lot of historical data was captured at a county level, including the U.S. Census.

521px-Map_of_California_highlighting_San_Bernardino_County.svg

San Bernardino County

San Bernardino County is famous for this – the nation’s largest county by geographic area, it includes the densely populated urban sprawl of the greater Los Angeles metropolis along with vast swathes of the uninhabited Mojave Desert. Assigning a single count of anything to San Bernardino county to is to teeter on geographic absurdity. But, for nineteenth-century population counts in the national census, that’s all we’ve got.

TheWest_1871_Population-01-01

Here’s a basic map of population figures from the 1870 census. You can see some general patterns: central California is by far the most heavily populated area, with some moderate settlement around Los Angeles, Portland, Salt Lake City, and Santa Fe. But for anything more detailed, it’s not terribly useful. What if there was a way to get a more fine-grained look at settlement patterns in these gigantic western counties? This is where my work on the postal system comes in. There was a post office in (almost) every nineteenth-century American town. And because the department kept records for all of these offices – the name of the office, its county and state, and the date it was established or discontinued – a post office becomes a useful proxy to study patterns over time and space. I assembled this data for a single year (1871) and then wrote a program to geocode each office, or to identify its location by looking it up in a large database of known place-names. I then supplemented it with the the salaries of postmasters at each office for 1871. From there, I could finally put it all onto a map:

TheWest_1871_PostOffices

The result is a much more detailed regional geography than that of the U.S. Census. Look at Wyoming in both maps. In 1870, the territory was divided into five giant rectangular counties, all of them containing less than 5,000 people. But its distribution of post offices paints a different picture: rather than vertical units, it consisted largely of a single horizontal stripe along its southern border.

Wyoming_census-02   Wyoming_postoffices-02

Similarly, our view of Utah changes from a population core of Salt Lake City to a line of settlement running down the center of the territory, with a cluster in the southwestern corner completely obscured in the census map.

Utah_census-01   Utah_postoffices-01

Post offices can also reveal transportation patterns: witness the clear skeletal arc of a stage-line that ran from the Oregon/Washington border southeast to Boise, Idaho.

Dalles_Boise

Connections that didn’t mirror the geographic unit of a state or county tended to get lost in the census. One instance of this was the major cross-border corridor running from central Colorado into New Mexico. A map of post offices illustrate its size and shape; the 1870 census map can only gesture vaguely at both.

ColoradoNewMexico_census-02   ColoradoNewMexico_postoffices-02

The following question, of course, should be asked of my (and any) map: what’s missing? Well, for one, a few dozen post offices. This speaks to the challenges of geocoding more than 1,300 historical post offices, many of which might have only been in existence for a single year or two. I used a database of more than 2 million U.S. place-names and wrote a program that tried to account for messy data (spelling variations, altered state or county boundaries, etc.). The program found locations for about 90% of post offices, while the remaining offices I had to locate by hand. Not surprisingly, they were missing from the database for a reason: these post offices were extremely obscure. Finding them entailed searching through county histories, genealogy message boards, and ghost town websites – a process that is simply not scalable beyond a single year. By 1880, the number of post offices in the West had doubled. By 1890, and it doubled again. I could conceivably spend years trying to locate all of these offices. So, what are the implications of incomplete data? Is automated, 90% accuracy “good enough”?

What else is missing? Differentiation. The salary of a postmaster partially addresses this problem, as the department used a formula to determine compensation based partially on the amount of business an office conducted. But it was not perfectly proportional. If it was, the map would be one giant circle covering everything: San Francisco conducted more business than any other office by several orders of magnitude. As it is, the map downplays urban centers while highlighting tiny rural offices. A post office operates in a kind of binary schema: no office, no people (well, at least very few). If there was an office, there were people there. We just don’t know how many. The map isn’t perfect, but it does start to tackle the county problem in the West.

*Note: You can download a CSV file containing post offices, postmaster salaries, and latitude/longitude coordinates here.*

Who Picked Up The Check?

Adventures in Data Exploration

In November 2012 the United States Postal Service reported a staggering deficit of $15.9 billion. For the historian, this begs the question: was it always this bad? Others have penned far more nuanced answers to this question, but my starting point is a lot less sophisticated: a table of yearly expenses and income.

SurplusDeficitByYear

US Postal Department Surplus (Gray) or Deficit (Red) by Year

So, was the postal department always in such terrible fiscal shape? No, not at first. But from the 1840s onward, putting aside the 1990s and early 2000s, deficits were the norm. The next question: What was the geography of deficits? Which states paid more than others? Essentially, who picked up the check?

Every year the Postmaster General issued a report containing a table of receipts and revenues broken down by state. Let’s take a look at 1871:

AnnualReportTableReceiptsExpenditruesByState

1871 Annual Report of the Postmaster General – Receipts and Expenditures

Because it’s only one table, I manually transcribed the columns into a spreadsheet. At this point, I could turn to ArcGIS to start analyzing the data, maybe merging the table with a shapefile of state boundaries provided by NHGIS. But ArcGIS is a relatively high-powered tool better geared for sophisticated geospatial analysis. What I’m doing doesn’t require all that much horsepower. And, in fact, quantitative spatial relationships (ex. measurements of distance or area) aren’t all that important for answering the questions I’ve posed. There are a number of different software packages for exploring data, but Tableau provides a quick-and-dirty, drag-and-drop interface. In keeping with the nature of data exploration, I’ve purposefully left the following visualizations rough around the edges. Below is a bar graph, for instance, showing the surplus or deficit of each state, grouped into rough geographic regions:

SurplusDeficitBar_Crop

Postal Surplus or Deficit by State – 1871

Or, in map form:

SurplusDeficitMap_Crop

Postal Surplus (Black) or Deficit (Red) by State – 1871

Between the map and the bar graph, it’s immediately apparent that:
a) Most states ran a deficit in 1871
b) The Northeast was the only region that emerged with a surplus

So who picked up the check? States with large urban, literate populations: New York, Pennsylvania, Massachusetts, Illinois. Who skipped out on the bill? The South and the West. But these are absolute figures. Maybe Texas and California simply spent more money than Arizona and Idaho because they had more people. So let’s normalize our data by analyzing it on a per-capita basis, using census data from 1870.

SurplusDeficitBar_PerCapita_Crop

Postal Surplus or Deficit per Person by State – 1871

The South and the West may have both skipped out on the bill, but it was the West that ordered prime rib and lobster before it left the table. Relative to the number of its inhabitants, western states bled the system dry. A new question emerges: how? What was causing this extreme imbalance of receipts and expenditures in the West? Were westerners simply not paying into the system?

ReceiptsExpendituresByRegion

Postal Receipts and Expenditures per Person by Region – 1871

Actually, no. The story was a bit more complicated. On a per-capita basis, westerners were paying slightly more money into the system than any other region. The problem was that providing service to each of those westerners cost substantially more than in any other region: $38 per person, or roughly 4-5 times the cost of service in the east. For all of its lore of rugged individualism and a mistrust of big government, the West received the most bloated government “hand-out” of any region in the country. This point has been driven home by a generation of “New Western” historians who demonstrated the region’s dependence on the federal government, ranging from massive railroad subsidies to the U.S. Army’s forcible removal of Indians and the opening of their lands to western settlers. Add the postal service to that long list of federal largesse in the West.

But what made mail service in the West so expensive? The original 1871 table further breaks down expenses by category (postmaster salaries, equipment, buildings, etc.). Some more mucking around in the data reveals a particular kind of expense that dominated the western mail system: transportation.

TransportationMap_PerCapita_Crop

Transportation Expenses per Person by State (State surplus in black, deficit in red) – 1871

High transport costs were partially a function of population density. Many western states like Idaho or Montana consisted of small, isolated communities connected by long mail routes. But there’s more to the story. Beginning in the 1870s, a series of scandals wracked the postal department over its “star” routes (designated as any non-steamboat, non-railroad mail route). A handful of “star” route carriers routinely inflated their contracts and defrauded the government of millions of dollars. These scandals culminated in the criminal trial of high-level postal officials, contractors, and a former United States Senator. In 1881, the New York Times printed a list of the ninety-three routes under investigation for fraud. Every single one of these routes lay west of the Mississippi.

1881_StarRouteFrauds_Crop

Annual Cost of “Star” Routes Under Investigation for Fraud – 1881 (Locations of Route Start/End Termini)

The rest of the country wasn’t just subsidizing the West. It was subsidizing a regional communications system steeped in fraud and corruption. The original question – “Who picked up the check?” – leads to a final cliffhanger: why did all of these frauds occur in the West?

A Dissertation’s Infancy: The Geography of the Post

A history PhD can be thought of as a collection of overlapping areas: coursework, teaching, qualifying exams, and the dissertation itself. The first three are fairly structured. You have syllabi, reading lists, papers, classes, deadlines. The fourth? Not so much. Once you’re advanced to candidacy there’s a sense of finally being cut loose. Go forth, conquer the archive, and return triumphantly to pen a groundbreaking dissertation. It’s exhilarating, empowering, and also terrifying as hell. I’ve been swimming through the initial research stage of the dissertation for the past several months and thought it would be a good time to articulate what, exactly, I’m trying to find. Note: if you are less interested in American history and more interested in maps and visualizations, I would skip to the end.

The Elevator Speech

I’m studying communications networks in the late nineteenth-century American West by mapping the geography of the U.S. postal system.*

The Elevator-Stuck-Between-Floors Speech

From the end of the Civil War until the end of the nineteenth century the US. Post steadily expanded into a vast communications network that spanned the continent. By the turn of the century the department was one of the largest organizational units in the world. More than 200,000 postmasters, clerks, and carriers were involved in shuttling billions of pounds of material between 75,000 offices at the cost of more than $100 million dollars a year. As a spatial network the post followed a particular geography. And nowhere was this more apparent than in the West, where the region’s miners, ranchers, settlers, and farmers led their lives on the network’s periphery. My dissertation aims to uncover the geography of the post on its western periphery: where it spread, how it operated, and its role in shaping the space and place of the region.

My project rests on the interplay between center and periphery. The postal network hinged on the relationship between its bureaucratic center in Washington, DC and the thousands of communities that constituted the nodes of that network. In the case of the West, this relationship was a contentious one. Departmental bureaucrats found themselves buffeted with demands to reign in ballooning deficits. Yet they were also required by law to provide service to every corner of the country, no matter how expensive. And few regions were costlier than the West, where a sparsely settled population scattered across a huge area was constantly rearranged by the boom-and-bust cycles of the late nineteenth century. From the top-down perspective of the network’s center, providing service in the West was a major headache. From the bottom-up perspective of westerners the post was one of the bedrocks of society. For most, it was the only affordable and accessible form of long-distance communication. In a region marked by transience and instability, local post offices were the main conduits for contact with the wider world. Western communities loudly petitioned their Congressmen and the department for more offices, better post roads, and speedier service. In doing so, they redefined the shape and contours of both the network and the wider geography of the region.

The post offers an important entry point into some of the major forces shaping American society in the late nineteenth century. First, it helped define the role of the federal government. On a day-to-day basis, for many Americans the post was the federal government. Articulating the geographic size and scale of the postal system will offer a corrective to persistent caricatures of the nineteenth-century federal government as weak and decentralized. More specifically, a generation of “New Western” historians have articulated the omnipresent role of the state in the West. Analyzing the relationship between center and periphery through the post’s geography provides a means of mapping the reach of federal power in the region. With the postal system as a proxy for state presence, I can begin to answer questions such as: where and how quickly did the state penetrate the West? How closely did it follow on the heels of settler migration, railroad development, or mining industries? Finally, the post was deeply enmeshed in a system of political patronage, with postmasterships disbursed as spoils of office. What was the relationship between a communications network and the geography of regional and national politics?

Second, the post rested on an often contentious marriage between the public and private spheres. Western agrarian groups upheld the post as a model public monopoly. Nevertheless, private hands guided the system’s day-to-day operations on its periphery. Payments to mail-carrying railroad companies became the department’s single largest expenditure, and it doled out millions of dollars each year to private contractors to carry the mail in rural areas. This private/public marriage came with costs – in the early 1880s, for instance, the department was rocked by corruption scandals when it discovered that rural private contractors had paid kickbacks to department officials in exchange for lavish carrying contracts. How did this uneasy alliance of public and private alter the geography of the network? And how did the department’s need to extend service in the rural West reframe wider debates over monopoly, competition, and the nation’s political economy?

Getting Off The History Elevator

That’s the idea, at least. Rather than delve into even greater detail on historiography or sources, I’ll skip to a topic probably more relevant for readers who aren’t U.S. historians: methodology. Digital tools will be the primary way in which I explore the themes outlined above. Most obviously, I’m going to map the postal network. This entails creating a spatial database of post offices, routes, and timetables. Unsurprisingly, that process will be incredibly labor intensive: scanning and georeferencing postal route maps, or transcribing handwritten microfilmed records into a database of thousands of geocoded offices. But once I’ve constructed the database, there are any number of ways to interrogate it.

To demonstrate, I’ll start with lower-hanging fruit. The Postmaster General issues an annual report providing (among other information) data on how many offices were established and discontinued in each state. These numbers are fairly straightforward to put into a table and throw onto a map. Doing so provides a top-down view of the system from the perspective of a bureaucrat in Washington, D.C. For instance, by looking at the number of post offices discontinued each year it’s possible to see the wrenching reverberations of the Civil War as the postal system struggled to reintegrate southern states into its network in 1867:

Post Offices Discontinued By State, 1867
(Source: Annual Report of the Postmaster General, 1867)

The West, meanwhile, was arguably the system’s most unstable region. As measured by the percentage of its total offices that were either established or discontinued each year, states such as New Mexico, Colorado, and Montana were continually building and dismantling new nodes in the network.

Post Offices Established or Discontinued as a Percentage of Total Post Offices in State, 1882
(Source: Annual Report of the Postmaster General, 1882)

Of course, the broad brush strokes of national, year-by-year data only provide a generalized snapshot of the system. I plan on drilling down to far more detail  by charting where and when specific post offices were established and discontinued. This will provide a much more fine-grained (both spatially and temporally) view of how the system evolved. Geographer Derek Watkins has employed exactly this approach:

Screenshot from Derek Watkins, “Posted: U.S. Growth Visualized Through Post Offices” (25 September 2011)

Derek’s map demonstrates the power of data visualization: it is compelling, interactive, and conveys an enormous amount of information far more effectively than text alone. Unfortunately, it also relies on an incomplete dataset. Derek scraped the USPS Postmaster Finder, which the USPS built as a tool for genealogists to look up postmaster ancestors. The USPS historian adds to it on an ad-hoc basis depending on specific requests by genealogists. In a conversation with me, she estimated that it encompasses only 10-15% of post offices, and there is no record of what has been completed and what remains to be done. Derek has, however, created a robust data visualization infrastructure. In a wonderful demonstration of generosity, he has sent me the code behind the visualization. Rather than spending hours duplicating Derek’s design work, I’ll be able to plug my own, more complete, post office data into a beautiful existing interface.

Derek’s generosity brings me back to my ongoing personal commitment to scholarly sharing. I plan on making the dissertation process as open as possible from start to finish. Specifically, the data and information I collect has broad potential for applications beyond my own project. As the backbone of the nation’s communications infrastructure, the postal system provides rich geographic context for any number of other historical inquiries. Cameron Ormsby, a researcher in Stanford’s Spatial History Lab, has already used post office data I collected as a proxy for measuring community development in order to analyze the impact of land speculation and railroad construction in Fresno and Tulare counties.

To kick things off, I’ve posted the state-level data I referenced above on my website as a series of CSV files. I also used Tableau Public to generate a quick-and-dirty way for people to interact with and explore the data in map form. This is an initial step in sharing data and I hope to refine the process as I go. Similarly, I plan on occasionally blogging about the project as it develops. Rather than narrowly focusing on the history of the U.S. Post, my goal (at least for now) is to use my topic as a launchpad to write about broader themes: research and writing advice, discussions of digital methodology, or data and visualization releases.

*By far the most common response I’ve received so far: “Like the Pony Express?” Interestingly, the Pony Express was a temporary experiment that only existed for about eighteen months in 1860-1861. In terms of mail carried, cost, and time in existence, it was a tiny blip within the postal department’s operations. Yet it has come to occupy a lofty position in America’s historical memory and encapsulates a remarkable number of the contradictions and mythologies of the West.

Pilgrims, Cowboys, and Loneliness

The provocative title of Stephen Marche’s Atlantic article, “Is Facebook Making Us Lonely?” invites immediate skepticism as the latest iteration in the sub-genre of technological alarmism about the internet. Like much of this literature, Marche’s writing is far more thoughtful and measured than his simplistic title would indicate. He admits, for instance, that “Loneliness is certainly not something that Facebook or Twitter or any of the lesser forms of social media is doing to us. We are doing it to ourselves.” He also makes the interesting point that Facebook requires a relentless and exhausting performative dance on a digital stage. But he also makes some problematic claims. A range of responses have critiqued Marche’s use of studies and statistics, but what caught my eye was Marche’s use of history. In one passage, worth quoting at length, Marche writes:

Loneliness is at the American core, a by-product of a long-standing national appetite for independence: The Pilgrims who left Europe willingly abandoned the bonds and strictures of a society that could not accept their right to be different. They did not seek out loneliness, but they accepted it as the price of their autonomy. The cowboys who set off to explore a seemingly endless frontier likewise traded away personal ties in favor of pride and self-respect. The ultimate American icon is the astronaut: Who is more heroic, or more alone? The price of self-determination and self-reliance has often been loneliness. But Americans have always been willing to pay that price.

Self-invention is only half of the American story, however. The drive for isolation has always been in tension with the impulse to cluster in communities that cling and suffocate. The Pilgrims, while fomenting spiritual rebellion, also enforced ferocious cohesion. The Salem witch trials, in hindsight, read like attempts to impose solidarity—as do the McCarthy hearings. The history of the United States is like the famous parable of the porcupines in the cold, from Schopenhauer’s Studies in Pessimism—the ones who huddle together for warmth and shuffle away in pain, always separating and congregating.

I always get annoyed when historians mount their high horses to harumph about how Americans don’t know anything about history. But indulge me for one paragraph while I do just that. There are two major problems with Marche’s use of history here. First, it’s inaccurate. There’s a big difference between “loneliness,” “independence” “self-determination” and “self-reliance,” but Marche seems to conflate them all together. The Pilgrims were more about religious reform than religious independence, and leaving one place for another place doesn’t make you lonely. Or alone. Or independent. Or self-reliant. As Marche himself admits, they also pursued their “spiritual rebellion” in an intensely communal manner.

Then there’s the cowboys. Oh boy. A generation of “New Western Historians” have pretty conclusively dispelled the idea of the self-reliant, independent wrangler. Cowboys were always deeply reliant on others: the federal government to remove plains Indians and enforce ranching and riparian rights, or a host of merchants, storekeepers, and meat-packers that inextricably tied them to national and international markets. And I don’t even understand what “traded away personal ties in favor of pride and self-respect” even means.

Image courtesy of Yale Collection of Western Americana, Beinecke Rare Book and Manuscript Collection

My problem is less with the accuracy of Marche’s history but in how he uses it. I don’t expect an article in the Atlantic to delve into the historiographical intricacies of the Puritans or the problematic nature of Frederick Jackson Turner’s frontier thesis. What Marche is talking about is American mythology, not some “core” of the American character or “actual” history. If he had made this distinction clearer, it’s a quite relevant and important point. Independence, self-reliance, self-determination: these are cherished ideals that undergird many of the stories Americans tell themselves about their past. And it’s fascinating to think about how these ideals interact with the separate (but related) reality of both loneliness and community in a present-day context.

Alexis de Tocqueville tackled this paradox between individualism and communalism two centuries ago in Democracy in America. The French political thinker toured America in 1831 and wrote an expansive account of American institutions, history, society, and character. A major theme running through Democracy in America was the tension between the individualism produced by a society based on equality with institutions and associations based on communal life. De Tocqueville argued that social equality had the downside of producing immensely self-centered people. In true de Tocqueville fashion, he penned one passage that has a ring of timelessness to it – Marche could have used it word-for-word in his characterization of present-day loneliness:

The first thing that strikes the observation is an innumerable multitude of men, all equal and alike, incessantly endeavoring to procure the petty and paltry pleasures with which they glut their lives. Each of them, living apart, is as a stranger to the fate of all the rest; his children and his private friends constitute to him the whole of mankind. As for the rest of his fellow citizens, he is close to them, but he does not see them; he touches them, but he does not feel them; he exists only in himself and for himself alone; and if his kindred still remain to him, he may be said at any rate to have lost his country.

But de Tocqueville goes on to describe how American society during the Jacksonian era combated the effects of isolation brought about by social equality, perhaps most importantly through associational life: “In no country in the world has the principle of association been more successfully used or applied to a greater multitude of objects than in America. ” Americans in the 1820s and 1830s loved forming groups: political parties, religious sects, reform movements. This was the age of Joseph Smith and Mormonism, massive evangelical revivals, temperance movements, and the American Anti-Slavery Society. So what does it say that one of the most famous historical observers of American society highlighted the intense communalism of that society? My point is not that de Tocqueville was right or wrong, it’s that Americans and critics of American society have always wrestled with the balance between communalism and individualism.

A lack of historicity is my major problem with “Is Facebook Making Us Lonely?”. Marche uses history as a vague, unexamined point of departure for the present, oftentimes veering into  trope of a lost “Golden Age.” He cites some studies demonstrating, for instance, that the number of households with one inhabitant has increased from 1950, or that the number of personal confidants decreased from the 1980s to the present. Although Eric Klinenberg thoughtfully disputes Marche’s claim that “various studies have shown loneliness rising drastically over a very short period of recent history,” I’m less concerned with the accuracy of Marche’s claims than his treatment of history itself.

There’s a tendency when writing critiques of present-day society to make a direct implication that things are fundamentally new and are changing for the worse. And this tendency seems to be even more prevalent in diatribes against technology, which operate under an often-unexamined assumption that technology X (the telegraph, the automobile, the Internet, social media) has irrevocably reshaped our world. It’s useful to talk about the effects of technological changes: there are many ways in which Facebook and social media has, in fact, fundamentally changed our society. But too often these articles assume that any and every change is a) something fundamentally new, and b) directly attributable to the technology itself. Marche neatly encapsulates this lack of historicity in two sentences: “Nostalgia for the good old days of disconnection would not just be pointless, it would be hypocritical and ungrateful. But the very magic of the new machines, the efficiency and elegance with which they serve us, obscures what isn’t being served: everything that matters.”

Facebook isn’t magic and the “good old days of disconnection” only exist in our historical imagination. Not only do cowboys have an American Professional Rodeo Association, the group has its own Facebook page. As de Tocqueville reminds us, we’ve wrestled with the contradictions between loneliness, individualism, and communalism for a long, long time. What Facebook has done is change some of the channels and format of these tensions. Like any technology, it needs to be more thoughtfully placed in its historical context. History is not a golden age or a black box or a passive point of departure for a completely new paradigm. Critics of Facebook or Twitter or whatever new technology will be undermining the “American core” in twenty years should do a better job of keeping this in mind.

Surviving Quals, Part II: The Grind

*This is part two of a series on preparing, studying for, and taking qualifying exams in a history PhD program. See Part I here. After taking my exams in December 2011, I decided to collect my thoughts on the process. The following advice is based on my own experience of taking Stanford’s qualifying oral exams for United States history. The format was a two-hour oral exam, with four faculty members testing four different fields: three standard American history fields (Colonial, Nineteenth Century, and Twentieth Century) and one specialty field (in my case, Spatial and Digital History). Bear in mind that other programs have different purposes, formats, and requirements.*

The Grind

“Preparing for quals is a full-time job, but there is no reason to put in overtime.” This was one of the best pieces of advice I received when I was asking fellow graduate students about the process. More so than perhaps any other facet of graduate school, studying for quals should be managed like a job. This is for two reasons: to keep pace and to keep sane.

Keep Pace

Quals can be thought of as a simple math problem with two main variables. One variable is the total number of books you need to read. The other is how much time you have to read them. If you have an exam date already set, work backwards to figure out how many books you need to read each week. If you have more control over scheduling the date of the exam, work forwards. Using a baseline of around 3-4 hours for each book, determine how many total hours you will need to read them. In either case, it’s crucial to factor in additional time for things like basic chronology, reviewing material, and meetings with professors (roughly 30-40 hours per field, in my case). Schedule in other commitments, weekends, vacations, or time off depending on your schedule. Finally, add in an additional 2-3 week buffer before the exam. This gives you crucial time to synthesize all of the material and, worst case scenario, a surplus buffer of time to dip into if you get behind on your reading schedule. Add it all up and you’ll get a rough sense for what your pace needs to be. In my case, I ended up having to read roughly 8-9 books a week, with around eight hours of additional preparation each week.

Once you’ve figured out what your pace is, you need to keep track of your progress. I ended up creating a spreadsheet with all of my books and estimates for how much time I’d need on each book (usually 3-4 hours for a normal monograph, several more hours for a synthetic tome like Daniel Walker Howe’s What Hath God Wrought). This gave me a running tally of my progress and how much still remained – unsurprisingly, this was a daunting list in the beginning. But checking off books became a daily ritual that lent an all-important sense of moving forward. Having a schedule also gives you added structure for an experience that can otherwise be dangerously unaccountable. There are days when you will be tired, distracted, or just sick and tired of turning pages. These are the days when lack of daily accountability becomes a problem. Putting off a book one morning might seem trivial at the time, but it adds up quickly. Having a schedule forces you to keep working. It might not be pretty, you might not retain as much from that particular book, but knowing that you have to get through it to reach your “quota” for the week allows you to keep grinding.

Keep Sane 

Treating quals-studying like a job that you clock into and out of also helps to keep your sanity. Just reading and reading for hours every day is an isolating and tiring experience in a way that taking classes, teaching, or even research is not. It’s easy to get lost in the world of endless books, and while this can be rewarding in its own peculiar way it’s also not sustainable. Set a daily reading schedule and try to stick with it. By working consistently at the same times each day it will be much easier for you to “leave” your job. When you’re done for the day, actually be done for the day. I found studying for quals to be draining in a very different way from other aspects of graduate school. Whereas I have no problem answering emails from students at night or thinking about research while I cook dinner, it was much more exhausting to think about the two books I had read that day for quals. If possible, try to take at least one day off a week where you don’t touch a book. And all of the other rules about work/life balance apply: have a social life, exercise, think and talk about things other than history. Clock in, clock out.

Learn How to Not Read

Arguably the most important skill in studying for quals is learning how to not read. When you have to read two books a day, you don’t actually read them. You gut them. Graduate school has likely forced you to begin to do this already, but it will soon become a standard rather than an exception. For inspiration, read Larry Cebula’s “How to Read a Book in One Hour.” Although you will be spending more time on each book, the same general principles apply. Below was my own system for reading a book for quals.

1. Use a template. After much debate I ended up using Evernote as my note-taking medium. I created a basic template that I would use to create a new note for each book. This not only saves time but allows you to remember information more systematically. Finally, taking notes digitally also allows for a more robust catalog and search functionality, especially via tagging systems. By tagging summaries of books with their different subjects, I could quickly pull up, say, all the books on my 19th-century reading list having to do with slavery.

*Download my empty template in Evernote format or as an HTML file, or see an example of a completed note.*

Screenshot of note-taking in Evernote, with Tags and Searches highlighted

2. Use book reviews. Read 2-3 reviews of the book and take notes on them. If possible, try to find a mix of shorter (1-2 page) synopses and lengthier (5-10 page) reviews. You will quickly learn which journals are best for your particular field – in US History, for instance, Reviews in American History offers much more detailed reviews that oftentimes place the books within a broader historiographic context. I would usually pair one of these longer reviews with two shorter ones. By reading several different reviews you can usually glean what the “consensus” is on the book’s major themes and contributions and be on the look-out for these while reading.

3. Be an active reader. I’m aware people have different styles. But for quals, I found the best way to take notes was to sit at a desk with my computer and take notes on every chapter as I went. Whereas in classes I had often read books lying on a couch and used marginalia and underlining, I’ve since soured on this approach. Actively taking notes while you read is less enjoyable, but forces you to synthesize as you go. It’s easy to underline an important sentence without actually understanding it. Paraphrasing forces you to actually get what you read. As for content, start with a careful, word-by-word reading of the introduction and take detailed notes. Then move much more quickly through the book’s chapters, skimming and trying to pull out what’s most important.

Quals tend to privilege arguments over thematic content: few people are going to ask for the specific evidence an author used to support their argument in a particular chapter. However, jotting a sentence down that describes the general setting, actors, and subject of the chapter, separate from its argumentative thrust, allows you to recall it better in the future. It’s important to take notes on both arguments and content. Finally, move fast. Flip past pages that are simply listing additional evidence for an argument. Although these are often the most enjoyable parts of history books they are, unfortunately, tangential to why you’re reading the book. Unless the book was particularly long or particularly important, I tried to cap the reading part of the note-taking process at around three hours.

4. Synthesize. This is crucial. After reading every book I forced myself to take 20-30 minutes and write a careful two-three paragraph summary of the book. This is much harder than simply taking notes because it forces you to distill a book into its barest bones. Perhaps not surprisingly, it’s difficult to write a summary of a book you don’t understand or remember, so doing this also makes sure you actually processed what the author was trying to do (or force you to at least take a stab at it). As a supplement to this, as I was reading the book I would write major themes or concepts in a bullet list. Once I got to the end, I would go back and decide which of these were actually major themes or concepts and which ended up being auxiliary. The important themes gave me a basic skeleton from which I could then write a more elaborate summary. These write-ups proved invaluable. When you’re reading two books a day, even a book you read two weeks ago can dissolve into a distant memory. These summaries give you a fast and efficient means of recalling what the book was about. Finally, go back and revise them as you read other books. Oftentimes you don’t understand the broader significance of an author’s argument until you’re able to place it in a larger historiographic context.

See an example of a full note here. Also see my full listing of book summaries for my US history fields.*

5. Talk it out. This is probably the hardest step, especially in the beginning of the process. But it’s central to studying for quals. There is something about having to verbally articulate an answer that forces you to understand it in a way that simply writing answers or notes does not. Additionally, one of the most challenging parts of quals is to move beyond simply being able to regurgitate a specific author’s argument and move towards higher-level synthesis. It’s one thing to be able to answer: “What is Bernard Bailyn’s interpretation of the American Revolution?” or even “What are three different interpretations of the American Revolution?” It’s much harder to answer, “Was the American Revolution actually revolutionary?” Answering these higher-level questions out loud is hard, but it is a skill at which you can and will get better. Once again, rely on your fellow graduate students, particularly ones who have already taken their exams. Have them ask you practice questions, pretend you are in an actual exam, and give formal answers (rather than the easier route of making it conversational, as in “Well, I’d probably say something about…”). Practice your own answers, but also ask other students for clarifications about topics or books you don’t understand. Do this as early as possible and keep doing it throughout the process. I found it the most useful way to prepare for the exam itself.

6. Go back to the basics. My grasp of the more factual side of American history was surprisingly weak going into the process. It’s easy to spend all of your time learning about historiography and interpretations, but you need a factual framework to build off. Particularly important episodes demand a solid grounding in chronology – for example, the lead-up to the American Revolution or the Civil War. Memorize things like changing geography, presidential administrations, dynastic reigns, economic depressions, major legal cases, etc. Some books, like those in the Oxford Series in American History, offer more nuts-and-bolts information than others. In this case, be aware of that and take more time to read them in more detail, writing separate notes related to basic chronology or events in addition to your notes on the more interpretive side of the book.