It’s the context stupid

On July 9, 2013, a conference took place in the Jewish Museum in Berlin entitled Public History of the Holocaust - Historical Research in the Digital Age. One of the issues brought up in the closing forum discussion was the loss of context in working with online digital archives and/or libraries, a point made by Stefanie Schüler-Springorum who used the example of doing newspapers research to illustrate it. Having used this example often to illustrate the methodological challenges of doing history in the digital age, I was very happy to hear it being addressed in the forum. For loss of context, or loss of awareness of context, when using and working with digital resources is a key issue that is in dire need of more discussion by historians, whether they describe themselves as digital or not. 

Newspaper research is a particularly good example of the issue, as anyone who has worked with them can confirm. When I worked on my PhD, I spent weeks browsing microfilms of Yiddish newspapers looking for articles on Jewish volunteers who fought in the International Brigades during the Spanish Civil War. More specifically, I collected all articles published on that topic in two Yiddish newspapers published in Paris in the late 1930s, Naye Prese and Parizer Haynt. This process of browsing the newspapers, or leaving through them, automatically provides a researcher with the context in which articles on a particular topic should be seen, the context being the totality of the newspaper and its coverage. It allows you to get an idea of the importance of the topic at hand for a newspaper. Moreover, it provides clues as to the ‚weight’ of an article: its size, the page on which it is printed, its position on a page and its lay-out, which all determine its visual prominence and thus its possible impact and reception. It also provides clues as to how the topic at hand is related to other topics a newspaper might discuss.

In the case of my own research, articles on Jewish volunteers featured prominently in particularly one newspaper, the communist daily Naye Prese. As it turned out, they were very much related to the newspaper’s campaign for a Popular Front among Jewish migrants in France and Paris which also featured prominently on its pages. Knowing that the International Brigades were an important propaganda tool for the Communist International’s post-1935 Popular Front strategy this is not very surprising. What it means though is that articles on Jewish volunteers were discursively related to coverage on the Popular Front and political struggles going on among Jewish migrants in France and Paris at the time. It follows, then, that one can only properly determine why the newspaper wrote about Jewish volunteers in the way it did by considering the total newspaper and its coverage. 

When using digitized newspapers this process can change quite dramatically. Now it should be said that digital newspaper archives come in various shapes and sizes. Many are completely text-searchable, some offer only non-OCR’ed materials and some offer a combination of both. An example of the latter, also used by Schüler-Springorum, is Compactmemory. The change I am talking about takes place when using full-text searchable newspapers, where a search will yield a list of results in seconds. To return to my example: instead of spending weeks to compile a list of articles on Jewish volunteers, this could be done in minutes with some smart searching (the reality is that the particular newspapers I worked with are not digitized and Yiddish cannot, yet, be easily OCR’ed but you get the point).

Obviously, this saves huge amounts of time as anyone as compared to the ‚traditional’ method. Yet context gets lost as a researcher is immediately transported to the micro-level; that of the specific articles being looked for. The browsing process as described above, which automatically yields crucial contextual information, awareness and insight, as well as a sense of the weight of the search results, is completely bypassed. Instead, when viewing search results, a researcher often has to zoom out to find out in what context an article should be seen, what the weight of the article is, and so on and so forth. Obviously, the bigger the set of results, the bigger this problem, and the bigger the risk of ending up with decontextualized analyses. 

All of this is not to suggest that we should go back to old-fashioned newspaper research: the solution cannot be scepticism and wariness of new technological possibilities. One solution lies simply in educating historians and make them aware of how their engagement with and awareness of context can change when using online resources. But this is not just about education: technology can and should help to address the issue as well. Interfaces also come in various forms and some newspaper archives present the search results in such a way that it is at least clear what place on a page an article has, to name just one contextual element.

 

Example: search result for the query "Jiddisch" using the online Dutch newspaper archive of the Royal Library in The Hague.

 

Indeed, in the aforementioned forum discussion in Berlin, Yossi Matias from Google Israel countered Schüler-Springorum’s claim by insisting that the danger of loss of context can be prevented by paying attention to how information is presented. This is a very true point, yet the current reality is that there are many newspaper archives online that do not have interfaces that are conducive to contextualized historical analysis. And most  historians grossly underestimate the importance of interfaces, if they reflect at all on what it means to use digital resources. But interfaces matter and the question is how we design them in such a way that they allow for complex querying of data while simultaneously accounting for, or emphasizing, an awareness of context? I would argue that in creating digital archives and libraries it should become normal for historians to be part of interface design discussions and decisions. And I would love to hear other thoughts on the issues raised above as well. 

Of course designing better interfaces for historical research does not address the issue of how to find discursive relations between clusters of articles but that might be stuff for another post.