Sunteți pe pagina 1din 23

Growing Data, Changing Journalism

An Explorative Inquiry Into The Rise Of Data Journalism

8 July 2011

Eric R. Alberts (3485595) Coding Culture (200600075) MA New Media & Digital Culture Mirko T. Schfer & Nikos Overheul


2010-2011

Eric R. Alberts

1 Introduction
Hal Varian, chief economist at Google, stated in an interview with McKinsey Quarterly in 2009 that the next ten years statisticians will have the sexiest job around. He motivated this statement by arguing the ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it [] is going to be a hugely important skill in the next decades (Manika 2009). When looking at Varians employer it is obvious why he came to this conclusion. A company that benefits greatly from vast amounts of data is Google. Every day enormous amounts of data are collected as a by-product of user interactions with Googles services. From this data new economic value is created. Google, however, is but one example of how important large data sets have become during the last decade. In an attempt to map the current explosion of data and the challenges that derive from it, The Economist published a special report in February of 2010. In this special issue, Joe Hellerstein, a computer scientist at the University of California, is quoted, naming the current age the industrial revolution of data (Cukier 2010, 3). The Economist continues by stating that [t]he effect is being felt everywhere, from business to science, from government to the arts (ibid.) giving this phenomenon the label of big data. In this paper an exploratory inquiry is conducted into one specific area in which big data is also becoming ever more important: journalism. Several interesting news reports and cases have already emerged, which ensued directly from delving through large quantities of data. A prime example is the large-scale investigation into British politicians expenses, invigorating the debate on government expenditure. It is likely that in the near future similar data-driven news reports will see the light of day, as large data sets and sophisticated tools, which allow for journalists to make sense of this data, are becoming more and more dispersed over the Internet. Access to and the use of databases are no longer proviso to ITspecialists or investigative journalists conducting expensive research. According to the European Journalism Centre (EJC), which organized a roundtable conference on data journalism in Amsterdam last year, [d]eveloping the knowhow to use the available data more efficiently, to understand it, communicate and generate stories based on it, could be a huge opportunity to breathe new life into journalism (Data-driven Journalism 2010, 6). The role of the reporter can even be expanded or changed to that of sense-maker by digging deep into data, making journalism more socially relevant (ibid.). This paper critically analyses these

Growing Data, Changing Journalism

opportunities but also discusses the challenges that derive from data journalism. By critically examining matters such as algorithms, data visualisation and participatory journalism this paper makes an effort to further contextualise this new phenomenon. The inquiry into the properties of data journalism will be conducted on two levels. First, this paper discusses data journalism on a material level, as databases are becoming the new valuable sources for journalists. The issue of materiality can be placed within an already active debate on how the materiality of books is transforming in the digital age. Companies like Google and Amazon are digitizing books into data objects, bringing on changes to our relationship with books and databases. Materiality in this paper also relates to the use of software tools and the choice of specific algorithms to structure these large amounts of data. Also, visualisation of data is becoming a widely accepted and applied manner for comprehending the vast and abstract data but not without making decisions that have consequences for the displayed data. Second, data journalism will be discussed on a social level as news organisations are actively involving their readers in analysing large data sets. The practice of crowdsourcing, for example, has the potential to change traditional producerconsumer relationships within journalism. The inquiry into the properties of data journalism on a material and social level will be related to various recently published examples of data journalism. An inquiry into the opportunities and challenges that derive from data journalism, however, cannot take place before giving a theoretical framework. This paper will therefore first embed data journalism in a larger context, describing how the rise of this new phenomenon can be seen within an already changing world of journalism under the processes of media convergence and participatory culture. This paper is intended to offer an insight into a new journalistic practice, which is drawing more and more attention to it as digital information is rapidly expanding, becoming widely available and publicly accessible. This paper contextualizes data journalism and critically engages assumptions and consequences in an attempt to offer a nuanced overview of this emerging branch in the already changing field of journalism in the digital age, which, according to Adam Westbrook, author of The Next Generation Journalist, is one of the big potential growth areas in the future of journalism (ctd. in Data-driven Journalism 2010, 3).

Eric R. Alberts

2 Theoretical framework of data journalism


As stated above, before discussing the opportunities and challenges it is of importance to first get a grip of what data journalism is and how this practice relates to journalism in general and the trends in todays digital culture. Because data journalism is only recently taking shape it has not been fully theorised yet. There is, however, extensive theory to be found on participatory journalism, the blurring of traditional relationships between news media and consumers of media. In this chapter, data journalism is understood in terms of participatory journalism but with a much more prominent technological aspect: its reliance on databases. Data journalism as a practice blends technology and culture in a way that it is not feasible to separate the technological aspects from its social context. Before this argument is further elaborated, it is, for the sake of contextualizing, necessary to first give an overview of the technological and cultural processes from which participatory journalism and, thus, data journalism have emerged. We live in a world that is witnessing a revolution in information technology, a converging set of technologies, which is penetrating all domains of human activity (Castells 2000). This revolution, however, is not just a technological process amplified through digitization, or what Nicholas Negroponte referred to as the transformation of atoms into bytes (Negroponte 1995). The digital age is also becoming a unified environment in which computer hardware and software define possibilities for action and conditions of expression (Rieder & Schfer 2008, 2). The new human condition is characterized by what Henry Jenkins refers to as convergence culture, enabling new forms of participation and collaboration (Jenkins 2006, 245). According to Jenkins [c]onvergence is both a top-down corporate driven process and a bottom-up consumer-driven process (Jenkins 2004, 37) and is taking place on a global scale in acts of media production and consumption. Previously set borders between making media and using media, but also between media industries, continue to blur. This idea of convergence, the blurring of boundaries set by the conditions of digitization (Jenkins 2006, 11), was initially enthusiastically welcomed in the field of journalism by authors like Dan Gillmor (2004) naming it grassroots or participatory journalism. The proliferation of networked communication technologies enables people to launch independent news organisations as a direct response to what were perceived as shortcomings in mainstream news coverage (Deuze 2008). A few of these alternative websites produced by amateurs/citizens are Indymedia, Wikinews and Ohmynews, the latter being an alternative to the

Growing Data, Changing Journalism

highly conservative mainstream press in South Korea (Kahney 2003). Studies on these cases show how citizen media offer interesting bottom-up alternatives to conventional top-down practices of news making (Paulussen & Ugille 2008, 26). At first glance these success stories back the more utopian beliefs of Dan Gillmor. If this trend would continue, independent participatory journalism might be able to replace top-down news media and its traditional news media-user relationships. Closer analysis of participatory journalism which is still a rather ill defined term (Hermida 2008) shows, however, that it is fair to say that the impact of weblogs and citizen media on traditional, professional journalism has thus far been rather limited (Paulussen & Ugille 2008, 26). This is partially due to the prevailing tendency among journalists to see themselves as the defining actors in the process of making news (Heinonen 2011). The main conclusion of a study on the development of participatory journalism on a global scale conducted by David Domingo et al. reveals that professional newsrooms appear to be rather reluctant to open up the news production process to the active involvement of citizens (Domingo et al. 2008). The primary question posed by researchers such as Wilson Lowrey (2005) of whether participatory journalism is in a way substituting professional journalism is thus losing relevance. Instead, the focal point of research on participatory journalism has shifted towards how mainstream news media are adopting citizen contributions in the process of news production (Paulussen & Ugille 2008, 26). In 2003 J. D. Lasica, senior editor of the Online Journalism Review, also stated readers want to be part of the news process (Lasica 2003, 74). But Lasica supplemented this statement by noting that instead of looking at participatory journalism and traditional journalism as rivals for readers eyeballs, we should recognize that were entering an era in which they complement each other, intersect with each other, play off one another (73). He continued by stating we are starting to see a mixture of commentary and analysis from grassroots as ordinary people find their voices and contribute to the media mix. Blogs wont replace traditional news media, but they will supplement them in important ways (74). Although Lasica wrote this essay almost ten years ago, we can see now that traditional news media indeed continue to dominate the news media landscape and are becoming ever more capable to harvest the potential of an active audience. For instance, in the Netherlands Dutch news organisation NRC integrates a successful weblog with the physical newspaper NRC Next and the Dutch public news organisation NOS invites young people to contribute to news on its website NOS op 3 (formerly known as NOS Headlines).

Eric R. Alberts

Mark Deuze, a Dutch professor in communication sciences who has shed light on the professional identity of journalists in the context of convergence culture, makes a similar observation as J. D. Lasica does. According to Deuze convergence culturebased participatory journalism is best understood as some kind of co-creative, commons-based news platform that is produced when a professional media organisation (top-down) partners with or deliberately taps into the emerging participatory media culture online (bottom-up) (Deuze 2008, 109). Furthermore, participatory journalism is very much under construction (ibid.). The convergence of top-down and bottom-up journalism is a work in progress with more or less traditional makers and users of news cautiously embracing its potential which embrace is not without problems both for the producers and consumers (ibid.). Mark Deuze and J. D. Lasica offer a far more nuanced consideration of participatory journalism in the context of convergence culture. This contrasts early utopian-like considerations, which were initially all-too-easily taken for granted (Domingo 2008, 680). Early studies on participatory journalism have also been criticized because of underlying technological determinism (Paulussen & Ugille 2008, 28). Changes in journalism were explained as caused by technological developments influencing the work of journalists from the outside (Deuze 2008, 110). Pablo Boczkowski, underscores the limitations of a sole focus on the effects of new technologies by showing that although technologies do produce effects, they can only be understood in the dynamics of technology adoption processes (Boczkowski 2004, 208). Technology must be seen in terms of its implementation, and therefore how it extends and amplifies previous ways of doing things (Deuze 2008, 110). Changes occurring in the field of journalism are therefore better understood as a mutual shaping of technological and social developments rather than as the effects of technological processes (Paulussen & Ugille 2008, 28). At the beginning of this chapter I stated that data journalism as a practice blends technology and culture in a way that it is not feasible to separate the technological aspects from its social context. With the use of the theory on participatory journalism I would like to argue that data journalism is the gradual outcome of a converging culture, which introduces a constantly changing mix of features, contexts, processes and ideas into the work of individual news workers (Deuze 2008, 112). This means that convergence culture in this particular context is not merely technologically (Negroponte 1995) nor solely socially driven. Data journalism should rather be seen in line of Paulussen and Ugille, as an outcome of the mutual shaping of technological and social developments.

Growing Data, Changing Journalism

Critical analysis, however, shows that convergence is a slow and problematic process and that its true effects are rather limited. Independent weblogs have not replaced news corporations and professional journalists remain to have control over a news story. In extending Paulussen and Ugilles line of thought I would therefore argue that convergence regarding participatory and data journalism is taking place horizontally (between technological and social aspects) rather than vertically (between top and bottom, between professionals and amateurs). In other words, if convergence culture is generally seen as the process of blurring borders then the borders regarding data journalism are blurring between the technological and social contexts. The process of convergence in the case of data journalism should be captured by a lens that emphasizes actors agency as much as technologys capabilities (Boczkowski 2004, 210). At the end of this chapter I would like to extend this lens metaphor by Boczkowski somewhat further. I believe that this lens can be viewed in terms of Actor-NetworkTheory (Latour 1999) in which human and non-human actants combine to form hybrid actors. When applying this view to data journalism we do in fact see that in general the technologies, the data and the software tools, are responsible for larger parts of the action chains, rendering actions intrinsically hybrid (Rieder & Schfer 2008, 161). The digital environment of the database together with the software tools that enable access to and structure this environment define possibilities for action and conditions of expression (160). According to Rieder and Schfer, software is responsible for extending [] the role that technology plays in the everyday practices that make up modern life (161). In other words, through the lens of Actor-Network-Theory data journalism can be seen as a network that consists of linkage between technological, social and cultural actors making data journalism a hybrid practice.

3 Exploring the core properties of data journalism


In the context of data journalism as a hybrid practice, a network of human and non-human actors, we can now explore some of the core elements of data journalism through the use of different case studies. This chapter tries to flesh out the elements of data journalism by largely following the chain of value creation (see illustration 1 below). This chain consists of four properties: raw data, structuring or filtering data, visualising data and storytelling. As a closure to this chapter and supplementary to these four elements, the aspect of participation, to which the

Eric R. Alberts

theoretic background is given in the prior chapter, will be discussed. It will become clear that data journalism seems to open up to participatory possibilities in specific ways. 3.1 Properties of data It is a truism that the amount of digital data is currently growing faster than anything else. A 2008 study by marketing research firm International Data Corporation (IDC) revealed that around 1200 exabytes (1 exabyte is 1 million terabyte) of digital data was produced that year (Cukier 2010, 5). The majority of this data consists out of photos, logs, phone calls and other database-to-database information from which only 5% of the information [] is structured, meaning it comes in a standard format of words and numbers that can be read by computers (ibid.). Data and information are epistemologically different, as information is made up of a collection of data but data and information are increasingly difficult to tell apart. Raw data is interwoven with todays algorithms and powerful computers, which can reveal new insights that would previously have remained hidden (Cukier 2010, 3-4). Gannett, the holding behind newspapers USA Today and The Indianapolis Star, has been leader in the area of database applications. Gannett realised early on that data should be a driving force in online journalism, for a number of reasons. First, data is evergreen content so its value to users does not end after twenty-four hours. Second, because of its sheer size, data can be best delivered in a medium without space constraints. The data is much more valuable if it is accessible and searchable at the users convenience. Third, Gannett realised that data is much more applicable to interactive media than, say, in print form. Data is suited for research and interaction, not so much for passive activities like reading or viewing (Gordon 2007). Supplementary to Gannetts list of data properties, which are relevant to journalism, data in general is transmitted and shared in the form of text, sound, or images without tangible loss. Because of its freedom from physical constraints data is however easy to manipulate. With a simple click large amounts of (personal) information can be copied or permanently deleted (Rieder & Schfer 2008, 163). 3.2 Structuring data In 2002 Wiebke Loosen, assistant lecturer at the Institute of Journalism and Communications at the University of Hamburg, concluded that the abundance of information on the Internet, in terms of its storage, management, multiple use and unlimited possibilities, are challenging journalism regarding its own processes of

Growing Data, Changing Journalism

rationalizing information (Loosen 2002, 5). In structuring the vast amounts of data lies the biggest challenge for businesses, governments and journalists alike. When structured, data is a potential goldmine. Google is probably one of the most obvious examples of a company that knows how to generate economic value from large amounts of data. This is largely the reason why companies like Google and Amazon choose to also transform physical objects into data objects. Books, for instance, are being scaled so that various statistical properties can be analysed for other purposes. Bernhard Rieder calls this computational potential or the value of the data of millions of scanned books. According to Rieder the book in the age of the database adds a contemporary wave of new embedded practices and logistics of what do we read and how we read it (Yudin 2011). In Rieders view three new practices emerge when books are translated into data objects. First, the whole text can be statistically projected that allow various explorations of the catalogues content. Second, books can be connected with other books through data, and books can also be connected to other data like the Internet or Google Scholar. Third, user gestures and practices, such as tagging, clicking, number of reads, sales and reviews, can be captured through the use of digital books. In the latter case, user data can be used to create navigational experiences and opportunities leading to the personalization of reading. In other words, Google and Amazon, with their systems to digitize books, transform books into information, and then unbind and rebind it again as an interactive, social and semantic interface (Yudin 2011). The transformation of physical books into data objects by Google and Amazon paves the road to structure information and generate new value from it. In the field of journalism a similar underlying motive can be found when we look at the recent investigation by many news organisations into the emails of former governor of Alaska, Sarah Palini. The state of Alaska released the emails following a two-and-ahalf year freedom of information process. The emails date from her inauguration as governor in 2006 through to 2008 and were released in printed form to the news organisations. The emails had to be digitized in order to successfully structure them. This is also the case with the documents of British politicians expensesii. On The Guardians specially made homepage it says they have 458.832 pages of documents in their possession and 234.877 pages are yet to be analysed. All of the nearly 460 thousand pages of receipts and claim forms were uploaded onto The Guardians servers as images, which then could be structured in the form of tagging. Yet tagging alone is inadequate to distillate a news story out of data.

10

Eric R. Alberts

3.3 Visualising data Mirko Lorenz, EJC-member and project leader of the Data Driven Journalism initiative (DDJ), states that raw data needs to be transformed into something meaningful. As a result the value to the public grows, especially when complex facts are boiled down into a clear story that people can easily understand and remember (Data-driven Journalism 2010, 12). Illustration 1 shows besides structuring or filtering the raw data, visualisations play an important role in generating value as well.


Illustration 1: Data-driven journalism as a process.

According to mash-up artist Tony Hirst an important thing to remember about data is that it can be used to tell stories, and that it may hide a great many patterns. Some of these patterns are self-evident if the data is visualised appropriately (Townend 2009). For data journalism visualisation is an important shackle in the chain of value creation. An example of how raw data can be visualised and contribute to journalism is The New York Times visualisation of President Obamas 2011 budget proposal and how it is spentiii. The interactive squares on their website immediately show how the Obama administration has planned to spend their budget and how each part of the budget relates to other parts. Another example is a visualisation by David McCandless for The Guardian, depicting the emergency budget proposal by British Chancellor George Osborne iv . McCandless also did a visualisation of the data

Growing Data, Changing Journalism

11

gathered from opinion polls during the general elections of 2010 in Britain v . Another visualisation that has received a lot of attention is the so-called Homicide Map by The LA Times, showing Los Angeles County homicide victimsvi. The Google Maps mash-up shows groups of homicides based on the number of homicides in an area. When clicked on a specific homicide the reader is automatically referred to the article in the LA Times reporting on the murder. Of course also the British MP expenses are being visualised by The Guardian, offering a clear-cut overview of what the newspaper has found so far. It is likely that The Guardian will do the same when more data about the Sarah Palin emails trickles in. The importance of visualisation within data journalism raises the question what visualisation exactly is and what risks it brings along. Lev Manovich defines information visualisation as a mapping between discrete data and a visual representation (Manovich 2010, 2). He does, however, state that this definition does not cover all aspects of information visualisation such as the distinctions between static, dynamic (i.e. animated) and interactive visualization [sic] (ibid.). While these differences are very important, I would like to follow Manovich in his argument that the core idea of visualisation has not changed when we switched from pencils to computers (Manovich 2010, 5). So whether the visualisation is static or interactive, the core idea still evolves around mapping some properties of the data into a visual representation (ibid.). With the use of present-day software it is possible to generate visualisations of much larger data sets than previously possible. As stated above, this does not mean that at its core, visualisations have changed over the last three hundred years. Manovich defines two key principles underlying commonplace information visualisations: reduction and space. Reduction includes the use of graphical primitives, such as points and lines, to reveal patterns and structures in the data. The price being paid for this extreme schematization is the loss of %99 of what is specific about each object to represent only %1 in the hope of revealing patterns across this %1 of objects characteristics (Manovich 2010, 5-6). The use of spatial variables, such as position, size and shape, is another core element typical for information visualisation. These spatial variables have long been preferred over other symbols such as color, tone and transparency. Edward R. Tuftes book Visual Explanations (1997) reveals a case that exemplifies how reduction and spatial preferences in visualisations can be problematic. The case is about the cholera epidemic in London in 1854 and shows how the choice of different intervals to display the data gathered by dr. John Snow give very different

12

Eric R. Alberts

representations of this data. If Snow would have chosen a different interval or had not been so aware of the data and as thorough in his logical thinking he might have never discovered the origin of the epidemic. This case also shows how popular journalisms choice to aggregate or over-compress data can lead to misleading graphical representations. In their article How Not To Lie With Visualisations Bernice Rogowitz and Lloyd Treinish demonstrate how different representation of a MRI scan of a human head can influence the interpretation of the data. They argue that [i]n order to accurately represent the structure in the data, it is important to understand the relationship between data structure and visual representation (Rogowitz & Treinish 1995, 4). They conclude by stating that although nowadays non-experts can create meaningful representation of their data it is still not easy enough because the visual effects are not well understood by the user (Rogowitz & Treinish 1995, 14). Lev Manovich, however, emphasises that new visualisation techniques and projects developed since the middle of the 1990s seem to no longer strictly take data that is not visual and map it into a visual domain (Manovich 2010, 11). According to Manovich the development of computers and the progress in their media capacities has made it possible to visualise data without reduction: While graphical reduction will continue to be used, this no longer [sic] the only possible method (23). This new method of visualisation or direct visualisation can be exemplified by the use of tag clouds. The tag cloud is an example of a reorganisation of data into a new representation that preserves its original form: text remains text (12). A good example of a tag cloud used in journalism is the word cloud by John Schwenkler, at the time a graduate student in philosophy at the University of California, which got published in The Boston Globevii. The cloud revealed that the official weblog of John McCain, the republican candidate for presidency, used the word Obama more often than any other word. Even more than Obamas own official blog. With the use of direct visualisation patterns in the data can be highlighted without having to reduce or spatially arrange the data with the use of abstract graphical elements. However, in the case of information visualisation, direct visualisation is still not that common as in scientific, medical and geovisualisation. During the 1990s and 2000s the speed and processing power of personal computers progressively increased, but still information visualisation remained to depend on static vector graphics. Only very recently are sophisticated tools allowing for interactive constructions of direct visualisation appearing. Manovich concludes that the ability to show artefacts in full detail is crucial to humanities, as it helps

Growing Data, Changing Journalism

13

the researcher to understand meaning and/or cause behind the pattern she may observe, as well as discover additional patterns (23). One can say that this ability is crucial to journalism as well. Visualisation is a key element for revealing patterns in raw or structured data and making it understandable for a large audience. For journalists and their employers it is, for the sake of objectively informing their audience, crucial that these visualisations display the actual facts. As Manovich has shown, however, information visualisation is not the same as scientific visualisation and it has a long history of reducing data to graphical primitives and specific spatial preference. Incorrect visualisations, which give a distorted view of the actual data, could have large-scale negative consequences. Direct visualisation, as introduced by Lev Manovich, seems to offer a solution to this problem. Now it is possible to visualise large quantities of data without reduction and the software tools that make these direct visualisations possible are rapidly dispersed across the Internet. For instance, ManyEyes, Tableau, Yahoo Pipes, the University of Amsterdam, Open Calais, and of course Google offer (free) tools for data visualisation, paving the road for objective data journalism. 3.4 Storytelling with data A large part of the EJC roundtable conference in Amsterdam focused on how to tell stories with data. Surprisingly none of the speakers really questioned if data necessarily needs to tell a story at all. Adrian Holovaty, a pioneer in data-driven journalism with a background in both journalism and computer programming, does question that, suggesting newspapers need to make an important shift and stop the story-centric worldview (Holovaty 2006). Holovaty claims the daily processes of journalists are, in practical terms, inefficient, wasting too much of the powerful raw data at the root of the stories. Instead, news should be orientated toward computers thereby hoping journalists and data will meet in the middle (Kiss 2008). If so, structured data remains structured and no longer has to be deconstructed for the purpose of writing a traditional news story. From his experience as a journalist Holovaty knows that newspaper organisations traditionally already collect lots of information, which is relentlessly structured. It just takes somebody to start storing it in a structured format (ibid.). Holovatys argument is best understood through the use of examples. For instance, Faces of the Fallen is a public and searchable database of all the U.S. service members who died in Operation Iraqi Freedom and Operation Enduring Freedomviii. Reporters at The Washington Post already were keeping a detailed

14

Eric R. Alberts

database of the deceased service members but this data was most of the time sitting around unused. In two weeks time Holovaty and his co-workers built the data into a powerful tool for the public and it was a catalyst for further reporting and used by activist groups to protest against the war (Kiss 2008). Holovaty also created a public and searchable database named Everyblock which made it possible to find crimes committed in the city of Chicagoix. The data comes from CLEARMap, the crime mapping website of the Chicago Police Department and includes information on where and when each crime occurred, thereby again using available but unused data. Although Holovatys manifesto for computer orientated journalism has inspired many, including the founder of Pulitzer Prize winning website Politifactx which compares political statements with actual facts (Waite 2007), there are examples where news organisations use data more in the classical journalistic tradition. Examples are the news stories based on the Afghanistan War Logsxi, which were made available by independent organisation WikiLeaks to several news organisations. Meanwhile the documents have all been structured and are available through news organisation websites. The New York Times, however, primarily uses this data to bring regular news stories. Reporters Cynthia OMurchu and Carola Hoyos of The Financial Times seem to have stayed somewhat closer to Holovatys view, as they produced several interactive graphics, including an interactive chart on oil and gas chief executives and their salariesxii. In turn, the graphics serve as the basis for traditional (follow-up) news stories. These examples show that there seem to be different views among large news organisations when it comes to implementing and using data. Holovaty-like data journalism is praised and sometimes pursued, but also often questioned. Given the fact that Holovaty-like examples are quite scarce it is fair to say data mainly stands in service of the news story. 3.5 Participatory aspects of data journalism Chapter 2 elaborated on the potential change of traditional news media-user relationships under the process of convergence, the blurring of boundaries set by the conditions of digitization. Whether this potential is called citizen, grassroots or participatory journalism, it all boils down to the emergence of bottom-up initiatives as counterweight to the large top-down news organisations. Comparison between early writings on participatory journalism and its current status reveals that participatory journalism should not be regarded as replacement of top-down news organisations but rather as collaboration between these organisations and their audiences. News organisations have learned and continue to learn to optimally

Growing Data, Changing Journalism

15

utilise new media affordances and to tap into the desire of readers to be part of the news making process. Data journalism can be considered as an outcome of this utilisation and as a testing ground for further collaboration between news organisations and consumers. In the specific case of data journalism citizens are not replacing journalists but they are adding to the chain of value creation as they canalise raw material, such as documents, videos or photos and help journalists tackle the problem of structuring the vast amounts of data. Crowdsourcing, a term coined by Jeff Howe in an article for Wired, is something with which news organisation are increasingly experimenting and can best be described as tapping into the latent talent of the crowd (Howe 2006) or using the crowds as an investigative ancillary force (Howe 2009, xxiv). For instance, in April of 2009 The New York Times release a press release in which it invited their readers to comb through the full schedules of Timothy F. Geithner when he was president of the Federal Reserve Bank of New Yorkxiii. Also, in February of that year The Huffington Post called upon its readers to help dig through the U.S. Senate stimulus billxiv. Other prime examples are, again, the British politicians expenses scandal and most recently the investigation of the Sarah Palin emails. In all of these examples news organisations use the combined analytical strength of their audience with the aim to generate stories out of large data sets. News organisations use their audience for investigative work, to swift through piles of documentation. The journalists role in this process is to collate and analyse the findings, making the journalist the central point of direction. As Alfred Hermida points out there are also examples of crowdsourcing without central direction (Hermida 2010). One of these examples is the Kenian open source platform Ushahidixv, which was founded in 2008 by a group of bloggers who wanted to give a response to the wave of ethnic violence sweeping the country in the wake of elections (Buntling 2011). Ushahidis next project, Huduma (Swahili for service), will use crowdsourcing in Kenya to monitor the effectiveness of services such as health and education (ibid.). Hermida also refers to social network Twitter allowing crowdsourcing to happen on a distributed, asynchronous manner, with individuals acting independently yet collectively at the same time (Hermida 2010). An example where the mass collaboration of total strangers on the web (ibid.) worked was when multinational Trafigura legally banned The Guardian from reporting on the alleged dumping of toxic waste off the shores of Ivory Coast. Trafigura became a trending topic on Twitter as the topic was widely discussed and in less than 24 hours Tragifura backed down (ibid.).

16

Eric R. Alberts

The latter example shows how crowdsourcing can be beneficial for journalism in other ways than for investigative work but does not necessarily apply to data journalism specifically. When looking at the given data-driven examples, audiences are primarily used to contribute to information structuring. The Guardian does however also outsource visualisation tasks. In a specially made group on photo community Flickr, users can post graphical translations of large data sets, which can be downloaded from The Guardians Datastorexvi. It remains the question, though, if crowdsourcing data and data visualisations means data journalism is intrinsically participatory. News organisations are increasingly implementing socalled data desks to the work floor as extension to the editorial office. Eric Ulken, a former reporter of The LA Times, published an article in which he describes the process of assembling the data desk. According to Ulken the data desk can be seen as a cross-functional team of journalists responsible for collecting, analysing and presenting data online and in print (Ulken 2008). Furthermore, the report of the EJC roundtable conference shows data journalism can profit greatly from applying the know-how of graphic designers and IT-specialists (Data-driven Journalism 2010). Adding multiple disciplines to the data desk may imply that participation of the public in the process of creating news stories is just as likely to stagnate.

4 Conclusion
Questioning if data journalism in intrinsically participatory is one of many questions still open for debate concerning a new form of journalism, which is slowly taking form under continuously changing conditions set in a world that is increasingly relying on technology and digital information. This paper has tried to give context to this new phenomenon and has explored its core properties using a variety of examples. At this point, however, it is too difficult to tell what the implications of data journalism will be. The assumption that a website such as Politifact, which checks U.S. politicians if their statements are based on facts, will increase the publics trust in, say, journalism, politics or democracy, is yet to be proven. For instance, the University of Michigan found in a series of research that misinformed people, who were exposed to corrected facts in news stories, rarely changed their minds. Political partisans particularly became even more strongly set in their beliefs. Facts can make misinformation even stronger (Keohane 2010). Instead of focussing on possible implications, this paper has tried to place data journalism within a broader context and has tried to flesh out its core properties in order to further comprehend this new phenomenon. The theoretical framework

Growing Data, Changing Journalism

17

tells us that data journalism can be placed against the background of journalism wherein traditional borders have continued to blur over the last decade. Set by the conditions of digitization, readers have also become users that are able to add value in the process of news making. This process is, however, a slow process and unlike the ideas posed at the beginning of this century, participatory journalism has not yet been able to crumble the power large news organisations. This does not mean the voice of the public is not being heard. The dispersion of increasingly sophisticated and free-to-use software and data sets enables people to contribute to journalism in a new way. Top-down journalism is in some aspects meeting bottomup, grassroots journalism but it remains to be work in progress that often offers more questions than answers. Against the backdrop of convergence culture I have argued data journalism can be regarded as the outcome of the mutual shaping of technological and social developments. Besides vertical top-down-meets-bottom-up, convergence is also taking place on a horizontal axis, between the technological and social contexts of journalism. Technological aspects are becoming inseparably intertwined with social aspects, as reporters are coming to rely on databases as fertile soil for the creation of news stories. In terms of Actor-Network-Theory these human and non-human actants combine to form hybrid actors. In general the technologies, the data and the software tools, are responsible for larger parts of the action chains, rendering actions intrinsically hybrid. Data journalism can therefore be regarded as a hybrid practice. Exploration of the core properties of data journalism amplifies its relation with data and shows the path journalists have to take in order to distillate a story out of data. On the one hand structuring and visualising data can be crucial shackles in getting from raw data to story. Sophisticated software tools make it easier than ever to structure large quantities of data and to visualise data without reducing crucial data. On the other hand these shackles are not part of a fixed chain or the only road to deriving news stories from data. Moreover, journalist and computer specialist Adrian Holovaty argues that nowadays making sense of complicated data for an audience alone is just as important as telling a story. Whatever path journalists will walk, whether it is through visualisations, telling stories, crowdsourcing or building databases, it almost goes without say that the future for journalism lies in analysing big data. This is a standpoint shared by Sir Tim Berners-Lee, founder of the World Wide Web. According to Berners-Lee the responsibility lies with journalists to hold governments, or any one else,

18

Eric R. Alberts

accountable, as information increasingly is made available on the Internet (Arthur 2010). How long it will take before the interdisciplinary data desk, with computer specialists, graphic designers and journalists working together, becomes a fullgrown and respected part of the editorial office remains to be seen. Whatever the implications will be, as databases keep on growing, culture and technology keeps on converging and audiences keep on participating, there will be a role for data journalism out there, somewhere.

References
Arthur, Charles. Analysing Data is the Future for Journalists, Says Tim BernersLee. The Guardian 22 Nov. 2010. 3 Jul. 2011 <http://www.guardian.co.uk/media/2010/nov/22/data-analysis-tim-bernerslee>. Boczkowski, Pablo J. The Processes of Adopting Multimedia and Interactivity in Three Online Newsrooms. Journal of Communication 54.2 (2004): 197:213. Bunting, Madeleine. Crowdsourcing Put to Good Use in Africa. The Guardian 19 May 2011. 3 Jun. 2011 <http://www.guardian.co.uk/globaldevelopment/poverty-matters/2011/may/19/crowdsourcing-good-use-inafrica>. Castells, Manuel. The Information Age: Economy, Society and Culture. Malden MA: Blackwell, 3 volumes, first published in 1996. Cukier, Kenneth N. Data, Data Everywhere. The Economist Special Report 27 Feb. 2010: 3-18. Data-driven Journalism: What is there to Learn? Amsterdam: European Journalism Centre, 2010. Deuze, Mark. The Professional Identity of Journalists in the Context of Convergence Culture. Observatorio Journal 7 (2008): 103-117. Domingo, David. Interactivity in the Daily Routines of Online Newsrooms: Dealing with an Uncomfortable Myth. Journal of Computer-Mediated Communication 13.3 (2008): 680-704.

Growing Data, Changing Journalism

19

Domingo, David et al. Participatory Journalism Practices in the Media and Beyond: An International Comparative Study of Initiatives in Online Newspapers. Journalism Practice 2.3 (2008): 326-342. Gillmor, Dan. We the Media: Grassroots Journalism by the People, for the People. Sebastopo, CA: OReilly Media, 2004. Gordon, Rich. Data as Journalism, Journalism as Data. Readership Institute 14 Nov. 2007. 3 Jul. 2011 <http://getsmart.readership.org/2007/11/data-asjournalism-journalism-as-data.html>. Heinonen, Ari. The Journalists Relationship with Users: New Dimensions to Conventional Roles. Participatory Journalism: Guarding Open Gates at Online Newspapers, Eds. Jane B. Singer et al. Malden, MA: Wiley-Blackwell, 2011. Hermida, Alfred. How the MSN is Tackling Participatory Journalism. Reportr.net 24 May 2008. 3 Jul. 2011 <http://www.reportr.net/2008/05/24/how-themsm-is-tackling-participatory-journalism/>. Hermida, Alfred. The Impact of Crowdsourcing on Journalism. Reportr.net 15 Oct. 2010. 3 Jun. 2011 <http://www.reportr.net/2010/10/15/impactcrowdsourcing-journalism/>. Holovaty, Adrian. A Fundamental Way Newspaper Sites Need to Change. Holvaty.com 6 Sep. 2006. 3 Jul. 2011 <http://www.holovaty.com/writing/fundamental-change/>. Howe, Jeff. The Rise of Crowdsourcing. Wired 14 Jun. 2006. 3 Jul. 2011 <http://www.wired.com/wired/archive/14.06/crowds.html>. Howe, Jeff. Crowdsourcing: Why the Power of the Crowd is Driving the Future of Business. New York: Three Rivers Press, 2008. Jenkins, Henry. The Cultural Logic of Media Convergence. International Journal of Cultural Studies 7.1 (2004): 33-43. Jenkins, Henry. Convergence Culture: Where Old and New Media Collide. New York: New York UP, 2006.

20

Eric R. Alberts

Kahey, Leander. Citizen Reporters Make the News. Wired 17 May 2003. 3 Jul. 2011 <http://www.wired.com/culture/lifestyle/news/2003/05/58856>. Keohane, Joe. How Facts Backfire. Boston.com 11 Jul. 2010. 3 Jul. 2011 <http://articles.boston.com/2010-07-11/bostonglobe/29324096_1_factsmisinformation-beliefs>. Kiss, Jemima. Future of Journalism: Adrian Holovatys Vision for Data-friendly Journalists. The Guardian 6 Jun. 2008. 3 Jul. 2011 <http://www.guardian.co.uk/media/pda/2008/jun/06/futureofjournalismadri anh>. Lasica, J. D. Blogs and Journalism Need Each Other. Nieman Reports 57 (2003): 70-74. Latour, Bruno. Pandoras Hope: Essays on the Reality of Science Studies. Cambridge, MA: Harvard UP, 1999. Loosen, Wiebke. The Second-Level Digital Divide of the Web and Its Impact on Journalism. First Monday 7.8 5 Aug. 2002. 3 Jul. 2011 <http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/97 7/898>. Lowrey, Wilson and William Anderson. The Journalist Behind the Curtain: Participatory Functions on the Internet and their Impact on Perceptions of the Work of Journalism. Journal of Computer-Mediated Communication 10.3 (2005). Manika, James. Hal Varian on How the Web Challenges Managers. McKinsey Quarterly January 2009. 3 Jul. 2011 <http://www.mckinseyquarterly.com/Hal_Varian_on_how_the_Web_challen ges_managers_2286>. Manovich, Lev. What is Visualization? Manovich.net 25 Oct. 2010. 3 Jul. 2011 <http://manovich.net/2010/10/25/new-article-what-is-visualization/>. Negroponte, Nicholas P. Being Digital. New York: Vintage Books, 1995.

Growing Data, Changing Journalism

21

Paulussen, Steve and Pieter Ugille. User Generated Content in the Newsroom: Professional and Organisational Constraints on Participatory Journalism. Westminister Papers in Communication and Culture 5.2: 2008, 24-41. Rieder, Bernhard and Mirko Tobias Schfer. Beyond Engineering: Software Design as Bridge over the Culture/Technology Dichotomy. Philosophy and Design. Eds. Pieter E. Vermaas et al. Springer, 2008. Rogowitz, Bernice E. and Lloyd A. Treinish. How Not to Lies with Visualizaton. IBM Research 1995. 3 Jul. 2011 <http://www.research.ibm.com/dx/proceedings/pravda/truevis.htm>. Townsend, Judith. #DataJourn Part 2: Q&A with Data Juggler Tony Hirst. Journalism.co.uk 8 Apr. 2009. 3 Jul. 2011 <http://blogs.journalism.co.uk/editors/2009/04/08/datajourn-part-2-qawith-data-juggler-tony-hirst/>. Tufte, Edward R. Visual Explanations: Images and Quantities, Evidence and Narrative. Cheshire, Conneticut: Graphics Press, 1997. Ulken, Eric. Building the Data Desk: Lessons from the L.A. Times. The Online Journalism Review 21 Nov. 2008. 3 Jul. 2011. <http://www.ojr.org/ojr/people/eulken/200811/1581/>. Waite, Matt. Announcing Politifact. Matt Waite 22 Aug. 2007. 3 Jul. 2011 <http://www.mattwaite.com/posts/2007/aug/22/announcing-politifact/>. Yudin, Ekaterina. Bernhard Rieder: 81,498 Words: the Book as Data Object. The Unbound Book 21 May. 2011. 3 Jul. 2011. <http://eboekenstad.nl/unbound/index.php/bernhard-rieder-81498-words-the-book-asdata-object/>.

Examples of data journalism used in this paper


i

The Guardian Crowdsourcing the Sarah Palin emails <http://www.guardian.co.uk/world/datablog/2011/jun/10/crowdsource-sarahpalin-emails>.

22

Eric R. Alberts


ii

The Guardian British MP expenses <http://mps-expenses.guardian.co.uk/>.

iiiThe

New York Times Obamas budget and how it is spent

<http://www.nytimes.com/interactive/2010/02/01/us/budget.html>.
iv

The Guardian - Emergency budget proposal 2010 <http://www.guardian.co.uk/news/datablog/interactive/2010/jun/22/budget2010-information-beautiful-blog>.

The Guardian General election opinion polls 2010 <http://www.guardian.co.uk/news/datablog/2010/may/06/general-election2010-opinion-polls-information-beautiful>.

vi

The LA Times The Homicide Report <http://projects.latimes.com/homicide/map/>.

vii

The Boston Globe Portrait of the candidate as a pile of words <http://www.boston.com/bostonglobe/ideas/articles/2008/08/03/portrait_of_ the_candidate_as_a_pile_of_words>.

viii

The Washington Post Faces of the fallen

<http://projects.washingtonpost.com/fallen/>.
ix

Everyblock Make your block a better place <http://www.everyblock.com/>.

Politifact Sorting out the truth in politics <http://www.politifact.com/>.

xi

The New York Times The Afghan war logs <http://www.nytimes.com/interactive/world/war-logs.html>.

xii

The Financial Times Oil and gas chief executives

<http://www.ft.com/intl/cms/s/0/190f9e7c-bd8d-11de-9f6a00144feab49a.html#axzz1R4GLjaUS>.

Growing Data, Changing Journalism

23


xiii

The New York Times The schedules of Timothy F. Geithner

<http://economix.blogs.nytimes.com/2009/04/26/geithner-day-by-day/>.
xiv

The Huffington Post The Senate stimulus bill

<http://www.huffingtonpost.com/2009/02/08/senate-stimulus-billfull_n_163144.html>.
xv

Ushahidi - information collection, visualization and interactive mapping <http://www.ushahidi.com/>.

xvi

The Guardian Data store on Flickr

<http://www.flickr.com/groups/guardiandatastore/>.

S-ar putea să vă placă și