On Friday, Dan Conover published a post explaining his vision for the place of long-form narrative relative to other story packages, structured data in particular. His first point is that narrative, the go-to format for classically trained journalists, is often a poor fit for the information at hand:
I already understood that stories are the way people make sense of their lives, but it was during 2004-05 that I began to see how journalistic narrative was distorting the way we viewed the world.
Dan’s self-described “money quote” for the piece is this:
Today’s revolution isn’t about killing narrative, but about inventing box scores for actions that don’t take place in ballparks.
Narrative is more obviously capable of distorting our understanding of the world, but structured data has a similar effect. Sabermetrics can’t tell you how players’ personal lives affect their teams performances because someone judged that such details should not be included in box scores.
In comparison with prose, “structured” data really means more simple, more granular structure. An essay is as much a data structure as any set of box scores or relational database. Simple granular data structure is attractive compared to prose narrative for two reasons:
- Information in simple granular structures can afford to be more concise because the context of that information is provided by the structure, rather than by more content.
- It facilitates algorithmic automation — it’s easier to make a computer do the boring parts of interpretation, like searching or calculating confidence distributions.
The implication of Dan’s first thesis, or at least the way he worded it in his long-form version, is that structured data doesn’t distort the way we view the world. It does. I don’t think Dan actually believes in the objectivity of data, but it’s a conversation worth having. Like Dan says, narrative tends to impose a conflict and a resolution on a story, but any data structure will impose something on the facts. Structure is as structure does, whether narrative, database or otherwise. The trick is to choose a structure that represents what we decide is valuable to know. Journalism itself has a built-in bias towards covering information meeting the criteria of newsworthiness:
- Timing
- Impact
- Proximity
- Prominence
- Human Interest
- Exception
- Conflict
- Whatever floats your editor’s boat
The difference I want to highlight is that, with structured data, the judgement of what is and isn’t important is no longer on a case-by-case basis. A reporter writing a traditional article can decide that one eyewitness quote is more important than another. A database may as well store them all and paint a much bigger picture, but another database may only store the official police report. The judgements are in the structure, and then inherited any observation of reality you record in that structure.
I’m not calling that a good or a bad thing, only something to be mindful of.