Excusez-moi, monsieur. Je suis une
journaliste
… Où est la bibliothèque?

As journalists, we know getting to the heart of a good story comes down to asking the right questions. But it can be pretty difficult to know what to ask when you’re speaking the wrong language.

Here at Medill Washington, we’ve been learning to speak the language of data by looking at spreadsheets and databases from the government and its agencies. And we’ve seen that endless rows and columns of numbers can turn into trends, anomalies and really good stories by getting the hang of how to speak to those spreadsheets in their own language.

In our Digital Frameworks for Reporting class, we’ve been using SQL, or Structured Query Language, to help us talk to data and get answers to our questions.

A database manager can help parse the data much better (and handle a lot more information) than a tool like Microsoft Excel. Not only that, it is much faster and less prone to error than trying to mash the numbers by hand.

Once you get a hang of how to structure your requests to ask questions with SQL, it’s pretty easy, so don’t be afraid of digging in. This kind of training can only help as you navigate the avalanche of information that comes from government agencies.

At Medill, we’re using a Firefox add-on called SQLite.  It’s small, fast and free; in other words, it’s perfect for a journalist on deadline.

In this post, I’ll be trying out some very basic ways to look at data using SQL. The idea is that understanding how information is structured helps a reporter be organized when looking for trends.

How to do it

If you want to follow along,  download the SQLite Manager Firefox Add-on here. Once installed, SQLite Manager will live in your “Tools” menu in Firefox.

Next, download a small set of data here. You can save the data, called votes.csv, and import it into SQLite Manager. You’ll want to replace the automatic column names with the headings: date, result, total_yes, total_no and total in the import wizard.

Open in the manager, it should look like this:

Now, we’re going to take a stab at asking the data some questions. We can give our database commands by typing in the Execute SQL tab.

First, we need to tell the database what information we want and where we want it from. We’ll use two commands called: SELECT and FROM.

We can choose to select all the data in the table using the * symbol. Our database is called “votes,” so we would type:

SELECT * FROM votes

Then press the “RUN SQL” button.

It should look like this:

If we want to look at just a piece of the data and not the entire set, we should replace the * symbol with the name of the column.

Try SELECT results FROM votes to see just the “results” column or SELECT total FROM votes to see the “total” column alone.

To get more specific information, we can use the WHERE clause, in conjunction with the SELECT statement. Maybe we want to see how many members showed up to cast their vote when a bill passed.

That means we want to see the “total” column in the “votes” table only when the “result” is pass.

We could type in:

SELECT total FROM votes WHERE result=”pass”

The database displays the “total” votes where the result was pass. In addition to = (equal), > (greater-than) or < (less-than) signs work with the WHERE clause, too.

The last function we’ll try is COUNT. This can be helpful when you want to know how many objects in your data meet a certain criteria, but you don’t want to see the objects themselves.

Let’s ask the database how many times a vote has resulted in a “fail” during our time period. Start by placing the COUNT command after our SELECT statement. We want to count in the “result” column, where the result is “fail” or:

SELECT COUNT(result) FROM votes WHERE result=”fail”

We could have easily counted by hand in our Browse & Search menu and see that two votes have failed. But think how difficult it would be if there were hundreds of fields of data. By learning how to make queries in SQL, you can pull information from a database much more quickly and with greater accuracy. Seeing that structure and understanding how to see trends are helpful tools for reporting.

If you want to see more SQL tutorials, take a look at SQLzoo.net. Happy hunting!

Kellen Henry is an environment and energy reporter for Medill Washington. Set to graduate from Northwestern at the end of the year, Henry has written for Marketwatch, McClatchy and The (Nashville) Tennessean.  She wrote this opinion piece for Washington Reporting 2.0., an occasional column about the craft of reporting.