It occurred to me recently that though I spend my life teaching people how to find the information they need, I only rarely speak about the fact that information in itself is valueless. Perhaps I should not just take it for granted that my students understand that data is inert and neutral until it is acted on by human intelligence.

The first act of intelligence, of course, is asking a question and deciding what kinds of information will help to answer it. The census can tell you far more than you could possibly wish to know, about how many people live in the US, how many of us have indoor plumbing, how many are living out of wedlock with a sex partner, how many work in services or manufacturing, and the like. These millions of bits of data are meaningless until you ask a specific question--for instance, how many Americans of Polish extraction are there, and where are they located?

Even this information is valueless without a larger purpose behind the question. The woman who asked this question might be trying to track down members of her family in the old country who are rumored to have come to America. Or it may be she has her great grandmother's recipe for pierogi, and wants to make and sell them. Perhaps she wants to raise money for a cause dear to Polish-Americans, or start a magazine for them, or find a mate of Polish ancestry. For each of these purposes, she would need a slightly different set of data--it is her purpose that gives the inert data value.

The census is already there waiting to be mined, but often the data only exists in the first place because somebody asked a question. When scientists started noticing that frog populations were disappearing or suffering from gross deformities, their first question was, "Is this impression accurate?" Their second was, "Is it just the frogs in my part of the country?" They then did systematic counts, analyzed the nature of the deformities, and shared this data with other scientists who were compiling similar data on frogs in their own regions.

Having discovered this was a widespread problem, they then began to test the environment to see what the common factor might be between the disappearing frogs of Minnesota and those of South Carolina. Again, they shared their data with other scientists, ruling out various possible explanations, and focusing their search on the remaining theories.

Scientists and scholars bring to the data not only a question, but a method for testing its validity. Knowing that information does NOT speak for itself, and that too often, it tells scientists what they want to believe, they protect themselves from error with careful method. They propose a hypothesis and set up a testing situation that controls every element except the one being tested for. They use control groups, and mathematical models, and statistical sampling methods.

Another way minds give value to data is by connecting it to other apparently unrelated data. The enormous value of works like Origin of Species and Silent Spring lay in their examinations of large amounts of data, and their proposed theories to account for the data. Since those theories were not just explanations of a dead past, but of ongoing processes, the theories were testable, and became the foundation for new questions, new collections of data, and new theory and knowledge.

Whether information is used honestly or dishonestly is entirely up to the user. There is a difference between the question "Help me find analyses of whether there is bias in the media," and the question "Help me find information about the liberal bias in the media." The first implies a willingness to seek truth by following the information wherever it leads. The second implies that you already know the truth, thank you, and all you need is a little bit of evidence to back your position up. Researchers of the first question may be surprised by the data, but not those researching the second--they will never admit the existence of evidence that contradicts their truth.

Another thing I probably don't tell my students often enough is not to trust the data too much. You always have to test it, because it may be tainted. It may have been gathered by people who are not so much interested in objective truth as in proving a point. That's why it's up to you to ask questions like: Who are the researchers? Are they reputable scholars or researchers? Are they impartial or biased? How valid is their research method? If they give you survey data, do they tell you the survey method?

You also need to ask about the data that got left out. Information may be neutral, but some is more inherently meaningful than others. During the O.J. Simpson trial, TV newscasters had the choice of telling us about Marcia Clark's hairdo and babysitting problems, or about the telecommunications bill which was handing over huge chunks of the public airwaves to the networks free of charge AND gutting the first amendment at the same time. The fact that they chose to tell us about Marcia Clark is revealing.

You see, we can use our data to reveal important truths, or to distract us from them.

We use our historical data to tell ourselves stories about who we are and what we value. The information that matches our stories becomes our cherished history--outnumbered men facing certain death at the Alamo but refusing to surrender, Patrick Henry demanding liberty or death, the founding fathers pledging their lives, their fortunes and their sacred honor. The information that doesn't match, that seems like a sick joke in the context of our national self-congratulation, kind of disappears from the textbooks.

So what I should probably tell my students more often is this: I will find the information for you--that is my job. But it is up to you to understand that it is not the data that guides you through the everyday, but rather the meaning you assign it. It is your job to approach it honestly, to question it, distrust it, test it. It is up to you to understand the truth of T.S. Eliot's words:
Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?

