International Development Research Centre (IDRC) Canada     
idrc.ca HOME > Publications > IDRC Books > All our books > MAKING A DIFFERENCE >
 Topic Explorer  
IDRC Books
     New
     in_focus
     Development/evaluation
     Economics
     Environment/biodiversity
     Food/agriculture
     Health
     IT/communication
     Natural resources
     Science/technology
     Social/political sciences
    All our books

IDRC's 40th anniversary

Subscribe

Free Online Books
 People
Bill Carman

ID: 30678
Added: 2003-05-29 9:01
Modified: 2004-11-22 18:53
Refreshed: 2010-03-14 06:23

Click here to get the URL for the RSS format file RSS format file

Related Impact Activities
12. Reporting Information About Studies of Information
Prev Document(s) 15 of 19 Next
Charles T. Meadow

A common occurrence in information science is that research papers do not make use of prior data compiled by other people. Prior theory, yes; actual quantitative data, no. One of the problems all of us have who study and try to measure the impact of information is the multiple definitions of the very word, "information." Because the definition varies so much, measurements based on it vary also. We all know the Shannon definition,
H = Negative sum of pi log pi where pi) is the probability of occurrence of a given symbol (Shannon and Weaver 1949). A variation on this is that information is that which reduces uncertainty. But the reality is that it is hardly possible to measure the extent to which uncertainty is lessened in a human being on a topic of great social and economic importance to that person. We might be able to make such measurements in a highly controlled laboratory study where, unfortunately, the relationship between the variables measured and the real world is unknown. But measuring the contribution of a library or consultant to a particular decision is a different matter altogether.

The first problem seems to be the definition of the key word, information. A second is that we do not always report the definitions of the variables we measure, or the circumstances of the measurement, with enough precision to enable others to use them.

Defining Information

It is very hard to expunge the more common meaning of information as the printed, and occasionally spoken, word. When we read about information and development, we are usually reading about the provision of information, i.e., documents, databases, or formal advice. We typically do not read about the process of converting this data–information into knowledge–information. One reason is the difficulty of measuring that process in any context, let alone that of a developing economy and social system. Another is that the conversion process, itself, is not enough. We must also measure the willingness to try to find information and the skill at doing it.

Probably all of us who call ourselves information scientists have at one time or another made the distinction between data and information. For most of us, it is roughly what Thorngate (this volume) called "out there" (data) vs "in here" (information), or "information" is that which each individual internally interprets the data to mean.

I am one of those who has published a formal definition (Meadow 1992) together with distinctions among such related terms as data, knowledge, intelligence, and wisdom. Yet, I often find myself, as well as my colleagues, using the word information very casually, ignoring my own definitions.

In most of the material I have read on the subject of information and development, information tends to be used to indicate documents or relatively formal, orally conveyed data, which is not the usual definition. There are, of course, exceptions. But, in general, even in research reports, the reader does not know for certain which meaning of information is intended, and this limits the exchange of data among ourselves about information because we are never completely sure what is being measured, or under what circumstances. The problem is not limited to the development research community; it is simply that we are the latest group to attempt to quantify the impact of information transfer.

A large part of the literature on the evaluation of information systems, whether or not in the development context, is concerned with the distribution of packages of data as yet unconverted to information by their recipients. This can mean printed books or reports in a library, records in a computer-stored database, or advice conveyed orally. "Unevaluated" means that the intended user has not necessarily evaluated the information in terms of its contribution to that person's own work or thought.

Most evaluation studies end with an assessment of "relevance," often defined as topicality or subject-relatedness. Another definition is utility or value, usually based on a questioner's immediate reaction to a document upon reading it. If I am a farmer concerned with crop loss caused by some parasite, and I find a document that describes the phenomenon and seems to offer a solution, but in actual trials does not enable me to get rid of the pest, the document may be relevant but will have no impact.

Another aspect of information is the intended recipient's willingness to seek it and skill in doing so. In assessing impact, it is as important to find people who did not visit the information centre as those who did. It is as important to know what the person looked for as what he or she found. It is as important to know what the actual consequence of finding the information was, as how relevant the person said it was.

These are not trivial issues. In my own impact assessment project, (Meadow and Spiteri, this volume) we wish to interview people who are thinking about starting a new business, as well as some who have already done so. The "done so" part is relatively easy. Businesses must register with the government and we can have access to the records.

Ideally, we would like to reach those with the idea as soon as they have it, before they start asking other people for information and advice because we would like to track the advice and information seeking and its effect. The earliest point at which we know we can find such people is when they approach a government office whose purpose is to advise those starting a new business. But, these people have already followed a rational path in seeking such advice. What happened to those who did not follow this path? We do not and will not know.

One method likely to find entrepreneurs before they approach the government or a bank is to do a survey of the population at large. The cost of such a survey to find a small group of people is prohibitive. Another way may be to advertise for people thinking of starting a new business and willing to help a research project. But this approach gives us a self-selected sample. Hence, of necessity, we deal with somewhat biased data, making their use by the next researcher questionable.

My own basic field has been information retrieval. When evaluating an information retrieval system it has generally been the custom to ask the person who has a question to evaluate the outcome. This can be done by asking if the retrieved informational items (typically, but not necessarily, documents) are relevant to the question. Research people in this field generally recognize but rarely actually take into account that the question asked at the retrieval facility might not be the one the user really was interested in.

Belkin (1980) was the first to articulate this phenomenon, in drawing a distinction between what he called the anomalous state of knowledge (ASK) and a question. The somewhat peculiar expression, ASK, reflects the fact that the person may not even know the exact question and may only be vaguely aware that something is missing.

If the prospective information user does not know the true question, then what question should be asked at a library? Sometimes, even when the true question is known, the question brought to a library (the information need statement) may reflect a feeling that the information service person assigned to help cannot understand the real question or that the asker simply lacks the ability to articulate it.

Either case is likely to lead to inadequate results even though the information sought may be present and the user able to understand and act upon it. Eventually, however, a question is asked and some information items are retrieved. The meaning of "relevance" of these information items can be, as noted earlier, topicality (Is it on the right subject?) or value/utility (How useful is this item to me?). But this judgment is almost invariably made at the time of retrieval, not after the retrieved item has been put to actual or attempted use. Did the lawyer win the case? Did the physician cure the patient? Did the securities trader make a profit? One reason, of course, why we do not wait until a use is made is that it may be impossible to attribute such a real-life outcome to retrieval or nonretrieval of a single document.

It is my belief that it is not impossible to attribute an influence of the retrieval of a documents to a subsequent real-life event, but developing the method would require some original research. Put more positively, using a broader measure, it is possible to assess the contribution of retrieved information to user actions, over a number of actions. That is, we are not likely to be able to assess the contribution of a particular document to the winning of a case at law, but we can assess the contribution of the process of recovery of information to winning cases, over a number of cases. In the context of development, this means that if we follow a number of users of an information facility as well as a matched sample of nonusers, it might be possible to assess the contribution of the facility to their actions.

Another side of our difficulty, as scholars of the impact of information, is that it is difficult for us to exchange data among ourselves. If "A" does a study of information use in one country and "B" a different study in another country it might still be useful to exchange data, to establish, for example, a series of data points showing the tendency to use a library as a function of user characteristics, or the success in achieving user-determined success in finding information as a function of training in searching for it. But, we rarely see publications in this field in which one researcher has used data taken from another. If the two do not use the same definitions of the user characteristics or of success in finding information, or of "information" itself, comparison of data is not possible and duplication of research is often necessary.

What is needed is a standardization of the variables used. In physics, it is possible to share or compare data taken by different experimenters because the definitions of the variables used or observed are standardized and well understood. A well-known example is the graph of thermal conductivity of tungsten as a function of temperature (Ho et al. 1974), reproduced in Tufte (1983). A clear, smooth line fitted to the data shows how thermal conductivity varies with temperature, as a composite of the work of many people. Many observations fall off the smooth line. Points falling off the line may indicate experimental error or that other factors may bear on the measurement. I believe it would be possible and certainly desirable to prepare similar composites for measurements of system performance, user performance, even just measurement of relevance under varying conditions.

Achieving the standardization I suggest is no small task. We could approach it by creating a standardized way to describe data. For example, and returning to the question of the meaning of "information," if a person reports a measurement of user satisfaction with retrieved information it would help others if the report explicitly stated: characteristics of the user, manner, or circumstances of collecting the data, meaning of information used, and scale of the satisfaction measure. We would need, and I believe we could construct, a thesaurus of descriptors for such data elements as user characteristics and, of course, information.

Here is an example. Assume we have a study of effect of training on performance of users with an information retrieval system. The users are all drawn from the same population, business managers with no previous computer experience and an average of 12 years of formal education.

One-half of the group is given training program "A," the other program "B." Each person in each group is given same set of questions to try to answer. A subject matter expert, familiar with the database in use, judges the outcome, rating each document retrieved as to relevance.

The judge is also asked to estimate how many relevant records were not retrieved, and to base the relevance rating on topicality, i.e., on whether retrieved records appeared to be on the subject of the question, rather than rating on utility or how valuable the individual user found the record. A binary relevance scale was used. Documents were deemed relevant or not relevant.

The variables of this example are:

User characteristics

Education (years)
Occupation (Code from a standard classification schedule)
Computer experience (years)

Outcome

Relevance
Definition used: topicality
Values possible: 0,1 (0 = not relevant)
For each question, number of documents retrieved, nq
For each question, number relevant, rq
For each question, number assumed relevant missed or not retrieved, mq
For each question, Precision = rq/nq
For each question, Recall = rq/(rq + mq)

Treatment

Trained by method (describe)
Questions assigned by experimenter. (List questions used)
Qualifications of relevance judge(s)

If we do not have all this information recorded in a standard manner (e.g., use of some standard occupation classification), then another researcher cannot use this data because it is not known exactly what the reported measurements mean. If a second researcher wanted to use the data in this hypothetical case, that person would typically not be sure of such questions as: Which definition of relevance did the judges use? Can it reasonably be assumed that the judges' findings are the same as those that would have been rendered by the searchers, i.e., can judges results be directly compared with user results? Were effects of order of presentation of records to judges considered? How should data based on a binary relevance scale be compared with data based on a scale of 1–5?

In summary, we need much more precise and standard definitions of the variables we report. If a researcher could reasonably count on terms always meaning essentially the same thing when used to describe a measure, we should be able to share data and to use others' data in our own work, without repeating the earlier work.

References

  • Belkin, N.J. 1980. Anomalous states of knowledge as a basis for information retrieval.Canadian Journal of Information Science, 5(May 80), 133–143.
  • Ho, C.Y.; Powell, R.W.; Liley, P.E. 1994. Thermal conductivity of the elements: A comprehensive review, supplement no. 1. Journal of Physical and Chemical Reference Data, 3, I–692.
  • Meadow, C.T. 1992. Text information retrieval systems. Academic Press, San Diego, CA, USA.
  • Shannon, C.E.; Weaver, w. 1949. The mathematical theory of communication. University of Illinois Press, Urbana, IL, USA.
  • Tufte, E.R. 1983. The visual display of information. Graphics Press, Cheshire, CT. 150 pp.

Charles Meadow is professor emeritus in the Faculty of Information Studies, University of Toronto, 140 George St, Toronto ON, Canada M5S 1A1.






Prev Document(s) 15 of 19 Next



   guest (Read)(Ottawa)   Login Home|Careers|Copyright and Terms of Use|General Infomation|Contact Us|Low bandwidth