"There are hundreds of Web sites dedicated to ancient Egypt; … Some
of these sites contain valuable information, but an astonishing number
are dedicated to bizarre theories about ancient Egypt, making it difficult
for the uninformed browser to disentangle fantasy from reality."
Stille, Alexander. 1997. "Perils of the Sphinx."
New
Yorker (February 10), p. 60.
"The problem is not too little information but too much, vast chunks
of it incomplete, misleading, or inaccurate, and not only in the medical
arena. The Net - and especially the Web - has the potential to become the
world's largest vanity press. It is a medium in which anyone with a computer
can serve simultaneously as author, editor, and publisher and can fill
any or all of these roles anonymously if he or she so chooses. In such
an environment, novices and savvy Internet users alike can have trouble
distinguishing the wheat from the chaff, the useful from the harmful. …
At first glance, science and snake oil may not always look all that different
on the Net. Those seeking to promote informed, intelligent discussion often
sit byte by byte with those whose sole purpose is to advance a political
point of view or make a fast buck."
Silberg, William M., George D. Lundberg, and Robert A.
Musacchio. 1997. "Assessing, Controlling, and Assuring the Quality of Medical
Information on the Internet." JAMA 277(15), p. 1244.
Discussion Topic: Can you find any other examples of inaccurate information on the Internet?
Note: Actually, as the quotes above suggest, there are really two separate problems here. First, an information source (e.g., a book, a newspaper, a web site) may include inaccurate information. Let's call this the accuracy problem. Second, it may be difficult for people to distinguish the accurate information from the inaccurate information. Let's call this the verifiability problem.
Why is inaccurate information a problem?
Note: After all, works of fiction describe events that never happened. As a result, they contain all sorts of inaccurate information. However, no one is too worried about the fact that most public libraries contain shelves and shelves of fiction.
Inaccurate information is a problem because it can keep people from acquiring knowledge (or even just true beliefs) about the subjects that are of interest to them.
There are two different ways in which inaccurate information can keep people from acquiring knowledge. First, and this is the worry that most of us have about the Internet, people may end up being too credulous. That is, they may believe a lot of inaccurate information that seems to be accurate. In this case, not only do they fail to acquire true beliefs, but they actually end up acquiring false beliefs. Second, people may end up being too skeptical. That is, as a result of worrying about being misled by inaccurate information that seems to be accurate, they may fail to believe some accurate information.
Note: We can now see why there are really two problems instead of just one. On the one hand, if we did not have the accuracy problem, then it would not matter if people could distinguish accurate information from inaccurate information. People could just believe whatever information they find because it would all be true. On the other hand, if we did not have the verifiability problem, it would not matter how much inaccurate information was out there. People would be able to identify it and just ignore it.
So, inaccurate information can keep people from acquiring knowledge. But why is that a problem?
It is a serious problem for at least two reasons. First, it is often harmful to hold false beliefs. For instance, it can cost you money if you believe that a product is better than it really is. It can even cost you your life if you believe that a medical treatment is more effective than it really is.
Note: There really is a significant social cost to inaccurate information on the Internet. For example, people have lost significant sums of money as a result of trusting information that they read on the Internet (see, e.g., "Internet Securities Fraud: Old Trick, New Medium"). In addition, there seem to be cases where people have been harmed by inaccurate health information on the Internet (see, e.g., "Some Evidence Exists that the Internet Does Harm Health"). And that is really frightening. These sorts of cases suggest how important it is to address the problem of inaccurate information.
Second, even if people only end up being too skeptical, they may fail to believe accurate information that it would have been beneficial for them to believe.
Note: If Internet users end up being too skeptical, it means that the Internet is just not living up to it's potential to quickly provide people with useful information.
Discussion Topic: Can you find any other examples where people have been harmed by inaccurate information on the Internet?
A Brief Aside about Acquiring Knowledge
In order for an individual to acquire knowledge from the Internet, several things have to happen. First, the individual has to have access to the Internet. Second, the information that she is interested in has to be available on the Internet. Third, she has to be able to locate the information if it is available. Fourth, she has to be able to comprehend the information once she finds it. Fifth, the information has to be accurate. Sixth, she has to be able to verify that the information is accurate. Making improvements in any of these six areas will facilitate the acquisition of knowledge. Library and information scientists tend to focus on the first and third areas. In this course, however, we will be focussing on the fifth area, accuracy, and the sixth area, verifiability.
Note: These are not necessarily the only things that have to happen. What other things (if any) have to happen in order for someone to acquire knowledge from the Internet?
Is there really a serious problem of inaccurate information on the Internet?
Cerf, Ferrell, Hecht and many others make a good case that there is likely to be a lot of inaccurate information on the Internet. However, is there really a significant amount of inaccurate information on the Internet? In particular, is there any empirical evidence that there is a lot of inaccurate information on the Internet?
Fairly recently, researchers (e.g., Connell/Tipple and Impicciatore et al.) have tried to measure the amount of inaccurate information on certain subsets of the web. Connell/Tipple, for example, looked at the accuracy of answers to "ready-reference" questions on the web. (Ready-reference questions are, of course, questions like "What is the height of the Eiffel Tower?") They found that about 25% of the answers to their questions were "either mostly wrong or completely wrong" (p. 366).
Note: Connell/Tipple also found that a searcher only had about a 27% chance of finding an accurate answer to his or her ready-reference question. However, this statistic conflates two issues (viz., the locatability problem and the accuracy problem). The chances of finding an accurate answer are so low because the chances of not finding an answer at all are so high (about 64%). As I noted above, being able to find the information that you seek is an important factor in knowledge acquisition. However, in this course, we are going to focus on the accuracy problem (and the verifiability problem). Thus, the appropriate statistic from the Connell/Tipple article is that, once a searcher has found an answer to his or her ready-reference question, there is about a 75% chance that the answer will be accurate and about a 25% chance that the answer will be inaccurate.
In their much cited article on the "Reliability of Health Information for the Public on the World Wide Web", Impicciatore et al. looked at the accuracy of consumer health information on the web. In particular, they looked at the accuracy of information about treating fever in children. They too found quite a bit of inaccurate information. In addition, they found quite a bit of incomplete information.
Now, even if a web site provides Internet users with incomplete information, it does not necessarily provide them with inaccurate information per se. Even so, Internet users can end up with false beliefs as a result of incomplete information just as they can end up with false beliefs as a result of inaccurate information. In this particular case, the risk is that an Internet user will end up with the false belief that they know everything that he or she needs to know about treating his or her child's fever. Since our ultimate concern is with false beliefs and their bad effects, this is just as bad an outcome.
Note: Getting incomplete information is, of course, very different from getting no answer at all to your question.
Finally, Connell/Tipple and Impicciatore et al. have both established that there is indeed inaccurate information on the Internet. However, these two studies measured the accuracy of information within two fairly limited domains. As a result, it is rather difficult to estimate the amount of inaccurate information in other domains or the amount of inaccurate information on the Internet overall. This is because the exact amount of inaccurate information may be very different for different subject areas.
What is the cause of the problem of inaccurate information?
I ask this question because it is often useful to know what the cause of a problem is when you are trying to find a solution. The standard diagnosis is that the problem is due to the fact that almost anyone can publish almost anything on the Internet. (This is almost certainly not the only cause, but it is probably the main cause.) In other words, the Internet lacks the kinds of editorial filters that make most other sources of information (e.g., newspapers, television news, encyclopedias, scientific journals, etc.) more reliable.
This is not to say that any other sources of information are perfectly reliable. (As we have noted above, you can get inaccurate information from almost any source of information.) This is just to explain why the problem is so much worse with the Internet.
It should also be pointed out, however, that the fact that almost anyone can publish almost anything is also one of the great advantages of the Internet. For example, this allows people a very large degree of free expression. In addition, it has certain epistemic benefits. Accurate information that might not have made it through the filters on other sources of information can be posted quickly and easily on the Internet. (This sort of point will come up again when we discuss Mill.) Even so, the fact that almost anyone can publish almost anything clearly has certain epistemic costs as well.
Note: "epistemic" is an adjective that I will use frequently in this course. It simply means "having to do with knowledge."
How can we deal with the problem of inaccurate information?
This is precisely the question that we will be trying to answer for the rest of the course. In particular, we will look at how people should go about verifiying the accuracy of information (i.e., how they can distinguish accurate from inaccurate information). Also, we will look at what information professionals can do to make this task easier.
Note: We will especially want to find solutions to the problem of inaccurate information on the Internet that reduce the epistemic costs of this problem without interfering with the many benefits of the Internet.
If you have questions for me about the content of the course, post a message to the WebCT discussion forums or send me a message directly via WebCT mail. (I prefer that you not use my regular email account for questions about the course.) In addition, if you are going to be in Tucson, you can come to my office hours or set up an appointment.
Note: Information about using WebCT is available at http://www.sir.arizona.edu/resources/computing.html#WebCT. If you have trouble with WebCT, send Samanthi Hewakapuge (samanthi@email.arizona.edu) an email message explaining exactly what is happening.
Note: In order to stay up-to-date on discussions and announcements, you should check into WebCT every day or so.
I have a couple of small requests with regard to the WebCT discussion forums. The WebCT forums will be our main mode of communication in this course. In order to keep this communication more or less organized, I will set up different forums for different purposes. For example, in addition to a forum for each lecture, I will also set up a "Greetings" forum for you to describe who you are and why you are taking this particular course. So, my first request is that you try to direct your comments to the appropriate forum. My second request is that you make use of the "Quote" feature. I have found that it is usually much easier to follow discussions on WebCT forums when people use the "Quote" feature to quote (at least parts of) the message that they are responding to.
In addition to the forums, there is also WebCT mail. Please use WebCT mail (instead of the forums) for any personal correspondence.
Finally, WebCT includes chat rooms. I am not going to schedule any official chat sessions. (Being kind of methodical in composing postings, I tend to prefer the asynchronous discussion forums.) However, the chat rooms are available to you. In particular, they may come in handy for coming up with, and working on, the group presentations (see below). In addition, if anyone wants to organize a chat session on some selected topic from the course, I would be happy to attend.
Unless otherwise noted, assignments and exams will be due at midnight Tucson time. I don't plan to start grading them at midnight; I just want to be sure that I have them in my hands when I get up the following morning. By the way, Tucson is always on Mountain Standard Time (MST).
Note: In most cases, the assignments will be about the material discussed in the preceding lecture. In a few cases, the assignments may be about the material that will be discussed in the next week’s lecture. In those cases, I want to get your take on the readings before I give my take.
Note: I do understand that group presentations, especially in a virtual course, present certain difficulties. Let me know if you are having any problems.The "Group Presentation" requires each of you to participate in creating an online presentation. The presentations will take place during the last few weeks of the semester. We will treat the presentations like any other virtual lecture. For instance, I will set up a forum for each presentation, presentations will be required reading, etc. You will create your presentation in collaboration with three other members of the class. You can choose your own partners. Each group will sign up for a specific date to put their presentation online.
Note: Both of these assignments will require you to create and publish web pages. If you do not have much experience in this area, you should give yourself plenty of time to work out the technical details. Click here for information on creating HTML documents.The "Individual Project" essentially requires each of you (individually) to create a web site that contains verifiable information. This project will be due on Tuesday, December 3.
In the meantime, you should start reading the articles listed under "Inaccurate Information on the Internet" on the list of readings.
This document was last modified on September 4, 2002.