TMCnet News

Tracking down the Wikipedia prankster

[December 15, 2005]

Tracking down the Wikipedia prankster

(ZDNet News)On Sunday, the New York Times published a story naming the author of a controversial and false Wikipedia article about the longtime journalist John Seigenthaler Sr. In the Wikipedia article, the then-anonymous author wrote that Seigenthaler, once an assistant to Robert Kennedy, may have been involved in his assassination, as well as that of his brother, President John F. Kennedy. The article stayed on Wikipedia--the free, open-access encyclopedia--for four months before Seigenthaler finally got the service's founder, Jimmy Wales, to agree to take it down. Shortly afterward, Seigenthaler published a scathing Op-Ed piece in USA Today, attacking Wikipedia's accountability and credibility. In the days that followed, a San Antonio, Texas, book indexer named Daniel Brandt set out to find the article's author. Brandt, who had had his own problems with a faulty Wikipedia biography, also runs Wikipedia Watch, a sometimes paranoid, sometimes rational Web site that seeks to keep the project honest. He also runs Google Watch, a similar site about the search leader. Following clues about the IP address of the computer used to post the Seigenthaler article, Brandt set out to find its author and in the process demonstrate some of Wikipedia's core problems. Over the course of two days of sleuthing, Brandt traced the IP address to a small courier service in Nashville, Tenn., and within hours, the culprit, Brian Chase, confessed directly to Seigenthaler. CNET News.com recently tracked down Brandt and picked his brain about why he got involved in the search for Chase and why he thinks Wikipedia is flawed. Q: So tell me about how you actually tracked down Brian Chase. Brandt: All I had was the IP address and the date and timestamp, and the various databases said it was a BellSouth DSL account in Nashville. I started playing with the search engines and using different tools to try to see if I could find out more about that IP address. They wouldn't respond to trace router pings, which means that they were blocked at a firewall, probably at BellSouth. But the thing is, when you look at the structure it is so easy for Wikipedia to get into trouble on this stuff. The internal architecture of how they develop articles is flawed. But very strangely, there was a server on the IP address. You almost never see that, since at most companies, your browsers and your servers are on different IP addresses. Only a very small company that didn't know what it was doing would have that kind of arrangement. I put in the IP address directly, and then it comes back and said, "Welcome to Rush Delivery." It didn't occur to me for about 30 minutes that maybe that was the name of a business in Nashville. Sure enough they had a one-page Web site. So the next day I sent them a fax. So after they gave you the runaround, what happened? The next night, I got the idea of sending a phony e-mail, I mean an e-mail under a phony name, phony account. When they responded, sure enough, the originating IP address matched the one that was in Seigenthaler's column. I see. I called Seigenthaler and I said I have proof that the IP address (was the same). We still didn't know Brian's name at that point, but the very next day some guy named Brian Chase walks into Seigenthaler's offices at Vanderbilt University and delivers the confessional letter. So sum up why you got involved. Well, I was really sympathetic with the position that Seigenthaler found himself in. The thing most people don't understand about Wikipedia is that sometimes they get into trouble because the press notices something about Wikipedia, like two months ago there was the article on Jane Fonda and another one on Bill Gates, which even Jimmy Wales admitted were just of abysmal quality. And when something like that happens, they circle the wagons and they come in and they clean up the article and it happens really fast. The problem is that people don't realize that for every article they do that with, there could be hundreds of articles that they haven't noticed that someone started and are just sitting there that could have been vandalized like Siegenthaler's bio was.

Say a little more. The whole model is basically flawed. I think Wikipedia is valuable on certain respects on technical articles where the facts tend to get resolved on the basis of ones and zeroes or if something is true or not. But when you get into stuff like biographies and particularly biographies of living people, the quality is much more iffy, the potential for libel is much greater and the controls just are not appropriate. You've said Wikipedia's influence is unearned. What do you mean? The media has been hyping Wikipedia for almost a year now and their traffic has doubled every few months. A lot of surfers think that Wikipedia is really great because it works on stuff where you can't find readily available information anywhere else. If you put search terms in Google and it says there's 550,000 results and most of page one and page two are spam sites, you can see where Wikipedia is attractive. And it has a very good reputation. But the thing is, when you look at the structure it is so easy for Wikipedia to get into trouble on this stuff. The internal architecture of how they develop articles is flawed. Even if the article is in good shape, there is no guarantee that between now when it's in good shape and five minutes later when you check it that someone hasn't come in and vandalized it. It's an invitation for disaster, and the Seigenthaler situation was bound to happen sooner or later. You've said that the new changes to Wikipedia's authoring rules will actually result in more problems rather than less. Can you elaborate? The problem is that anyone who registers and logs in enjoys a very special privilege on Wikipedia, which is that their IP address is hidden and in the history section for articles, you just see a login name and a time stamp. The problem with that is that there is no way for ordinary mortals to associate the login name with an IP address. You have to use a very special tool, called CheckUser, which is only in use by half a dozen top administrators. And even then, the logs that are used to compile the information are only kept for one to four weeks, from what I can tell. In Seigenthaler's case, if Brian Chase had spent the extra 20 seconds registering, they wouldn't have had any record at all of his IP address So even if you're one of the privileged six, or Jimmy Wales himself, if too much time has passed, you have lost that information because the logs have been deleted. And so it's going to make the situation worse. In Seigenthaler's case, if Brian Chase had spent the extra 20 seconds registering, they wouldn't have had any record at all of his IP address. And even though Jimmy would have been eager to accommodate someone like Seigenthaler just to protect Wikipedia, he would not be in a position to because they don't keep logs that long. So if it was up to you, what would be the proper way for Wikipedia to handle authorship and editing? No. 1, I would eliminate all edits that are not logged in. No. 2, next to the log-in name, I would have the IP address displayed on all the edits, no exceptions. No. 3, to get an account, which means to get a log-in name, they have to give a valid e-mail address. So then you would have three things, the login name, an IP address and a valid e-mail address, that make you less anonymous. Then I would go and take all the biographical articles on living persons and take them out of the publicly editable Wikipedia and put them in a sandbox that's only open to registered users. That keeps out all spiders and scrapers. And then you work on all these biographies and get them up to snuff and then put them back in the main Wikipedia for public access but lock them so they cannot be edited. If you need to add more information, you go through the process again. Right. I know that's a drastic change in ideology because Wikipedia's ideology says that the more tweaks you get from the masses, the better and better the article gets and that quantity leads to improved quality irrevocably. Their position is that the Seigenthaler thing just slipped through the crack. Well, I don't buy that because they don't know how many other Seigenthaler situations are lurking out there. You've said that Wikipedia is a potential menace for anyone who values privacy. But what about Google, and the rest of the Web and the tools you used to track down Rush Delivery? I'm talking specifically about how Wikipedia's criteria for whether someone merits a biography has an extremely low standard. For example, there's a page on Brian Chase, and I don't feel comfortable about that. Right now, newspapers should print his name because it's topical. But a few months from now, his name will sort of disappear from the Internet because newspapers don't rate that high on the search engines, and it's only up in Google news for a month. But Wikipedia articles rank very, very highly on all search engines and Brian Chase will shoot right up to the top with the Wikipedia. And when this poor guy is trying to send out his resume, and he never gets called back from interviews, how do you know that the people aren't Googling him when they get his resume and saying, 'Well, he did this thing.' The permanence becomes invasion of privacy even more so than getting your name in the newspaper.

[ Back To TMCnet.com's Homepage ]

ITEXPO Begins in:

TMCnet News

Tracking down the Wikipedia prankster