Google and the moral imperative

By evolvingthoughts on December 2, 2006.

Google prides itself for being an ethical company. "Do no harm" is their motto, I believe (although some Chinese dissidents may dispute this). But what happens when an honest site is hacked and porn links are included on their index page?

Google delists and deindexes that site immediately, with neither warning nor notification of what is wrong, that's what. What site are we talking about? The award winning anticreationist site, The Talk Origins Archive.

Wesley Elsberry, the admin, gives the story of trying to get Google to reindex us here. But the site has been subjected, along with the Panda's Thumb, to a series of Denial of Service attacks recently as well.

Despite a few pathetic attempts to argue against the wealth of information on this site, particularly Mark Isaac's Index to Creationist Claims, Douglas Theobald's 29+ Evidences for Macroevolution, and Jim Foley's Hominid Evolution FAQ, the information given here is overwhelming and useful against creationist attempts to get creationism and ID taught in schools. So it seems that some sleazy little twerps on the creationist or ID side decided that information and honest debate was not working for them, and to stop it. And the best way is to prevent the information getting out. Such is the "honesty" of "critical evaluation". We'll be watching this closely in the future.

[Disclosure: I am a member of the foundation that supports the Archive, and also a contributor to some lesser FAQs there. And proud of it.]

Late note: Google, via Matt Cutts, the head of the webspam team at the company, gives his side of the story here. He apparently tried to contact us, although with spam filters on emails, I am not sure if it was received. So I must retract the implication that Google failed in its duty here, although I am still of the opinion that there is a concerted attack on the Archive by other means. It also looks, from the links Matt gives, that this was a nasty spam attack unrelated to the content of the Archive or the antievolution "debate".

That said, I can't resist a snarky comment. Google here says that webmasters should be reading their webmaster forum to keep up with this. I know Google is the main search engine, but really. If emails were sent, that is enough, but why should the onus be on webmasters to read the forums of every service provider, ISP and so forth on the web? I think that is indeed a tad arrogant. But this is just me being snarky.

More like this

Actually, I'd be surprised if it really were some creationist conspiracy to take down the archives. Yeah, perhaps it was somebody with a chip on their shoulder. But any high-traffic, high-profile, site on the web is a target for that kind of hacking regardless of content, and ones that don't have full-time paid sysadmins to watch it are going to have a slower response time and thus (perhaps) be more temptng targets.

-Rob

Well given the DOS attacks, and the fact that the Archive is such a useful resource in fighting creationism, I would think the presumption is that it was deliberate (the hack occurred only on the index page, suggesting it was a deliberate attmept to get the Archive off Google, which has in the past returned the Archive as a top ranked hit on these matters). But it is always possible that it was your basic pron-redirection attack.

It's the same mindset that justifies blocking Planned Parenthood sites. The end justfies the means, I guess.

The thing is, these kinds of cyberattacks accomplish nothing. It is pathetic. The information is not gone, it is merely harder to get to for a while. And they risk drawing attention to the very thing they are trying to hide.

Well, if a site is hacked and much of its content turns into porn links, than the redirection of porn searches to the site is a form of massive Google-bombing - and the only surefire way to stop that is to de-index the site immediately. This may sound cruel, but in the long run does the 'net and the website a big favor. And never at any times has Google been the bearer of the burden of scrubbing a site of it's hack and provide the proof that it is safe for re-indexing. The admins of the talk origins archive bear the burden of cleaning their own site and proving that their domain will not pollute the Google-net. So of course it was hard.

Aerik,

I've never objected to Google keeping their index clean. Their policy of refusing to tell webmasters what or where the problem lies, though, privileges cheaters, who know what they did wrong, over honest webmasters whose sites get cracked, who don't have that information.

Do try to keep up.

Myself, I'm not saying that it necessarily was one of our conceptual opponents doing the cracking. As Rob points out, the TOA is a big target for crackers of all sorts. However, one reply on the thread on Google Webmaster Help about this shows exactly how this news was received by one of the antievolution advocates:

Thank God they [AIG and ICR -WRE] are indexed and thank God you are not! :-)

He's probably not alone.

Well, if a site is hacked and much of its content turns into porn links, than the redirection of porn searches to the site is a form of massive Google-bombing - and the only surefire way to stop that is to de-index the site immediately.

I should add that the case in question is not the one that Aerik paints. One file out of 5,000+ files on the TOA archive got cracked. There was a block of links hidden invisibly to browsers at the bottom of the page, of comparable length to blocks of spam I see added to weblog comments all the time, maybe a couple of kilobytes total, IIRC. Google has no obligation to index any site that it doesn't choose to, certainly. But they likewise have no claim to moral superiority when their policies make life easier for the cheaters and harder for the honest victims of Internet skullduggery.

Of course, talkorigins.org is still vulnerable to the following attack:

1) Post an article to talk.origins with 10,000 porn links.
2) Using two sock puppets (or confederates), nominate and second this post for POTM.
3) When voting time comes around, use a network of owned PCs to post to t.o, casting 10,000 votes for your article.
4) Presto! Your collection of links is enshrined in the POTM archive.

One thing that I find interesting about computer security is the extent to which side effects matter. In this case, the act of adding a thousand porn links to the side has the primary effect of advertising a porn site. But it also has the side effect of removing the attacked site from Google. So if the attacker's real goal is to remove the site from Google, then a collection of porn links is the way to do it. Google's trying to Do the Right Thing (avoiding porn googlebombing) has opened up an avenue for a different type of attack (delisting).

Similarly, let's say that your workplace has a security policy that if you enter your password incorrectly ten times in a row, then your account is locked, and you can't log in even if you provide the correct password. This prevents brute-force attacks, but makes it trivial to mount a denial of service attack, to prevent a legitimate user from logging in.

There's probably an analogy with evolution and spandrels here. Teleological thinkers make the mistake of thinking that biological structures have a purpose, asking "what is a wing for?", when the real question is "what can you do with a wing?". Finches, ostriches, and penguins have all come up with different valid answers to that question.

Uncommon Descent got delisted in September:-

http://www.uncommondescent.com/archives/1764

and

http://www.uncommondescent.com/archives/1759

Look at the bright side: acceptance of Creationism among pR0n users may go down.

Following on from my post a couple above...

As will be seen by following the link:

http://www.uncommondescent.com/archives/1764

I posted a comment to DaveScot's blog entry on the subject of UD being delisted; referencing Wesley's blog entry.

I'd suggest readers take careful note of DaveScot's response.

He then produced the following blog entry:

http://www.uncommondescent.com/archives/1830

Some might suggest that that entry is a little slanted, since it makes no mention of Wesley's blog entry, or his explanation that the site had been cracked. I tried to respond suggesting that in the interests of full disclosure DaveScot should perhaps have referenced Wesley's blog - but the post never got through the filters...

In unrelated news, after the above happened, I was apparently been banned from UD under a "three strikes" rule - the third strike being wilfully and with malice aforethought misstating the moral of the tale of the boy who cried wolf:

http://www.uncommondescent.com/archives/1820

I have to say, I thought that was my fourth strike, but can only, on a quick skim, identify two previous strikes:

http://www.uncommondescent.com/archives/1800 (post 18)

http://www.uncommondescent.com/archives/1829 (post 7)

The same thing happened with an ESL resource I use often. For a week or so it was filtered out by my school's software, and finally I figured out why: http://www.gaijinpot.com/bb/showthread.php?t=22145

This is the explanation the (rightful) proprietor posted on the thread:

"Hi,

I'm Chris from Boggle's World. I am replying to this thread to let you know what has happened and what is happening and do some damage control. On May 11th the registrar account for Boggle's World was hacked and the whois info was changed. Once the whois info was changed, the hacker was able to transfer the domain to himself and serve up the porn. The change went into effect exactly 1 week after the transfer (according to ICANN's domain transfer rules). As soon as I woke up and found out, I contacted a lawyer and the two registrars involved. The registrars are now working to get the domain back and have shut down the adult content so now nothing displays on Boggle's World. The prognosis is good for now but it's the weekend so nothing will be done till Monday. If a descision is made on Monday or Tuesday, I will get possesion of the domain until 1 week later (as per ICANN's policies). I'm sorry that some of you have had to be exposed to that unwittingly. But please understand that I was devastated when I found out and am working hard to correct it. This was a hack job pure and simple (no expired domain kind of stuff or anything like that)."

Sorry for the long post, but I want you to know you're not the only one who's had it happen. And that it goes to promote an adult hook-up site (while that may not be all that bad in itself) shows what a cheap ploy it is.

--J

Please note the late note in the body of the text. Google tell a different story. I apologised to Google at the blog of the webspam head.

Hey John,

I actually work with a search engine optimisation company in Scotland and have to deal with this sort of thing all the time. Looks like Matt's reply was pretty comprehensive and I'm sure you'll be on top of things by now but if you're still having problems let me know and I can ask around the office for suggestions.

This is the reply I made to Matt Cutts over on his blog:

Matt,

I think that the message you show as a warning is excellent. It clearly states what is wrong, with enough information to permit a webmaster to locate the problem.

I only wish that I had actually received it.

Before I made my complaint, I checked my incoming email. There was no sign there of an attempt to contact me from Google. Lunarpages.com, where the TOA is hosted, forwards email to my account.

This morning, I learned of this post, so I re-checked my steps. No, still nothing in my incoming mail. I looked for strings from within the warning, to see if the text came through without an obvious "google" connection. No luck on that, either.

I rely upon the Lunarpages email forwarding, but given this post, maybe I was wrong to do so. I logged into the domain's Lunarpages webmail interface for the first time. I searched for anything with "google.com" in the from field. I searched for strings from within the warning message quoted above. Still nothing.

Bummer. That just leaves examining the SMTP records on my local email account. Google's message should have been relayed by Lunarpages, so I looked for that in the SMTP logs. Still nothing.

My SMTP logs, BTW, do show rejects for hosts like wr-out-0708.google.com, which is apparently blacklisted at spamcop.net. The following shows rejects on the 28th with a "google.com" domain. I haven't checked these for spoofs, but I assume that's what's up with these:

Nov 28 02:14:49 [...][63343]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

Nov 28 11:39:34 [...][84035]: ruleset=check_relay, arg1=py-out-1314.google.com, arg2=64.233.166.175, relay=py-out-1314.google.com [64.233.166.175], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.166.175

Nov 28 11:39:38 [...][84034]: ruleset=check_relay, arg1=py-out-1314.google.com, arg2=64.233.166.175, relay=py-out-1314.google.com [64.233.166.175], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.166.175

Nov 28 12:18:47 [...][85490]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.244, relay=wr-out-0708.google.com [64.233.184.244], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.244

Nov 28 12:23:01 [...][85593]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.240, relay=wr-out-0708.google.com [64.233.184.240], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.240

Nov 28 12:23:01 [...][85592]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

Nov 28 18:24:09 [...][96244]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

Nov 28 18:30:36 [...][96470]: ruleset=check_relay, arg1=wr-out-0708.google.com, arg2=64.233.184.250, relay=wr-out-0708.google.com [64.233.184.250], reject=550 5.7.1 Rejected see: http://spamcop.net/bl.shtml?64.233.184.250

It would be ironic, though, if the warning message that could have short-circuited this whole affair was blocked because of spam filtering.

As for entitlement, I don't think that I was out of bounds given the information I had to work with. Google certainly isn't responsible for fixing the bad stuff that is on my site. I never said it was. Having a third party mess with the site caused the problem in the first place. Having tried to work with Google once the problem became known to me resulted in... nothing. Not until I complained about what happened.

I do feel a bit better to know that Google made an attempt at contact before de-indexing our site. And it is good to know that the site is scheduled for re-indexing within a couple of days, rather than the couple of weeks mentioned on the Webmaster Help Group. I wish Matt and the rest of the folks at Google success in making the process better in the future. If, when I did claim the TOA site via Google Webmaster Tools on Dec. 1, the text of that warning that you quote above had been waiting for me, I would have had no complaint to make. It seems to me that if Google is willing to send that level of information via email, then making it available to the verified owner of a site via the Webmaster Tools interface should not be a problem, either.

And, Adam, neither Google nor you can tell whether the problem is a deliberate cheat or an honest person vicitimized from the problem itself. Google is entirely correct to protect their index by pulling sites that are not in compliance with their guidelines. I never said otherwise. My complaint was what Google's policy of obscuring the de-indexing decision in the aftermath created, which is a situation in which cheaters have an advantage over honest webmasters, since the cheaters have knowledge of where in their pages the bad stuff lies, and the honest webmaster does not have that knowledge. Whether or not you may accept that I qualify as an honest webmaster, the policy as it currently stands obviously puts honest webmasters at a clear disadvantage.

Hmmmmmmmmm......... given the more than even chance that some Fundamentalist(s) is/are behind this, don't you think it's kind of skewed for them to use (and therefore spread) pornography as their weapon? So much for "good ends don't justify evil means."

p.s. Google's actual motto is "Don't be evil."

I tried to cut-and-paste my comment on Matt Cutts' weblog here, but it didn't seem to take. I also have it on my weblog:

http://austringer.net/wp/?p=443#comment-68303

Argh -- even with your update, it appears you still don't get. Not the webmaster forum, but webmaster tools -- it's a tool they provide you to help monitor what they know about your site and for you to know if all is well.

You might argue you shouldn't have to use that either, but if Google is so important to you (which it is), I don't think it's too much to ask that you take some responsibility in monitoring what they're doing with your site.

Matt Cutts is taking my complaint and using it in a constructive way.

http://www.mattcutts.com/blog/how-google-handles-hacked-sites/#comment-…

Kudos to Matt.

"Joe Blow",

Actually, my suggestion that the content of warning email should be made available through Google Webmaster Tools for verified site owners is being taken seriously. Check the thread at Matt Cutts' site. At lest, that's the way I'm reading it.

So, how did the hacker get at the pages originally? I thought t.o. was running on a unix server.

The TOA is a virtual-hosted account at Lunarpages.com, on a Linux server.

We don't have raw logs going back to the date of the cracking.

We can't be completely sure of how the cracking got done. We are doing what we can to try to keep the site secure.

Just a quick pedantic thing I noticed. Google's motto is not "Do no harm" but "Don't be evil". Either way it conspicuously removes any obligation to actually be/do GOOD. I suppose since they went public the motto should be changed to "Don't be evil unless this absence of evil in any way negatively affects the profits of Google shareholders, in which case evil is just dandy-o." Or maybe I'm just cynical from dealing with them for the past five years :-)

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

My new blog

August 7, 2009

For those who come here from old links, my new blog address is evolvingthoughts.net This blog is no longer active.

Evolving Thoughts moves

May 23, 2009

So it is farewell... I have enjoyed blogging here at Seed, who have been generally very good to me given the constraints of herding cats with string they are working under, but it is time to move on. The neighborhood became a little hostile to old fashioned fogies like me, and that's all we need to…

We will resume transmission as soon as we can

May 20, 2009

There's some reorganising of my life and blogging going on. I'll announce all the changes to links and stuff in a fortnight or less. Please excuse the dust and noise of the construction behind the plastic sheets.

No, it's not an ancestor either (probably)

May 19, 2009

In addition to the "missing link" trope that is being dished out about the new primate fossil, is another one, more subtle and insidious: it's the ancestor of all primates. How do they know that? Consider a biologically realistic scenario: at the time there were probably hundreds of species of…

Alpha Fail

May 18, 2009