Adam Weiss

Digital Media Strategist | Podcaster | Science Communicator

Founder/CEO of AppDemoVideos.com

Digital Media Strategist
Podcaster
Science Communicator
  • LinkedIn
  • Twitter
  • YouTube

reCAPTCHA: A Spam Filter With a Purpose

June 20, 2007 By Adam

CAPTCHA: Completely Automated Public Turing Test to Tell Computers and Humans Apart

That’s quite a mouthful, but even if you’ve never seen heard of a CAPTCHA before, I can guarantee you’ve used one. Here’s CAPTCHA image I swiped from Yahoo’s email sign-up page:

Yahoo CAPTCHA
Yup, those terrible things! I don’t know about you, but I have a hard time solving many of the CAPTCHAs I come across. They are designed to stop spammers from writing programs that race through the web bulk-posting to blogs or snapping up millions of new Gmail accounts. However, when designing an image that is hard for a computer program to read, you often end up with something that is hard for legitimate humans to read as well. This annoys people while wasting their time – not a good combination.

Luckily, there’s a new solution to both of those problems: reCAPTCHA. This variation on the CAPTCHA uses whole English words as “human tests,” making it far easier to read than your average Yahoo or Ticketmaster CAPTCHA. In addition – and this is the really cool part – reCAPTCHA is ultimately not a waste of time. Sure, you still have to take a few seconds to type letter in a box, but these letters are taking advantage of the magnificent computing power of your brain for the benefit of humanity. This is because one of the two words you type in is an unknown word from a library digitization project.

From the reCAPTCHA.net website:

To archive human knowledge and to make information more accessible to the world, multiple projects are currently digitizing physical books that were written before the computer age. The book pages are being photographically scanned, and then, to make them searchable, transformed into text using “Optical Character Recognition” (OCR). The transformation into text is useful because scanning a book produces images, which are difficult to store on small devices, expensive to download, and cannot be searched. The problem is that OCR is not perfect.

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA.

reCAPTCHA Example

So, with one word (the one with the known answer) proving you’re human and the other word digitizing books, this variation on CAPTCHA is a win-win for everyone. In fact, if a spammer does come up with a way to break these CAPTCHAs, it only removes half on the benefit: they may have gotten into the website, but they contributed to a digital library in the process.

Check it out and grab one of their plugins to implement it on your site if you have a spam problem (they even have a tool for obfuscating email addresses that is useful to almost anyone who uses the web). It’s free, and it’s cool tech for a good cause.

Filed Under: Enhance Your Website, Fight Spam

Archives