Computers catching up to Captcha - Action News
Home WebMail Thursday, November 14, 2024, 05:31 AM | Calgary | 5.6°C | Regions Advertise Login | Our platform is in maintenance mode. Some URLs may not be available. |
Science

Computers catching up to Captcha

Captchas, the squiggly-letter images used on some websites to distinguish between a human and a computer, are getting harder to decipher as machines get better at figuring them out, says technology columnist Dan Misener.

Squiggly-letter test used to tell humans, computers apart getting harder as machines catch on

Tell me if this sounds familiar: you're online, ready to buy some concert tickets or to sign up for a new email account. But before you're allowed to proceed, you have to prove you're a human being by deciphering a mess of distorted, squiggly letters and numbers, then typing them into a text box. This is what's called a Captcha, or completely automated public Turing test to tell computers and humans apart.

For a while now, I've had a sneaking suspicion that Captchas are getting harder. Increasingly, I'm left wondering, is that an uppercase "x" or a lowercase "x"? An "o" or a zero? A "q" or an "o" with a squiggle through it? Sometimes, even though I'm 100 per centsure I've typed exactly the right thing, the computer disagrees with me.

For months, I thought I was alone in this frustration. I worried that my increasing inability to pass these tests suggested that I'm not entirely human.

Then last week, I opened an emailmessage from a colleague that read: "You know those distorted letters we have to type to pass security tests online? I notice they're getting more and more distorted."

According to Luis von Ahn, one of the computer scientists who coined the term "Captcha," the testsare getting harder.

If you can't make out this dizzying array of letters, you've failed the completely automated public Turing test to tell computers and humans apart, known as a Captcha.

"The thing about Captchas is that many people do their own implementations," he told me. "Over time, some of these implementations have gotten a lot harder, because the really easy ones essentially, the undistorted ones can be broken by bots."

Traditionally, identifying squiggly, distorted letters has been difficult for computers but comparatively easy for humans. But computers are getting better and better at it, and easy Captchas aren't as effective as they once were.

Still,von Ahnsays his own implementation of Captchas, calledreCaptcha, isn't getting any harder.

"It's still the case, as it wasthree or four years ago, that a person who submits a solution [to reCaptcha] is going to be correct 96 per centof the time,"von Ahnsaid. "That number remains the same."

Using Captchas to digitize books

Digitizing books involves photographically scanning the work and rendering it into text using Optical Character Recognition (OCR), but OCR can't always identify every word. Programs like reCaptcha allow the words that can't be identified by OCR to be turned into a Captcha image that is then given to a user to solve as part ofa regular security checkon various websites. The mysteryword is given to the user in conjunction with another word for which the answer is already known. The user is asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the mystery word, too. The same word is then given to several other users to verify the initial answer was correct.

Source: ReCaptcha

ReCaptcha, which was acquired by Google in 2009,generates more than 100 million Captcha images a day for various websites for free.The Captcha images it provides are also used to help decipherwords that can't be identified during the process of digitizing printed material (see sidebar).

Computers are getting better at solving Captchas because devising automated ways of bypassing the test is potentially lucrative. Imagine that you're an email spammer. Wouldn't it be great if you could automatically sign up for hundreds or thousands of bogus email accounts? Or, imagine you're a ticket scalper. Wouldn't it be terrific if you could write a computer program to automatically buy all the tickets for a concert? Captchas can help keep spammers and scalpers at bay.

Because there's a lot of money to be made, software developers are actively writing code they say can crack Captchas that, von Ahn says, sells for $10,000.Von Ahn saidhe has even seen ticket scalpers advertise software they saycan break reCaptcha for as high as $50,000.

According to von Ahn, it's simply a matter of time before software will rival humans at solving Captchas, but it could take decades.

A screen grab of a nearly indecipherable Captcha image used to distinguish between humans and computers.

In the meantime, as easier, ineffective systems are phased out, people will continue to be frustrated by some Captchas. And as frustrating as Captchas are for the average user, they can be even more frustrating for people who are visually impaired or use screen-reader software. Some Captcha implementations include an audio alternative, but accessibility will continue to be an issue.

Regardless ofhow they areimplemented, Captchas are all built around the idea of creating a task that's hard for computers and easy for humans. As computers get better and better at reading squiggly letters, we may be asked toprove our humanity by performing other types of tasks.

For instance, computers are still very bad at determining the contents of a photograph. It's difficult for software to tell the difference between a photo of a cat and a photo of a dog. Microsoft Research built a Captcha system calledASIRRA(Animal Species Image Recognition for Restricting Access) based on this idea. Companies likeSolve MediaandNuCaptcha have put their own twists on Captchas thatrequire users to enter words from a text or video advertisement.

Some sites that use Captchas are making them harder to decipher because computers are getting better at figuring them out.

If von Ahn is right and computers will eventually be able to reliably solve text-based Captchas, that's not necessarily a bad thing. Though Captcha-busting technology could be used by spammers or ticket scalpers, it could also help decipherhard-to-read parts of digitized books or identify skewed and distorted text in photographs.

So, the next time you're confounded by a mess of squiggly, distorted letters, don't be too hard on yourself. Maybe it's the Captcha's fault.

As von Ahn told me, "Sometimes, they're really bad. Sometimes, they are so hard to read that I can't read them myself."

Comforting words from one of the people responsible for some of those squiggly letters.