reCAPTCHA验证码:悄悄为人类做贡献
  • 上方文Q
  • 2013年02月18日 11:18
  • 0

常在网上飘,验证码相信大家几乎每天都会见到,可是你知道么?当你每次不耐烦地输入一个很难看清楚的单词的时候,除了防止垃圾注册或者评论以外,其实都在悄悄为人类文明做出一点贡献呢。

验证码(CAPTCHA),还有一个冗长的名字叫做“全自动区分计算机和人类的图灵测试”(Completely Automated Public Turing test to tell Computers and Humans Apart)。在国外很多下载网站上,Google reCAPTCHA验证码使用得非常多,它可以免费申请,而悄悄做贡献的就是它。@JimmyLiye编译了Google的一篇文章,讲述了这个乍一看很不可思议的小故事。

reCAPTCHA验证码:悄悄为人类做贡献

这事儿还得从OCR说起。OCR的意思是光学字符识别,简单地说就是将图片型的文字扫描并识别出来。问题是,这种技术现在的效率实在不敢恭维,经常错误百出。你看:

reCAPTCHA验证码:悄悄为人类做贡献

有一天,某台机器扫描了一本书,想把它转成电子版:

reCAPTCHA验证码:悄悄为人类做贡献

处理出来是这样子的,勉强能看看:

The Hreckinridge' and Lane Democrats, having taken courage at the recent eastern advises, are [xxxxxxxxxx] energetically for the campaign: Several prominent Democrats who at first favoredDonoLea, are coming out. for the other aide, apparently under the [xxxxxxxx] of Federal [xxxxxxxxx]. An address to the National Democracy of ,1ifornia, urging the party to supportHaeeslipslDas, has recently been published, which manifestlybss strengthened that aide of the [xxxxxxxxx]: It is signed by 65 Democrats, many of whom occupy respectab e and prominent positions in the party, 22 of them are Federal office-holders,[xxxxx] more are recipients of Federal patronage, and the others represent a mass of politicians giving the document [xxxx][xxxxxx] mTheDcu8las Democrats are also active The Irish and German vote will mostly go with ths# branch of the party, but it is[xxxxxxxxx] to [xxxxxxxx] [xxxxx] [xxxx] [xx] the stronger. Thus far 17 IT newspapers have declared for DonGres, 13 for Base$- IaaIDGS and 9 remain non-committal, with even chances of going either way. Under these circumstances the Republicans entertain not unjustifiable hopes that the Democratic divisions may be so equal,- ly balanced as to give the State [xx] LIaCOLV.Same very [xxxxxxx] Bell and Everett meetings have been held in different parts of the State, bat thus far that party does not exhibit much rank sad ale air en.

还有这个,原书质量比较差:

reCAPTCHA验证码:悄悄为人类做贡献

看到这个,电脑就傻眼了,吐出来一堆乱七八糟不知所云的东西:

‘ letz-1- rrk fit: 1′ . on its to Vc ,rt, cann into tlm yc H_ tcr,la, .n. ‘l l; , arc ti:( h of thc 1″,ats that to ltc rc: ,;. , I; ., l: rel!;n. tani., , ./olio, IJuteilu, . 1!’i./_ ;lr”n. Iiam! Jr.r. F’l,nr_.Z.._%i;;, ,, : rt-Irn: am/ tf.rri.:, t?m steamer as a tr nW r. Uu ,tin;t, c ac?1 1″,at firm/ a t;nn, accor.liu; to .t rn. ‘Cl.w r. wu ru lm:nui MistinW /y in u;th, -. ink ;:,k as to “what w ax 1111, :111(I vle:iR a of ;: (,am( into, mnr r-, tm if tlm wo r( uu.i n:’ of t?u : la?:Iv. \ ‘c : ol in thc , ucr:atic , , Tlau :; will h:aw tu-li.r \. ’1′Im yap?tts Il ,,n an,/ I, ,rr:l. r, (,t tf,is r:ity, start witli it, with lu:rtic: ol \ 1- e:l.k.

看得懂吗?反正我是看不懂。

reCAPTCHA验证码的目的之一就是为了改变这种情况的。下边这张图可以很好的解释它的工作原理:1、扫描书籍;2、提取OCR无法识别的单词;3、进一步扭曲并加入随机横线来增强安全性;4、使用两个单词生成验证码来让用户识别。

reCAPTCHA验证码:悄悄为人类做贡献

有了它的帮助,第二张图片上面的文字就基本顺畅了(尽管还是有一点小错误):

The New-York State yacht Squadron, on its annual cruise to Newport came into the harbor yesterday afternoon. The following are the names of the boats that came to anchor here: Jessie, gera loliv erelun Annie, Mannering, Julia, Bonita, Magic wut, Rambler, floumblie, Henrietta, Sea-Drift and Maria, with the steamer America as a tender. On anchoring each boat fired a gun, according to custom. The reports were heard distinctly in the city, causing considerable inquiry as to "what was up", and quite a number of sanguine individuals came into our office to inquire if the guns were not annunciatory signals of the successful laying of the Atlantic Cable. We invariably replied in the negative. The squadron will leave to-day for Newport. The yachts Washington and buub r of this city, start with it, with parties of New Haven people.

有的人可能要问了,既然机器都看不明白,那他怎么判断你输对了还是错了呢?这个问题问得好,Google的解决方法也很绝:

两个验证码里面有一个是正确的,被人审核过的,而另一个是不正确的,机器读不出来的。当你把那个正确的输对以后,我们就会默认另外一个也是对的。这样,你每输入一次验证码,就为人类的知识宝库增加了一个单词。

 

文章纠错

  • 好文点赞
  • 水文反对

此文章为快科技原创文章,快科技网站保留文章图片及文字内容版权,如需转载此文章请注明出处:快科技

观点发布 网站评论、账号管理说明
热门评论
查看全部评论
相关报道

最热文章排行查看排行详情

邮件订阅

评论0 | 点赞0| 分享0 | 收藏0