ReCaptcha

From EduTech Wiki
Jump to: navigation, search
Low
Medium
High
Complete

Cs Portal > List of citizen science projects > reCaptcha - (2013/09/10)

No image.png
No image.png
CCLlogo.png
CCLlogo.png


IDENTIFICATION

Participant's homepage
  • Infrastructure:
  • Developed with:
Start date :
  • Beta start date : N/A
  • End date : Still open.
Subject

Description reCAPTCHA is a user-dialogue system originally developed by Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum at Carnegie Mellon University's main Pittsburgh campus, and acquired by Google in September 2009. Like the CAPTCHA interface, reCAPTCHA asks users to enter words seen in distorted text images onscreen. By presenting two words it both protects websites from bots attempting to access restricted areas and helps digitize the text of books. The reCAPTCHA service supplies subscribing websites with images of words that optical character recognition (OCR) software has been unable to read. The subscribing websites (whose purposes are generally unrelated to the book digitization project) present these images for humans to decipher as CAPTCHA words, as part of their normal validation procedures. They then return the results to the reCAPTCHA service, which sends the results to the digitization projects. Purpose reCAPTCHA has worked on digitizing the archives of The New York Times and books from Google Books. As of 2012, thirty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2013.

Wikipedia, retrieved July 2013 ? Research question

TEAM

MAIN TEAM LOCATION
Loading map...

Project team page Leader: Google Institution: Partner institutions: Contact:

USER TASKS

CONTRIBUTION TYPE: data interpretation
PARTICIPATION TYPOLOGY: crowdsourcing


GAMING GENRE NONE
GAMING ELEMENTS: NONE

COMPUTING
THINKING
SENSING
GAMING

Tasks description Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid. (Wikipedia, retrieved July 2013) Interaction with objects

Interface

  • Data type to manipulate: pictures
  • interface enjoyment:
  • Interface usability:

GUIDANCE

GUIDANCE
  • Tutorial: Somewhat
  • Peer to peer guidance: Somewhat
  • Training sequence: Somewhat
FEEDBACK ON
  • Individual performance: x
  • Collective performance: x
  • Research progress: Somewhat

Feedback and guidance description

COMMUNITY

COMMUNITY TOOLS
  • Communication: website
  • Social Network: N/A
  • Member profiles:: no
  • Member profile elements:
NEWS & EVENTS
  • Main news site:
  • Frequency of project news updates: N/A
  • Type of events:
  • Frequency of events :

Community description

  • Community size (volounteers based)
  • Role:
  • Interaction form:
  • Has official community manager(s): no
  • Has team work N/A
  • Other:
  • Community led additions:


Other information

1 PROJECT

Url:http://www.google.com/recaptcha
Start date:
End date: Still open


2 TEAM

Official team page:
Leader: Google




3 PROJECT DEFINITION


3.1 Subject

Engineering and technology > (other)

3.2 Description

reCAPTCHA is a user-dialogue system originally developed by Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum at Carnegie Mellon University's main Pittsburgh campus, and acquired by Google in September 2009. Like the CAPTCHA interface, reCAPTCHA asks users to enter words seen in distorted text images onscreen. By presenting two words it both protects websites from bots attempting to access restricted areas and helps digitize the text of books. The reCAPTCHA service supplies subscribing websites with images of words that optical character recognition (OCR) software has been unable to read. The subscribing websites (whose purposes are generally unrelated to the book digitization project) present these images for humans to decipher as CAPTCHA words, as part of their normal validation procedures. They then return the results to the reCAPTCHA service, which sends the results to the digitization projects.

3.3 Purpose.

reCAPTCHA has worked on digitizing the archives of The New York Times and books from Google Books. As of 2012, thirty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2013. Wikipedia, retrieved July 2013

3.4 .

4 ABOUT PARTICIPANT TASKS


4.1 Tasks description.

Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid. (Wikipedia, retrieved July 2013)

4.2 .

Grey typology Participation typology Contribution type:
Computing: NO Thinking: NO
Sensing: NO Gaming: NO
Crowdsourcing Distributed intelligence
Participatory science Extreme citizen science
Science outreach
Data collection
Data analysis
Data interpretation --------
Gaming
Genre: Gaming elements:
Interface
Data type to manipulate: pictures interface enjoyment:
Interface usability:
Member profiles::no
Member profile elements:


5 ABOUT GUIDANCE AND FEEDBACK


Guidance Feedback on
Tutorial and documentation: SOMEWHAT
Training sequence: SOMEWHAT
Peer to peer guidance: SOMEWHAT
individual performance: NO
collective performance: NO
research progress: Somewhat

5.1 .

6 COMMUNITY


Tools News & Events

Communication: website
Social Network: N/A

Main news site:
Frequency of project news updates: N/A
Type of events:
Frequency of events :

Community description

Community size (volounteers based):
Role: Interaction form:
Has official community manager(s): no
Has team work N/A

Other information about community:
Community led additions:

7 OTHER PROJECT INFORMATION




Yes [[has completion level::Low]




Yes

Engineering and technology other reCAPTCHA has worked on digitizing the archives of The New York Times and books from Google Books. As of 2012, thirty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2013.

Wikipedia, retrieved July 2013


reCaptcha Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid. (Wikipedia, retrieved July 2013) data interpretation none crowdsourcing pictures, other: Thinking: no Computing: no Sensing: no Gaming: no



N/A N/A N/A no no somewhat

no

no website N/A



N/A


N/A


Low

free text test;


Bibliography

BIBLIOGRAPHY


reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, Manuel Blum (2008)

http://dx.doi.org/10.1126/science.1160379
✄   Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum. 2008. "reCAPTCHA: Human-Based Character Recognition via Web Security Measures" Science 12 September 2008: Vol. 321 no. 589

reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, Manuel Blum (2008)

http://dx.doi.org/10.1126/science.1160379
Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum. 2008. "reCAPTCHA: Human-Based Character Recognition via Web Security Measures" Science 12 September 2008: Vol. 321 no. 589


High Transcription Accuracy.

http://www.google.com/recaptcha/digitizing
💬   (At google)

High Transcription Accuracy.

http://www.google.com/recaptcha/digitizing
(At google)


Developer's Guide.

https://developers.google.com/recaptcha/intro
💬   (At Google)

Developer's Guide.

https://developers.google.com/recaptcha/intro
(At Google)


Labeling Images with a Computer Game. Luis von Ahn, Laura Dabbish (2005)

http://www.cs.cmu.edu/~biglou/captcha cacm.pdf
✄   Luis von Ahn and Laura Dabbish. Labeling Images with a Computer Game. ACM Conf. on Human Factors in Computing Systems, CHI 2004. pp 319-326.

Labeling Images with a Computer Game. Luis von Ahn, Laura Dabbish (2005)

http://www.cs.cmu.edu/~biglou/captcha_cacm.pdf
Luis von Ahn and Laura Dabbish. Labeling Images with a Computer Game. ACM Conf. on Human Factors in Computing Systems, CHI 2004. pp 319-326.
Facts about "ReCaptcha"
Has Haklay typology levelcrowdsourcing +
Has additional information(At google) + and (At Google) +
Has authorLuis von Ahn +, Benjamin Maurer +, Colin McMillen +, David Abraham +, Manuel Blum + and Laura Dabbish +
Has citizen science subject areaother +
Has collective performance feedbackno +
Has community managerno +
Has community toolsWebsite +
Has completion levelLow +
Has data types to manipulatepictures +
Has date2,008 + and 2,005 +
Has individual performance feedbackno +
Has linkhttp://dx.doi.org/10.1126/science.1160379 +, http://www.google.com/recaptcha/digitizing +, https://developers.google.com/recaptcha/intro + and http://www.cs.cmu.edu/~biglou/captcha cacm.pdf +
Has member profilesno +
Has participant contribution typedata interpretation +
Has participant retributionnone +
Has participant task descriptionScanned text is subjected to analysis by t
Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid. (Wikipedia, retrieved July 2013)
ReCAPTCHA Wikipedia], retrieved July 2013) +
Has peer to peer guidanceN/A +
Has project access URLhttp://www.google.com/recaptcha +
Has project descriptionreCAPTCHA is a user-dialogue system origin
reCAPTCHA is a user-dialogue system originally developed by Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum at Carnegie Mellon University's main Pittsburgh campus, and acquired by Google in September 2009. Like the CAPTCHA interface, reCAPTCHA asks users to enter words seen in distorted text images onscreen. By presenting two words it both protects websites from bots attempting to access restricted areas and helps digitize the text of books. The reCAPTCHA service supplies subscribing websites with images of words that optical character recognition (OCR) software has been unable to read. The subscribing websites (whose purposes are generally unrelated to the book digitization project) present these images for humans to decipher as CAPTCHA words, as part of their normal validation procedures. They then return the results to the reCAPTCHA service, which sends the results to the digitization projects.
the results to the digitization projects. +
Has project namereCaptcha +
Has project news updatesN/A +
Has project purposereCAPTCHA has worked on digitizing the arc
reCAPTCHA has worked on digitizing the archives of The New York Times and books from Google Books. As of 2012, thirty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2013. Wikipedia, retrieved July 2013
/ReCAPTCHA Wikipedia], retrieved July 2013 +
Has publication typeResearch article + and other +
Has referenceLuis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum. 2008. "reCAPTCHA: Human-Based Character Recognition via Web Security Measures" Science 12 September 2008: Vol. 321 no. 589 + and Luis von Ahn and Laura Dabbish. Labeling Images with a Computer Game. ACM Conf. on Human Factors in Computing Systems, CHI 2004. pp 319-326. +
Has research progress feedbacksomewhat +
Has social software sitesN/A +
Has subject areaEngineering and technology +
Has team leaderGoogle +
Has team workN/A +
Has titlereCAPTCHA: Human-Based Character Recognition via Web Security Measures +, High Transcription Accuracy +, Developer's Guide + and Labeling Images with a Computer Game +
Has training sequenceN/A +
Has tutorials and documentationN/A +
Has volonteer computingno +
Has volonteer gamingno +
Has volonteer sensingno +
Has volonteer thinkingno +
Is opentrue +
Last editionSeptember 10, 2013 +