|
|
Line 11: |
Line 11: |
| |field_team_leadermm=Google | | |field_team_leadermm=Google |
| |field_participant_task_description=Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid. | | |field_participant_task_description=Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid. |
| | | ([http://en.wikipedia.org/wiki/ReCAPTCHA Wikipedia], retrieved July 2013) |
| [http://en.wikipedia.org/wiki/ReCAPTCHA Wikipedia], retrieved July 2013 | |
| |field_participant_retributions=none | | |field_participant_retributions=none |
| |field_Haklay_typology=crowdsourcing | | |field_Haklay_typology=crowdsourcing |
Line 62: |
Line 61: |
| |field_author=Luis von Ahn, Laura Dabbish | | |field_author=Luis von Ahn, Laura Dabbish |
| |field_date=2005 | | |field_date=2005 |
| |field_title= Labeling Images with a Computer Game | | |field_title=Labeling Images with a Computer Game |
| |field_reference=Luis von Ahn and Laura Dabbish. Labeling Images with a Computer Game. ACM Conf. on Human Factors in Computing Systems, CHI 2004. pp 319-326. | | |field_reference=Luis von Ahn and Laura Dabbish. Labeling Images with a Computer Game. ACM Conf. on Human Factors in Computing Systems, CHI 2004. pp 319-326. |
| |field_link=http://www.cs.cmu.edu/~biglou/captcha_cacm.pdf | | |field_link=http://www.cs.cmu.edu/~biglou/captcha_cacm.pdf |
| |field_publication_type=Research article | | |field_publication_type=Research article |
| }} | | }} |
Cs Portal > List of citizen science projects > reCaptcha - (2013/09/10)
THIS PAGE DESCRIBE A CITIZEN SCIENCE PROJECT
Start date :
- Beta start date : N/A
- End date : Still open.
⇳ Description
reCAPTCHA is a user-dialogue system originally developed by Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum at Carnegie Mellon University's main Pittsburgh campus, and acquired by Google in September 2009. Like the CAPTCHA interface, reCAPTCHA asks users to enter words seen in distorted text images onscreen. By presenting two words it both protects websites from bots attempting to access restricted areas and helps digitize the text of books. The reCAPTCHA service supplies subscribing websites with images of words that optical character recognition (OCR) software has been unable to read. The subscribing websites (whose purposes are generally unrelated to the book digitization project) present these images for humans to decipher as CAPTCHA words, as part of their normal validation procedures. They then return the results to the reCAPTCHA service, which sends the results to the digitization projects.
➠ Purpose
[[Has project purpose::reCAPTCHA has worked on digitizing the archives of The New York Times and books from Google Books. As of 2012, thirty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2013.
Wikipedia, retrieved July 2013]]
? Research question
MAIN TEAM LOCATION
Loading map...
{"minzoom":false,"maxzoom":false,"mappingservice":"leaflet","width":"300px","height":"270px","centre":false,"title":"","label":"","icon":"","lines":[],"polygons":[],"circles":[],"rectangles":[],"copycoords":false,"static":false,"zoom":false,"defzoom":14,"layers":["OpenStreetMap"],"image layers":[],"overlays":[],"resizable":false,"fullscreen":false,"scrollwheelzoom":true,"cluster":false,"clustermaxzoom":20,"clusterzoomonclick":true,"clustermaxradius":80,"clusterspiderfy":true,"geojson":"","clicktarget":"","imageLayers":[],"locations":[],"imageoverlays":null}
Project team page
Leader: Google
Institution:
Partner institutions:
Contact:
CONTRIBUTION TYPE: data interpretation
PARTICIPATION TYPOLOGY: crowdsourcing
GAMING GENRE NONE
GAMING ELEMENTS: NONE
◉ Tasks description
[[Has participant task description::Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid.
(Wikipedia, retrieved July 2013)]]
⤯ Interaction with objects
▣ Interface
- Data type to manipulate: pictures
- interface enjoyment:
- Interface usability:
GUIDANCE
- Tutorial: Somewhat
- Peer to peer guidance: Somewhat
- Training sequence: Somewhat
FEEDBACK ON
- Individual performance: x
- Collective performance: x
- Research progress: Somewhat
❂ Feedback and guidance description
COMMUNITY TOOLS
- Communication: website
- Social Network: N/A
- Member profiles:: no
- Member profile elements:
NEWS & EVENTS
- Main news site:
- Frequency of project news updates: N/A
- Type of events:
- Frequency of events :
⏣ Community description
- Community size (volounteers based)
- Role:
- Interaction form:
- Has official community manager(s): no
- Has team work N/A
- Other:
- Community led additions:
Other information
PROJECT
Url:http://www.google.com/recaptcha
Start date:
End date: Still open
TEAM
Official team page:
Leader: Google
PROJECT DEFINITION
Subject
Engineering and technology > (other)
Description
reCAPTCHA is a user-dialogue system originally developed by Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham and Manuel Blum at Carnegie Mellon University's main Pittsburgh campus, and acquired by Google in September 2009. Like the CAPTCHA interface, reCAPTCHA asks users to enter words seen in distorted text images onscreen. By presenting two words it both protects websites from bots attempting to access restricted areas and helps digitize the text of books. The reCAPTCHA service supplies subscribing websites with images of words that optical character recognition (OCR) software has been unable to read. The subscribing websites (whose purposes are generally unrelated to the book digitization project) present these images for humans to decipher as CAPTCHA words, as part of their normal validation procedures. They then return the results to the reCAPTCHA service, which sends the results to the digitization projects.
Purpose.
reCAPTCHA has worked on digitizing the archives of The New York Times and books from Google Books. As of 2012, thirty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2013.
Wikipedia, retrieved July 2013
.
ABOUT PARTICIPANT TASKS
Tasks description.
Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid.
(Wikipedia, retrieved July 2013)
.
Grey typology |
Participation typology |
Contribution type: |
|
Computing: |
NO |
Thinking: |
NO |
Sensing: |
NO |
Gaming: |
NO |
|
|
Crowdsourcing |
☑ |
Distributed intelligence |
☐ |
Participatory science |
☐ |
Extreme citizen science |
☐ |
Science outreach |
☐ |
| |
|
Data collection |
☐ |
Data analysis |
☐ |
Data interpretation |
☑-------- |
|
Gaming |
Genre: |
Gaming elements: |
Interface |
Data type to manipulate: pictures |
interface enjoyment: Interface usability: |
Member profiles::no Member profile elements: |
ABOUT GUIDANCE AND FEEDBACK
Guidance |
Feedback on |
Tutorial and documentation: |
SOMEWHAT |
Training sequence: |
SOMEWHAT |
Peer to peer guidance: |
SOMEWHAT |
|
individual performance: |
NO |
collective performance: |
NO |
research progress: |
Somewhat |
|
.
Tools |
News & Events |
Communication: website
Social Network: N/A
|
Main news site:
Frequency of project news updates: N/A
Type of events:
Frequency of events :
|
Community description |
Community size (volounteers based):
Role:
Interaction form:
Has official community manager(s): no
Has team work N/A
|
Other information about community:
Community led additions:
OTHER PROJECT INFORMATION
Yes
[[has completion level::Low]
Yes
Engineering and technology
other
[[Has project purpose::reCAPTCHA has worked on digitizing the archives of The New York Times and books from Google Books. As of 2012, thirty years of The New York Times had been digitized and the project planned to have completed the remaining years by the end of 2013.
Wikipedia, retrieved July 2013]]
reCaptcha
[[Has participant task description::Scanned text is subjected to analysis by two different optical character recognition programs. Their respective outputs are then aligned with each other by standard string-matching algorithms and compared both to each other and to an English dictionary. Any word that is deciphered differently by both OCR programs or that is not in the English dictionary is marked as "suspicious" and converted into a CAPTCHA. The suspicious word is displayed, out of context, along with a control word already known. The system assumes that if the human types the control word correctly, then the response to the questionable word is accepted as probably valid.
(Wikipedia, retrieved July 2013)]]
data interpretation
none
crowdsourcing
pictures, other:
Thinking: no
Computing: no
Sensing: no
Gaming: no
N/A
N/A
N/A
no
no
somewhat
no
no
website
N/A
N/A
N/A
Low
free text test;
Bibliography
reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, Manuel Blum (2008)
- ➥ http://dx.doi.org/10.1126/science.1160379
- ✄ Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum. 2008. "reCAPTCHA: Human-Based Character Recognition via Web Security Measures" Science 12 September 2008: Vol. 321 no. 589
reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, Manuel Blum (2008)
- http://dx.doi.org/10.1126/science.1160379
- Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum. 2008. "reCAPTCHA: Human-Based Character Recognition via Web Security Measures" Science 12 September 2008: Vol. 321 no. 589
Labeling Images with a Computer Game. Luis von Ahn, Laura Dabbish (2005)
- ➥ http://www.cs.cmu.edu/~biglou/captcha cacm.pdf
- ✄ Luis von Ahn and Laura Dabbish. Labeling Images with a Computer Game. ACM Conf. on Human Factors in Computing Systems, CHI 2004. pp 319-326.
Labeling Images with a Computer Game. Luis von Ahn, Laura Dabbish (2005)
- http://www.cs.cmu.edu/~biglou/captcha_cacm.pdf
- Luis von Ahn and Laura Dabbish. Labeling Images with a Computer Game. ACM Conf. on Human Factors in Computing Systems, CHI 2004. pp 319-326.