LightSide: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
mNo edit summary |
||
Line 39: | Line 39: | ||
'''Short how to''' | '''Short how to''' | ||
Lightside is based on machine learning algorightms that can learn to tag text from examples. | |||
Below is a longer quote from the [http://ankara.lti.cs.cmu.edu/side/LightSide_Researchers_Manual.pdf Manual (feb 2014): | Below is a longer, slightly modified quote from the [http://ankara.lti.cs.cmu.edu/side/LightSide_Researchers_Manual.pdf Manual (feb 2014): | ||
{{quotation| LightSide is divided into a series of six tabs following the entire process of machine learning. In the first, Extract Features, training documents are converted into feature tables. Next, in Restructure Plugins, we have built several tools which allow users to manually adjust the resulting feature tables. | {{quotation| LightSide is divided into a series of six tabs following the entire process of machine learning. In the first, '''Extract Features''', training documents are converted into feature tables. Next, in '''Restructure Plugins''', we have built several tools which allow users to manually adjust the resulting feature tables. In '''Build Model''', the third tab, modern algorithms are used to discover latent patterns in that feature table. The classifier that results is able to reproduce human annotation.}} | ||
In Build Model, the third tab, modern algorithms are used to discover latent patterns in that feature table. The classifier that results is able to reproduce | |||
human annotation. | |||
The next three tabs allow users to explore those trained models and use them to annotate new data. In the fourth tab, Explore Results, offers error analysis tools that allow researchers to understand what their models do well and why they fail in some cases. The fifth, Compare Results, allows users to look at specific differences between two different trained models to understand both gaps | {{quotation|The next three tabs allow users to explore those trained models and use them to annotate new data. In the fourth tab, '''Explore Results''', offers error analysis tools that allow researchers to understand what their models do well and why they fail in some cases. The fifth, '''Compare Results''', allows users to look at specific differences between two different trained models to understand both gaps in performance as a whole and individually. The final tab, '''Predict Labels''', allows us to use the resulting trained models to annotate new data that no humans have labeled.}} | ||
in performance as a whole and individually. The final tab, Predict Labels, allows us to use the resulting trained models to annotate new data that no humans have labeled. | |||
The simplest workflow, for those with basic machine learning needs, comes from the first and third tabs. In each case we progress from an input data structure to an output data structure: | {{quotation|The simplest workflow, for those with basic machine learning needs, comes from the first and third tabs. In each case we progress from an input data structure to an output data structure: | ||
''Documents → Extract Features → Feature Table → Build Model → Trained Model''}} | ''Documents → Extract Features → Feature Table → Build Model → Trained Model''}} | ||