MeTA: Difference between revisions
Jump to navigation
Jump to search
(Created page with "{{Data mining and learning analytics tools |field_name=MeTA: ModErn Text Analysis |field_description=text tokenization, including deep semantic features like parse trees...") |
mNo edit summary |
||
Line 1: | Line 1: | ||
{{Data mining and learning analytics tools | {{Data mining and learning analytics tools | ||
|field_name=MeTA: ModErn Text Analysis | |field_name=MeTA: ModErn Text Analysis | ||
|field_description=text tokenization, including deep semantic features like parse trees | |field_developers=ChengXiang Zhai et al. University of Illinois at Urbana -Champaign | ||
|field_description=Features: | |||
* text tokenization, including deep semantic features like parse trees | |||
* inverted and forward indexes with compression and various caching strategies | |||
* a collection of ranking functions for searching the indexes | |||
* topic models | |||
* classification algorithms | |||
* graph algorithms | |||
* language models | |||
* CRF implementation (POS-tagging, shallow parsing) | |||
* wrappers for liblinear and libsvm (including libsvm dataset parsers) | |||
* UTF8 support for analysis on various languages | |||
* multithreaded algorithms | |||
|field_analysis_orientation=General analysis | |field_analysis_orientation=General analysis | ||
|field_mining_tool_type=Text mining | |field_mining_tool_type=Text mining | ||
Line 36: | Line 39: | ||
Another design philosophy of MeTA is to facilitate education and research experiments with various algorithms. In this direction, it is similar to Indri/Lemur in its emphasis on modularity and extensibility achieved through object-oriented design. It enables flexible configuration of a selected subset of modules so as to make it easy for designing course assignments or experimenting with a few selected algorithms as needed in focused research projects. For example, it has been successfully used in a MOOC on Text Retrieval and Search Engines where over one thousand Coursera learners have used the toolkit to finish a large programming assignment. It will be used again for supporting programming assignments for another upcoming MOOC on Text Mining and Analytics. | Another design philosophy of MeTA is to facilitate education and research experiments with various algorithms. In this direction, it is similar to Indri/Lemur in its emphasis on modularity and extensibility achieved through object-oriented design. It enables flexible configuration of a selected subset of modules so as to make it easy for designing course assignments or experimenting with a few selected algorithms as needed in focused research projects. For example, it has been successfully used in a MOOC on Text Retrieval and Search Engines where over one thousand Coursera learners have used the toolkit to finish a large programming assignment. It will be used again for supporting programming assignments for another upcoming MOOC on Text Mining and Analytics. | ||
== Bibliography == | |||
Massung, S., Geigle, C., & Zhai, C. (2016). META: A Unified Toolkit for Text Retrieval and Analysis. ACL 2016, 91.https://www.aclweb.org/anthology/P/P16/P16-4.pdf#page=103 | |||
ChengXiang Zhai (2011). Beyond Search: Statistical Topic Models for Text Analysis (slides. https://meta-toolkit.org/sigir-keynote-zhai.pdf |