Gensim
The last edition of this page was on: 2016/12/23
The Completion level of this page is : Low
The last edition of this page was on: 2016/12/23 The Completion level of this page is : Low
SHORT DESCRIPTION
[[has description::Quote from the about page (12/2016): Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”). I also wanted to try these fancy “Latent Semantic Methods”, but the libraries that realized the necessary computation were not much fun to work with.
By now, gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.]]
TOOL CHARACTERISTICS
Usability
Tool orientation
Data mining type
Manipulation type
IMPORT FORMAT :
EXPORT FORMAT :
Tool objective(s) in the field of Learning Sciences | |
☑ Analysis & Visualisation of data |
☑ Providing feedback for supporting instructors: |
Tool can perform:
- Data extraction of type:
- Transformation of type:
- Data analysis of type:
- Data visualisation of type: (These visualisations can be interactive and updated in "real time")
ABOUT USERS
Tool is suitable for:
Required skills:
STATISTICS: Medium
PROGRAMMING: N/A
SYSTEM ADMINISTRATION: N/A
DATA MINING MODELS: Medium
FREE TEXT
Tool version : Gensim (blank line) Developed by : |
SHORT DESCRIPTION
Quote from the about page (12/2016): Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”). I also wanted to try these fancy “Latent Semantic Methods”, but the libraries that realized the necessary computation were not much fun to work with.
By now, gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.
TOOL CHARACTERISTICS
Tool orientation | Data mining type | Usability |
---|---|---|
This tool is designed for general purpose analysis. | This tool is designed for Text mining. | Authors of this page consider that this tool is . |
Data import format | Data export format |
---|---|
. | . |
Tool objective(s) in the field of Learning Sciences | |
☑ Analysis & Visualisation of data |
☑ Providing feedback for supporting instructors: |
Can perform data extraction of type:
Can perform data transformation of type:
Can perform data analysis of type:
Can perform data visualisation of type:
(These visualisations can be interactive and updated in "real time")
ABOUT USER
Tool is suitable for: | ||||
Students/Learners/Consumers:☑ | Teachers/Tutors/Managers:☑ | Researchers:☑ | Organisations/Institutions/Firms:☑ | Others:☑ |
Required skills: | |||
Statistics: MEDIUM | Programming: | System administration: | Data mining models: MEDIUM |
OTHER TOOL INFORMATION
Gensim |
[[has description::Quote from the about page (12/2016): Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”). I also wanted to try these fancy “Latent Semantic Methods”, but the libraries that realized the necessary computation were not much fun to work with.
By now, gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.]] |
General analysis |
Medium |
N/A |
N/A |
Medium |
Text mining |
Low |
@inproceedings{rehurek_lrec,
title = Template:Software Framework for Topic Modelling with Large Corpora, author = {Radim {\v R}eh{\r u}{\v r}ek and Petr Sojka}, booktitle = {{Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks}}, pages = {45--50}, year = 2010, month = May, day = 22, publisher = {ELRA}, address = {Valletta, Malta}, note={\url{http://is.muni.cz/publication/884893/en}}, language={English}
}