Search by property

Jump to navigation Jump to search

This page provides a simple browsing interface for finding entities described by a property and a named value. Other available search interfaces include the page property search, and the ask query builder.

Search by property

A list of all pages that have property "Has description" with value "Online tools to assist in the conversion of JSON to CSV.". Since there have been only a few results, also nearby values are displayed.

Showing below up to 57 results starting with #1.

View (previous 500 | next 500) (20 | 50 | 100 | 250 | 500)


    

List of results

    • Orange  + (Open source data visualization and analysiOpen source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics.</br></br>Various addons like [[Orange Textable ]] expand functionality of this software[[Orange Textable ]] expand functionality of this software)
    • OpenSesame  + (OpenSesame is a graphical, open-source expOpenSesame is a graphical, open-source experiment builder for the social sciences. It sports a modern and intuitive user interface that allows you to build complex experiments with a minimum of effort. With OpenSesame you can create a wide range of experiments. The plug-in framework and [[Python]] scripting allow you to incorporate external devices, such as eye trackers, response boxes, and parallel port devices, into your experiment.</br></br>OpenSesame is freely available under the General Public Licence.vailable under the General Public Licence.)
    • Piwik  + (Piwik is an open source web analytics platPiwik is an open source web analytics platform. </br></br>Piwik displays reports regarding the geographic location of visits, the source of visits (i.e. whether they came from a website, directly, or something else), the technical capabilities of visitors (browser, screen size, operating system, etc.), what the visitors did (pages they viewed, actions they took, how they left), the time of visits and more.</br></br>In addition to these reports, Piwik provides some other features that can help users analyze the data Piwik accumulates, such as:</br></br>*Annotations — the ability to save notes (such as one's analysis of data) and attach them to dates in the past.</br>*Transitions — a feature similar to Click path-like features that allows one to see how visitors navigate a website, but different in that it only displays navigation information for one page at a time.</br>*Goals — the ability to set goals for actions it is desired for visitors to take (such as visiting a page or buying a product). Piwik will track how many visits result in those actions being taken.</br>*E-commerce — the ability to track if and how much people spend on a website.</br>*Page Overlay — a feature that displays analytics data overlaid on top of a website.</br>*Row Evolution — a feature that displays how metrics change over time within a report.</br>*Custom Variables — the ability to attach data, like a user name, to visit data.ach data, like a user name, to visit data.)
    • QDA Miner  + (QDA Miner is qualitative "mixed methods" dQDA Miner is qualitative "mixed methods" data analysis package. There are two version:</br>* A [http://provalisresearch.com/products/qualitative-data-analysis-software/freeware/ free QDA Miner Lite] Version</br>* An expensive commercial version</br></br>Quote from the official [http://provalisresearch.com/products/qualitative-data-analysis-software/ product page]: <span style="background-color:#eeeeee" class="citation">“DA Miner is an easy-to-use qualitative data analysis software package for coding, annotating, retrieving and analyzing small and large collections of documents and images. QDA Miner qualitative data analysis tool may be used to analyze interview or focus group transcripts, legal documents, journal articles, speeches, even entire books, as well as drawings, photographs, paintings, and other types of visual documents. Its seamless integration with SimStat, a statistical data analysis tool, and [[WordStat]], a quantitative content analysis and text mining module, gives you unprecedented flexibility for analyzing text and relating its content to structured information including numerical and categorical data.”</span>erical and categorical data.”</span>)
    • WordSmith  + (Quotation from the [http://www.lexically.nQuotation from the [http://www.lexically.net/downloads/version6/HTML/index.html?getting_started.htm getting started page] (11/2014): <span style="background-color:#eeeeee" class="citation">“ WordSmith Tools is an integrated suite of programs for looking at how words behave in texts. You will be able to use the tools to find out how words are used in your own texts, or those of others. </br></br>The WordList tool lets you see a list of all the words or word-clusters in a text, set out in alphabetical or frequency order. The concordancer, Concord, gives you a chance to see any word or phrase in context -- so that you can see what sort of company it keeps. With KeyWords you can find the key words in a text. The tools have been used by Oxford University Press for their own lexicographic work in preparing dictionaries, by language teachers and students, and by researchers investigating language patterns in lots of different languages in many countries world-wide.”</span>n many countries world-wide.”</span>)
    • KNOT  + (Quote from the [http://interlinkinc.net/KNQuote from the [http://interlinkinc.net/KNOT.html software home page] (11(2014):</br></br><div style="padding:2px;border-style:dotted;border-width:thin;margin-left:1em;margin-right:1em;margin-top:0.5ex;margin-bottom:0.5ex;"></br>The Knowledge Network Organizing Tool (KNOT) is built around the Pathfinder network generation algorithm. There are also several other components (see below). Pathfinder algorithms take estimates of the proximities between pairs of items as input and define a network representation of the items. The network (a PFNET) consists of the items as nodes and a set of links (which may be either directed or undirected for symmetrical or non-symmetrical proximity estimates) connecting pairs of the nodes. The set of links is determined by patterns of proximities in the data and parameters of Pathfinder algorithms. For details on the method and its applications see R. Schvaneveldt (Editor), Pathfinder Associative Networks: Studies in Knowledge Organization. Norwood, NJ: Ablex, 1990.</br></br>The Pathfinder software includes several programs and utilities to facilitate Pathfinder network analyses of proximity data. The system is oriented around producing pictures of the solutions, but representations of networks and other information are also available in the form of text files which can be used with other software. The positions of nodes for displays are computed using an algorithm described by Kamada and Kawai (1989, Information Processing Letters, 31, 7-15).</div>Information Processing Letters, 31, 7-15).</div>)
    • Orange Textable  + (Quote from the [http://langtech.ch/textablQuote from the [http://langtech.ch/textable Textable] (oct. 2, 2014)</br></br>Orange Textable is an open-source software tool for building data tables on the basis of raw text sources. Look at the following example to see it in typical action. Orange Textable offers the following features:</br></br>* text data import from keyboard, files, or urls</br>* systematic recoding</br>* segmentation and annotation of various text units</br>* extract and exploit XML-encoded annotations</br>* automatic, random, and arbitrary selection of unit subsets</br>* unit context examination using concordance and collocation tables</br>* frequency and complexity measures</br>* recoded text data and table exportsures * recoded text data and table export)
    • Gensim  + (Quote from the [http://radimrehurek.com/geQuote from the [http://radimrehurek.com/gensim/about.html about page] (12/2016): Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”). I also wanted to try these fancy “Latent Semantic Methods”, but the libraries that realized the necessary computation were not much fun to work with.</br></br>By now, gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.at take forever just to run “hello world”.)
    • Textalyser  + (Quote from the [http://textalyser.net/ homQuote from the [http://textalyser.net/ home page]: <span style="background-color:#eeeeee" class="citation">“Welcome to the online text analysis tool, the detailed statistics of your text, perfect for translators (quoting), for webmasters (ranking) or for normal users, to know the subject of a text. Now with new features as the analysis of words groups, finding out the keyword density, analyse the prominence of word or expressions.”</span>ence of word or expressions.”</span>)
    • Bitext  + (Quote from the [http://www.bitext.com/ homQuote from the [http://www.bitext.com/ home page] (11/2014): Bitext provides B2B multilingual semantic engines with “documentably” the highest accuracy in the market. Bitext works for companies in two main markets: Text Analytics (Concept and Entity Extraction, Sentiment Analysis) for Social CRM, Enterprise Feedback Management or Voice of the Customer; and in Natural Language Interfaces for Search Engines.al Language Interfaces for Search Engines.)
    • Juxta  + (Quote from the [http://www.juxtasoftware.oQuote from the [http://www.juxtasoftware.org/about/ About Page] (11/2014): <span style="background-color:#eeeeee" class="citation">“Juxta is an open-source tool for comparing and collating multiple witnesses to a single textual work. Originally designed to aid scholars and editors examine the history of a text from manuscript to print versions, Juxta offers a number of possibilities for humanities computing and textual scholarship. [...] As a standalone desktop application, Juxta allows users to complete many of the necessary operations of textual criticism on digital texts (TXT and XML). With this software, you can add or remove witnesses to a comparison set, switch the base text at will. Once you’ve collated a comparison, Juxta also offers several kinds of analytic visualizations. By default, it displays a heat map of all textual variants and allows the user to locate — at the level of any textual unit — all witness variations from the base text. Users can switch to a side by side collation view, which gives a split frame comparison of a base text with a witness text. A histogram of Juxta collations is particularly useful for long documents; this visualization displays the density of all variation from the base text and serves as a useful finding aid for specific variants.”</span>g aid for specific variants.”</span>)
    • ALA-Reader  + (Quote from the [http://www.personal.psu.edQuote from the [http://www.personal.psu.edu/rbc4/score.htm software home page] (11/2014): Here is a software tool that can translate written text summaries directly into proximity files (prx) that can be analyzed by [http://interlinkinc.net/KNOT.html Pathfinder KNOT]. It also generates text proposition files that can be imported by [http://cmap.ihmc.us/download/ CMAP Tools] to automatically form concept maps from the text. It should be of use to researchers who want to visualize "text" for various instructional and research-related reasons. Also it should work with different languages.</br></br>ALA-Reader contains a rudimentary scoring system. Essentially, this tool converts the written summary into a cognitive map and then scores the cognitive map using an approach that we developed for scoring concept maps. The "score" produced is percent agreement with an expert referent. As I narrow down what algorithms work, then I plan to release updated versions periodically. to release updated versions periodically.)
    • Lexico  + (Quote from the [http://www.tal.univ-paris3Quote from the [http://www.tal.univ-paris3.fr/lexico/index-gb.htm Home page]: <span style="background-color:#eeeeee" class="citation">“Lexico3 is the 2001 edition of the Lexico software, first published in 1990. Functions present from the first version (segmentation, concordances, breakdown in graphic form, characteristic elements and factorial analyses of repeated forms and segments) were maintained and for the most part significantly improved. The Lexico series is unique in that it allows the user to maintain control over the entire lexicometric process from initial segmentation to the publication of final results. Beyond identification of graphic forms, the software allows for study of the identification of more complex units composed of form sequences: repeated segments, pairs of forms in co-occurrences, etc which are less ambiguous than the graphic forms that make them up.”</span></br></br>A free version is available for "personal work", bottom of [http://www.tal.univ-paris3.fr/lexico/download.htm this page]v-paris3.fr/lexico/download.htm this page])
    • Wordcruncher  + (Quote from the [http://www.wordcruncher.coQuote from the [http://www.wordcruncher.com/index.html home page]: <span style="background-color:#eeeeee" class="citation">“WordCruncher is a free eBook reader with research tools to help students and scholars study important texts.</br>* You can look for specific references, search for words or phrases, follow cross-reference hyperlinks, and enlarge images.</br>* You can copy and paste text, add bookmarks, highlight text, and make searchable notes.</br>* Additional study aids include complex searches, word frequencies, word frequency distributions, synchronized windows to compare translations, word tags, and various text analysis reports (e.g., collocation, vocabulary dispersion, vocabulary usage). ”</span>persion, vocabulary usage). ”</span>)
    • Netlytic  + (Quote from the [https://netlytic.org/ home page] (11/2014): Netlytic is a cloud-based text and social networks analyzer that can automatically summarize and discover social networks from online conversations on social media sites.)
    • Meaning Cloud  + (Quote from the home page: <span style="Quote from the home page: <span style="background-color:#eeeeee" class="citation">“Textalytics is a text analysis engine that extracts meaningful elements from any type of content and structures it, so that you can easily process and manage it. Textalytics features a set of high-level web services — adaptable to the characteristics of every type of business — which can be flexibly integrated into your processes and applications.”</span> processes and applications.”</span>)
    • Lexos  + (Quote from the home page: <span style="Quote from the home page: <span style="background-color:#eeeeee" class="citation">“This web-based tool enables you to "scrub" (clean) your unicode text(s), cut a text(s) into various size chunks, manage chunks and chunk sets, tokenize with character- or word- Ngrams or TF-IDF weighting, and choose from a suite of analysis tools for investigating those texts. Functionality includes building dendrograms, making graphs of rolling averages of word frequencies or ratios of words or letters, and playing with visualizations of word frequencies including word clouds and bubble visualizations. To facilitate subsequent text mining analyses beyond the scope of this site, users can also transpose and download their matricies of word counts or relative proportions as comma- or tab-separated files (.csv, .tsv).”</span>eparated files (.csv, .tsv).”</span>)
    • KoRpus  + (Quote: <span style="background-color:#eQuote: <span style="background-color:#eeeeee" class="citation">“koRpus is an R package i originally wrote to measure similarities/differences between texts. over time it grew into what it is now, a hopefully versatile tool to analyze text material in various ways, with an emphasis on scientific research, including readability and lexical diversity features.”</span> lexical diversity features.”</span>)
    • OpenRefine  + (Quote: OpenRefine (formerly Google Refine)Quote: OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases like Freebase. ([http://openrefine.org/ ], oct 2. 2014). ([http://openrefine.org/ ], oct 2. 2014))
    • Stanford NLP toolkits  + (Quote: The Stanford NLP Group makes parts Quote: The Stanford NLP Group makes parts of our Natural Language Processing software available to everyone. These are statistical NLP toolkits for various major computational linguistics problems. They can be incorporated into applications with human language technology needs. ([http://nlp.stanford.edu/software/index.shtml])://nlp.stanford.edu/software/index.shtml]))
    • Tweet NLP  + (Quote: We provide a tokenizer, a part-of-speech tagger, hierarchical word clusters, and a dependency parser for tweets, along with annotated corpora and web-based annotation tools.)
    • TOko  + (Quoted from the tOko homepage (oct 2014) *Quoted from the tOko homepage (oct 2014)</br>* tOKo is an open source tool for text analysis and browsing a corpus of documents. It implements a wide variety of text analysis and browsing functions in an interactive user interface.</br>* An important application area of tOKo is ontology development. It supports both ontology construction from a corpus, as well as relating the ontology back to a corpus (for example by highlighting concepts from the ontology in a document).</br>* Another application area is community research. Here the objective is to analyse the exchange of information, for example in a community forum or through a collection of interconnected weblogs.gh a collection of interconnected weblogs.)
    • Redash  + (Quotes from the [https://redash.io/help/abQuotes from the [https://redash.io/help/aboutrd/aboutrd.html#whats_redash FAQ]: Redash is an open source tool for teams to query, visualize and collaborate. Redash is quick to setup and works with any data source you might need so you can query from anywhere in no time. [..] Redash was built to allow fast and easy access to billions of records, that we process and collect using Amazon Redshift (“petabyte scale data warehouse” that “speaks” PostgreSQL). Today Redash has support for querying multiple databases, including: Redshift, Google BigQuery,Google Spreadsheets, PostgreSQL, MySQL, Graphite, Axibase Time Series Database and custom scripts.</br></br>Main features:</br>* Query editor - enjoy all the latest standards like auto-complete and snippets. Share both your results and queries to support an open and data driven approach within the organization.</br>* Visualization - once you have your dataset, select one of our /9 types of visualizations/ for your query. You can also export or embed it anywhere.</br>* Dashboard - combine several visualizations into a topic targeted dashboard.</br>* Alerts - get notified via email, Slack, Hipchat or a webhook when your query's results need attention.</br>" API - anything you can do with the UI, you can do with the API. Easily connect results to other systems or automate your workflows. other systems or automate your workflows.)
    • R  + (R is a language and environment for statisR is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.</br></br>R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. </br></br>R is available as Free Software for data manipulation, calculation and graphical display. It includes</br></br>*an effective data handling and storage facility,</br>*a suite of operators for calculations on arrays, in particular matrices,</br>*a large, coherent, integrated collection of intermediate tools for data analysis,</br>*graphical facilities for data analysis and display either on-screen or on hardcopy, and</br>*a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities. </br></br>R can be considered as an environment within which statistical techniques are implemented. R can be extended via packages. For example, try:</br>* [http://rqda.r-forge.r-project.org/ RQDA]</br>* [http://cran.at.r-project.org/web/views/NaturalLanguageProcessing.html CRAN Task View: Natural Language Processing]l CRAN Task View: Natural Language Processing])
    • RapidAnalytics  + (RapidAnalytics is an open source server for data mining and business analytics. It is based on the data mining solution RapidMiner and includes ETL, data mining, reporting, dashboards in a single server solution.)
    • RapidMiner Studio  + (RapidMiner is a world-leading open-source RapidMiner is a world-leading open-source system for data mining. It is available as a stand-alone application for data analysis and as a data mining engine for the integration into own products. '''RapidMiner is now RapidMiner Studio''' and RapidAnalytics is now called RapidMiner Server.</br></br>In a few words, RapidMiner Studio is a "downloadable GUI for machine learning, data mining, text mining, predictive analytics and business analytics". It can also be used (for most purposes) in batch mode (command line mode).</br></br>[[User:Camacab0|Camacab0]] ([[User talk:Camacab0|talk]])[[User talk:Camacab0|talk]]))
    • Beestar insight  + (Real-time system that automatically collecReal-time system that automatically collects student engagement and attendance & provides analytics tools and dashboards for students, teachers & management.</br></br>See :[http://startups.fm/2013/03/25/5-pressing-educational-problems-beestars-location-intelligence-platform-solves.html 5 pressing educational problems Beestar’s Location Intelligence Platform solves]s Beestar’s Location Intelligence Platform solves])
    • SQ-ALL  + (Run SQL queries on APIs, JSON / XML / RSS feeds, Web pages (tables), EVERYTHING!)
    • SAM  + (SAM includes a set of visualizations of learner activities to increase awareness and to support self-reflection. These are implemented as widgets in the ROLE project)
    • SATO  + (SATO is a multi-purpose text mining tool, SATO is a multi-purpose text mining tool, e.g. it includes concordancing, lexical inventoring, annotation and categorization. It allows to mark up text with variables for further analysis.</br></br>SATO is a web-based text analysis tool using a command line language.</br></br>So far, only a french interface exists.</br></br>A commercial version exists, i.e. you can buy a license to install the same system on your own server.nstall the same system on your own server.)
    • SNAPP  + (SNAPP essentialy serves as a diagnostic instrument, allowing teaching staff to evaluate student behavorial patterns against learning learning activity design objectives and intervene as required in a timely manner.)
    • Scrapy  + (Scrapy is a fast high-level screen scrapinScrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.</br>Features</br>*Scrapy was designed with simplicity in mind, by providing the features you need without getting in your way</br>*Just write the rules to extract the data from web pages and let Scrapy crawl the entire web site for you</br>*Scrapy is used in production crawlers to completely scrape more than 500 retailer sites daily, all in one server</br>*Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core </br>*Scrapy is completely written in [[Python]] and runs on Linux, Windows, Mac and BSD</br>*Scrapy comes with lots of functionality built in. Check this section of the documentation for a list of them.</br>*Scrapy is extensively documented and has an comprehensive test suite with very good code coverageve test suite with very good code coverage)
    • Semantic Drilldown  + (Semantic Drilldown is an extension to SemaSemantic Drilldown is an extension to Semantic MediaWiki (SMW) that provides a page for drilling down through a site's data, using categories and filters on semantic properties. The list of pages in each top-level category can be viewed, and for each such category, filters can be created that cover a specific semantic property. If filters exist for a category, users can click on the different possible values for those filters, narrowing the set of results, and thus drill down through the data.lts, and thus drill down through the data.)
    • Semantic Forms Inputs  + (Semantic Forms Inputs is an extension to MediaWiki that provides additional input types for Semantic MediaWikis that use the Semantic Forms extension.)
    • Page Forms  + (Semantic Forms is an extension to MediaWiki that allows users to add, edit and query data using forms. It is heavily tied in with the Semantic MediaWiki extension, and is meant to be used for structured data that has semantic markup.)
    • Semantic Maps  + (Semantic Maps is an extension to Semantic MediaWiki (SMW) that adds semantic capabilities to the Maps extension and adds the datatype Geographic coordinate.)
    • Semantic MediaWiki  + (Semantic MediaWiki is an extension for managing structured data in your wiki and for querying that data to create dynamic representations: tables, timelines, maps, lists, etc.)
    • Semantic Result Formats  + (Semantic Result Formats (SRF) is a MediaWiSemantic Result Formats (SRF) is a MediaWiki extension, used in conjunction with the Semantic MediaWiki extension, that bundles a number of further result formats for SMW's inline queries. The individual formats can be added to the installation independently...added to the installation independently...)
    • Taporware  + (TAPoRware is a set of text analysis tools TAPoRware is a set of text analysis tools that enables users to perform text analysis on HTML, XML and plain text files, using documents from the users' machine or on the web.</br></br>There are five families of tools: for HTML, XML, Text, Other and Beta. A list is included below in the free text section.s included below in the free text section.)
    • Tableau Public  + (Tableau software helps people communicate Tableau software helps people communicate data through an innovation called VizQL, a visual query language that converts drag-and-drop actions into data queries, allowing users to quickly find and share insights in their data. With Tableau, “data workers” first connect to data stored in files, cubes, databases, warehouses, Hadoop technologies, and even some cloud sources like Google Analytics. They then interact with the Tableau user interface to simultaneously query the data and view the results in charts, graphs, and maps that can be arranged together on dashboards. ([http://shop.oreilly.com/product/0636920030942.do Jones, 2014]: 15)</br></br>Basically, one has to install a desktop application (Win/Mac) and create a visualization. The result then can be published either on their public server or on your own server (commercial).server or on your own server (commercial).)
    • Tabula  + (Tabula is a free, open source tool that alTabula is a free, open source tool that allows you to easily take data out of PDF files and into Excel, database programs, and web applications. Tabula allows users to upload their documents, indicate the position of the tables they want and extract the data right into Comma Separated Variable (CSV) or Tab Separated Variable (TSV) file, or just copy the text as CSV to a clipboard. Tabula can repeat operation on several pages or documents.t operation on several pages or documents.)
    • TextSTAT  + (TextSTAT is simple text analysis program. TextSTAT is simple text analysis program. It's main functionality is concordance.</br></br>Quote from the [http://neon.niederlandistik.fu-berlin.de/en/textstat/ home page] (11/2014): <span style="background-color:#eeeeee" class="citation">“TextSTAT is a simple programme for the analysis of texts. It reads plain text files (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. This version includes a web-spider which reads as many pages as you want from a particular website and puts them in a TextSTAT-corpus. The new news-reader, too, puts news messages in a TextSTAT-readable corpus file.</br>TextSTAT reads MS Word and OpenOffice files. No conversion needed, just add the files to your corpus... ”</span>the files to your corpus... ”</span>)
    • Apache OpenNLP  + (The Apache OpenNLP library is a machine leThe Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.</br></br>It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning.opy and perceptron based machine learning.)
    • Dragon ToolKit  + (The Dragon Toolkit is a Java-based developThe Dragon Toolkit is a Java-based development package for academic use in information retrieval (IR) and text mining (TM, including text classification, text clustering, text summarization, and topic modeling). It is tailored for researchers who work on large-scale IR and TM and prefer Java programming. Moreover, different from Lucene and Lemur, it provides built-in supports for semantic-based IR and TM. The dragon toolkit seamlessly integrates a set of NLP tools, which enable the toolkit to index text collections with various representation schemes including words, phrases, ontology-based concepts and relationships. ([dragon.ischool.drexel.edu/], retrieved March 2014)school.drexel.edu/], retrieved March 2014))
    • Learning Analytics Enriched Rubric  + (The Learning Analytics Enriched Rubric (LAThe Learning Analytics Enriched Rubric (LA e-Rubric) is an advanced grading method used for criteria-based assessment. As a rubric, it consists of a set of criteria. For each criterion, several descriptive levels are provided. A numerical grade is assigned to each of these levels.</br></br>An enriched rubric contains some criteria and related grading levels that are associated to data from the analysis of learners’ interaction and learning behavior in a Moodle course, such as number of post messages, times of accessing learning material, assignments grades and so on.</br></br>Using learning analytics from log data that concern collaborative interactions, past grading performance and inquiries of course resources, the LA e-Rubric can automatically calculate the score of the various levels per criterion. The total rubric score is calculated as a sum of the scores per each criterion.as a sum of the scores per each criterion.)
    • Mediawiki  + (The first version of the software was deployed to serve the needs of the free content Wikipedia encyclopedia in 2002. It has been deployed since then in tens of thousands other websites for all sorts of purposes.)
    • Semilar  + (The goal of the SEMantic simILARity softwaThe goal of the SEMantic simILARity software toolkit (SEMILAR; pronounced the same way as the word 'similar') is to promote productive, fair, and rigorous research advancements in the area of semantic similarity. The kit is available as application software or as Java API.</br></br>As of March 2014, the GUI-based SEMILAR application is only available to a limited number of users who commit to help improving the usability of the interface. The JAVA libray (API) however, can be downloaded.</br></br>SEMILAR comes with various similarity methods based on Wordnet, Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), BLEU, Meteor, Pointwise Mutual Information (PMI), Dependency based methods, optimized methods based on Quadratic Assignment, etc. And the similarity methods work in different granularities - word to word, sentence to sentence, or bigger texts. Some methods have their own variations which coupled with parameter settings and your selection of preprocessing steps could result in a huge space of possible instances of the same basic method.ssible instances of the same basic method.)
    • Mediawiki collection extension installation  + (This extension makes it possible to collect a number of pages. Collections can be edited, persisted and optionally retrieved as PDF, ODF or DocBook (XML))
    • Tropes  + (Tropes is a free text-analysis(text miningTropes is a free text-analysis(text mining) software . Tropes include its ability to carry out stylistic, syntactic and semantic analyses and to present the results in graph and table form. Tropes can yield information about a text such as stylistic/rhetorical analyses (argumentative, enunciative, descriptive or narrative style). It can also identify different word categories (verbs, connectors, personal pronouns, modalities, qualifying adjectives), conduct thematic analyses (reference fields), and detect discursive/chronological structures.etect discursive/chronological structures.)
    • Apache UIMA  + (Unstructured Information Management ArchitUnstructured Information Management Architecture (UIMA) is a component framework to analyze unstructured content such as text, audio and video. This is originally developed by IBM.</br>UIMA enables applications to be decomposed into components, for example “language identification” => “language specific segmentation” => “sentence boundary detection” => Each component implements interfaces defined by the framework and provides self describing metadata via XML descriptor files. Also provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes.essing pipelines over a cluster of networked nodes.)
    • Voyant Tools  + (Voyeur is a web-based text analysis enviroVoyeur is a web-based text analysis environment. It is designed to be user-friendly, flexible and powerful. Voyeur is part of the Hermeneuti.ca, a collaborative project to develop and theorize text analysis tools and text analysis rhetoric. [http://hermeneuti.ca/voyeur/ Voyeur Tools: See Through Your Texts] (retrieved 3/2014).</br></br>In Yoyeur, you can</br>* use texts in a variety of formats including plain text, HTML, XML, PDF, RTF and MS Word</br>* use texts from different locations, including URLs and uploaded files</br>* perform lexical analysis including the study of frequency and distribution data; in particular export data into other tools (as XML, tab separated values, etc.)</br>* embed live tools into remote web sites that can accompany or complement your own contentn accompany or complement your own content)
    • Web-harvest  + (Web-Harvest is Open Source Web Data ExtracWeb-Harvest is Open Source Web Data Extraction tool written in Java. It offers a way to collect desired Web pages and extract useful data from them. In order to do that, it leverages well established techniques and technologies for text/xml manipulation such as XSLT, XQuery and Regular Expressions. Web-Harvest mainly focuses on HTML/XML based web sites which still make vast majority of the Web content. On the other hand, it could be easily supplemented by custom Java libraries in order to augment its extraction capabilities.er to augment its extraction capabilities.)
    • Weka  + (Weka is a collection of machine learning aWeka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.</br></br>Weka 3.7 (still beta in oct. 2014) includes a package system, that allows to add functionality without recompiling the system. As of summer 2014, most people seem to use this developer version.</br></br>Weka is a very popular free data mining tool that includes advanced text mining featureshat includes advanced text mining features)
    • Gephy  + (Welcome to Gephi! Gephi is an open-source Welcome to Gephi! Gephi is an open-source software for visualizing and analysing large networks graphs. Gephi uses a 3D render engine to display graphs in real-time and speed up the exploration. You can use it to explore, analyse, spatialise, filter, cluterize, manipulate and export all types of graphs.manipulate and export all types of graphs.)
    • WordStat  + (Wordstat is a commercial text-mining and cWordstat is a commercial text-mining and content analysis software. It is integrated with the [[QDA Miner]] and SimStat products from the same company.</br></br>Quote from the official [http://provalisresearch.com/products/content-analysis-software/product page]: <span style="background-color:#eeeeee" class="citation">“WordStat is a flexible and easy-to-use text analysis software – whether you need text mining tools for fast extraction of themes and trends, or careful and precise measurement with state-of-the-art quantitative content analysis tools. WordStat‘s seamless integration with SimStat – our statistical data analysis tool – and QDA Miner – our qualitative data analysis software – gives you unprecedented flexibility for analyzing text and relating its content to structured information, including numerical and categorical data.”</span>erical and categorical data.”</span>)
    • Tm  + (tm package provides a framework for text mtm package provides a framework for text mining applications within R. The tm package offers functionality for managing text documents, abstracts the process of document manipulation and eases the usage of heterogeneous text formats in R. The package provides native support for reading in several classic file formats such as plain text, PDFs, or XML files. There is also a plug-in mechanism to handle additional file formats. The data structures and algorithms can be extended to fit custom demands.hms can be extended to fit custom demands.)