Quote: <span style="background-color:#eeeeee" class="citation">“koRpus is an R package i originally wrote to measure similarities/differences between texts. over time it grew into what it is now, a hopefully versatile tool to analyze text material in various ways, with an emphasis on scientific research, including readability and lexical diversity features.”</span>  +
LOCO-Analyst is an educational tool aimed at providing teachers with feedback on the relevant aspects of the learning process taking place in a web-based learning environment, and thus helps them improve the content and the structure of their web-based courses. LOCO-Analyst aims at providing teachers with feedback regarding: *all kinds of activities their students performed and/or took part in during the learning process, *the usage and the comprehensibility of the learning content they had prepared and deployed in the LCMS, *contextualized social interactions among students (i.e., social networking) in the virtual learning environment.  +
The Learning Analytics Enriched Rubric (LA e-Rubric) is an advanced grading method used for criteria-based assessment. As a rubric, it consists of a set of criteria. For each criterion, several descriptive levels are provided. A numerical grade is assigned to each of these levels. An enriched rubric contains some criteria and related grading levels that are associated to data from the analysis of learners’ interaction and learning behavior in a Moodle course, such as number of post messages, times of accessing learning material, assignments grades and so on. Using learning analytics from log data that concern collaborative interactions, past grading performance and inquiries of course resources, the LA e-Rubric can automatically calculate the score of the various levels per criterion. The total rubric score is calculated as a sum of the scores per each criterion.  +
Quote from the [ Home page]: <span style="background-color:#eeeeee" class="citation">“Lexico3 is the 2001 edition of the Lexico software, first published in 1990. Functions present from the first version (segmentation, concordances, breakdown in graphic form, characteristic elements and factorial analyses of repeated forms and segments) were maintained and for the most part significantly improved. The Lexico series is unique in that it allows the user to maintain control over the entire lexicometric process from initial segmentation to the publication of final results. Beyond identification of graphic forms, the software allows for study of the identification of more complex units composed of form sequences: repeated segments, pairs of forms in co-occurrences, etc which are less ambiguous than the graphic forms that make them up.”</span> A free version is available for "personal work", bottom of [ this page]  +
Quote from the home page: <span style="background-color:#eeeeee" class="citation">“This web-based tool enables you to "scrub" (clean) your unicode text(s), cut a text(s) into various size chunks, manage chunks and chunk sets, tokenize with character- or word- Ngrams or TF-IDF weighting, and choose from a suite of analysis tools for investigating those texts. Functionality includes building dendrograms, making graphs of rolling averages of word frequencies or ratios of words or letters, and playing with visualizations of word frequencies including word clouds and bubble visualizations. To facilitate subsequent text mining analyses beyond the scope of this site, users can also transpose and download their matricies of word counts or relative proportions as comma- or tab-separated files (.csv, .tsv).”</span>  +
<span style="background-color:#eeeeee" class="citation">“The open-source LightSide platform, including the machine-learning and feature-extraction core as well as the researcher's workbench UI, has been and continues to be funded in part through Carnegie Mellon University, in particular by grants from the National Science Foundation and the Office of Naval Research.”</span> ([ LightSide home page], sept. 2014).  +
LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like: * Find the names of people, organizations or locations in news * Automatically classify Twitter search results into categories * Suggest correct spellings of queries The free and open source version requires that data processed and linked software must be freely available. There are other versions.  +
Log Parser is a flexible command line utility that was initially written by Gabriele Giuseppini, a Microsoft employee, to automate tests for IIS logging. It was intended for use with the Windows operating system, and was included with the IIS 6.0 Resource Kit Tools. The default behavior of logparser works like a "data processing pipeline", by taking an SQL expression on the command line, and outputting the lines containing matches for the SQL expression. (From wikipedia) Microsoft describes Logparser as a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows operating system such as the Event Log, the Registry, the file system, and Active Directory. The results of the input query can be custom-formatted in text based output, or they can be persisted to more specialty targets like SQL, SYSLOG, or a chart.  +
Log Parser studio graphical user interface (GUI) to function as a front-end to [[Log Parser|Log Parser 2.2]] and a ‘Query Library’ in order to manage all queries and scripts that one builds up over time. Log Parser Studio (LPS) can house all queries in a central location and allows to edit, create and save queries. You can search for queries using free text search as well as export and import both libraries and queries in different formats allowing for easy collaboration as well as storing multiple types of separate libraries for different protocols.  +
MAXQDA is a mixed methods research tool. There are two versions: * MAXQDA includes more classical QDA functionality (e.g. the ones that can be found in Atlas or Nvivo) + data management/import tools * MAXQDAplus contains the quantiative MAXDictio tool. According to [ Wikipedia] (oct 2013), <span style="background-color:#eeeeee" class="citation">“MAXQDA is a software program designed for computer-assisted qualitative and mixed methods data, text and multimedia analysis in academic, scientific, and business institutions. It is the successor of winMAX, which was first made available in 1989.”</span>  +
Maps is a MediaWiki extension that provides the ability to visualize geographic data with dynamic, JavaScript based, mapping API's. It has built-in support for geocoding, displaying maps, displaying markers, adding pop-ups, and more.  +
Features: * text tokenization, including deep semantic features like parse trees * inverted and forward indexes with compression and various caching strategies * a collection of ranking functions for searching the indexes * topic models * classification algorithms * graph algorithms * language models * CRF implementation (POS-tagging, shallow parsing) * wrappers for liblinear and libsvm (including libsvm dataset parsers) * UTF8 support for analysis on various languages * multithreaded algorithms  +
Quote from the home page: <span style="background-color:#eeeeee" class="citation">“Textalytics is a text analysis engine that extracts meaningful elements from any type of content and structures it, so that you can easily process and manage it. Textalytics features a set of high-level web services — adaptable to the characteristics of every type of business — which can be flexibly integrated into your processes and applications.”</span>  +
The first version of the software was deployed to serve the needs of the free content Wikipedia encyclopedia in 2002. It has been deployed since then in tens of thousands other websites for all sorts of purposes.  +
This extension makes it possible to collect a number of pages. Collections can be edited, persisted and optionally retrieved as PDF, ODF or DocBook (XML)  +
Commercial software for extracting specific information. Using a point-and-click interface, Mozenda enables to extract specific information and images from websites. Mozenda is composed of an "Agent builder" and a web-console. The Mozenda Web Console can run the Agent created in the Agent Builder and enables to organize, manage, view, export and publish information. All agents are run on highly optimized harvesting servers in Mozenda's Data Centers.  +
NaCTeM has developed a number of high-quality text mining tools for the UK academic community. However, at least some seem to available to all for ''non commercial purposes'' ([]) NaCTeM's tools and services offer benefits to a wide range of users eg. reduction in time and effort for finding and linking pertinent information from large scale textual resources and customised solutions in semantic data analysis. ([ Our Aims and Objectives], retrieved March 2014). NaCTeM tools are available in different ways. For basic tools, web services exist. Others require download and sometimes configuration/installation.  +
NetDraw is a free Windows program for visualizing social network data NetDraw is also included in [ UCINET], a fairly cheap commercial SNA program deveveloped by the same company.  +
NetMiner is an application software for exploratory analysis and visualization of large network data based on SNA(Social Network Analysis). It can be used for general research and teaching in social networks. This tool allows researchers to explore their network data visually and interactively, helps them to detect underlying patterns and structures of the network. It features data transformation, network analysis, statistics, visualization of network data, chart, and a programming language based on the [[Python]] script language.  +
Quote from the [ home page] (11/2014): Netlytic is a cloud-based text and social networks analyzer that can automatically summarize and discover social networks from online conversations on social media sites.  +
Neural Designer is a data mining application intended for professional data scientists. It uses neural networks, which are mathematical models of the brain function that can be trained in order to perform tasks such as function regression, pattern recognition, time series prediction or auto-association. The software provides a graphical user interface using a wizard approach consisting of a sequence of pages. It allows you to run the tasks and to obtain comprehensive results as a report in an easy way. Neural Designer outstands in terms of performance. Indeed, it is developed using C++, has been subjected to code optimization techniques and makes use of parallel processing. It can analyze bigger data sets in less time.  +
Quote: OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases like Freebase. ([ ], oct 2. 2014)  +
OpenSesame is a graphical, open-source experiment builder for the social sciences. It sports a modern and intuitive user interface that allows you to build complex experiments with a minimum of effort. With OpenSesame you can create a wide range of experiments. The plug-in framework and [[Python]] scripting allow you to incorporate external devices, such as eye trackers, response boxes, and parallel port devices, into your experiment. OpenSesame is freely available under the General Public Licence.  +
Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics. Various addons like [[Orange Textable ]] expand functionality of this software  +
Quote from the [ Textable] (oct. 2, 2014) Orange Textable is an open-source software tool for building data tables on the basis of raw text sources. Look at the following example to see it in typical action. Orange Textable offers the following features: * text data import from keyboard, files, or urls * systematic recoding * segmentation and annotation of various text units * extract and exploit XML-encoded annotations * automatic, random, and arbitrary selection of unit subsets * unit context examination using concordance and collocation tables * frequency and complexity measures * recoded text data and table export  +