Property:Has description

This is a property of type Text.

Usage87

FilterThe <a rel="nofollow" class="external text" href="https://www.semantic-mediawiki.org/wiki/Help:Property_page/Filter">search filter</a> allows the inclusion of <a rel="nofollow" class="external text" href="https://www.semantic-mediawiki.org/wiki/Help:Query_expressions">query expressions</a> such as <code>~</code> or <code>!</code>. The selected <a rel="nofollow" class="external text" href="https://www.semantic-mediawiki.org/wiki/Query_engine">query engine</a> might also support case insensitive matching or other short expressions like:<ul><li><code>in:</code> result should include the term, e.g. '<code>in:Foo</code>'</li></ul><ul><li><code>not:</code> result should to not include the term, e.g. '<code>not:Bar</code>'</li></ul>

Showing 50 pages using this property.

A

ALA-Reader +

Quote from the [http://www.personal.psu.edu/rbc4/score.htm software home page] (11/2014): Here is a software tool that can translate written text summaries directly into proximity files (prx) that can be analyzed by [http://interlinkinc.net/KNOT.html Pathfinder KNOT]. It also generates text proposition files that can be imported by [http://cmap.ihmc.us/download/ CMAP Tools] to automatically form concept maps from the text. It should be of use to researchers who want to visualize "text" for various instructional and research-related reasons. Also it should work with different languages. ALA-Reader contains a rudimentary scoring system. Essentially, this tool converts the written summary into a cognitive map and then scores the cognitive map using an approach that we developed for scoring concept maps. The "score" produced is percent agreement with an expert referent. As I narrow down what algorithms work, then I plan to release updated versions periodically. +

AntConc +

AntConc is a freeware concordance program for Windows, Macintosh OS X, and Linux. The software includes seven tools: Concordance Tool: shows search results in a 'KWIC' (KeyWord In Context) format. Concordance Plot Tool: shows search results plotted as a 'barcode' format. This allows you to see the position where search results appear in target texts. File View Tool: This tool shows the text of individual files. This allows you to investigate in more detail the results generated in other tools of AntConc. Clusters/N-Grams: hows clusters based on the search condition. In effect it summarizes the results generated in the Concordance Tool or Concordance Plot Tool. The N-Grams Tool, on the other hand, scans the entire corpus for 'N' (e.g. 1 word, 2 words, …) length clusters. This allows you to find common expressions in a corpus. Collocates: shows the collocates of a search term. This allows you to investigate non-sequential patterns in language. Word List: counts all the words in the corpus and presents them in an ordered list. This allows you to quickly find which words are the most frequent in a corpus. Keyword List: shows the which words are unusually frequent (or infrequent) in the corpus in comparison with the words in a reference corpus. This allows you to identify characteristic words in the corpus, for example, as part of a genre or ESP study. +

Apache Mahout +

According to the [http://mahout.apache.org/ home page] (oct 1 2014): The Apache Mahout™ project's goal is to build a scalable machine learning library. Quote: “Currently Mahout supports mainly three use cases: Recommendation mining takes users' behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category.” +

Apache OpenNLP +

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also includes maximum entropy and perceptron based machine learning. +

Apache Superset +

Features * A rich set of data visualizations * An easy-to-use interface for exploring and visualizing data * Create and share dashboards * Enterprise-ready authentication with integration with major authentication providers (database, OpenID, LDAP, OAuth & REMOTE_USER through Flask AppBuilder) * An extensible, high-granularity security/permission model allowing intricate rules on who can access individual features and the dataset * A simple semantic layer, allowing users to control how data sources are displayed in the UI by defining which fields should show up in which drop-down and which aggregation and function metrics are made available to the user * Integration with most SQL-speaking RDBMS through SQLAlchemy * Deep integration with Druid.io This project was originally named Panoramix, was renamed to Caravel in March 2016, and is currently named Superset as of November 2016 +

Apache UIMA +

Unstructured Information Management Architecture (UIMA) is a component framework to analyze unstructured content such as text, audio and video. This is originally developed by IBM. UIMA enables applications to be decomposed into components, for example “language identification” => “language specific segmentation” => “sentence boundary detection” => Each component implements interfaces defined by the framework and provides self describing metadata via XML descriptor files. Also provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes. +

B

Beestar insight +

Real-time system that automatically collects student engagement and attendance & provides analytics tools and dashboards for students, teachers & management. See :[http://startups.fm/2013/03/25/5-pressing-educational-problems-beestars-location-intelligence-platform-solves.html 5 pressing educational problems Beestar’s Location Intelligence Platform solves] +

Bitext +

Quote from the [http://www.bitext.com/ home page] (11/2014): Bitext provides B2B multilingual semantic engines with “documentably” the highest accuracy in the market. Bitext works for companies in two main markets: Text Analytics (Concept and Entity Extraction, Sentiment Analysis) for Social CRM, Enterprise Feedback Management or Voice of the Customer; and in Natural Language Interfaces for Search Engines. +

C

Cytoscape +

Cytoscape is an open source software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data. Although Cytoscape was originally designed for biological research, now it is a general platform for complex network analysis and visualization. Cytoscape core distribution provides a basic set of features for data integration, analysis, and visualization. Additional features are available as Apps (formerly called Plugins). Apps are available for network and molecular profiling analyses, new layouts, additional file format support, scripting, and connection with databases. They may be developed by anyone using the Cytoscape open API based on Java™ technology and App community development is encouraged. Most of the Apps are freely available from Cytoscape App Store. See [http://apps.cytoscape.org/ Cytoscape App store] +

D

DataMelt +

'''DataMelt is a software environment for numeric calculations, statistics and data analysis ''' DataMelt, or DMelt, is an environment for numeric computation, data analysis and data visualization. DMelt is designed for analysis of large data volumes ("big data"), data mining, statistical analyses and math computations. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets. DMelt is a computational platform: It can be used with different programming languages on different operating systems. Unlike other statistical programs, DataMelt is not limited by a single programming language. Data analysis and statistical computations can be done using high-level scripting languages (Python/Jython, Groovy, etc.), as well as a lower-level language, such as JAVA. It incorporates many open-source JAVA packages into a coherent interface using the concept of dynamic scripting. DMelt creates high-quality vector-graphics images (SVG, EPS, PDF etc.) that can be included in LaTeX and other text-processing systems. The program runs on Windows, Linux, Mac OS. +

DocuBurst +

According to the official [http://vialab.science.uoit.ca/docuburst/help.php help page] (3(2014), DocuBurst is an online document visualization tool, and can be used for: * Uploading your own text documents * Creating interactive visual summaries of documents * Exploring keywords to uncover document themes or topics * Investigating intra-document word patterns, such as character relationships * Comparing documents * Commenting, annotating and sharing visualizations with others +

Dragon ToolKit +

The Dragon Toolkit is a Java-based development package for academic use in information retrieval (IR) and text mining (TM, including text classification, text clustering, text summarization, and topic modeling). It is tailored for researchers who work on large-scale IR and TM and prefer Java programming. Moreover, different from Lucene and Lemur, it provides built-in supports for semantic-based IR and TM. The dragon toolkit seamlessly integrates a set of NLP tools, which enable the toolkit to index text collections with various representation schemes including words, phrases, ontology-based concepts and relationships. ([dragon.ischool.drexel.edu/], retrieved March 2014) +

G

General architecture for text engineering +

GATE is over 15 years old and is in active use for all types of computational task involving human language. GATE excels at text analysis of all shapes and sizes. From large corporations to small startups, from €multi-million research consortia to undergraduate projects, our user community is the largest and most diverse of any system of this type, and is spread across all but one of the continents ([http://gate.ac.uk/overview.html GATE: a full-lifecycle open source solution for text processing]) +

Gensim +

Quote from the [http://radimrehurek.com/gensim/about.html about page] (12/2016): Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”). I also wanted to try these fancy “Latent Semantic Methods”, but the libraries that realized the necessary computation were not much fun to work with. By now, gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”. +

Gephy +

Welcome to Gephi! Gephi is an open-source software for visualizing and analysing large networks graphs. Gephi uses a 3D render engine to display graphs in real-time and speed up the exploration. You can use it to explore, analyse, spatialise, filter, cluterize, manipulate and export all types of graphs. +

Gismo +

GISMO is a graphical interactive monitoring tool that provides visualization of students' activities in online courses to instructors. Instructors can examine various aspects of distance students, such as the attendance to courses, reading of materials, submission of assignments. Users of Moodle may benefit from GISMO for their teaching activities. With respect to the standard reports provided by Moodle (which basically allow teachers to see if an individual student has viewed a specific resource or participated on a specific activity on a specific day), GISMO provides comprehensive visualizations that gives an overview of the whole class, not only a specific student or a particular resource. GISMO is available for Moodle 1.9.x. and Moodle 2.x. +

I

IBM Many Eyes +

A website where you can visualise data such as numbers, text and geographic information. You can create a range of visualisations including unusual ones such as “treemaps” and “phrase nets”. All the charts made in Many Eyes are interactive, so you can change what data is shown and how it is displayed. Many Eyes is also an online community where users can create topical groups to organise, share and discuss data visualisations. You can sign up to receive notifications when there are new visualisations or data on topics you are interested in. Only being able to use Many Eyes if your data and the visualisations can be shared publicly on the Internet. +

IBM Many Eyes v2 +

A website where you can visualise data such as numbers, text and geographic information. You can create a range of visualisations including unusual ones such as “treemaps” and “phrase nets”. All the charts made in Many Eyes are interactive, so you can change what data is shown and how it is displayed. Many Eyes is also an online community where users can create topical groups to organise, share and discuss data visualisations. You can sign up to receive notifications when there are new visualisations or data on topics you are interested in. Only being able to use Many Eyes if your data and the visualisations can be shared publicly on the Internet. +

Iramuteq +

IRaMuTeQ stands for "Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires", in English, "interface of R for multi-dimensional text and questionnaire analysis". Iramutec is built on top of [[R]] As of oct 2014, there is only a french Interface, but the software can deal with English texts. +

J

JSON-CSV Converter +

Online tools to assist in the conversion of JSON to CSV. +

Juxta +

Quote from the [http://www.juxtasoftware.org/about/ About Page] (11/2014): “Juxta is an open-source tool for comparing and collating multiple witnesses to a single textual work. Originally designed to aid scholars and editors examine the history of a text from manuscript to print versions, Juxta offers a number of possibilities for humanities computing and textual scholarship. [...] As a standalone desktop application, Juxta allows users to complete many of the necessary operations of textual criticism on digital texts (TXT and XML). With this software, you can add or remove witnesses to a comparison set, switch the base text at will. Once you’ve collated a comparison, Juxta also offers several kinds of analytic visualizations. By default, it displays a heat map of all textual variants and allows the user to locate — at the level of any textual unit — all witness variations from the base text. Users can switch to a side by side collation view, which gives a split frame comparison of a base text with a witness text. A histogram of Juxta collations is particularly useful for long documents; this visualization displays the density of all variation from the base text and serves as a useful finding aid for specific variants.” +

K

KEEL +

KEEL (Knowledge Extraction based on Evolutionary Learning) is an open source (GPLv3) Java software tool which empowers the user to assess the behavior of evolutionary learning and Soft Computing based techniques for different kinds of DM problems: regression, classification, clustering, pattern mining and so on. See a complete description on [http://sci2s.ugr.es/keel/description.php KEEP website] +

KH Coder +

KH Coder is an application for quantitative content analysis, text mining or corpus linguistics. It can handle Japanese, English, French, German, Italian, Portuguese and Spanish language data. By inputting the raw texts the searching and statistical analysis functionalities like KWIC, collocation statistics, co-occurrence networks, self-organizing map, multidimensional scaling, cluster analysis and correspondence analysis can be utilized. +

KNOT +

Quote from the [http://interlinkinc.net/KNOT.html software home page] (11(2014): <div style="padding:2px;border-style:dotted;border-width:thin;margin-left:1em;margin-right:1em;margin-top:0.5ex;margin-bottom:0.5ex;"> The Knowledge Network Organizing Tool (KNOT) is built around the Pathfinder network generation algorithm. There are also several other components (see below). Pathfinder algorithms take estimates of the proximities between pairs of items as input and define a network representation of the items. The network (a PFNET) consists of the items as nodes and a set of links (which may be either directed or undirected for symmetrical or non-symmetrical proximity estimates) connecting pairs of the nodes. The set of links is determined by patterns of proximities in the data and parameters of Pathfinder algorithms. For details on the method and its applications see R. Schvaneveldt (Editor), Pathfinder Associative Networks: Studies in Knowledge Organization. Norwood, NJ: Ablex, 1990. The Pathfinder software includes several programs and utilities to facilitate Pathfinder network analyses of proximity data. The system is oriented around producing pictures of the solutions, but representations of networks and other information are also available in the form of text files which can be used with other software. The positions of nodes for displays are computed using an algorithm described by Kamada and Kawai (1989, Information Processing Letters, 31, 7-15).</div> +

Knime +

KNIME is a user-friendly graphical workbench for the entire analysis process: data access, data transformation, initial investigation, powerful predictive analytics, visualisation and reporting. The open integration platform provides over 1000 modules (nodes). The open source version [http://www.knime.org/knime claims to implement] a very rich platform: “The KNIME Analytics Platform incorporates hundreds of processing nodes for data I/O, preprocessing and cleansing, modeling, analysis and data mining as well as various interactive views, such as scatter plots, parallel coordinates and others. It integrates all of the analysis modules of the well known [[Weka]] data mining environment and additional plugins allow R-scripts to be run, offering access to a vast library of statistical routines.” +

KoRpus +

Quote: “koRpus is an R package i originally wrote to measure similarities/differences between texts. over time it grew into what it is now, a hopefully versatile tool to analyze text material in various ways, with an emphasis on scientific research, including readability and lexical diversity features.” +

L

LOCO-Analyst +

LOCO-Analyst is an educational tool aimed at providing teachers with feedback on the relevant aspects of the learning process taking place in a web-based learning environment, and thus helps them improve the content and the structure of their web-based courses. LOCO-Analyst aims at providing teachers with feedback regarding: *all kinds of activities their students performed and/or took part in during the learning process, *the usage and the comprehensibility of the learning content they had prepared and deployed in the LCMS, *contextualized social interactions among students (i.e., social networking) in the virtual learning environment. +

Learning Analytics Enriched Rubric +

The Learning Analytics Enriched Rubric (LA e-Rubric) is an advanced grading method used for criteria-based assessment. As a rubric, it consists of a set of criteria. For each criterion, several descriptive levels are provided. A numerical grade is assigned to each of these levels. An enriched rubric contains some criteria and related grading levels that are associated to data from the analysis of learners’ interaction and learning behavior in a Moodle course, such as number of post messages, times of accessing learning material, assignments grades and so on. Using learning analytics from log data that concern collaborative interactions, past grading performance and inquiries of course resources, the LA e-Rubric can automatically calculate the score of the various levels per criterion. The total rubric score is calculated as a sum of the scores per each criterion. +

Lexico +

Quote from the [http://www.tal.univ-paris3.fr/lexico/index-gb.htm Home page]: “Lexico3 is the 2001 edition of the Lexico software, first published in 1990. Functions present from the first version (segmentation, concordances, breakdown in graphic form, characteristic elements and factorial analyses of repeated forms and segments) were maintained and for the most part significantly improved. The Lexico series is unique in that it allows the user to maintain control over the entire lexicometric process from initial segmentation to the publication of final results. Beyond identification of graphic forms, the software allows for study of the identification of more complex units composed of form sequences: repeated segments, pairs of forms in co-occurrences, etc which are less ambiguous than the graphic forms that make them up.” A free version is available for "personal work", bottom of [http://www.tal.univ-paris3.fr/lexico/download.htm this page] +

Lexos +

Quote from the home page: “This web-based tool enables you to "scrub" (clean) your unicode text(s), cut a text(s) into various size chunks, manage chunks and chunk sets, tokenize with character- or word- Ngrams or TF-IDF weighting, and choose from a suite of analysis tools for investigating those texts. Functionality includes building dendrograms, making graphs of rolling averages of word frequencies or ratios of words or letters, and playing with visualizations of word frequencies including word clouds and bubble visualizations. To facilitate subsequent text mining analyses beyond the scope of this site, users can also transpose and download their matricies of word counts or relative proportions as comma- or tab-separated files (.csv, .tsv).” +

LightSide +

“The open-source LightSide platform, including the machine-learning and feature-extraction core as well as the researcher's workbench UI, has been and continues to be funded in part through Carnegie Mellon University, in particular by grants from the National Science Foundation and the Office of Naval Research.” ([http://ankara.lti.cs.cmu.edu/side/ LightSide home page], sept. 2014). +

Lingpipe +

LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like: * Find the names of people, organizations or locations in news * Automatically classify Twitter search results into categories * Suggest correct spellings of queries The free and open source version requires that data processed and linked software must be freely available. There are other versions. +

Log Parser +

Log Parser is a flexible command line utility that was initially written by Gabriele Giuseppini, a Microsoft employee, to automate tests for IIS logging. It was intended for use with the Windows operating system, and was included with the IIS 6.0 Resource Kit Tools. The default behavior of logparser works like a "data processing pipeline", by taking an SQL expression on the command line, and outputting the lines containing matches for the SQL expression. (From wikipedia) Microsoft describes Logparser as a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows operating system such as the Event Log, the Registry, the file system, and Active Directory. The results of the input query can be custom-formatted in text based output, or they can be persisted to more specialty targets like SQL, SYSLOG, or a chart. +

Log Parser Studio +

Log Parser studio graphical user interface (GUI) to function as a front-end to [[Log Parser|Log Parser 2.2]] and a ‘Query Library’ in order to manage all queries and scripts that one builds up over time. Log Parser Studio (LPS) can house all queries in a central location and allows to edit, create and save queries. You can search for queries using free text search as well as export and import both libraries and queries in different formats allowing for easy collaboration as well as storing multiple types of separate libraries for different protocols. +

M

MAXQDA +

MAXQDA is a mixed methods research tool. There are two versions: * MAXQDA includes more classical QDA functionality (e.g. the ones that can be found in Atlas or Nvivo) + data management/import tools * MAXQDAplus contains the quantiative MAXDictio tool. According to [http://en.wikipedia.org/wiki/MAXQDA Wikipedia] (oct 2013), “MAXQDA is a software program designed for computer-assisted qualitative and mixed methods data, text and multimedia analysis in academic, scientific, and business institutions. It is the successor of winMAX, which was first made available in 1989.” +

Maps (MediaWiki extension) +

Maps is a MediaWiki extension that provides the ability to visualize geographic data with dynamic, JavaScript based, mapping API's. It has built-in support for geocoding, displaying maps, displaying markers, adding pop-ups, and more. +

MeTA +

Features: * text tokenization, including deep semantic features like parse trees * inverted and forward indexes with compression and various caching strategies * a collection of ranking functions for searching the indexes * topic models * classification algorithms * graph algorithms * language models * CRF implementation (POS-tagging, shallow parsing) * wrappers for liblinear and libsvm (including libsvm dataset parsers) * UTF8 support for analysis on various languages * multithreaded algorithms +

Meaning Cloud +

Quote from the home page: “Textalytics is a text analysis engine that extracts meaningful elements from any type of content and structures it, so that you can easily process and manage it. Textalytics features a set of high-level web services — adaptable to the characteristics of every type of business — which can be flexibly integrated into your processes and applications.” +

Mediawiki +

The first version of the software was deployed to serve the needs of the free content Wikipedia encyclopedia in 2002. It has been deployed since then in tens of thousands other websites for all sorts of purposes. +

Mediawiki collection extension installation +

This extension makes it possible to collect a number of pages. Collections can be edited, persisted and optionally retrieved as PDF, ODF or DocBook (XML) +

Mozenda +

Commercial software for extracting specific information. Using a point-and-click interface, Mozenda enables to extract specific information and images from websites. Mozenda is composed of an "Agent builder" and a web-console. The Mozenda Web Console can run the Agent created in the Agent Builder and enables to organize, manage, view, export and publish information. All agents are run on highly optimized harvesting servers in Mozenda's Data Centers. +

N

Nactem software tools +

NaCTeM has developed a number of high-quality text mining tools for the UK academic community. However, at least some seem to available to all for ''non commercial purposes'' ([http://www.nactem.ac.uk/terms_conditions.php]) NaCTeM's tools and services offer benefits to a wide range of users eg. reduction in time and effort for finding and linking pertinent information from large scale textual resources and customised solutions in semantic data analysis. ([http://www.nactem.ac.uk/aims.php Our Aims and Objectives], retrieved March 2014). NaCTeM tools are available in different ways. For basic tools, web services exist. Others require download and sometimes configuration/installation. +

NetDraw +

NetDraw is a free Windows program for visualizing social network data NetDraw is also included in [https://sites.google.com/site/ucinetsoftware/home UCINET], a fairly cheap commercial SNA program deveveloped by the same company. +

NetMiner +

NetMiner is an application software for exploratory analysis and visualization of large network data based on SNA(Social Network Analysis). It can be used for general research and teaching in social networks. This tool allows researchers to explore their network data visually and interactively, helps them to detect underlying patterns and structures of the network. It features data transformation, network analysis, statistics, visualization of network data, chart, and a programming language based on the [[Python]] script language. +

Netlytic +

Quote from the [https://netlytic.org/ home page] (11/2014): Netlytic is a cloud-based text and social networks analyzer that can automatically summarize and discover social networks from online conversations on social media sites. +

Neural Designer +

Neural Designer is a data mining application intended for professional data scientists. It uses neural networks, which are mathematical models of the brain function that can be trained in order to perform tasks such as function regression, pattern recognition, time series prediction or auto-association. The software provides a graphical user interface using a wizard approach consisting of a sequence of pages. It allows you to run the tasks and to obtain comprehensive results as a report in an easy way. Neural Designer outstands in terms of performance. Indeed, it is developed using C++, has been subjected to code optimization techniques and makes use of parallel processing. It can analyze bigger data sets in less time. +

O

OpenRefine +

Quote: OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases like Freebase. ([http://openrefine.org/ ], oct 2. 2014) +

OpenSesame +

OpenSesame is a graphical, open-source experiment builder for the social sciences. It sports a modern and intuitive user interface that allows you to build complex experiments with a minimum of effort. With OpenSesame you can create a wide range of experiments. The plug-in framework and [[Python]] scripting allow you to incorporate external devices, such as eye trackers, response boxes, and parallel port devices, into your experiment. OpenSesame is freely available under the General Public Licence. +

Orange +

Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics. Various addons like [[Orange Textable ]] expand functionality of this software +

Orange Textable +

Quote from the [http://langtech.ch/textable Textable] (oct. 2, 2014) Orange Textable is an open-source software tool for building data tables on the basis of raw text sources. Look at the following example to see it in typical action. Orange Textable offers the following features: * text data import from keyboard, files, or urls * systematic recoding * segmentation and annotation of various text units * extract and exploit XML-encoded annotations * automatic, random, and arbitrary selection of unit subsets * unit context examination using concordance and collocation tables * frequency and complexity measures * recoded text data and table export +

Property:Has description

Navigation menu

Slow Search