Property:Has description

The educational technology and digital learning wiki
Jump to navigation Jump to search

This is a property of type Text.

Showing 50 pages using this property.
A
Unstructured Information Management Architecture (UIMA) is a component framework to analyze unstructured content such as text, audio and video. This is originally developed by IBM. UIMA enables applications to be decomposed into components, for example “language identification” => “language specific segmentation” => “sentence boundary detection” => Each component implements interfaces defined by the framework and provides self describing metadata via XML descriptor files. Also provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes.  +
B
Real-time system that automatically collects student engagement and attendance & provides analytics tools and dashboards for students, teachers & management. See :[http://startups.fm/2013/03/25/5-pressing-educational-problems-beestars-location-intelligence-platform-solves.html 5 pressing educational problems Beestar’s Location Intelligence Platform solves]  +
Quote from the [http://www.bitext.com/ home page] (11/2014): Bitext provides B2B multilingual semantic engines with “documentably” the highest accuracy in the market. Bitext works for companies in two main markets: Text Analytics (Concept and Entity Extraction, Sentiment Analysis) for Social CRM, Enterprise Feedback Management or Voice of the Customer; and in Natural Language Interfaces for Search Engines.  +
C
Cytoscape is an open source software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data. Although Cytoscape was originally designed for biological research, now it is a general platform for complex network analysis and visualization. Cytoscape core distribution provides a basic set of features for data integration, analysis, and visualization. Additional features are available as Apps (formerly called Plugins). Apps are available for network and molecular profiling analyses, new layouts, additional file format support, scripting, and connection with databases. They may be developed by anyone using the Cytoscape open API based on Java™ technology and App community development is encouraged. Most of the Apps are freely available from Cytoscape App Store. See [http://apps.cytoscape.org/ Cytoscape App store]  +
D
'''DataMelt is a software environment for numeric calculations, statistics and data analysis ''' DataMelt, or DMelt, is an environment for numeric computation, data analysis and data visualization. DMelt is designed for analysis of large data volumes ("big data"), data mining, statistical analyses and math computations. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets. DMelt is a computational platform: It can be used with different programming languages on different operating systems. Unlike other statistical programs, DataMelt is not limited by a single programming language. Data analysis and statistical computations can be done using high-level scripting languages (Python/Jython, Groovy, etc.), as well as a lower-level language, such as JAVA. It incorporates many open-source JAVA packages into a coherent interface using the concept of dynamic scripting. DMelt creates high-quality vector-graphics images (SVG, EPS, PDF etc.) that can be included in LaTeX and other text-processing systems. The program runs on Windows, Linux, Mac OS.  +
According to the official [http://vialab.science.uoit.ca/docuburst/help.php help page] (3(2014), DocuBurst is an online document visualization tool, and can be used for: * Uploading your own text documents * Creating interactive visual summaries of documents * Exploring keywords to uncover document themes or topics * Investigating intra-document word patterns, such as character relationships * Comparing documents * Commenting, annotating and sharing visualizations with others  +
The Dragon Toolkit is a Java-based development package for academic use in information retrieval (IR) and text mining (TM, including text classification, text clustering, text summarization, and topic modeling). It is tailored for researchers who work on large-scale IR and TM and prefer Java programming. Moreover, different from Lucene and Lemur, it provides built-in supports for semantic-based IR and TM. The dragon toolkit seamlessly integrates a set of NLP tools, which enable the toolkit to index text collections with various representation schemes including words, phrases, ontology-based concepts and relationships. ([dragon.ischool.drexel.edu/], retrieved March 2014)  +
G
GATE is over 15 years old and is in active use for all types of computational task involving human language. GATE excels at text analysis of all shapes and sizes. From large corporations to small startups, from €multi-million research consortia to undergraduate projects, our user community is the largest and most diverse of any system of this type, and is spread across all but one of the continents ([http://gate.ac.uk/overview.html GATE: a full-lifecycle open source solution for text processing])  +
Quote from the [http://radimrehurek.com/gensim/about.html about page] (12/2016): Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”). I also wanted to try these fancy “Latent Semantic Methods”, but the libraries that realized the necessary computation were not much fun to work with. By now, gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.  +
Welcome to Gephi! Gephi is an open-source software for visualizing and analysing large networks graphs. Gephi uses a 3D render engine to display graphs in real-time and speed up the exploration. You can use it to explore, analyse, spatialise, filter, cluterize, manipulate and export all types of graphs.  +
GISMO is a graphical interactive monitoring tool that provides visualization of students' activities in online courses to instructors. Instructors can examine various aspects of distance students, such as the attendance to courses, reading of materials, submission of assignments. Users of Moodle may benefit from GISMO for their teaching activities. With respect to the standard reports provided by Moodle (which basically allow teachers to see if an individual student has viewed a specific resource or participated on a specific activity on a specific day), GISMO provides comprehensive visualizations that gives an overview of the whole class, not only a specific student or a particular resource. GISMO is available for Moodle 1.9.x. and Moodle 2.x.  +
I
A website where you can visualise data such as numbers, text and geographic information. You can create a range of visualisations including unusual ones such as “treemaps” and “phrase nets”. All the charts made in Many Eyes are interactive, so you can change what data is shown and how it is displayed. Many Eyes is also an online community where users can create topical groups to organise, share and discuss data visualisations. You can sign up to receive notifications when there are new visualisations or data on topics you are interested in. Only being able to use Many Eyes if your data and the visualisations can be shared publicly on the Internet.  +
A website where you can visualise data such as numbers, text and geographic information. You can create a range of visualisations including unusual ones such as “treemaps” and “phrase nets”. All the charts made in Many Eyes are interactive, so you can change what data is shown and how it is displayed. Many Eyes is also an online community where users can create topical groups to organise, share and discuss data visualisations. You can sign up to receive notifications when there are new visualisations or data on topics you are interested in. Only being able to use Many Eyes if your data and the visualisations can be shared publicly on the Internet.  +
IRaMuTeQ stands for "Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires", in English, "interface of R for multi-dimensional text and questionnaire analysis". Iramutec is built on top of [[R]] As of oct 2014, there is only a french Interface, but the software can deal with English texts.  +
J
Online tools to assist in the conversion of JSON to CSV.  +
Quote from the [http://www.juxtasoftware.org/about/ About Page] (11/2014): <span style="background-color:#eeeeee" class="citation">“Juxta is an open-source tool for comparing and collating multiple witnesses to a single textual work. Originally designed to aid scholars and editors examine the history of a text from manuscript to print versions, Juxta offers a number of possibilities for humanities computing and textual scholarship. [...] As a standalone desktop application, Juxta allows users to complete many of the necessary operations of textual criticism on digital texts (TXT and XML). With this software, you can add or remove witnesses to a comparison set, switch the base text at will. Once you’ve collated a comparison, Juxta also offers several kinds of analytic visualizations. By default, it displays a heat map of all textual variants and allows the user to locate — at the level of any textual unit — all witness variations from the base text. Users can switch to a side by side collation view, which gives a split frame comparison of a base text with a witness text. A histogram of Juxta collations is particularly useful for long documents; this visualization displays the density of all variation from the base text and serves as a useful finding aid for specific variants.”</span>  +
K
KEEL (Knowledge Extraction based on Evolutionary Learning) is an open source (GPLv3) Java software tool which empowers the user to assess the behavior of evolutionary learning and Soft Computing based techniques for different kinds of DM problems: regression, classification, clustering, pattern mining and so on. See a complete description on [http://sci2s.ugr.es/keel/description.php KEEP website]  +
KH Coder is an application for quantitative content analysis, text mining or corpus linguistics. It can handle Japanese, English, French, German, Italian, Portuguese and Spanish language data. By inputting the raw texts the searching and statistical analysis functionalities like KWIC, collocation statistics, co-occurrence networks, self-organizing map, multidimensional scaling, cluster analysis and correspondence analysis can be utilized.  +
Quote from the [http://interlinkinc.net/KNOT.html software home page] (11(2014): <div style="padding:2px;border-style:dotted;border-width:thin;margin-left:1em;margin-right:1em;margin-top:0.5ex;margin-bottom:0.5ex;"> The Knowledge Network Organizing Tool (KNOT) is built around the Pathfinder network generation algorithm. There are also several other components (see below). Pathfinder algorithms take estimates of the proximities between pairs of items as input and define a network representation of the items. The network (a PFNET) consists of the items as nodes and a set of links (which may be either directed or undirected for symmetrical or non-symmetrical proximity estimates) connecting pairs of the nodes. The set of links is determined by patterns of proximities in the data and parameters of Pathfinder algorithms. For details on the method and its applications see R. Schvaneveldt (Editor), Pathfinder Associative Networks: Studies in Knowledge Organization. Norwood, NJ: Ablex, 1990. The Pathfinder software includes several programs and utilities to facilitate Pathfinder network analyses of proximity data. The system is oriented around producing pictures of the solutions, but representations of networks and other information are also available in the form of text files which can be used with other software. The positions of nodes for displays are computed using an algorithm described by Kamada and Kawai (1989, Information Processing Letters, 31, 7-15).</div>  +
KNIME is a user-friendly graphical workbench for the entire analysis process: data access, data transformation, initial investigation, powerful predictive analytics, visualisation and reporting. The open integration platform provides over 1000 modules (nodes). The open source version [http://www.knime.org/knime claims to implement] a very rich platform: <span style="background-color:#eeeeee" class="citation">“The KNIME Analytics Platform incorporates hundreds of processing nodes for data I/O, preprocessing and cleansing, modeling, analysis and data mining as well as various interactive views, such as scatter plots, parallel coordinates and others. It integrates all of the analysis modules of the well known [[Weka]] data mining environment and additional plugins allow R-scripts to be run, offering access to a vast library of statistical routines.”</span>  +
Quote: <span style="background-color:#eeeeee" class="citation">“koRpus is an R package i originally wrote to measure similarities/differences between texts. over time it grew into what it is now, a hopefully versatile tool to analyze text material in various ways, with an emphasis on scientific research, including readability and lexical diversity features.”</span>  +
L
LOCO-Analyst is an educational tool aimed at providing teachers with feedback on the relevant aspects of the learning process taking place in a web-based learning environment, and thus helps them improve the content and the structure of their web-based courses. LOCO-Analyst aims at providing teachers with feedback regarding: *all kinds of activities their students performed and/or took part in during the learning process, *the usage and the comprehensibility of the learning content they had prepared and deployed in the LCMS, *contextualized social interactions among students (i.e., social networking) in the virtual learning environment.  +
The Learning Analytics Enriched Rubric (LA e-Rubric) is an advanced grading method used for criteria-based assessment. As a rubric, it consists of a set of criteria. For each criterion, several descriptive levels are provided. A numerical grade is assigned to each of these levels. An enriched rubric contains some criteria and related grading levels that are associated to data from the analysis of learners’ interaction and learning behavior in a Moodle course, such as number of post messages, times of accessing learning material, assignments grades and so on. Using learning analytics from log data that concern collaborative interactions, past grading performance and inquiries of course resources, the LA e-Rubric can automatically calculate the score of the various levels per criterion. The total rubric score is calculated as a sum of the scores per each criterion.  +
Quote from the [http://www.tal.univ-paris3.fr/lexico/index-gb.htm Home page]: <span style="background-color:#eeeeee" class="citation">“Lexico3 is the 2001 edition of the Lexico software, first published in 1990. Functions present from the first version (segmentation, concordances, breakdown in graphic form, characteristic elements and factorial analyses of repeated forms and segments) were maintained and for the most part significantly improved. The Lexico series is unique in that it allows the user to maintain control over the entire lexicometric process from initial segmentation to the publication of final results. Beyond identification of graphic forms, the software allows for study of the identification of more complex units composed of form sequences: repeated segments, pairs of forms in co-occurrences, etc which are less ambiguous than the graphic forms that make them up.”</span> A free version is available for "personal work", bottom of [http://www.tal.univ-paris3.fr/lexico/download.htm this page]  +
Quote from the home page: <span style="background-color:#eeeeee" class="citation">“This web-based tool enables you to "scrub" (clean) your unicode text(s), cut a text(s) into various size chunks, manage chunks and chunk sets, tokenize with character- or word- Ngrams or TF-IDF weighting, and choose from a suite of analysis tools for investigating those texts. Functionality includes building dendrograms, making graphs of rolling averages of word frequencies or ratios of words or letters, and playing with visualizations of word frequencies including word clouds and bubble visualizations. To facilitate subsequent text mining analyses beyond the scope of this site, users can also transpose and download their matricies of word counts or relative proportions as comma- or tab-separated files (.csv, .tsv).”</span>  +
<span style="background-color:#eeeeee" class="citation">“The open-source LightSide platform, including the machine-learning and feature-extraction core as well as the researcher's workbench UI, has been and continues to be funded in part through Carnegie Mellon University, in particular by grants from the National Science Foundation and the Office of Naval Research.”</span> ([http://ankara.lti.cs.cmu.edu/side/ LightSide home page], sept. 2014).  +
LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like: * Find the names of people, organizations or locations in news * Automatically classify Twitter search results into categories * Suggest correct spellings of queries The free and open source version requires that data processed and linked software must be freely available. There are other versions.  +
Log Parser is a flexible command line utility that was initially written by Gabriele Giuseppini, a Microsoft employee, to automate tests for IIS logging. It was intended for use with the Windows operating system, and was included with the IIS 6.0 Resource Kit Tools. The default behavior of logparser works like a "data processing pipeline", by taking an SQL expression on the command line, and outputting the lines containing matches for the SQL expression. (From wikipedia) Microsoft describes Logparser as a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows operating system such as the Event Log, the Registry, the file system, and Active Directory. The results of the input query can be custom-formatted in text based output, or they can be persisted to more specialty targets like SQL, SYSLOG, or a chart.  +
Log Parser studio graphical user interface (GUI) to function as a front-end to [[Log Parser|Log Parser 2.2]] and a ‘Query Library’ in order to manage all queries and scripts that one builds up over time. Log Parser Studio (LPS) can house all queries in a central location and allows to edit, create and save queries. You can search for queries using free text search as well as export and import both libraries and queries in different formats allowing for easy collaboration as well as storing multiple types of separate libraries for different protocols.  +
M
MAXQDA is a mixed methods research tool. There are two versions: * MAXQDA includes more classical QDA functionality (e.g. the ones that can be found in Atlas or Nvivo) + data management/import tools * MAXQDAplus contains the quantiative MAXDictio tool. According to [http://en.wikipedia.org/wiki/MAXQDA Wikipedia] (oct 2013), <span style="background-color:#eeeeee" class="citation">“MAXQDA is a software program designed for computer-assisted qualitative and mixed methods data, text and multimedia analysis in academic, scientific, and business institutions. It is the successor of winMAX, which was first made available in 1989.”</span>  +
Maps is a MediaWiki extension that provides the ability to visualize geographic data with dynamic, JavaScript based, mapping API's. It has built-in support for geocoding, displaying maps, displaying markers, adding pop-ups, and more.  +
Features: * text tokenization, including deep semantic features like parse trees * inverted and forward indexes with compression and various caching strategies * a collection of ranking functions for searching the indexes * topic models * classification algorithms * graph algorithms * language models * CRF implementation (POS-tagging, shallow parsing) * wrappers for liblinear and libsvm (including libsvm dataset parsers) * UTF8 support for analysis on various languages * multithreaded algorithms  +
Quote from the home page: <span style="background-color:#eeeeee" class="citation">“Textalytics is a text analysis engine that extracts meaningful elements from any type of content and structures it, so that you can easily process and manage it. Textalytics features a set of high-level web services — adaptable to the characteristics of every type of business — which can be flexibly integrated into your processes and applications.”</span>  +
The first version of the software was deployed to serve the needs of the free content Wikipedia encyclopedia in 2002. It has been deployed since then in tens of thousands other websites for all sorts of purposes.  +
This extension makes it possible to collect a number of pages. Collections can be edited, persisted and optionally retrieved as PDF, ODF or DocBook (XML)  +
Commercial software for extracting specific information. Using a point-and-click interface, Mozenda enables to extract specific information and images from websites. Mozenda is composed of an "Agent builder" and a web-console. The Mozenda Web Console can run the Agent created in the Agent Builder and enables to organize, manage, view, export and publish information. All agents are run on highly optimized harvesting servers in Mozenda's Data Centers.  +
N
NaCTeM has developed a number of high-quality text mining tools for the UK academic community. However, at least some seem to available to all for ''non commercial purposes'' ([http://www.nactem.ac.uk/terms_conditions.php]) NaCTeM's tools and services offer benefits to a wide range of users eg. reduction in time and effort for finding and linking pertinent information from large scale textual resources and customised solutions in semantic data analysis. ([http://www.nactem.ac.uk/aims.php Our Aims and Objectives], retrieved March 2014). NaCTeM tools are available in different ways. For basic tools, web services exist. Others require download and sometimes configuration/installation.  +
NetDraw is a free Windows program for visualizing social network data NetDraw is also included in [https://sites.google.com/site/ucinetsoftware/home UCINET], a fairly cheap commercial SNA program deveveloped by the same company.  +
NetMiner is an application software for exploratory analysis and visualization of large network data based on SNA(Social Network Analysis). It can be used for general research and teaching in social networks. This tool allows researchers to explore their network data visually and interactively, helps them to detect underlying patterns and structures of the network. It features data transformation, network analysis, statistics, visualization of network data, chart, and a programming language based on the [[Python]] script language.  +
Quote from the [https://netlytic.org/ home page] (11/2014): Netlytic is a cloud-based text and social networks analyzer that can automatically summarize and discover social networks from online conversations on social media sites.  +
Neural Designer is a data mining application intended for professional data scientists. It uses neural networks, which are mathematical models of the brain function that can be trained in order to perform tasks such as function regression, pattern recognition, time series prediction or auto-association. The software provides a graphical user interface using a wizard approach consisting of a sequence of pages. It allows you to run the tasks and to obtain comprehensive results as a report in an easy way. Neural Designer outstands in terms of performance. Indeed, it is developed using C++, has been subjected to code optimization techniques and makes use of parallel processing. It can analyze bigger data sets in less time.  +
O
Quote: OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases like Freebase. ([http://openrefine.org/ ], oct 2. 2014)  +
OpenSesame is a graphical, open-source experiment builder for the social sciences. It sports a modern and intuitive user interface that allows you to build complex experiments with a minimum of effort. With OpenSesame you can create a wide range of experiments. The plug-in framework and [[Python]] scripting allow you to incorporate external devices, such as eye trackers, response boxes, and parallel port devices, into your experiment. OpenSesame is freely available under the General Public Licence.  +
Open source data visualization and analysis for novice and experts. Data mining through visual programming or Python scripting. Components for machine learning. Add-ons for bioinformatics and text mining. Packed with features for data analytics. Various addons like [[Orange Textable ]] expand functionality of this software  +
Quote from the [http://langtech.ch/textable Textable] (oct. 2, 2014) Orange Textable is an open-source software tool for building data tables on the basis of raw text sources. Look at the following example to see it in typical action. Orange Textable offers the following features: * text data import from keyboard, files, or urls * systematic recoding * segmentation and annotation of various text units * extract and exploit XML-encoded annotations * automatic, random, and arbitrary selection of unit subsets * unit context examination using concordance and collocation tables * frequency and complexity measures * recoded text data and table export  +
P
Semantic Forms is an extension to MediaWiki that allows users to add, edit and query data using forms. It is heavily tied in with the Semantic MediaWiki extension, and is meant to be used for structured data that has semantic markup.  +
<span style="background-color:#eeeeee" class="citation">“Pajek (Slovene word for Spider) is a program, for Windows, for analysis and visualization of large networks. It is freely available, for noncommercial use, at its download page. See also a reference manual for Pajek (in PDF). The development of Pajek is traced in its History. See also an overview of Pajek's background and development. ”</span> ([http://pajek.imfm.si/doku.php?id=pajek Pajek], sept. 22, 2014) Pajek includes six data structures (e.g. network, permutation, cluster,...) and about 15 alorithms using these structures (e.g. partitions, decompositions, paths, flows...)  +
Piwik is an open source web analytics platform. Piwik displays reports regarding the geographic location of visits, the source of visits (i.e. whether they came from a website, directly, or something else), the technical capabilities of visitors (browser, screen size, operating system, etc.), what the visitors did (pages they viewed, actions they took, how they left), the time of visits and more. In addition to these reports, Piwik provides some other features that can help users analyze the data Piwik accumulates, such as: *Annotations — the ability to save notes (such as one's analysis of data) and attach them to dates in the past. *Transitions — a feature similar to Click path-like features that allows one to see how visitors navigate a website, but different in that it only displays navigation information for one page at a time. *Goals — the ability to set goals for actions it is desired for visitors to take (such as visiting a page or buying a product). Piwik will track how many visits result in those actions being taken. *E-commerce — the ability to track if and how much people spend on a website. *Page Overlay — a feature that displays analytics data overlaid on top of a website. *Row Evolution — a feature that displays how metrics change over time within a report. *Custom Variables — the ability to attach data, like a user name, to visit data.  +
Q
QDA Miner is qualitative "mixed methods" data analysis package. There are two version: * A [http://provalisresearch.com/products/qualitative-data-analysis-software/freeware/ free QDA Miner Lite] Version * An expensive commercial version Quote from the official [http://provalisresearch.com/products/qualitative-data-analysis-software/ product page]: <span style="background-color:#eeeeee" class="citation">“DA Miner is an easy-to-use qualitative data analysis software package for coding, annotating, retrieving and analyzing small and large collections of documents and images. QDA Miner qualitative data analysis tool may be used to analyze interview or focus group transcripts, legal documents, journal articles, speeches, even entire books, as well as drawings, photographs, paintings, and other types of visual documents. Its seamless integration with SimStat, a statistical data analysis tool, and [[WordStat]], a quantitative content analysis and text mining module, gives you unprecedented flexibility for analyzing text and relating its content to structured information including numerical and categorical data.”</span>  +
R
R +
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. R is available as Free Software for data manipulation, calculation and graphical display. It includes *an effective data handling and storage facility, *a suite of operators for calculations on arrays, in particular matrices, *a large, coherent, integrated collection of intermediate tools for data analysis, *graphical facilities for data analysis and display either on-screen or on hardcopy, and *a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities. R can be considered as an environment within which statistical techniques are implemented. R can be extended via packages. For example, try: * [http://rqda.r-forge.r-project.org/ RQDA] * [http://cran.at.r-project.org/web/views/NaturalLanguageProcessing.html CRAN Task View: Natural Language Processing]  +