Property:Has description

This is a property of type Text.

Usage87

FilterThe <a rel="nofollow" class="external text" href="https://www.semantic-mediawiki.org/wiki/Help:Property_page/Filter">search filter</a> allows the inclusion of <a rel="nofollow" class="external text" href="https://www.semantic-mediawiki.org/wiki/Help:Query_expressions">query expressions</a> such as <code>~</code> or <code>!</code>. The selected <a rel="nofollow" class="external text" href="https://www.semantic-mediawiki.org/wiki/Query_engine">query engine</a> might also support case insensitive matching or other short expressions like:<ul><li><code>in:</code> result should include the term, e.g. '<code>in:Foo</code>'</li></ul><ul><li><code>not:</code> result should to not include the term, e.g. '<code>not:Bar</code>'</li></ul>

Showing 37 pages using this property.

P

Page Forms +

Semantic Forms is an extension to MediaWiki that allows users to add, edit and query data using forms. It is heavily tied in with the Semantic MediaWiki extension, and is meant to be used for structured data that has semantic markup. +

Pajek +

“Pajek (Slovene word for Spider) is a program, for Windows, for analysis and visualization of large networks. It is freely available, for noncommercial use, at its download page. See also a reference manual for Pajek (in PDF). The development of Pajek is traced in its History. See also an overview of Pajek's background and development. ” ([http://pajek.imfm.si/doku.php?id=pajek Pajek], sept. 22, 2014) Pajek includes six data structures (e.g. network, permutation, cluster,...) and about 15 alorithms using these structures (e.g. partitions, decompositions, paths, flows...) +

Piwik +

Piwik is an open source web analytics platform. Piwik displays reports regarding the geographic location of visits, the source of visits (i.e. whether they came from a website, directly, or something else), the technical capabilities of visitors (browser, screen size, operating system, etc.), what the visitors did (pages they viewed, actions they took, how they left), the time of visits and more. In addition to these reports, Piwik provides some other features that can help users analyze the data Piwik accumulates, such as: *Annotations — the ability to save notes (such as one's analysis of data) and attach them to dates in the past. *Transitions — a feature similar to Click path-like features that allows one to see how visitors navigate a website, but different in that it only displays navigation information for one page at a time. *Goals — the ability to set goals for actions it is desired for visitors to take (such as visiting a page or buying a product). Piwik will track how many visits result in those actions being taken. *E-commerce — the ability to track if and how much people spend on a website. *Page Overlay — a feature that displays analytics data overlaid on top of a website. *Row Evolution — a feature that displays how metrics change over time within a report. *Custom Variables — the ability to attach data, like a user name, to visit data. +

Q

QDA Miner +

QDA Miner is qualitative "mixed methods" data analysis package. There are two version: * A [http://provalisresearch.com/products/qualitative-data-analysis-software/freeware/ free QDA Miner Lite] Version * An expensive commercial version Quote from the official [http://provalisresearch.com/products/qualitative-data-analysis-software/ product page]: “DA Miner is an easy-to-use qualitative data analysis software package for coding, annotating, retrieving and analyzing small and large collections of documents and images. QDA Miner qualitative data analysis tool may be used to analyze interview or focus group transcripts, legal documents, journal articles, speeches, even entire books, as well as drawings, photographs, paintings, and other types of visual documents. Its seamless integration with SimStat, a statistical data analysis tool, and [[WordStat]], a quantitative content analysis and text mining module, gives you unprecedented flexibility for analyzing text and relating its content to structured information including numerical and categorical data.” +

R

R +

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. R is available as Free Software for data manipulation, calculation and graphical display. It includes *an effective data handling and storage facility, *a suite of operators for calculations on arrays, in particular matrices, *a large, coherent, integrated collection of intermediate tools for data analysis, *graphical facilities for data analysis and display either on-screen or on hardcopy, and *a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities. R can be considered as an environment within which statistical techniques are implemented. R can be extended via packages. For example, try: * [http://rqda.r-forge.r-project.org/ RQDA] * [http://cran.at.r-project.org/web/views/NaturalLanguageProcessing.html CRAN Task View: Natural Language Processing] +

RapidAnalytics +

RapidAnalytics is an open source server for data mining and business analytics. It is based on the data mining solution RapidMiner and includes ETL, data mining, reporting, dashboards in a single server solution. +

RapidMiner Studio +

RapidMiner is a world-leading open-source system for data mining. It is available as a stand-alone application for data analysis and as a data mining engine for the integration into own products. '''RapidMiner is now RapidMiner Studio''' and RapidAnalytics is now called RapidMiner Server. In a few words, RapidMiner Studio is a "downloadable GUI for machine learning, data mining, text mining, predictive analytics and business analytics". It can also be used (for most purposes) in batch mode (command line mode). [[User:Camacab0|Camacab0]] ([[User talk:Camacab0|talk]]) +

Redash +

Quotes from the [https://redash.io/help/aboutrd/aboutrd.html#whats_redash FAQ]: Redash is an open source tool for teams to query, visualize and collaborate. Redash is quick to setup and works with any data source you might need so you can query from anywhere in no time. [..] Redash was built to allow fast and easy access to billions of records, that we process and collect using Amazon Redshift (“petabyte scale data warehouse” that “speaks” PostgreSQL). Today Redash has support for querying multiple databases, including: Redshift, Google BigQuery,Google Spreadsheets, PostgreSQL, MySQL, Graphite, Axibase Time Series Database and custom scripts. Main features: * Query editor - enjoy all the latest standards like auto-complete and snippets. Share both your results and queries to support an open and data driven approach within the organization. * Visualization - once you have your dataset, select one of our /9 types of visualizations/ for your query. You can also export or embed it anywhere. * Dashboard - combine several visualizations into a topic targeted dashboard. * Alerts - get notified via email, Slack, Hipchat or a webhook when your query's results need attention. " API - anything you can do with the UI, you can do with the API. Easily connect results to other systems or automate your workflows. +

S

SAM +

SAM includes a set of visualizations of learner activities to increase awareness and to support self-reflection. These are implemented as widgets in the ROLE project +

SATO +

SATO is a multi-purpose text mining tool, e.g. it includes concordancing, lexical inventoring, annotation and categorization. It allows to mark up text with variables for further analysis. SATO is a web-based text analysis tool using a command line language. So far, only a french interface exists. A commercial version exists, i.e. you can buy a license to install the same system on your own server. +

SNAPP +

SNAPP essentialy serves as a diagnostic instrument, allowing teaching staff to evaluate student behavorial patterns against learning learning activity design objectives and intervene as required in a timely manner. +

SQ-ALL +

Run SQL queries on APIs, JSON / XML / RSS feeds, Web pages (tables), EVERYTHING! +

Scrapy +

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. Features *Scrapy was designed with simplicity in mind, by providing the features you need without getting in your way *Just write the rules to extract the data from web pages and let Scrapy crawl the entire web site for you *Scrapy is used in production crawlers to completely scrape more than 500 retailer sites daily, all in one server *Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core *Scrapy is completely written in [[Python]] and runs on Linux, Windows, Mac and BSD *Scrapy comes with lots of functionality built in. Check this section of the documentation for a list of them. *Scrapy is extensively documented and has an comprehensive test suite with very good code coverage +

Semantic Drilldown +

Semantic Drilldown is an extension to Semantic MediaWiki (SMW) that provides a page for drilling down through a site's data, using categories and filters on semantic properties. The list of pages in each top-level category can be viewed, and for each such category, filters can be created that cover a specific semantic property. If filters exist for a category, users can click on the different possible values for those filters, narrowing the set of results, and thus drill down through the data. +

Semantic Forms Inputs +

Semantic Forms Inputs is an extension to MediaWiki that provides additional input types for Semantic MediaWikis that use the Semantic Forms extension. +

Semantic Maps +

Semantic Maps is an extension to Semantic MediaWiki (SMW) that adds semantic capabilities to the Maps extension and adds the datatype Geographic coordinate. +

Semantic MediaWiki +

Semantic MediaWiki is an extension for managing structured data in your wiki and for querying that data to create dynamic representations: tables, timelines, maps, lists, etc. +

Semantic Result Formats +

Semantic Result Formats (SRF) is a MediaWiki extension, used in conjunction with the Semantic MediaWiki extension, that bundles a number of further result formats for SMW's inline queries. The individual formats can be added to the installation independently... +

Semilar +

The goal of the SEMantic simILARity software toolkit (SEMILAR; pronounced the same way as the word 'similar') is to promote productive, fair, and rigorous research advancements in the area of semantic similarity. The kit is available as application software or as Java API. As of March 2014, the GUI-based SEMILAR application is only available to a limited number of users who commit to help improving the usability of the interface. The JAVA libray (API) however, can be downloaded. SEMILAR comes with various similarity methods based on Wordnet, Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), BLEU, Meteor, Pointwise Mutual Information (PMI), Dependency based methods, optimized methods based on Quadratic Assignment, etc. And the similarity methods work in different granularities - word to word, sentence to sentence, or bigger texts. Some methods have their own variations which coupled with parameter settings and your selection of preprocessing steps could result in a huge space of possible instances of the same basic method. +

Stanford NLP toolkits +

Quote: The Stanford NLP Group makes parts of our Natural Language Processing software available to everyone. These are statistical NLP toolkits for various major computational linguistics problems. They can be incorporated into applications with human language technology needs. ([http://nlp.stanford.edu/software/index.shtml]) +

T

TOko +

Quoted from the tOko homepage (oct 2014) * tOKo is an open source tool for text analysis and browsing a corpus of documents. It implements a wide variety of text analysis and browsing functions in an interactive user interface. * An important application area of tOKo is ontology development. It supports both ontology construction from a corpus, as well as relating the ontology back to a corpus (for example by highlighting concepts from the ontology in a document). * Another application area is community research. Here the objective is to analyse the exchange of information, for example in a community forum or through a collection of interconnected weblogs. +

Tableau Public +

Tableau software helps people communicate data through an innovation called VizQL, a visual query language that converts drag-and-drop actions into data queries, allowing users to quickly find and share insights in their data. With Tableau, “data workers” first connect to data stored in files, cubes, databases, warehouses, Hadoop technologies, and even some cloud sources like Google Analytics. They then interact with the Tableau user interface to simultaneously query the data and view the results in charts, graphs, and maps that can be arranged together on dashboards. ([http://shop.oreilly.com/product/0636920030942.do Jones, 2014]: 15) Basically, one has to install a desktop application (Win/Mac) and create a visualization. The result then can be published either on their public server or on your own server (commercial). +

Tabula +

Tabula is a free, open source tool that allows you to easily take data out of PDF files and into Excel, database programs, and web applications. Tabula allows users to upload their documents, indicate the position of the tables they want and extract the data right into Comma Separated Variable (CSV) or Tab Separated Variable (TSV) file, or just copy the text as CSV to a clipboard. Tabula can repeat operation on several pages or documents. +

Tangara +

'''Quotes''' from the [http://eric.univ-lyon2.fr/~ricco/tanagra/index.html official home page] (10/2014): * TANAGRA is a free DATA MINING software for academic and research purposes. It proposes several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area. * The main purpose of Tanagra project is to give researchers and students an easy-to-use data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing to analyse either real or synthetic data. * The second purpose of TANAGRA is to propose to researchers an architecture allowing them to easily add their own data mining methods, to compare their performances. TANAGRA acts more as an experimental platform in order to let them go to the essential of their work, dispensing them to deal with the unpleasant part in the programmation of this kind of tools : the data management. * The third and last purpose, in direction of novice developers, consists in diffusing a possible methodology for building this kind of software. They should take advantage of free access to source code, to look how this sort of software is built, the problems to avoid, the main steps of the project, and which tools and code libraries to use for. In this way, Tanagra can be considered as a pedagogical tool for learning programming techniques. According to its author, Tangara can be compared to [[Weka]]: In comparison it has an easier to use Interface, but less functionality. +

Taporware +

TAPoRware is a set of text analysis tools that enables users to perform text analysis on HTML, XML and plain text files, using documents from the users' machine or on the web. There are five families of tools: for HTML, XML, Text, Other and Beta. A list is included below in the free text section. +

Test tool +

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur euismod molestie suscipit. Quisque metus libero, vulputate sed consectetur elementum, molestie id mi. Aliquam tristique diam metus, eget tincidunt tortor aliquet sit amet. Vestibulum ac velit id lacus blandit hendrerit eu nec risus. Donec ac elementum nisi. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nulla nec ipsum felis. Vestibulum neque diam, laoreet in mollis eget, vulputate at erat. Donec quis semper est, in condimentum quam. Pellentesque pulvinar semper est, ac condimentum massa adipiscing ut. Sed pharetra ligula et posuere vulputate. Morbi ullamcrper auctor varius. Nulla eget nibh at ipsum convallis faucibus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Sed sed turpis sagittis, viverra libero ac, lacinia ligula. +

TextSTAT +

TextSTAT is simple text analysis program. It's main functionality is concordance. Quote from the [http://neon.niederlandistik.fu-berlin.de/en/textstat/ home page] (11/2014): “TextSTAT is a simple programme for the analysis of texts. It reads plain text files (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. This version includes a web-spider which reads as many pages as you want from a particular website and puts them in a TextSTAT-corpus. The new news-reader, too, puts news messages in a TextSTAT-readable corpus file. TextSTAT reads MS Word and OpenOffice files. No conversion needed, just add the files to your corpus... ” +

Textalyser +

Quote from the [http://textalyser.net/ home page]: “Welcome to the online text analysis tool, the detailed statistics of your text, perfect for translators (quoting), for webmasters (ranking) or for normal users, to know the subject of a text. Now with new features as the analysis of words groups, finding out the keyword density, analyse the prominence of word or expressions.” +

Tm +

tm package provides a framework for text mining applications within R. The tm package offers functionality for managing text documents, abstracts the process of document manipulation and eases the usage of heterogeneous text formats in R. The package provides native support for reading in several classic file formats such as plain text, PDFs, or XML files. There is also a plug-in mechanism to handle additional file formats. The data structures and algorithms can be extended to fit custom demands. +

Tropes +

Tropes is a free text-analysis(text mining) software . Tropes include its ability to carry out stylistic, syntactic and semantic analyses and to present the results in graph and table form. Tropes can yield information about a text such as stylistic/rhetorical analyses (argumentative, enunciative, descriptive or narrative style). It can also identify different word categories (verbs, connectors, personal pronouns, modalities, qualifying adjectives), conduct thematic analyses (reference fields), and detect discursive/chronological structures. +

Tweet NLP +

Quote: We provide a tokenizer, a part-of-speech tagger, hierarchical word clusters, and a dependency parser for tweets, along with annotated corpora and web-based annotation tools. +

V

Voyant Tools +

Voyeur is a web-based text analysis environment. It is designed to be user-friendly, flexible and powerful. Voyeur is part of the Hermeneuti.ca, a collaborative project to develop and theorize text analysis tools and text analysis rhetoric. [http://hermeneuti.ca/voyeur/ Voyeur Tools: See Through Your Texts] (retrieved 3/2014). In Yoyeur, you can * use texts in a variety of formats including plain text, HTML, XML, PDF, RTF and MS Word * use texts from different locations, including URLs and uploaded files * perform lexical analysis including the study of frequency and distribution data; in particular export data into other tools (as XML, tab separated values, etc.) * embed live tools into remote web sites that can accompany or complement your own content +

W

Web-harvest +

Web-Harvest is Open Source Web Data Extraction tool written in Java. It offers a way to collect desired Web pages and extract useful data from them. In order to do that, it leverages well established techniques and technologies for text/xml manipulation such as XSLT, XQuery and Regular Expressions. Web-Harvest mainly focuses on HTML/XML based web sites which still make vast majority of the Web content. On the other hand, it could be easily supplemented by custom Java libraries in order to augment its extraction capabilities. +

Weka +

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. Weka 3.7 (still beta in oct. 2014) includes a package system, that allows to add functionality without recompiling the system. As of summer 2014, most people seem to use this developer version. Weka is a very popular free data mining tool that includes advanced text mining features +

WordSmith +

Quotation from the [http://www.lexically.net/downloads/version6/HTML/index.html?getting_started.htm getting started page] (11/2014): “ WordSmith Tools is an integrated suite of programs for looking at how words behave in texts. You will be able to use the tools to find out how words are used in your own texts, or those of others. The WordList tool lets you see a list of all the words or word-clusters in a text, set out in alphabetical or frequency order. The concordancer, Concord, gives you a chance to see any word or phrase in context -- so that you can see what sort of company it keeps. With KeyWords you can find the key words in a text. The tools have been used by Oxford University Press for their own lexicographic work in preparing dictionaries, by language teachers and students, and by researchers investigating language patterns in lots of different languages in many countries world-wide.” +

WordStat +

Wordstat is a commercial text-mining and content analysis software. It is integrated with the [[QDA Miner]] and SimStat products from the same company. Quote from the official [http://provalisresearch.com/products/content-analysis-software/product page]: “WordStat is a flexible and easy-to-use text analysis software – whether you need text mining tools for fast extraction of themes and trends, or careful and precise measurement with state-of-the-art quantitative content analysis tools. WordStat‘s seamless integration with SimStat – our statistical data analysis tool – and QDA Miner – our qualitative data analysis software – gives you unprecedented flexibility for analyzing text and relating its content to structured information, including numerical and categorical data.” +

Wordcruncher +

Quote from the [http://www.wordcruncher.com/index.html home page]: “WordCruncher is a free eBook reader with research tools to help students and scholars study important texts. * You can look for specific references, search for words or phrases, follow cross-reference hyperlinks, and enlarge images. * You can copy and paste text, add bookmarks, highlight text, and make searchable notes. * Additional study aids include complex searches, word frequencies, word frequency distributions, synchronized windows to compare translations, word tags, and various text analysis reports (e.g., collocation, vocabulary dispersion, vocabulary usage). ” +

Property:Has description

Navigation menu

Slow Search