DataMelt: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(11 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Data mining and learning analytics tools
{{Data mining and learning analytics tools
|field_logo=Scavis logo.jpg
|field_logo=Dm logo125px.png
|field_screenshot=Scavis.png
|field_screenshot=Datamelt 2dhisto.jpeg
|field_name=Computation and Visualization Environment
|field_name=Computation and Visualization Environment
|field_developers=DataMelt community. Led by S.Chekanov
|field_developers=DataMelt community. Led by S.Chekanov
|field_website=http://jwork.org/dmelt/
|field_website=https://datamelt.org/
|field_data_tool_type=Application software
|field_data_tool_type=Application software
|field_plugin_of=
|field_license_type=Free&Open source
|field_license_type=Free&Open source
|field_free_software_licence=GPL / GNU for non commercial use. Commercial-friendly license is available
|field_free_software_licence=GPL / GNU for non commercial use. Commercial-friendly license is available
|field_last_release=2018/05/10
|field_last_release=2019/02/02
|field_last_version=2.2
|field_last_version=2.4
|field_description='''DataMelt is a software environment for numeric calculations, statistics and data analysis '''
|field_description='''DataMelt is a software environment for numeric calculations, statistics and data analysis '''


Line 19: Line 18:


DMelt creates high-quality vector-graphics images (SVG, EPS, PDF etc.) that can be included in LaTeX and other text-processing systems.
DMelt creates high-quality vector-graphics images (SVG, EPS, PDF etc.) that can be included in LaTeX and other text-processing systems.
The program runs on Windows, Linux, Mac OS.


==History==
DataMelt has its roots in [[particle physics]] where data mining is a primary task. It was created as [[jHepWork]] project in 2005 and it was initially written for data analysis for [[particle physics]]<ref>
HEP data analysis using jHepWork and Java, arXiv:0809.0840v2, ANL-HEP-CP-08-53 preprint. CERN preprint, [https://arxiv.org/abs/0809.0840/ arXiv:0809.0840v2]
</ref> using the [[Java (programming language)|Java]] software concept for [[International linear collider|International Linear Collider]] project developed at [[SLAC]]. Later versions of jHepWork were modified for general public use (for scientists, engineers, students for educational purpose) since the International Linear Collider project has stalled. In 2013, jHepWork was renamed to DataMelt and become a general-purpose community-supported project.
The main source of the reference is the book "Scientific Data analysis using Jython Scripting and Java"
<ref>
Scientific Data analysis using Jython Scripting and Java. Book. By S.V.Chekanov, Springer-Verlag,  ISBN 978-1-84996-286-5, [https://www.springer.com/us/book/9781849962865]
</ref>
which discusses  data-analysis methods using [[Java (programming language)|Java]] and [[Jython]] scripting.
Later it was also discussed in the  German Java SPEKTRUM journal
<ref>
DataMelt – Werkbank für technisch-wissenschaftliche Berechnungen und Visualisierungen mit Java und Jython. by Rohe Klaus.  Java SPEKTRUM. (in German) volume 5 (2013) 26-28 [https://www.sigs-datacom.de/fachzeitschriften/javaspektrum/archiv/artikelansicht/artikel-titel/integrationsspektrum-scavis-werkbank-fuer-technisch-wissenschaftliche-berechnungen-und-visualis.html]
</ref>.
The string "HEP" in the project name "jHepWork" abbreviates "High-Energy Physics". But due to a wide popularity outside this area of physics, it was renamed to [[SCaViS]] ('''S'''cientific '''C'''omputation and '''Vis'''ualization Environment). This project existed for 3 years before it was renamed to DataMelt (or, in short, DMelt).


DataMelt is hosted by the jWork.ORG portal<ref>  jWork.ORG Community Portal focused on Java scientific software. [http://jwork.org/main/]</ref>
|field_analysis_orientation=General analysis
|field_data_manipulation_type=Data transformation, Data analysis, Data visualisation
|field_data_transformation_capabilities=Mathematical transformation of data for analysis
|field_analysis_type=Basic statistics and data summarization, Data mining methods and algorithms
|field_visualisation_type=Sequential Graphic, Chart/Diagram, Map
|field_tool_usability=rather easy to use
|field_end_user_type=Students/Learners/Consumers, Teachers/Tutors/Managers, Developers/Designers, Researchers, Organisations/Institutions/Firms
|field_statistics_level=Advanced
|field_programming_level=Basic
|field_system_engineering_level=None
|field_data_mining_models_level=Medium
|field_completion_level=Medium
|field_last_edition=2018/05/23
}}


==Supported platforms==
==History==
DataMelt has its roots in particle physics where data mining is a primary task. It was created as jHepWork project in 2005 <ref>HEP data analysis using jHepWork and Java, arXiv:0809.0840v2, ANL-HEP-CP-08-53 preprint. CERN preprint, [https://arxiv.org/abs/0809.0840/ arXiv:0809.0840v2]
</ref> using the Java software concept for International linear collider project developed at SLAC. Later versions of jHepWork were modified for general public use (for scientists, engineers, students for educational purpose) since the International Linear Collider project has stalled. In 2013, jHepWork was renamed to SCaVis. In 2015 it became a community-supported  project
and was renamed to DataMelt.


DataMelt runs on Windows, Linux, Mac and the [[Android (operating system)|Android]] platforms. The package for the Android is called AWork.
DataMelt is hosted by the jWork.ORG portal<ref>jWork.ORG Community Portal focused on Java scientific software. [http://jwork.org/main/].</ref>.


==Documentation==
==Documentation==


DataMelt is extensively documented. In 2018, the web page of this project contained about 600 examples written in Jython, Java, Groovy, JRuby, covering a number of fields, from general mathematics to data mining and data visualization. The Java API documentation includes the description of more than 40,000 Java classes. In addition,
In 2018, DataMelt web page hosted about 600 examples written in Jython, Java, Groovy, JRuby, covering a number of fields, from general mathematics to data mining and data visualization. The Java API documentation includes the description of more than 40,000 Java classes. The DataMelt documentation includes  certain restrictions  for general public due to the proprietorial nature of the documentation project.
there is a wiki documentation. The documentation includes  certain restrictions  for general public due to the proprietorial nature of the documentation project.




==License terms==
jHepWork was described in the book "Scientific Data analysis using Jython Scripting and Java"
 
<ref>
DataMelt is licensed by [[Freemium]] license.
Scientific Data analysis using Jython Scripting and Java. Book. By S.V.Chekanov, Springer-Verlag, ISBN 978-1-84996-286-5, [https://www.springer.com/us/book/9781849962865]</ref>. Later it was also discussed in the  German Java SPEKTRUM journal
The core source code of the numerical and graphical libraries is licensed by the [[GNU General Public License]]. The interactive development environment (IDE) used by DataMelt has some restrictions for commercial usage since language files, documentation files, examples, installer, code-assist databases, interactive help are licensed by the creative-common license. Full members of the DataMelt project have several benefits, such as: the license for a commercial usage, access to the source repository, an extended help system, a user script repository and an access to the complete documentation.
 
The commercial licenses cannot apply to source code that was imported or contributed<ref>{{cite web|url=http://jwork.org/wiki/DMelt:Dev/Contributions|title=Contributed Packages (DataMelt Manual)}}</ref> to DataMelt from other authors.
 
==Examples==
 
===Jython scripts===
 
Here is an example of how to show 2D bar graphs by reading a CVS file downloaded from the [[World Bank]] web site.
 
<syntaxhighlight lang="python">
from jhplot.io.csv import *
from java.io import *
from jhplot import *
 
d = {}
reader = CSVReader(FileReader("ny.gdp.pcap.cd_Indicator_en_csv_v2.csv"));
while True:
    nextLine = reader.readNext()
    if nextLine is None:
        break
    xlen = len(nextLine)
    if xlen < 50:
        continue
    d[nextLine[0]] = float(nextLine[xlen-2]) # key=country, value=DGP
 
c1 = HChart("2013",800,400)
#c1.setGTitle("2013 Gross domestic product  per capita")
c1.visible()
c1.setChartBar()
c1.setNameY("current US $")
c1.setNameX("")
c1.setName("2013 Gross domestic product  per capita")
 
name1 = "Data Source: World Development Indicators"
 
set_value = lambda name: c1.valueBar(d[name], name, name1)
 
set_value(name="Russia")
set_value(name="Poland")
set_value(name="Romania")
set_value(name="Bulgaria")
set_value(name="Belarus")
set_value(name="Ukraine")
c1.update()
</syntaxhighlight>
 
The execution of this script plots a bar chart in a separate window. The image can be saved in a number of formats.
 
Here is another simple example which illustrates how to fill a 2D histogram and display it on a canvas. The script also creates a figure in the [[Portable Document Format|PDF]] format.
This script illustrates how to glue and mix the native JAVA classes (from the package java.util) and DataMelt classes (the package jhplot) inside a script written using the Python syntax.
 
<syntaxhighlight lang="python">
from java.util import Random
from jhplot import *
 
c1 = HPlot3D("Canvas") # create an interactive canvas
c1.setGTitle("Global title")
c1.setNameX("X")
c1.setNameY("Y")
c1.visible()
c1.setAutoRange()
 
h1 = H2D("2D histogram", 25, -3.0, 3.0, 25, -3.0, 3.0)
rand = Random()
for i in range(200):
    h1.fill(rand.nextGaussian(), rand.nextGaussian())
c1.draw(h1)
c1.export("jhplot3d.eps") # export to EPS Vector Graphics
</syntaxhighlight>
 
This script can be run either using DataMelt IDE or using a stand-alone Jython after specifying classpath to DataMelt libraries. The output is shown below:
 
[[File:Histograms_in_3D_showing_random_Gaussian_data.png|thumb|center|222px|3D histogram]]
 
===Groovy scripts===
The same example can also be coded using the [[Groovy (programming language)|Groovy]] programming language which is supported by DataMelt.
 
<syntaxhighlight lang="groovy">
import java.util.Random
import jhplot.*
 
c1 = new HPlot3D("Canvas")  //  create an interactive canvas
c1.setGTitle("Global title")
c1.setNameX("X")
c1.setNameY("Y")
c1.visible()
c1.setAutoRange()
 
h1 = new H2D("2D histogram",25,-3.0, 3.0,25,-3.0, 3.0)
rand = Random()
(1..200).each{ // or (0..<200).each{
// or Java: for (i=0; i<200; i++){
// if argument is required, you cann access it through "it" inside the loop:
// (0..<200).each{ println "step: ${it+1}" }
    h1.fill(rand.nextGaussian(),rand.nextGaussian())
}
c1.draw(h1);
c1.export("jhplot3d.eps") // export to EPS Vector Graphics
</syntaxhighlight>
 
[[Groovy (programming language)|Groovy]] is better integrated with Java and can be a factor three
faster for long loops over primitives compared to Jython.
 
==Reviews==
 
DataMelt and its earlier versions, SCaVis (2013-2015) and JHepWork (2005-2013), which are still  available from [http://jwork.org/dmelt/index.php?id=previous DataMelt archive repository], are described in these articles:
<ref>Data Analysis and Data Mining Using Java, Jython and jHepWork Blog. 2010. Oracle.com. [https://community.oracle.com/docs/DOC-982931]
</ref>
<ref>
<ref>
DataMelt – Werkbank für technisch-wissenschaftliche Berechnungen und Visualisierungen mit Java und Jython. by Rohe Klaus.  Java SPEKTRUM. (in German) volume 5 (2013) 26-28 [https://www.sigs-datacom.de/fachzeitschriften/javaspektrum/archiv/artikelansicht/artikel-titel/integrationsspektrum-scavis-werkbank-fuer-technisch-wissenschaftliche-berechnungen-und-visualis.html]
DataMelt – Werkbank für technisch-wissenschaftliche Berechnungen und Visualisierungen mit Java und Jython. by Rohe Klaus.  Java SPEKTRUM. (in German) volume 5 (2013) 26-28 [https://www.sigs-datacom.de/fachzeitschriften/javaspektrum/archiv/artikelansicht/artikel-titel/integrationsspektrum-scavis-werkbank-fuer-technisch-wissenschaftliche-berechnungen-und-visualis.html]
</ref>
</ref>.
<ref>HEP data analysis using jHepWork and Java. Proceedings of the HERA-LHC workshops (2007-2008), DESY-CERN [https://arxiv.org/abs/0809.0840]
More recently, the DataMelt program was described in the book <ref>Numeric Computation and Statistical Data Analysis on the Java Platform (Book). By  S.V.Chekanov, Springer, (2016) ISBN 978-3-319-28531-3, 700 pages, [https://www.springer.com/gp/book/9783319285290]</ref>. According to the Springer International, this book was top 25% most downloadable books in 2016 and 2017 in the category "Advanced Information and Knowledge Processing".
</ref>
<ref>Suitability analysis of data mining tools and methods. [https://is.muni.cz/th/255695/fi_b/]. S.Kovac, Bachelor's thesis (in English), jHepWork is reviewed on page 39-42, Masaryk University.
</ref>
The program was compared with other similar frameworks in these resources
<ref>
A Review: Comparative Study of Diverse Collection  of Data Mining Tools. By  S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila.  International Journal of Computer, Control, Quantum and Information Engineering. 2014; 8(6). 7.
</ref>
<ref>jHepWork – full-featured multiplatform data-analysis framework. [http://blog.dreamcss.com/frameworks/jhepwork-multiplatform-data-analysis-framework/]
</ref>
<ref>Java Applications: Weka, Svnkit, Jhepwork, Fiji, Memoranda, Livegraph, Dirsync Pro, Moneydance, Rachota Timetracker, Data Mapping Engine. de LLC Books (2010) (Redactor) [https://www.amazon.es/Java-Applications-Memoranda-Moneydance-Timetracker/dp/1156999391]
</ref>
<ref>A Study of Tools, Techniques, and Trends for Big Data Analytics. By R.Shireesha et al.  (2016) International Journal of Advance Computing Technique and Applications (IJACTA), ISSN : 2321-4546, Vol 4, Issue 1 [http://www.ijacta.com/index.php/ojs/article/download/41/33]
</ref>
<ref>Comparison of Various Tools for Data Mining. By P.Kaur etc. IJERT ISSN: 2278-0181 Vol. 3 Issue 10 (2010) [[https://www.ijert.org/download/11405/comparison-of-various-tools-for-data-mining]]
</ref>
<ref>Advanced Web and Network Technologies, and Applications. By Heng Tao Shen et al. Springer Science & Business Media - 2006-01-09
</ref>
<ref>SCaVis 2.1. Review by Pete Daniel  (2014) [https://www.download3k.com/Home-Education/Science/Download-SCaVis.html]
</ref>
.


The DataMelt (2015-), a new development of the JHepWork and  SCaVis programs.
Comparisons of DataMelt with other similar packages for statistical and numeric analysis are given in these resources
Comparisons of DataMelt with other popular packages for statistical and numeric analysis are given in these resources
<ref>Comparative Analysis of Information Extraction Techniques for Data Mining, by Amit Verma et al. Indian Journal of Science and Technology, Vol 9, March 2016 [http://www.indjst.org/index.php/indjst/article/download/80464/67992]
<ref>Comparative Analysis of Information Extraction Techniques for Data Mining, by Amit Verma et al. Indian Journal of Science and Technology, Vol 9, March 2016 [http://www.indjst.org/index.php/indjst/article/download/80464/67992]
</ref>
</ref>
Line 197: Line 68:
</ref>
</ref>
.
.
==Popularity==
jHepWork, SCaVis/DatMelt are part of the software library of National Institutes of Health Library
<ref>Data Sciences Workstation: SCaVis. By Lisa Federer. National Institutes of Health Library [http://nihlibrary.campusguides.com/c.php?g=102650&p=672758]</ref>,
Mathematical support of Institute for Nuclear Research of Russian academy of Sciences<ref>The DataForge, Sector for Mathematical Support of Institute for Nuclear Research of Russian academy of Sciences [http://www.inr.ru/~nozik/dataforge/misc.html]</ref> and others.
On a commercial site, DataMelt is provided as a service on Amazon EC2 clouds by the Miri Infotech IT Solution Provider company
<ref>Miri Infotech. A Complete IT Solution Provider. [http://www.miritech.com/products/aws/datamelt.aspx  DataMelt deployment]</ref>.
It is difficult to judge how many users use DataMelt since download information from the main resource [http://jwork.org/dmelt] is not available.
[[Sourceforge|Sourceforge]], which provides an alternative download option, quotes 300 monthly downloads [https://sourceforge.net/projects/dmelt/] (May 2018).
One estimate can be done by looking at the popularity of the book
<ref>Numeric Computation and Statistical Data Analysis on the Java Platform (Book). S.V.Chekanov, Springer, (2016)
ISBN 978-3-319-28531-3, 700 pages, [https://www.springer.com/gp/book/9783319285290]</ref>
which is an introduction to the DataMelt program. According to the
Springer International, this book is top 25% most downloadable books in 2016 and 2017 in the category "Advanced Information and Knowledge Processing"
<ref>Springer Book Performance Report [http://jwork.org/dmelt/data_dm/reports/BookPerformanceReport2017.pdf]</ref>.
Since the publication of the book, Springer detects 26k chapter downloads until May 2018<ref>Springer download Statistics of the book "Numeric Computation and Statistical Data Analysis on the Java Platform" 2016 [http://www.bookmetrix.com/detail/book/1b56c407-cc52-49ee-9547-5ec300e18498#downloads]</ref>, about 1500 per chapter.
The previous book describing jHepWork had a similar popularity <ref>Springer download Statistics of the book "Scientific Data Analysis using Jython Scripting and Java" [http://www.bookmetrix.com/detail/book/0d0c46b5-edee-4145-b112-4167b6739c11#downloads]</ref>.
Bookmetrix estimates 140 readers of the DataMelt book.
== References ==
|field_analysis_orientation=General analysis
|field_data_analysis_objective=
|field_data_manipulation_type=Data transformation, Data analysis, Data visualisation
|field_import_format=
|field_export_format=
|field_extraction_type=
|field_data_transformation_capabilities=Mathematical transformation of data for analysis
|field_analysis_type=Data mining methods and algorithms, Basic statistics and data summarization
|field_visualisation_type=Sequential Graphic, Chart/Diagram, Map
|field_tool_usability=
|field_statistics_level=Advanced
|field_programming_level=Basic
|field_system_engineering_level=None
|field_data_mining_models_level=Medium
|field_completion_level=Medium
|field_last_edition=2014/02/26
}}

Latest revision as of 23:27, 15 March 2020

Dm logo125px.png


Computation and Visualization Environment 2.4 (2019/02/02)

Datamelt 2dhisto.jpeg

Developed by: DataMelt community. Led by S.Chekanov
License: GPL / GNU for non commercial use. Commercial-friendly license is available
Web page : Tool homepage
Tool type : Application software

Tool.png

The last edition of this page was on: 2018/05/23

The Completion level of this page is : Medium


SHORT DESCRIPTION

DataMelt is a software environment for numeric calculations, statistics and data analysis


DataMelt, or DMelt, is an environment for numeric computation, data analysis and data visualization. DMelt is designed for analysis of large data volumes ("big data"), data mining, statistical analyses and math computations. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets.

DMelt is a computational platform: It can be used with different programming languages on different operating systems. Unlike other statistical programs, DataMelt is not limited by a single programming language. Data analysis and statistical computations can be done using high-level scripting languages (Python/Jython, Groovy, etc.), as well as a lower-level language, such as JAVA. It incorporates many open-source JAVA packages into a coherent interface using the concept of dynamic scripting.

DMelt creates high-quality vector-graphics images (SVG, EPS, PDF etc.) that can be included in LaTeX and other text-processing systems. The program runs on Windows, Linux, Mac OS.


TOOL CHARACTERISTICS

Usability

Authors of this page consider that this tool is rather easy to use.

Tool orientation

This tool is designed for general purpose analysis.

Data mining type

This tool is made for '.

Manipulation type

This tool is designed for Data transformation, Data analysis, Data visualisation.

IMPORT FORMAT :

EXPORT FORMAT :


Tool objective(s) in the field of Learning Sciences

Analysis & Visualisation of data
Predicting student performance
Student modelling
Social Network Analysis (SNA)
Constructing courseware

Providing feedback for supporting instructors:
Recommendations for students
Grouping students:
Developing concept maps:
Planning/scheduling/monitoring
Experimentation/observation

Tool can perform:

  • Data extraction of type:
  • Transformation of type: Mathematical transformation of data for analysis
  • Data analysis of type: Basic statistics and data summarization, Data mining methods and algorithms
  • Data visualisation of type: Sequential Graphic, Chart/Diagram, Map (These visualisations can be interactive and updated in "real time")



ABOUT USERS

Tool is suitable for:

Students/Learners/Consumers
Teachers/Tutors/Managers
Researchers
Developers/Designers
Organisations/Institutions/Firms
Others

Required skills:

STATISTICS: Advanced

PROGRAMMING: Basic

SYSTEM ADMINISTRATION: None

DATA MINING MODELS: Medium



FREE TEXT


Tool version : Computation and Visualization Environment 2.4 2019/02/02
(blank line)

Developed by : DataMelt community. Led by S.Chekanov
(blank line)
Tool Web page : https://datamelt.org/
(blank line)
Tool type : Application software
(blank line)
License:GPL / GNU for non commercial use. Commercial-friendly license is available

Datamelt 2dhisto.jpeg

SHORT DESCRIPTION


DataMelt is a software environment for numeric calculations, statistics and data analysis


DataMelt, or DMelt, is an environment for numeric computation, data analysis and data visualization. DMelt is designed for analysis of large data volumes ("big data"), data mining, statistical analyses and math computations. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets.

DMelt is a computational platform: It can be used with different programming languages on different operating systems. Unlike other statistical programs, DataMelt is not limited by a single programming language. Data analysis and statistical computations can be done using high-level scripting languages (Python/Jython, Groovy, etc.), as well as a lower-level language, such as JAVA. It incorporates many open-source JAVA packages into a coherent interface using the concept of dynamic scripting.

DMelt creates high-quality vector-graphics images (SVG, EPS, PDF etc.) that can be included in LaTeX and other text-processing systems. The program runs on Windows, Linux, Mac OS.

TOOL CHARACTERISTICS


Tool orientation Data mining type Usability
This tool is designed for general purpose analysis. This tool is designed for . Authors of this page consider that this tool is rather easy to use.
Data import format Data export format
. .
Tool objective(s) in the field of Learning Sciences

☑ Analysis & Visualisation of data
☑ Predicting student performance
☑ Student modelling
☑ Social Network Analysis (SNA)
☑ Constructing courseware

☑ Providing feedback for supporting instructors:
☑ Recommendations for students
☑ Grouping students:
☑ Developing concept maps:
☑ Planning/scheduling/monitoring
Experimentation/observation

Can perform data extraction of type:


Can perform data transformation of type:
Mathematical transformation of data for analysis


Can perform data analysis of type:
Basic statistics and data summarization, Data mining methods and algorithms


Can perform data visualisation of type:
Sequential Graphic, Chart/Diagram, Map (These visualisations can be interactive and updated in "real time")


ABOUT USER


Tool is suitable for:
Students/Learners/Consumers:☑ Teachers/Tutors/Managers:☑ Researchers:☑ Organisations/Institutions/Firms:☑ Others:☑
Required skills:
Statistics: ADVANCED Programming: BASIC System administration: NONE Data mining models: MEDIUM

OTHER TOOL INFORMATION


Datamelt 2dhisto.jpeg
Datamelt 2dhisto.jpeg
Dm logo125px.png
Computation and Visualization Environment
GPL / GNU for non commercial use. Commercial-friendly license is available"GPL / GNU for non commercial use. Commercial-friendly license is available" is not in the list (Artistic License 2.0, Berkeley Database License, Boost Software License, BSD license (modified version), BDL / BSD Documentation License, CeCILL (CEA CNRS INRIA Logiciel Libre), Cryptix General License, Eclipse Distribution License, EUPL - European Union Public License, GPL / GNU General Public License, ...) of allowed values for the "Free software license" property.
Free&Open source
DataMelt community. Led by S.Chekanov
2019/02/02
2.4
https://datamelt.org/
DataMelt is a software environment for numeric calculations, statistics and data analysis


DataMelt, or DMelt, is an environment for numeric computation, data analysis and data visualization. DMelt is designed for analysis of large data volumes ("big data"), data mining, statistical analyses and math computations. The program can be used in many areas, such as natural sciences, engineering, modeling and analysis of financial markets.

DMelt is a computational platform: It can be used with different programming languages on different operating systems. Unlike other statistical programs, DataMelt is not limited by a single programming language. Data analysis and statistical computations can be done using high-level scripting languages (Python/Jython, Groovy, etc.), as well as a lower-level language, such as JAVA. It incorporates many open-source JAVA packages into a coherent interface using the concept of dynamic scripting.

DMelt creates high-quality vector-graphics images (SVG, EPS, PDF etc.) that can be included in LaTeX and other text-processing systems. The program runs on Windows, Linux, Mac OS.

General analysis
Students/Learners/Consumers, Teachers/Tutors/Managers, Developers/Designers, Researchers, Organisations/Institutions/Firms
Advanced
Basic
None
Medium
Application software
Data transformation, Data analysis, Data visualisation
Basic statistics and data summarization, Data mining methods and algorithms
Mathematical transformation of data for analysis
Sequential Graphic, Chart/Diagram, Map
rather easy to use
Medium


History

DataMelt has its roots in particle physics where data mining is a primary task. It was created as jHepWork project in 2005 [1] using the Java software concept for International linear collider project developed at SLAC. Later versions of jHepWork were modified for general public use (for scientists, engineers, students for educational purpose) since the International Linear Collider project has stalled. In 2013, jHepWork was renamed to SCaVis. In 2015 it became a community-supported project and was renamed to DataMelt.

DataMelt is hosted by the jWork.ORG portal[2].

Documentation

In 2018, DataMelt web page hosted about 600 examples written in Jython, Java, Groovy, JRuby, covering a number of fields, from general mathematics to data mining and data visualization. The Java API documentation includes the description of more than 40,000 Java classes. The DataMelt documentation includes certain restrictions for general public due to the proprietorial nature of the documentation project.


jHepWork was described in the book "Scientific Data analysis using Jython Scripting and Java" [3]. Later it was also discussed in the German Java SPEKTRUM journal [4]. More recently, the DataMelt program was described in the book [5]. According to the Springer International, this book was top 25% most downloadable books in 2016 and 2017 in the category "Advanced Information and Knowledge Processing".

Comparisons of DataMelt with other similar packages for statistical and numeric analysis are given in these resources [6] [7] [8] [9] [10] .

  1. HEP data analysis using jHepWork and Java, arXiv:0809.0840v2, ANL-HEP-CP-08-53 preprint. CERN preprint, arXiv:0809.0840v2
  2. jWork.ORG Community Portal focused on Java scientific software. [1].
  3. Scientific Data analysis using Jython Scripting and Java. Book. By S.V.Chekanov, Springer-Verlag, ISBN 978-1-84996-286-5, [2]
  4. DataMelt – Werkbank für technisch-wissenschaftliche Berechnungen und Visualisierungen mit Java und Jython. by Rohe Klaus. Java SPEKTRUM. (in German) volume 5 (2013) 26-28 [3]
  5. Numeric Computation and Statistical Data Analysis on the Java Platform (Book). By S.V.Chekanov, Springer, (2016) ISBN 978-3-319-28531-3, 700 pages, [4]
  6. Comparative Analysis of Information Extraction Techniques for Data Mining, by Amit Verma et al. Indian Journal of Science and Technology, Vol 9, March 2016 [5]
  7. Evaluation and comparison of open source software suites for data mining and knowledge discovery. A.H. Altalhi et al. Wiley Online Library (2017) [6]
  8. Brief Review of Educational Applications Using Data Mining and Machine Learning, [7], by A. Berenice Urbina Nájera, Jorgede la Calleja Mora, Redie ISSN 1607-4041. Revista Electrónica de Investigación Educativa, 19(4), 84-96
  9. Analysis of Data Using Data Mining tool Orange. Maqsud S.Kukasvadiya et. al. [8] (2017) IJEDR, Volume 5, Issue 2, ISSN: 2321-9939
  10. Big Data - A Survey of Big Data Technologies. By P.Dhavalchandra, M.Jignasu, R.Amit. International Journal of Science and Technology. Volume 2, p45-50 (2016) [9]