Web-harvest
Web-harvest 2.0 (2010/02/17)
Developed by: Web-Harvest Team
License: BSD license (original version)
Web page : Tool homepage
Tool type :
The last edition of this page was on: 2014/02/26
The Completion level of this page is : Low
The last edition of this page was on: 2014/02/26 The Completion level of this page is : Low
SHORT DESCRIPTION
Web-Harvest is Open Source Web Data Extraction tool written in Java. It offers a way to collect desired Web pages and extract useful data from them. In order to do that, it leverages well established techniques and technologies for text/xml manipulation such as XSLT, XQuery and Regular Expressions. Web-Harvest mainly focuses on HTML/XML based web sites which still make vast majority of the Web content. On the other hand, it could be easily supplemented by custom Java libraries in order to augment its extraction capabilities.
TOOL CHARACTERISTICS
Usability
Tool orientation
Data mining type
Manipulation type
IMPORT FORMAT :
EXPORT FORMAT :
Tool objective(s) in the field of Learning Sciences | |
☑ Analysis & Visualisation of data |
☑ Providing feedback for supporting instructors: |
Tool can perform:
- Data extraction of type:
- Transformation of type:
- Data analysis of type:
- Data visualisation of type: (These visualisations can be interactive and updated in "real time")
ABOUT USERS
Tool is suitable for:
Required skills:
STATISTICS:
PROGRAMMING:
SYSTEM ADMINISTRATION:
DATA MINING MODELS:
FREE TEXT
Tool version : Web-harvest 2.0 2010/02/17 (blank line) Developed by : Web-Harvest Team |
SHORT DESCRIPTION
Web-Harvest is Open Source Web Data Extraction tool written in Java. It offers a way to collect desired Web pages and extract useful data from them. In order to do that, it leverages well established techniques and technologies for text/xml manipulation such as XSLT, XQuery and Regular Expressions. Web-Harvest mainly focuses on HTML/XML based web sites which still make vast majority of the Web content. On the other hand, it could be easily supplemented by custom Java libraries in order to augment its extraction capabilities.
TOOL CHARACTERISTICS
Tool orientation | Data mining type | Usability |
---|---|---|
This tool is designed for general purpose analysis. | This tool is designed for . | Authors of this page consider that this tool is . |
Data import format | Data export format |
---|---|
. | . |
Tool objective(s) in the field of Learning Sciences | |
☑ Analysis & Visualisation of data |
☑ Providing feedback for supporting instructors: |
Can perform data extraction of type:
Can perform data transformation of type:
Can perform data analysis of type:
Can perform data visualisation of type:
(These visualisations can be interactive and updated in "real time")
ABOUT USER
Tool is suitable for: | ||||
Students/Learners/Consumers:☑ | Teachers/Tutors/Managers:☑ | Researchers:☑ | Organisations/Institutions/Firms:☑ | Others:☑ |
Required skills: | |||
Statistics: | Programming: | System administration: | Data mining models: |
OTHER TOOL INFORMATION
Web-harvest.jpg |
Webharvest logo.jpg |
Web-harvest |
BSD license (original version) |
Free&Open source |
Web-Harvest Team |
2010/02/17 |
2.0 |
http://web-harvest.sourceforge.net/ |
Web-Harvest is Open Source Web Data Extraction tool written in Java. It offers a way to collect desired Web pages and extract useful data from them. In order to do that, it leverages well established techniques and technologies for text/xml manipulation such as XSLT, XQuery and Regular Expressions. Web-Harvest mainly focuses on HTML/XML based web sites which still make vast majority of the Web content. On the other hand, it could be easily supplemented by custom Java libraries in order to augment its extraction capabilities. |
General analysis |
Data extraction |
Low |