Educational data mining: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
mNo edit summary
Line 10: Line 10:
See also:
See also:
* [[learning analytics]]
* [[learning analytics]]
Educational data mining is rooted in general data mining. However, there are specifics:
{{quotationbox|Data mining, also called Knowledge Discovery in Databases (KDD), is the field
of discovering novel and potentially useful information from large amounts of data
[Witten and Frank 1999]. It has been proposed that educational data mining methods are
often different from standard data mining methods, due to the need to explicitly account
for (and the opportunities to exploit) the multi-level hierarchy and non-independence in
educational data [Baker in press]. For this reason, it is increasingly common to see the
use of models drawn from the psychometrics literature in educational data mining
publications [Barnes 2005; Desmarais and Pu 2005; Pavlik et al. 2008].}} ([http://www.educationaldatamining.org/JEDM/index.php/JEDM/article/view/8 Baker & Yacef, 2009])


== Data mining vs. learning analytics ==
== Data mining vs. learning analytics ==

Revision as of 23:01, 16 January 2014

Draft

<pageby nominor="false" comments="false"/>

Introduction

“Educational Data Mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings which they learn in. Whether educational data is taken from students' use of interactive learning environments, computer-supported collaborative learning, or administrative data from schools and universities, it often has multiple levels of meaningful hierarchy, which often need to be determined by properties in the data itself, rather than in advance. Issues of time, sequence, and context also play important roles in the study of educational data.” (Educational Data Mining Society home page, retrieved Jan 17, 2014)

“Educational Data Mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings in which they learn.” (JEDM - Journal of Educational Data Mining, retrieved Jan 17, 2014)

See also:

Educational data mining is rooted in general data mining. However, there are specifics:

Data mining, also called Knowledge Discovery in Databases (KDD), is the field of discovering novel and potentially useful information from large amounts of data [Witten and Frank 1999]. It has been proposed that educational data mining methods are often different from standard data mining methods, due to the need to explicitly account for (and the opportunities to exploit) the multi-level hierarchy and non-independence in educational data [Baker in press]. For this reason, it is increasingly common to see the use of models drawn from the psychometrics literature in educational data mining

publications [Barnes 2005; Desmarais and Pu 2005; Pavlik et al. 2008].

(Baker & Yacef, 2009)

Data mining vs. learning analytics

According to Baker and Siemens (2013), “Siemens and Baker (2012) noted that the two communities have considerable overlap (in terms of both research and researchers), and that the two communities strongly believe in conducting research that has applications that benefit learners as well as informing and enhancing the learning sciences.”

Data mining components

(Romero&Ventura, 2007) according to Baker & Yacef (2009) identifies the following types of educational data mining

  1. Statistics and visualization
  2. Web mining
    1. Clustering, classification, and outlier detection
    2. Association rule mining and sequential pattern mining
    3. Text mining

Baker & Yacef (2009) then summarize a new typology defined in Baker (2010):

  1. Prediction
    • Classification
    • Regression
    • Density estimation
  2. Clustering
  3. Relationship mining
    • Association rule mining
    • Correlation mining
    • Sequential pattern mining
    • Causal data mining
  4. Distillation of data for human judgment
  5. Discovery with models

Data mining relies on several types of sources:

  • Log files (if you have access)
  • Analytics databases filled with data from client-side JavaScript code (user actions such as entering a page can be recorded and the user can be traced through cookies)
  • Web page contents
  • Data base contents
  • Productions (other than website), e.g. word processing documents
  • ...

Types of analytics that can be obtained

  • quality of text
  • richness of content
  • content (with respect to some benchmark text)
  • similarity of content (among productions)
  • etc. (this list needs to be completed)

Software

-... to do ....

Links

Journals

  • JEDM - Journal of Educational Data Mining

Organizations and communities

People

  • Ryan Baker (Ryan Shaun Joazeiro de Baker). Includes many interesting online (or draft) EDM publications.

Bibliography

  • Baepler, P. M., Cynthia James. (2010). Academic Analytics and Data Mining in Higher Education. International Journal for the Scholarship of Teaching and Learning, 4(2).
  • Baker, R.S.J.d. (2010) Data Mining for Education. In McGaw, B., Peterson, P., Baker, E. (Eds.) International Encyclopedia of Education (3rd edition), vol. 7, pp. 112-118. Oxford, UK: Elsevier. Draft PDF - draft pdf
  • Baker, R.S.J.d. (2010) Mining Data for Student Models. In Nkmabou, R., Mizoguchi, R., & Bourdeau, J. (Eds.) Advances in Intelligent Tutoring Systems, pp. 323-338. Secaucus, NJ: Springer. PDF Reprint
  • Baker, R.S.J.d., Inventado, P.S. (in press) Educational Data Mining and Learning Analytics. To appear in J.A. Larusson, B. White (Eds.) Learning Analytics: From Research to Practice. Berlin, Germany: Springer. preprint draft pdf
  • Baker, R., Siemens, G. (in press) Educational data mining and learning analytics. To appear in Sawyer, K. (Ed.) Cambridge Handbook of the Learning Sciences: 2nd Edition preprint draft pdf
  • Baker, R.S.J.d. (2013) Learning, Schooling, and Data Analytics. Handbook on Innovations in Learning for States, Districts, and Schools, pp.179-190. Philadelphia, PA: Center on Innovations in Learning. pdf
  • Baker, R.S.J.d., Yacef, K. (2009) The State of Educational Data Mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, 1 (1), 3-17. PDF
    • This is a frequently cited article
  • Baker, R.S.J.d., de Carvalho, A. M. J. A. (2008) Labeling Student Behavior Faster and More Precisely with Text Replays. Proceedings of the 1st International Conference on Educational Data Mining, 38-47.
  • Gobert, J.D., Sao Pedro, M., Raziuddin, J., Baker, R. (2013) From Log Files to Assessment Metrics: Measuring Students' Science Inquiry Skills Using Educational Data Mining. Journal of the Learning Sciences, 22 (4), 521-563 [official pdf]
  • Gobert, J.D., Sao Pedro, M.A., Baker, R.S.J.d., Toto, E., Montalvo, O. (2012) Leveraging Educational Data Mining for Real-time Perfomance Assesment of Scientific Inquiry Skills within Microworlds. Journal of Educational Data Mining, 4 (1), 111-143 [pdf]
  • Hübscher, R. & Puntambekar, S. (2008). Integrating knowledge gained from data mining with pedagogical knowledge. In Proceedings of the 1st International Conference on Educational Data Mining (EDM2008), 97–106. (PDF)
  • Hübscher, R., Puntambekar, S., & Nye, A. H. (2007). Domain specific interactive data mining. In Proceedings of Workshop on Data Mining for User Modeling, 11th International Conference on User Modeling, Corfu, Greece, 81–90. (PDF)
  • Koedinger, K.R., Baker, R.S.J.d., Cunningham, K., Skogsholm, A., Leber, B., Stamper, J. (2010) A Data Repository for the EDM community: The PSLC DataShop. In Romero, C., Ventura, S., Pechenizkiy, M., Baker, R.S.J.d. (Eds.) Handbook of Educational Data Mining. Boca Raton, FL: CRC Press, pp. 43-56.
  • Macfadyen, L. P. and Sorenson. P. (2010) “Using LiMS (the Learner Interaction Monitoring System) to track online learner engagement and evaluate course design.” In Proceedings of the 3rd international conference on educational data mining (pp. 301–302), Pittsburgh, USA.
  • Macfayden, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning” system for educators: a proof of concept. Computers & Education, 54(2), 588–599.
    • This is an often cited text defining how EDM could be directly useful to educators.
  • Merceron, A. and K. Yacef (2005b). Educational data mining: a case study. In C. K. Looi, G. McCalla, B. Bredeweg, and J. Breuker, editors, Proceedings of the 12th Conference on Artificial Intelligence in Education, pages 467-474, Amsterdam, The Netherlands, 2005. IOS Press.
  • Romero, C. , Ventura, S.N., & Garcia, E. (2008). Data mining in course management systems: Moodle case study and tutorial, Computers & Education, 51(1), 368-384.
  • Romero, C. Ventura, S. (in press) Educational Data Mining: A Review of the State-of-the-Art. IEEE Transaction on Systems, Man, and Cybernetics, Part C: Applications and Reviews. PDF
  • Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 33(1), 135-146.
  • Merceron, A. and K. Yacef. Interestingness measures for association rules in educational data. In Proceedings of Educational Data Mining Conference, pages 57-66, 2008.
  • Perera, D.; J. Kay, K. Yacef, and I. Koprinska. Mining learners' traces from an online collaboration tool. In Proceedings of Educational Data Mining workshop, pages 60-69, 2007. HTML
  • Southavilay. V, K. Yacef, and R. A. Calvo. Process mining to support students’ collaborative writing (best student paper award). In Educational Data Mining conference proceedings, pages 257-266, 2010
  • Southavilay. V, K. Yacef, and R. A. Calvo. Analysis of collaborative writing processes using hidden markov models and semantic heuristics. In Submitted to Workshop on Semantic Aspects in Data Mining (SADM) at ICDM2010, 2010.
  • Winne, P.H., Baker, R.S.J.d. (2013) The Potentials of Educational Data Mining for Researching Metacognition, Motivation, and Self-Regulated Learning. Journal of Educational Data Mining, 5 (1), 1-8. [pdf]