Educational data mining

The educational technology and digital learning wiki
Revision as of 20:18, 16 January 2014 by Daniel K. Schneider (talk | contribs) (Created page with "{{stub}} <pageby nominor="false" comments="false"/> == Data mining vs. learning analytics == == Data mining components == (Romero&Ventura, 2007) according to Baker & Yacef...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


<pageby nominor="false" comments="false"/>

Data mining vs. learning analytics

Data mining components

(Romero&Ventura, 2007) according to Baker & Yacef (2009) identifies the following types of educational data mining

  1. Statistics and visualization
  2. Web mining
    1. Clustering, classification, and outlier detection
    2. Association rule mining and sequential pattern mining
    3. Text mining

Baker & Yacef (2009) then summarize a new typology defined in Baker (2010):

  1. Prediction
    • Classification
    • Regression
    • Density estimation
  2. Clustering
  3. Relationship mining
    • Association rule mining
    • Correlation mining
    • Sequential pattern mining
    • Causal data mining
  4. Distillation of data for human judgment
  5. Discovery with models

Data mining relies on several types of sources:

  • Log files (if you have access)
  • Analytics databases filled with data from client-side JavaScript code (user actions such as entering a page can be recorded and the user can be traced through cookies)
  • Web page contents
  • Data base contents
  • Productions (other than website), e.g. word processing documents
  • ...

Types of analytics that can be obtained

  • quality of text
  • richness of content
  • content (with respect to some benchmark text)
  • similarity of content (among productions)
  • etc. (this list needs to be completed)



  • JEDM - Journal of Educational Data Mining

Organizations and communities



  • Baepler, P. M., Cynthia James. (2010). Academic Analytics and Data Mining in Higher Education. International Journal for the Scholarship of Teaching and Learning, 4(2).
  • Baker, R.S.J.d. (2010) Data Mining for Education. In McGaw, B., Peterson, P., Baker, E. (Eds.) International Encyclopedia of Education (3rd edition), vol. 7, pp. 112-118. Oxford, UK: Elsevier. Draft PDF - draft pdf
  • Baker, R.S.J.d. (2010) Mining Data for Student Models. In Nkmabou, R., Mizoguchi, R., & Bourdeau, J. (Eds.) Advances in Intelligent Tutoring Systems, pp. 323-338. Secaucus, NJ: Springer. PDF Reprint
  • Baker, R.S.J.d., Inventado, P.S. (in press) Educational Data Mining and Learning Analytics. To appear in J.A. Larusson, B. White (Eds.) Learning Analytics: From Research to Practice. Berlin, Germany: Springer. preprint draft pdf

Baker, R., Siemens, G. (in press) Educational data mining and learning analytics. To appear in Sawyer, K. (Ed.) Cambridge Handbook of the Learning Sciences: 2nd Edition preprint draft pdf

  • Baker, R.S.J.d. (2013) Learning, Schooling, and Data Analytics. Handbook on Innovations in Learning for States, Districts, and Schools, pp.179-190. Philadelphia, PA: Center on Innovations in Learning. pdf
  • Baker, R.S.J.d., Yacef, K. (2009) The State of Educational Data Mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, 1 (1), 3-17. PDF
    • This is a frequently cited article
  • Baker, R.S.J.d., de Carvalho, A. M. J. A. (2008) Labeling Student Behavior Faster and More Precisely with Text Replays. Proceedings of the 1st International Conference on Educational Data Mining, 38-47.
  • Gobert, J.D., Sao Pedro, M., Raziuddin, J., Baker, R. (2013) From Log Files to Assessment Metrics: Measuring Students' Science Inquiry Skills Using Educational Data Mining. Journal of the Learning Sciences, 22 (4), 521-563 [official pdf]
  • Gobert, J.D., Sao Pedro, M.A., Baker, R.S.J.d., Toto, E., Montalvo, O. (2012) Leveraging Educational Data Mining for Real-time Perfomance Assesment of Scientific Inquiry Skills within Microworlds. Journal of Educational Data Mining, 4 (1), 111-143 [pdf]
  • Hübscher, R. & Puntambekar, S. (2008). Integrating knowledge gained from data mining with pedagogical knowledge. In Proceedings of the 1st International Conference on Educational Data Mining (EDM2008), 97–106. (PDF)
  • Hübscher, R., Puntambekar, S., & Nye, A. H. (2007). Domain specific interactive data mining. In Proceedings of Workshop on Data Mining for User Modeling, 11th International Conference on User Modeling, Corfu, Greece, 81–90. (PDF)
  • Koedinger, K.R., Baker, R.S.J.d., Cunningham, K., Skogsholm, A., Leber, B., Stamper, J. (2010) A Data Repository for the EDM community: The PSLC DataShop. In Romero, C., Ventura, S., Pechenizkiy, M., Baker, R.S.J.d. (Eds.) Handbook of Educational Data Mining. Boca Raton, FL: CRC Press, pp. 43-56.
  • Macfadyen, L. P. and Sorenson. P. (2010) “Using LiMS (the Learner Interaction Monitoring System) to track online learner engagement and evaluate course design.” In Proceedings of the 3rd international conference on educational data mining (pp. 301–302), Pittsburgh, USA.
  • Macfayden, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning” system for educators: a proof of concept. Computers & Education, 54(2), 588–599.
    • This is an often cited text defining how EDM could be directly useful to educators.
  • Merceron, A. and K. Yacef (2005b). Educational data mining: a case study. In C. K. Looi, G. McCalla, B. Bredeweg, and J. Breuker, editors, Proceedings of the 12th Conference on Artificial Intelligence in Education, pages 467-474, Amsterdam, The Netherlands, 2005. IOS Press.
  • Romero, C. , Ventura, S.N., & Garcia, E. (2008). Data mining in course management systems: Moodle case study and tutorial, Computers & Education, 51(1), 368-384.
  • Romero, C. Ventura, S. (in press) Educational Data Mining: A Review of the State-of-the-Art. IEEE Transaction on Systems, Man, and Cybernetics, Part C: Applications and Reviews. PDF
  • Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 33(1), 135-146.
  • Merceron, A. and K. Yacef. Interestingness measures for association rules in educational data. In Proceedings of Educational Data Mining Conference, pages 57-66, 2008.
  • Perera, D.; J. Kay, K. Yacef, and I. Koprinska. Mining learners' traces from an online collaboration tool. In Proceedings of Educational Data Mining workshop, pages 60-69, 2007. HTML
  • Southavilay. V, K. Yacef, and R. A. Calvo. Process mining to support students’ collaborative writing (best student paper award). In Educational Data Mining conference proceedings, pages 257-266, 2010
  • Southavilay. V, K. Yacef, and R. A. Calvo. Analysis of collaborative writing processes using hidden markov models and semantic heuristics. In Submitted to Workshop on Semantic Aspects in Data Mining (SADM) at ICDM2010, 2010.
  • Winne, P.H., Baker, R.S.J.d. (2013) The Potentials of Educational Data Mining for Researching Metacognition, Motivation, and Self-Regulated Learning. Journal of Educational Data Mining, 5 (1), 1-8. [pdf]