Author(s):
- Molly Q. Feldman
- Ji Yong Cho
- Monica Ong
- Sumit Gulwani
- Zoran Popović
- Erik Andersen
Abstract:
K-8 mathematics students must learn many procedures, such as addition and subtraction. Students frequently learn “buggy” variations of these procedures, which we ideally could identify automatically. This is challenging because there are many possible variations that reflect deep compositions of procedural thought. Existing approaches for K-8 math use manually specified variations which do not scale to new math algorithms or previously unseen misconceptions. Our system examines students’ answers and infers how they incorrectly combine basic skills into complex procedures. We evaluate this approach on data from approximately 300 students. Our system replicates 86% of the answers that contain clear systematic mistakes (13%). Investigating further, we found 77% at least partially replicate a known misconception, with 53% matching exactly. We also present data from 29 participants showing that our system can demonstrate inferred incorrect procedures to an educator as successfully as a human expert.
Document: https://dl.acm.org/doi/10.1145/3173574.3173838
References:
1. Rajeev Alur, Loris D’Antoni, Sumit Gulwani, DileepKini, and Mahesh Viswanathan. 2013. Automated Grading of DFA Constructions.. InIJCAI, Vol. 13.1976–1982.
2. Erik Andersen, Sumit Gulwani, and Zoran Popovic.2013. A trace-based framework for analyzing and synthesizing educational progressions. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 773–782.3.
3. John R Anderson, Albert T Corbett, Kenneth RKoedinger, and Ray Pelletier. 1995. Cognitive tutors: Lessons learned.The journal of the learning sciences4,2 (1995), 167–207.
4. Robert B Ashlock. 1990.Error patterns in computation(5 ed.). Simon & Schuster Books For Young Readers.
5. Lawrence Bergman, Vittorio Castelli, Tessa Lau, and daniel Oblinger. 2005. DocWizards: a system for authoring follow-me documentation wizards. InProceedings of the 18th annual ACM symposium onUser interface software and technology. ACM, 191–200.
6. Stephen B Blessing. 1997. A programming by demonstration authoring tool for model-tracing tutors.International Journal of Artificial Intelligence in education (IJAIED)8 (1997), 233–261.
7. John Seely Brown and Richard R Burton. 1978.Diagnostic models for procedural bugs in basic mathematical skills.Cognitive science2, 2 (1978),155–192.
8. John Seely Brown and Kurt VanLehn. 1980. Repairtheory: A generative theory of bugs in procedural skills.Cognitive science4, 4 (1980), 379–426.
9. Kerry Shih-Ping Chang and Brad A. Myers. 2016. Using and Exploring Hierarchical Data in Spreadsheets. InProceedings of the 2016 CHI Conference on human factors in Computing Systems. ACM, 2497–2507.
10. Pei-Yu Chi, Sally Ahn, Amanda Ren, Mira Dontcheva, Wilmot Li, and Bj ̈orn Hartmann. 2012. MixT: automatic generation of step-by-step mixed media tutorials. InProceedings of the 25th annual ACM symposium on user interface software and technology. ACM, 93–102.
11. Ravi Chugh, Brian Hempel, Mitchell Spradlin, and Jacob Albers. 2016. Programmatic and direct manipulation, together at last.ACM SIGPLAN Notices51, 6 (2016), 341–354.
12. Linda S Cox. 1975. Diagnosing and RemediatingSystematic Errors in Addition and SubtractionComputations.Arithmetic Teacher22, 2 (1975),151–157.
13. Jennifer Fernquist, Tovi Grossman, and George Fitzmaurice. 2011. Sketch-sketch revolution: an engaging tutorial system for guided sketching and application learning. InProceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 373–382.
14. Floraine Grabler, Maneesh Agrawala, Wilmot Li, MiraDontcheva, and Takeo Igarashi. 2009. Generating photo manipulation tutorials by demonstration.ACM transactions on Graphics (TOG)28, 3 (2009), 66.
15. Andrew Head, Elena Glassman, Gustavo Soares, RyoSuzuki, Lucas Figueredo, Loris D’Antoni, and Bj ̈ornHartmann. 2017. Writing Reusable Code Feedback atScale with Mixed-Initiative Program Synthesis. In Proceedings of the Fourth (2017) ACM Conference onLearning@ Scale. ACM, 89–98.
16. HeyMath! 2016. Online. (2016).
17. Vicki-Lynn Holmes, Chelsea Miedema, Lindsay Nieuwkoop, and Nicholas Haugen. 2013. Data-driven intervention: correcting mathematics students’misconceptions, not mistakes.The MathematicsEducator23, 1 (2013).
18. Bernd Huber, Joong Ho Lee, and Ji-Hyung Park. 2015. Detecting User Intention at Public Displays from FootPositions. In Proceedings of the 33rd Annual ACM conference on Human Factors in Computing Systems.ACM, 3899–3902.
19. Earl Hunt and Jim Minstrell. 1994. A cognitive approach to the teaching of physics.Classroom Lessons: Integrating Cognitive Theory and Classroom Practice(1994).
20. Matthew P Jarvis, Goss Nuzzo-Jones, and Neil THeffernan. 2004. Applying machine learning techniques to rule generation in intelligent tutoring systems. intelligent Tutoring Systems. Springer, 157–178.
21. Garvit Juniwal, Alexandre Donz ́e, Jeff C Jensen, and Sanjit A Seshia. 2014. CPSGrader: Synthesizingtemporal logic testers for auto-grading an embedded systems laboratory. InProceedings of the 14thInternational Conference on Embedded Software. ACM,24.
22. Shalini Kaleeswaran, Anirudh Santhiar, Aditya Kanade, and Sumit Gulwani. 2016. Semi-supervised verified feedback generation. In Proceedings of the 2016 24thACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 739–750.
23. Maurits Kaptein and Judy Robertson. 2012. Rethinkingstatistical analysis methods for CHI. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1105–1114.
24. J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data.biometrics(1977), 159–174.
25. Pat Langley and Stellan Ohlsson. 1984. Automated Cognitive Modeling.. InAAAI. 193–197.
26. Francis Lankford, Jr. 1972.Some Computational Strategies of Seventh Grade Pupils. Technical Report.U.S. Department of Health, Education, and Welfare,University of Virginia.http://babel.hathitrust.org/cgi/pt?id=mdp.39015035510141;view=1up;seq=3
27. Tessa Lau, Steven A Wolfman, Pedro Domingos, and Daniel S Weld. 2003. Programming by demonstration using version space algebra.Machine Learning53, 1-2(2003), 111–156.
28. Bj ̈orn B Levidow, Earl Hunt, and Colene McKee. 1991.The DIAGNOSER: A HyperCard tool for building theoretically based tutorials.Behavior Research Methods, Instruments, & Computers23, 2 (1991),249–252.
29. Nan Li, William Cohen, Kenneth R Koedinger, and Noboru Matsuda. 2011. A machine learning approachfor automatic student model discovery. In Educational Data Mining 2011.
30. Noboru Matsuda, William W Cohen, Jonathan Sewall,Gustavo Lacerda, and Kenneth R Koedinger. 2007.Predicting students’ performance with sim student:Learning cognitive skills from observation. Frontiers in Artificial Intelligence and Applications158 (2007), 467.
31. Noboru Matsuda, Andrew Lee, William W Cohen, andKenneth R Koedinger. 2009. A computational model ofhow learner errors arise from weak prior knowledge. InProceedings of the Annual Conference of the CognitiveScience Society, Austin, TX. 1288–1293.
32. Mika ̈el Mayer, Gustavo Soares, Maxim Grechkin, VuLe, Mark Marron, Oleksandr Polozov, Rishabh Singh, Benjamin Zorn, and Sumit Gulwani. 2015. User interaction models for disambiguation in programming by example. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology.ACM, 291–301.
33. McGraw-Hill Education. 2016. Thrive. Online. (2016).
34. MetaMetrics, Inc. 2017. MetaMetrics: Bringing meaning to measurement by matching students to resources using a scientific, universal scale. (2017).https://metametricsinc.com/.
35. Tom M Mitchell. 1982. Generalization as search.Artificial intelligence18, 2 (1982), 203–226.
36. Brad A Myers, David A Weitzman, Andrew J Ko, and Duen H Chau. 2006. Answering why and why not questions in user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in computing systems. ACM, 397–406.
37. Jeffrey Nichols and Tessa Lau. 2008. Mobilization by Demonstration: Using Traces to Re-author Existing Web Sites. In Proceedings of the 13th International Conference on Intelligent User Interfaces. ACM,149–158.
38. Eleanor O’Rourke, Erik Andersen, Sumit Gulwani, and Zoran Popovi ́c. 2015. A Framework for Automatically Generating Interactive Instructional Scaffolding. InProceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM,1545–1554.
39. Kelly Rivers and Kenneth R Koedinger. 2015.Data-driven hint generation in vast solution spaces: a self-improving python programming tutor.InternationalJournal of Artificial Intelligence in Education27, 1(2015), 37–64.
40. Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Bj ̈orn Hartmann. 2017. Learning syntactic program transformations from examples. in Proceedings of the 39th International Conference on Software.Engineering. IEEE Press, 404–415.
41. Arjun Singh, Sergey Karayev, Kevin Gutowski, and Pieter Abbeel. 2017. Gradescope: A Fast, Flexible, and Fair System for Scalable Assessment of Handwritten Work. In Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale. ACM, 81–88.
42. Rishabh Singh, Sumit Gulwani, and ArmandoSolar-Lezama. 2013. Automated feedback generation for introductory programming assignments.ACMSIGPLAN Notices48, 6 (2013), 15–26.
43. Masamichi Sison, Raymund an2d Shimura. 1998.Student modeling and machine learning.InternationalJournal of Artificial Intelligence in Education (IJAIED)9 (1998), 128–158.
44. Piyawadee Sukaviriya. 1988. Dynamic construction of animated help from application context. In Proceedings of the 1st annual ACM SIGGRAPH symposium on UserInterface Software. ACM, 190–202.
45. Piyawadee Sukaviriya and James D Foley. 1990. Coupling a UI framework with automatic generation of context-sensitive animated help. InProceedings of the3rd annual ACM SIGGRAPH symposium on User interface software and technology. ACM, 152–166.
46. Kurt VanLehn. 1990.Mind bugs: The origins of procedural misconceptions. MIT press.
47. Ruud Wetzels, Dora Matzke, Michael D Lee, Jeffrey NRouder, Geoffrey J Iverson, and Eric-Jan Wagenmakers.2011. Statistical evidence in experimental psychology: An empirical comparison using 855 t tests.Perspectives on Psychological Science6, 3 (2011), 291–298.