Language Learning and AI: 7 lessons from 70 years (#5)

5. Recording learner behavior and student modeling

The intelligent tutoring systems in ICALL had this knowledge stored in a student model (Schulze, 2012). Student modeling (e.g., Bull, 1993; Bull, 1994, 2000; Mabbott & Bull, 2004; McCalla, 1992; Michaud & McCoy, 2000; Schulze, 2008; Self, 1974; Tsiriga & Virvou, 2003) is a challenging endeavor; student data needs to be recorded and structured into a student profile, then inferences can be drawn to construct a student model over time. The model has structured information about prior learning, learner beliefs, strategies, and preferences, and language beliefs. Basically, it models the information teachers have about their students both through student records and the teacher’s experience. Such information helps to tailor instructional sequences, guidance and help, and corrective feedback individually so that it becomes relevant and most effective. GenAIs have LLMs which contain enormous information about language and languages (Wolfram, 2023, February 14); their knowledge of the learner is often non-existent or serendipitous at best. Currently and in the context of language education and especially in the context of previous research in ICALL and student modeling in general, the lack of a student model means that GenAIs cannot be treated nor employed as an intelligent tutoring system (ITS), because ITS consist of a knowledge base, a student model, and a pedagogical module (Wikipedia contributors, 2024, December 20) to imitate the behavior of a human tutor and provide individualized tutoring.

My inspiration for this title came from the book
Snyder, T. (2017). On tyranny: Twenty lessons from the twentieth century. Tim Duggan Books.

I am sharing these early drafts of a book chapter I published in
Yijen Wang, Antonie Alm, & Gilbert Dizon (Eds.) (2025),
Insights into AI and language teaching and learning. Castledown Publishers.

https://doi.org/10.29140/9781763711600-02.

Thus far, I have given a historical introduction and talked about the necessary exposure to authentic language, communication in context, interaction in language learning with GenAI, and appropriate error correction and contingent feedback. The following describes the basis for lesson #5.

To be continued …

References

Bull, S. (1993). Towards User/System Collaboration in Developing a Student Model for Intelligent Computer-Assisted Language Learning. Computer Assisted Language Learning, 8, 3-8.

Bull, S. (1994). Student modeling for second language acquisition. Computers and Education, 23(1-2), 13-20.

Bull, S. (2000). ‘Do It Yourself’ Student Models for Collaborative Student Modelling and Peer Interaction. In B. P. Goettl, H. M. Halff, C. Redfield Luckhardt, & V. J. Shute (Eds.), Intelligent Tutoring Systems. 4th International Conference, ITS ’98, San Antonio, Texas, USA, August 16-19, 1998 Proceedings (pp. 176-185). Springer Verlag.

Mabbott, A., & Bull, S. (2004). Alternative Views on Knowledge: Presentation of Open Learner Models. In J. C. Lester, R. M. Vicari, & F. Paraguacu (Eds.), Intelligent Tutoring Systems: 7th International Conference (pp. 689-698). Springer-Verlag.

McCalla, G. I. (1992). The Centrality of Student Modelling to Intelligent Tutoring Systems. In E. Costa (Ed.), New Directions for Intelligent Tutoring Systems (pp. 107-131). Springer Verlag.

Michaud, L. N., & McCoy, K. F. (2000). Supporting Intelligent Tutoring in CALL by Modeling the User’s Grammar. In Proceedings of the Thirteenth Annual International Florida Artificial Intelligence Research Symposium, May 22-24, 2000, Orlando, Florida (pp. 50-54). AAAI Press.

Schulze, M. (2008). Modeling SLA Processes Using NLP. In C. Chapelle, Y.-R. Chung, & J. Xu (Eds.), Towards Adaptive CALL: Natural Language Processing for Diagnostic Assessment. (pp. 149-166). Iowa State University. https://apling.engl.iastate.edu/wp-content/uploads/sites/221/2015/05/5thTSLL2007_proceedings.pdf

Schulze, M. (2012). Learner modeling. In C. A. Chapelle (Ed.), The Encyclopaedia of Applied Linguistics. 10 volumes (pp. online n.p.). Wiley-Blackwell.

Self, J. A. (1974). Student Models in Computer-Aided Instruction. International Journal of Man-Machine Studies, 6, 261-276.

Tsiriga, V., & Virvou, M. (2003). Modelling the Student to Individualise Tutoring in a Web-Based ICALL. International Journal of Continuing Engineering Education and Life-Long Learning, 13(3-4), 350-365.

Wikipedia contributors. (2024, December 20). Intelligent tutoring system. In Wikipedia, The Free Encyclopedia.

Wolfram, S. (2023, February 14). What is ChatGPT doing … and why does it work? Stephen Wolfram Writings. https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work

Language Learning and AI: 7 lessons from 70 years (#4)

4. Appropriate error correction and contingent feedback

Rather than focusing on engaging the learner in communicative interaction, learning with ICALL systems was often based on the assumption that corrective feedback on learner language is of great importance. ICALL research particularly in the 1990s and early 2000s focused on corrective feedback and relied on the three steps of traditional error analysis: effective recognition, description, and explanation (Heift & Schulze, 2007, chapter 3: Error analysis and description). Error analysis (Corder, 1974) was the main approach in second-language acquisition research in the 1960s and 70s. It’s contributions to language education, applied linguistics, and ICALL are manifold and impacted language teaching to this day. Although error correction is still a part common teaching practices today, applied linguistics research has shifted the focus away from deficits in the learner’s language to operationalizing and encouraging their abilities (compare the National Council of State Supervisors for Languages (NCSSFL) and the American Council on the Teaching of Foreign Languages (ACTFL) Can-Do Statements, which were introduced in 2013). This has changed the perspective on corrective feedback in language education. More nuances were introduced and also ICALL began to look at providing help and guidance to learners through text augmentation by, for example, enriching a reading text with linked online glossaries and information on morphological paradigms (e.g., Amaral & Meurers, 2011; Wood, 2011). Text augmentation appears to be as yet underexplored in GenAI and language education research since late 2022.

My inspiration for this title came from the book
Snyder, T. (2017). On tyranny: Twenty lessons from the twentieth century. Tim Duggan Books.

I am sharing these early drafts of a book chapter I published in
Yijen Wang, Antonie Alm, & Gilbert Dizon (Eds.) (2025),
Insights into AI and language teaching and learning. Castledown Publishers.

https://doi.org/10.29140/9781763711600-02.

We are onto lesson 4. Part 0 gives a historical introduction. Lesson 1 focuses on the necessary exposure to authentic language and whether this can be done with GenAI. Lesson 2 looked at communication in context, which is central in language learning. We turned to the role of interaction in language learning with GenAI with lesson 3.

How does ICALL with its symbolic NLP compare with language feedback and guidance to GenAI with its LLMs and ANNs? The texts GenAI produces are mostly well-formed, especially if the text’s language is English or one of the other languages in which many texts on the internet are written (Schulze, in press). So, how suitable would a GenAI be for appropriate error correction and contingent feedback? In an ICALL system, a fragment of the grammar of the learnt language would be described with rules and items, using a formal grammar, in the expert model and parser. This computational grammar could ‘understand’ the linguistically well-formed words, phrases, and sentences, which were covered by the rules of the expert model. To be able to parse student errors, the expert model needed to be adapted. Errors were captured in an error grammar – the buggy rules that were parallel to the rules that covered error-free linguistic units – or in relaxed constraints (Dini & Malnati, 1993). An example of a buggy rule and its error-free counterpart in German is (in pseudo-code for legibility):

default rule(subject-verb agreement) := 
        if subject(NUMBER) = verb(NUMBER)
        and
        subject(PERSON) = verb(PERSON)
        then 
        parse successfully and move on
        else 
        buggy rule(subject-verb agreement)

buggy rule(subject-verb agreement)  := 
       if subject(NUMBER_S) <> verb(NUMBER_V) 
       then 
       give feedback("The subject is in",
          [subject(NUMBER_S)],".
          You need to choose a verb ending that indicates
          [subject(NUMBER_S)],", too. The verb in your 
          sentence is in", [verb(NUMBER_V)]) 
       else next
       then 
       if subject(PERSON_S)  <> verb(PERSON_V)
       then 
       give feedback("The subject is in", [subject(PERSON_S)],
       ". You need to choose a verb ending that indicates
       [subject(PERSON_S)],", too. The verb in your sentence 
       is in", [verb(PERSON_V)])

Buggy rules required a high level of error anticipation, because to cover an error, a particular buggy rule needed to be written. Since buggy rules are deterministic, if they were sufficient to parse the student input, they were robust in the feedback they provided. Relaxed constraints reached a slightly wider coverage and required less error anticipation, because the constraint that, for example, the subject and finite verb of German sentence needed to agree in number and person has been relaxed. This means that whether or not subject and verb agree the sentence is parsed successfully with one less constraint rule:

relaxed rule(subject-verb agreement) := 
         subject(NUMBER_S) and verb(NUMBER_V) 
         and subject(PERSON_S) and verb(PERSON_V)
         if NUMBER_S = NUMBER_V
         then next
         else 
         give feedback("The subject is in",
             [subject(NUMBER_S)],". You need to choose 
             a verb ending that indicates 
             [subject(NUMBER_S)],", too. The verb in your 
             sentence is in",[verb(NUMBER_V)])
         then 
         if PERSON_S = PERSON_V
         then next
         else 
         give feedback("The subject is in",
             [subject(PERSON_S)],". 
             You need to choose a verb ending that indicates 
             [subject(PERSON_S)],", too. The verb in 
             your sentence is in", [verb(PERSON_V)])
         then parse successfully and move on

This pseudo-code illustration shows how labor-intensive the coding of symbolic NLP for ICALL with its focus on error correction and feedback was. The lack of coverage of the computational lexica and grammars and the additional parsing challenges introduced with including parser coverage of errors learners make meant that even the few ICALL systems that were used by students had limited coverage (e.g., Heift, 2010; Nagata, 2002).

Coverage is not a problem for GenAI, as we saw above. However, LLMs and multidimensional ANNs were not intended to provide corrective feedback to language learners. Their error correction of GenAI can be illustrated best with the automatic correction of spelling errors. The prompt “Tell me please what the capitel of germany is.” with its two spelling errors yields the following result: “The capital of Germany is **Berlin** …” (Microsoft Copilot, 2025, January 17, my emphasis) For languages with LLMs, the automatic error correction in the natural language understanding is accurate and comprehensive, as can be seen from the answer in the example. However, the feedback on such errors, only given when specifically requested, is all too often flawed in parts or incomplete. Stated in brief, GenAIs are good at error correction and are limited in providing appropriate corrective feedback. Many teachers, and language learners, have at least anecdotal evidence that metalinguistic explanations are not suitable for language learners and that errors are often underreported or over-flagged. This is understandable if one considers that GenAIs are working with probabilistic patterns in the LLM for their error correction, diagnosis, and (metalinguistic) feedback. This works often for the correction, but is shaky at best for diagnosis and feedback. The computational linguist and ICALL researcher Detmar Meuers (2024) argued in this context that assuming a GenAI is a suitable language teacher is worse than asking a speaker of that language to start teaching systematic language classes. His argument was also based on the fact the a GenAI has no ‘knowledge’ of the prior learning history, language abilities and beliefs, and the general profile of the learner.

To be continued …

References

Amaral, L., & Meurers, W. D. (2011). On using Intelligent Computer-Assisted Language Learning in real-life foreign language teaching and learning. ReCALL, 23(1), 4-24.

Corder, P. (1974). Error Analysis. In J. P. B. Allen & P. Corder (Eds.), The Edinburgh Course in Applied Linguistics. Volume 3 – Techniques in Applied Linguistics (pp. 122-131). Oxford University Press.

Dini, L., & Malnati, G. (1993). Weak Constraints and Preference Rules. In P. Bennett & P. Paggio (Eds.), Preference in Eurotra (pp. 75-90). Commission of the European Communities.

Heift, T. (2010). Developing an Intelligent Tutor. CALICO JOURNAL, 27(3), 443-459.

Heift, T., & Schulze, M. (2007). Errors and Intelligence in CALL. Parsers and Pedagogues. Routledge.

Meurers, D. (2024). #3×07 – Intelligente Tutorielle Systeme (mit Prof. Dr. Detmar Meurers) In Auftrag:Aufbruch. Der Podcast des Forum Bildung Digitalisierung. https://auftrag-aufbruch.podigee.io/30-intelligente-tutorielle-systeme-mit-detmar-meurers

Microsoft Copilot. (2025, January 17). Tell me please what the capitel of germany is. Microsoft Copilot.

Nagata, N. (2002). BANZAI: An Application of Natural Language Processing to Web-Based Language Learning. CALICO, 19(3), 583-599.

Schulze, M. (2025). The impact of artificial intelligence (AI) on CALL pedagogies. In Lee McCallum & Dara Tafazoli (eds) The Palgrave Encyclopedia of Computer-Assisted Language Learning. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-51447-0_7-1.

Wood, P. (2011). Computer assisted reading in German as a foreign language. Developing and testing an NLP-based application. CALICO JOURNAL, 28(3), 662-676.

Language Learning and AI: 7 lessons from 70 years (#3)

3. Varied interaction in language-learning tasks

The human-machine conversation often works because we are used to adhere – even if the machine cannot and is not — to Grice’s four maxims of conversation (Grice, 1975): quantity (be informative), quality (be truthful), relation (be relevant), and manner (be clear). Interaction in dialog works because readers look at the mathematically compiled output of the GenAI and assume that it is informative, truthful, and relevant. Due to its generation of linguistically accurate and plausible text, the GenAI appears to be clear. Communicative interaction proceeds successfully as long as the human reader does not detect that the machine output is not truthful or factually accurate because of, for example, hallucinations (Nananukul & Kejriwal, 2024) or errors or is not relevant because of misinterpreting an ambiguity in different contexts (e.g., when asked about bats, giving information about the mammal rather than the intended sports instrument).

And here is lesson #3 of a short series. Part 0 gives a historical introduction. Lesson #1 focuses on the necessary exposure to authentic language and whether this can be done with GenAI. Lesson #2 looked at communication in context, which is central in language learning. And now we are turning to the role of interaction in language learning with GenAI.

Besides these hurdles, GenAIs have become interesting verbal interactants in language education. On the other hand, ICALL systems, mainly due to their limited language coverage (see above), have provided limited interaction. Systems with or without AI worked with branching trees and canned text, for example in Quandary, a software of the Hot Potatoes suite (Arneil & Holmes, n.d.), which does not have NLP built in. Other systems were more like Chatbots whose conversation was limited to one topic or topic area (e.g., Underwood, 1982). Such CALL chatbots were inspired by Weizenbaum’s Eliza (for his reflection see Weizenbaum (1976)) and SHRDLU (Winograd, 1971) and often relied on regular expressions (Computer Science Field Guide, n.d.) and keyword searches. More sophisticated NLP was employed in the interactive games Spion (Sanders & Sanders, 1995) and Kommissar (DeSmedt, 1995). These early examples of the direct interaction of a learner with a machine with some AI capabilities, especially a level of NLP, show that GenAI has opened a door to the possibility of many more complex and comprehensive verbal interactions and role plays in a variety of languages.

My inspiration for this title came from the book
Snyder, T. (2017). On tyranny: Twenty lessons from the twentieth century. Tim Duggan Books.

I am sharing these early drafts of a book chapter I published in
Yijen Wang, Antonie Alm, & Gilbert Dizon (Eds.) (2025),
Insights into AI and language teaching and learning. Castledown Publishers.

https://doi.org/10.29140/9781763711600-02.

Of course, language learning tasks (see Willis (1996) for an early introduction to the now commonly applied Task-based Language Teaching) are not only rooted in conversations and role plays. GenAI can also generate model answers for different task components or be employed for brainstorming first ideas in the pre-task steps, for example. This was impossible with the ICALL systems based on symbolic NLP and (limited) expert systems. A discussion of the affordances and challenges of this powerful generation of (partial) task outcomes and components both by the student or the teacher is beyond the confines of this chapter, but it is an area within the application of GenAI in language education that is in urgent need of discussion. This agentive collaboration in dialog, possible scaffolding, and student guidance can either support or hinder and even prevent learning.

To be continued …

References

Arneil, S., & Holmes, M. (n.d.). Quandary. Retrieved January 17 from https://hcmc.uvic.ca/project/quandary/

Computer Science Field Guide. (n.d.). Regular expressions – Formal Languages. Retrieved January 27 from https://www.csfieldguide.org.nz/en/chapters/formal-languages/regular-expressions/

DeSmedt, W. H. (1995). Herr Kommissar: An ICALL Conversation Simulator for Intermediate German. In V. M. Holland, J. D. Kaplan, & M. R. Sams (Eds.), Intelligent Language Tutors: Theory Shaping Technology (pp. 153-174). Lawrence Erlbaum Associates.

Grice, H. P. (1975). Logic and Conversation. In D. Cole & J. Morgan (Eds.), Syntax and Semantics: Speech Acts (pp. 41-58). Academic Press.

Nananukul, N., & Kejriwal, M. (2024). HALO: an ontology for representing and categorizing hallucinations in large language models. Proc. SPIE 13058, Disruptive Technologies in Information Sciences VIII, 130580B (6 June 2024),

Sanders, R. H., & Sanders, A. F. (1995). History of an AI Spy Game: Spion. In V. M. Holland, J. D. Kaplan, & M. R. Sams (Eds.), Intelligent Language Tutors: Theory Shaping Technology (pp. 141-151). Lawrence Erlbaum Associates.

Underwood, J. H. (1982). Simulated Conversation as CAI Strategy. Foreign Language Annals, 15, 209-212.

Weizenbaum, J. (1976). Computer Power and Human Reason: From Judgment To Calculation. W. H. Freeman.

Willis, J. R. (1996). A framework for task-based learning. Longman.

Winograd, T. (1971). Procedures as a representation for data in a computer program for understanding natural language. https://hci.stanford.edu/winograd/shrdlu/AITR-235.pdf