Language Learning and AI: 7 lessons from 70 years (#4)

4. Appropriate error correction and contingent feedback

Rather than focusing on engaging the learner in communicative interaction, learning with ICALL systems was often based on the assumption that corrective feedback on learner language is of great importance. ICALL research particularly in the 1990s and early 2000s focused on corrective feedback and relied on the three steps of traditional error analysis: effective recognition, description, and explanation (Heift & Schulze, 2007, chapter 3: Error analysis and description). Error analysis (Corder, 1974) was the main approach in second-language acquisition research in the 1960s and 70s. It’s contributions to language education, applied linguistics, and ICALL are manifold and impacted language teaching to this day. Although error correction is still a part common teaching practices today, applied linguistics research has shifted the focus away from deficits in the learner’s language to operationalizing and encouraging their abilities (compare the National Council of State Supervisors for Languages (NCSSFL) and the American Council on the Teaching of Foreign Languages (ACTFL) Can-Do Statements, which were introduced in 2013). This has changed the perspective on corrective feedback in language education. More nuances were introduced and also ICALL began to look at providing help and guidance to learners through text augmentation by, for example, enriching a reading text with linked online glossaries and information on morphological paradigms (e.g., Amaral & Meurers, 2011; Wood, 2011). Text augmentation appears to be as yet underexplored in GenAI and language education research since late 2022.

My inspiration for this title came from the book  
Snyder, T. (2017). On tyranny: Twenty lessons from the twentieth century. Tim Duggan Books.

I am sharing these early drafts of a book chapter I published in
Yijen Wang, Antonie Alm, & Gilbert Dizon (Eds.) (2025), 
Insights into AI and language teaching and learning.
Castledown Publishers.

https://doi.org/10.29140/9781763711600-02.
Photo by Ann H on Pexels.com
We are onto lesson 4. Part 0 gives a historical introduction. Lesson 1 focuses on the necessary exposure to authentic language and whether this can be done with GenAI. Lesson 2 looked at communication in context, which is central in language learning. We turned to the role of interaction in language learning with GenAI with lesson 3.

How does ICALL with its symbolic NLP compare with language feedback and guidance to GenAI with its LLMs and ANNs? The texts GenAI produces are mostly well-formed, especially if the text’s language is English or one of the other languages in which many texts on the internet are written (Schulze, in press). So, how suitable would a GenAI be for appropriate error correction and contingent feedback? In an ICALL system, a fragment of the grammar of the learnt language would be described with rules and items, using a formal grammar, in the expert model and parser. This computational grammar could ‘understand’ the linguistically well-formed words, phrases, and sentences, which were covered by the rules of the expert model. To be able to parse student errors, the expert model needed to be adapted. Errors were captured in an error grammar – the buggy rules that were parallel to the rules that covered error-free linguistic units – or in relaxed constraints (Dini & Malnati, 1993). An example of a buggy rule and its error-free counterpart in German is (in pseudo-code for legibility):

default rule(subject-verb agreement) := 
        if subject(NUMBER) = verb(NUMBER)
        and
        subject(PERSON) = verb(PERSON)
        then 
        parse successfully and move on
        else 
        buggy rule(subject-verb agreement)

buggy rule(subject-verb agreement)  := 
       if subject(NUMBER_S) <> verb(NUMBER_V) 
       then 
       give feedback("The subject is in",
          [subject(NUMBER_S)],".
          You need to choose a verb ending that indicates
          [subject(NUMBER_S)],", too. The verb in your 
          sentence is in", [verb(NUMBER_V)]) 
       else next
       then 
       if subject(PERSON_S)  <> verb(PERSON_V)
       then 
       give feedback("The subject is in", [subject(PERSON_S)],
       ". You need to choose a verb ending that indicates
       [subject(PERSON_S)],", too. The verb in your sentence 
       is in", [verb(PERSON_V)])

Buggy rules required a high level of error anticipation, because to cover an error, a particular buggy rule needed to be written. Since buggy rules are deterministic, if they were sufficient to parse the student input, they were robust in the feedback they provided. Relaxed constraints reached a slightly wider coverage and required less error anticipation, because the constraint that, for example, the subject and finite verb of German sentence needed to agree in number and person has been relaxed. This means that whether or not subject and verb agree the sentence is parsed successfully with one less constraint rule: 

relaxed rule(subject-verb agreement) := 
         subject(NUMBER_S) and verb(NUMBER_V) 
         and subject(PERSON_S) and verb(PERSON_V)
         if NUMBER_S = NUMBER_V
         then next
         else 
         give feedback("The subject is in",
             [subject(NUMBER_S)],". You need to choose 
             a verb ending that indicates 
             [subject(NUMBER_S)],", too. The verb in your 
             sentence is in",[verb(NUMBER_V)])
         then 
         if PERSON_S = PERSON_V
         then next
         else 
         give feedback("The subject is in",
             [subject(PERSON_S)],". 
             You need to choose a verb ending that indicates 
             [subject(PERSON_S)],", too. The verb in 
             your sentence is in", [verb(PERSON_V)])
         then parse successfully and move on

This pseudo-code illustration shows how labor-intensive the coding of symbolic NLP for ICALL with its focus on error correction and feedback was. The lack of coverage of the computational lexica and grammars and the additional parsing challenges introduced with including parser coverage of errors learners make meant that even the few ICALL systems that were used by students had limited coverage (e.g., Heift, 2010; Nagata, 2002).

Coverage is not a problem for GenAI, as we saw above. However, LLMs and multidimensional ANNs were not intended to provide corrective feedback to language learners. Their error correction of GenAI can be illustrated best with the automatic correction of spelling errors. The prompt “Tell me please what the capitel of germany is.” with its two spelling errors yields the following result: “The capital of Germany is **Berlin**  …” (Microsoft Copilot, 2025, January 17, my emphasis) For languages with LLMs, the automatic error correction in the natural language understanding is accurate and comprehensive, as can be seen from the answer in the example. However, the feedback on such errors, only given when specifically requested, is all too often flawed in parts or incomplete. Stated in brief, GenAIs are good at error correction and are limited in providing appropriate corrective feedback. Many teachers, and language learners, have at least anecdotal evidence that metalinguistic explanations are not suitable for language learners and that errors are often underreported or over-flagged. This is understandable if one considers that GenAIs are working with probabilistic patterns in the LLM for their error correction, diagnosis, and (metalinguistic) feedback. This works often for the correction, but is shaky at best for diagnosis and feedback. The computational linguist and ICALL researcher Detmar Meuers  (2024) argued in this context that assuming a GenAI is a suitable language teacher is worse than asking a speaker of that language to start teaching systematic language classes. His argument was also based on the fact the a GenAI has no ‘knowledge’ of the prior learning history, language abilities and beliefs, and the general profile of the learner.

To be continued …

References

Amaral, L., & Meurers, W. D. (2011). On using Intelligent Computer-Assisted Language Learning in real-life foreign language teaching and learning. ReCALL, 23(1), 4-24.

Corder, P. (1974). Error Analysis. In J. P. B. Allen & P. Corder (Eds.), The Edinburgh Course in Applied Linguistics. Volume 3 – Techniques in Applied Linguistics (pp. 122-131). Oxford University Press.

Dini, L., & Malnati, G. (1993). Weak Constraints and Preference Rules. In P. Bennett & P. Paggio (Eds.), Preference in Eurotra (pp. 75-90). Commission of the European Communities.

Heift, T. (2010). Developing an Intelligent Tutor. CALICO JOURNAL, 27(3), 443-459.

Heift, T., & Schulze, M. (2007). Errors and Intelligence in CALL. Parsers and Pedagogues. Routledge.

Meurers, D. (2024). #3×07 – Intelligente Tutorielle Systeme (mit Prof. Dr. Detmar Meurers) In Auftrag:Aufbruch. Der Podcast des Forum Bildung Digitalisierung. https://auftrag-aufbruch.podigee.io/30-intelligente-tutorielle-systeme-mit-detmar-meurers

Microsoft Copilot. (2025, January 17). Tell me please what the capitel of germany is. Microsoft Copilot.

Nagata, N. (2002). BANZAI: An Application of Natural Language Processing to Web-Based Language Learning. CALICO, 19(3), 583-599.

Schulze, M. (2025). The impact of artificial intelligence (AI) on CALL pedagogies. In Lee McCallum & Dara Tafazoli (eds) The Palgrave Encyclopedia of Computer-Assisted Language Learning. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-51447-0_7-1.

Wood, P. (2011). Computer assisted reading in German as a foreign language. Developing and testing an NLP-based application. CALICO JOURNAL, 28(3), 662-676.