Assessment and GenAI

Often assessment focuses on the product and not the process; the submitted essay gets graded and not the process of writing it. Students can generate these material results of an assessed activity – a short essay, presentation, poster, or audio or video recording – more quickly and with a higher degree of linguistic accuracy than they can create them without the easily available GenAI tools (Fawzi, 2023). The same is even more likely for the translation of sentences in quizzes, for example. Entering into a whole dialog (as described above), however, puts more emphasis on the writing process.

Photo by Max Fischer on Pexels.com

Given this high level of form accuracy in generated and translated texts, it is also difficult to make the linguistic accuracy of the product, the text, the sole or even the main criterion for assessment, as is often done by language teachers. A lower level of accuracy could simply mean that the student did not use GenAI or other (inappropriate) tools (Bowen & Watson, 2024, p. 148ff.). In assessment, a changed focus and different strategies are necessary at the stage of course or lesson planning already. When assessing the proficiency development of students, test tasks need to be designed such that the process – of writing, for example – can be assessed rather than the product. Multiple sketches and drafts need to be submitted and are assessed to have different windows on the (writing) process. Students can also be enabled and encouraged to produce written or spoken language spontaneously to ensure equity in assessment, so that they learn and retain the Kulturtechniken necessary for producing meaningful and creative texts.

This blog post is an excerpt from the manuscript for Schulze, Mathias (2025). The impact of artificial intelligence (AI) on CALL pedagogies. In Lee McCallum & Dara Tafazoli (eds) The Palgrave Encyclopedia of Computer-Assisted Language Learning. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-51447-0_7-1. 
In 2024, I wrote this encyclopedia entry as my first attempt of gaining a better understanding of what was going on after GenAI burst into Language Education.

When assessing students’ written or spoken language proficiency, the three components of proficiency – complexity, accuracy, fluency – need to be assessed in a balanced way. For example, an increased textual complexity – a diverse range of vocabulary and grammatical constructions, including some or many sophisticated items, such as less frequently used words and constructions and longer sentences – is often an indicator of the student’s successful second-language development. In this context, appropriate task or activity design – instructing the learner to use the newly learned vocabulary and grammatical constructions – will make the task for the student clearer and will make it less likely that inappropriate tools can or will be used.

In general, more room should be given to approaches in assessment that do not focus on one final product but on accompanying the learning process. This way GenAI tools can be used during an appropriate phase of the activity, e.g., brainstorming ideas or looking up alternative constructions and synonyms. Here approaches such as Dynamic Assessment (Lantolf, 2009) and Integrated Performance Assessment (Adair-Hauck et al., 2006) help reduce the negative impact that GenAI tools can have on the viability and fairness of assessment procedures. This is particularly important in CALL, because the students are working in a digital environment already and access to GenAI tools is easy and thus tempting for many. The focus on the (learning) quality of the process rather than the product has the important advantage that it encourages creativity and risk-taking because the incremental assessment procedure has productive, non-threatening feedback and repair loops as its integral parts. This way creativity in writing can be rewarded and spelling, lexical, and grammatical mistakes do not have to be penalized heavily; some of the pressures of assessment are thus mitigated if not eliminated.

References

Adair-Hauck, B., Glisan, E. W., Koda, K., Swender, E. B., & Sandrock, P. (2006). The integrated performance assessment (IPA): Connecting assessment to instruction and learning. Foreign Language Annals, 39, 359–382.

Bowen, J. A., & Watson, C. E. (2024). Teaching with AI. A practical guide to a new era of human learning. John Hopkins University Press.

Fawzi, H. (2023). A Bleeding Edge or a Cutting Edge? A Systematic Review of ChatGPT and English as a Second and/or Foreign Language Learners’ Writing Abilities. In Conference Proceedings. WorldCALL 2023. CALL in Critical Times. (Chiang Mai, Thailand) (pp. 7-15). The International Academic Forum.

Lantolf, J. P. (2009). Dynamic assessment: The dialectic integration of instruction and assessment. Language Teaching, 42(3), 355-368. https://doi.org/https://doi.org/10.1017/S0261444808005569

Timothy Snyder on the Holocaust as History and Warning

When I served as the director of the Waterloo Centre for German Studies, I started the Jacob and Wilhelm Grimm Lecture, the annual flagship event of the Center. For the last lecture that I organized – with a lot of help from many colleagues – we were fortunate to be able to host the historian Timothy Snyder in early 2017. He had just published his Facebook post “20 lessons from the 20th century” but not his book On tyranny yet, which was derived from that Facebook post.

I have dug up a couple of older videos on the internet. Some of them have to do with my current thinking about AI, this one and a few others are related to my research but not to AI and language and learning. The connection, as almost always, is language … broadly conceived

My inspiration for the title of my book chapter "ICALL and AI: Seven lessons from seventy years"  I published in 

Yijen Wang, Antonie Alm, & Gilbert Dizon (Eds.) (2025), Insights into AI and language teaching and learning. Castledown Publishers. 
came from Snyder, T. (2017). On tyranny: Twenty lessons from the twentieth century. Tim Duggan Books.

Learner motivation and GenAI

Language teachers know that learners need to obtain and then to maintain a level of motivation in their second-language learning and language use. Motivation has become more important in the context of GenAI, as the example of recent advances in machine translation show. Commonly available machine translation tools are now similar to GenAI-based chatbots, because they are also based on large language models. The generation of a text in another language or the translation of a text from language A to language B with a GenAI tool is literally just a prompt and/or a click away. Of course, fast translation tools can be very helpful in many situations. Yet, the habitual and exclusive use of machine translation has the potential to reduce communication to the exchange of forms, as it certainly will not lead to a negotiation of meaning. The simple lookup of an answer to a second-language teaching prompt or the generation of a text to produce a learning task result are at least a missed practice opportunity and are much less likely to lead to learning.

Photo by Mir Burhan on Pexels.com

Language learning is a long and, for many, difficult process. With the advent of GenAI, teachers need to be able to explain – even more so than before – why it is worthwhile to learn a language, while tools enable computer users to mimic the command of another language. In language learning, students can have the experience that language is more than an intricate assembly of linguistic forms that follow some system of accuracy rules. If students are able to look underneath and beyond, and understand, these language forms – this competence Kramsch (2006) calls symbolic competence – then they have obtained entry to another speech community. Computational tools, on the other hand, can only provide access to sets of linguistic forms or character strings, also in other languages (see the Chinese Room Argument (Searle, 1980)). Teachers need to be able to motivate their students such that they choose the longer yet more fruitful path of language practice and learning rather than that of the rapid generation of plausible texts.

This blog post is an excerpt from the manuscript for Schulze, Mathias (2025). The impact of artificial intelligence (AI) on CALL pedagogies. In Lee McCallum & Dara Tafazoli (eds) The Palgrave Encyclopedia of Computer-Assisted Language Learning. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-51447-0_7-1. 
In 2024, I wrote this encyclopedia entry as my first attempt of gaining a better understanding of what was going on after GenAI burst into Language Education.

References

Kramsch, C. (2006). From communicative competence to symbolic competence. Modern Language Journal, 90(ii), 249–252.

Searle, J. (1980). Minds, brains, and programs. Behaviorial and Brain Sciences, 3(3), 417–457.