Adding word sense awareness to computer-assisted language learning methods: a tailor-made word sense disambiguation method for Spanish as a foreign language
Submitted: 2023-11-28
|Accepted: 2025-03-10
|Published: 2025-07-25
Copyright (c) 2025 Revista de Lingüística y Lenguas Aplicadas

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Downloads
Keywords:
natural language processing, word sense disambiguation, Computer-Assisted Language Learning, Spanish as a foreign language
Supporting agencies:
Abstract:
Word sense awareness is a feature which has not yet been implemented in most Computer-Assisted Language Learning (CALL) environments or in computer-readable resources for pedagogical purposes such as graded word lists. The current study aims to contribute to filling this gap by presenting a word sense disambiguation (WSD) method1 which relies on a tailor-made sense inventory, exploits readily available large language models, and only requires a limited number of prototypical examples sentences as manually curated data. The methodology is evaluated on a set of 74 lexically ambiguous items, with a Spanish language for specific purposes course as the target setting. With weighted F1 scores up to 0.8995, the WSD method shows potential to be applied in real-life CALL scenarios.
References:
Alfter, D., & Graën, J. (2019). “Interconnecting lexical resources and word alignment: How do learners get on with particle verbs?”, Proceedings of the 22nd Nordic Conference on Computational Linguistics, 321–326.
Bensoussan, M., & Laufer, B. (1984). “Lexical guessing in context in EFL reading comprehension”, Journal of Research in Reading,7, 15–32. https://doi.org/10.1111/j.1467-9817.1984.tb00252.x
Bevilacqua, M., Pasini, T., Raganato, A., & Navigli, R. (2021). “Recent Trends in Word Sense Disambiguation: A Survey”, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 4330–4338. https://doi.org/10.24963/ijcai.2021/593
Boers, F. (2021). Evaluating second language vocabulary and grammar instruction: A synthesis of the research on teaching words, phrases, and patterns. Routledge. https://doi.org/10.4324/9781003005605
Chambers, A. (2019). “Towards the corpus revolution? Bridging the research–practice gap”, Language Teaching,52/4, 460–475. https://doi.org/10.1017/S0261444819000089
Degani, T., & Tokowicz, N. (2010). “Ambiguous words are harder to learn”, Bilingualism: Language and Cognition13/3, 299–314. https://doi.org/10.1017/S1366728909990411
Degraeuwe, J., & Goethals, P. (2022). “Interactive Word Sense Disambiguation in Foreign Language Learning”, Proceedings of the 11th Workshop on Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL 2022), 46–54. https://doi.org/10.3384/ecp190005
Devlin, J., Chang, M.W., Lee, K., & Toutanova, K. (2019). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. https://doi.org/10.18653/v1/N19-1423
Fellbaum, C. (Ed.). (1998). WordNet. The MIT Press. https://doi.org/10.7551/mitpress/7287.001.0001
Firth, J.R. (1957). “A synopsis of linguistic theory 1930-55”, in Selected papers of J.R. Firth 1952-1959. London: Longman, 168–205.
Fundación SM. (2023). “Diccionario Clave. Lengua española”. https://www.grupo-sm.com/es/book/diccionario-clave-lengua-española [retrieved: 13.11.2023]
Gabrielatos, C. (2018). “Keyness analysis: Nature, metrics and techniques”, in C. Taylor & A. Marchi (eds.) Corpus Approaches To Discourse. Routledge, 225–258. https://doi.org/10.4324/9781315179346-11
Gilquin, G., & Granger, S. (2010). “How can data-driven learning be used in language teaching?”, in The Routledge Handbook of Corpus Linguistics. Routledge. https://doi.org/10.4324/9780203856949.ch26
Goethals, P. (2018). “Customizing vocabulary learning for advanced learners of Spanish”, in T. Read, B. Sedano Cuevas & S. Montaner-Villalba (eds.) Technological innovation for specialized linguistic domains: Languages for digital lives and cultures, proceedings of TISLID’18, Éditions Universitaires Européennes, 229–240.
González, M. (ed.). (2012). Diccionario Clave: Diccionario de uso del español actual (Novena edición (aumentada, y actualizada según la normativa académica actual)). SM.
Granger, S., Kraif, O., Ponton, C., Antoniadis, G., & Zampa, V. (2007). “Integrating learner corpora and natural language processing: A crucial step towards reconciling technological sophistication and pedagogical effectiveness”, ReCALL,19/3, 252–268. https://doi.org/10.1017/S0958344007000237
Gutiérrez-Fandiño, A., Armengol-Estapé, J., Pàmies, M., Llop-Palao, J., Silveira-Ocampo, J., Carrino, C.P., Gonzalez-Agirre, A., Armentano-Oller, C., Rodriguez-Penagos, C., & Villegas, M. (2021). “MarIA: Spanish Language Models”. https://doi.org/10.48550/ARXIV.2107.07253
Harris, Z.S. (1970). Papers in structural and transformational linguistics. Dordrecht: Reidel. https://doi.org/10.1007/978-94-017-6059-1
Hovy, E., Navigli, R., & Ponzetto, S.P. (2013). “Collaboratively built semi-structured content and Artificial Intelligence: The story so far”, Artificial Intelligence,194, 2–27. https://doi.org/10.1016/j.artint.2012.10.002
Johns, T. (1991). “From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning”, in T. Johns & P. King (eds.) Classroom Concordancing. English Language Research Journal, 4, 27-45.
Kilgarriff, A. (1997). “I don’t believe in word senses”, Language Resources and Evaluation,31/2, 91–113. https://doi.org/10.1023/A:1000583911091
Kulkarni, A., Heilman, M., Eskenazi, M., & Callan, J. (2008). “Word Sense Disambiguation for Vocabulary Learning”, in B.P. Woolf, E. Aïmeur, R. Nkambou, & S. Lajoie (eds.), Intelligent Tutoring Systems 5091, Springer Berlin Heidelberg, 500–509. https://doi.org/10.1007/978-3-540-69132-7_53
Lacerra, C., Bevilacqua, M., Pasini, T., & Navigli, R. (2020). “CSI: A Coarse Sense Inventory for 85% Word Sense Disambiguation”, Proceedings of the AAAI Conference on Artificial Intelligence, 34/05, 8123–8130. https://doi.org/10.1609/aaai.v34i05.6324
Loureiro, D., Rezaee, K., Pilehvar, M.T., & Camacho-Collados, J. (2021). “Analysis and Evaluation of Language Models for Word Sense Disambiguation”, Computational Linguistics, 1–57. https://doi.org/10.1162/coli_a_00405
Lyons, J. (1977). Semantics (Vol. 2). Cambridge, UK: Cambridge University Press. https://doi.org/10.1017/CBO9780511620614
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). “Efficient Estimation of Word Representations in Vector Space”. arXiv preprint arXiv:1301.3781. https://doi.org/10.48550/ARXIV.1301.3781
Moliner, M., & Riera, C. (2016). Diccionario de uso del español (Cuarta edición, edición del cincuentenario) [Dictionary of the use of Spanish (Fourth edition, fiftieth anniversary edition)]. Gredos.
Navigli, R. (2009). “Word sense disambiguation: A survey”, ACM Computing Surveys,41/2, 1–69. https://doi.org/10.1145/1459352.1459355
Navigli, R., & Ponzetto, S.P. (2012). “BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network”, Artificial Intelligence,193, 217–250. https://doi.org/10.1016/j.artint.2012.07.001
Navigli, R., Litkowski, K.C., & Hargraves, O. (2007). “SemEval-2007 task 07: Coarse-grained English all-words task”, ACL 2007 - SemEval 2007 - Proceedings of the 4th International Workshop on Semantic Evaluations, June, 30–35. https://doi.org/10.3115/1621474.1621480
Pilán, I., Volodina, E., & Borin, L. (2016). “Candidate sentence selection for language learning exercises: From a comprehensive framework to an empirical evaluation”, Revue Traitement Automatique Des Langues,57/3, 67–91.
Pojanapunya, P., & Watson Todd, R. (2018). “Log-likelihood and odds ratio: Keyness statistics for different purposes of keyword analysis”, Corpus Linguistics and Linguistic Theory,14/1, 133–167. https://doi.org/10.1515/cllt-2015-0030
Ruiz, S., Rebuschat, P., & Meurers, D. (2021). “The effects of working memory and declarative memory on instructed second language vocabulary learning: Insights from intelligent CALL”, Language Teaching Research,25/4, 510–539. https://doi.org/10.1177/1362168819872859
Tack, A., François, T., Desmet, P., & Fairon, C. (2018). “NT2Lex: A CEFR-Graded Lexical Resource for Dutch as a Foreign Language Linked to Open Dutch WordNet”, Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, 137–146. https://doi.org/10.18653/v1/W18-0514
Uma, A.N., Fornaciari, T., Hovy, D., Paun, S., Plank, B., & Poesio, M. (2021). “Learning from Disagreement: A Survey”, Journal of Artificial Intelligence Research, 72, 1385–1470. https://doi.org/10.1613/jair.1.12752
Verspoor, M., & Lowie, W. (2003). “Making Sense of Polysemous Words”, Language Learning, 53/3, 547–586. https://doi.org/10.1111/1467-9922.00234
Wiedemann, G., Remus, S., Chawla, A., & Biemann, C. (2020). “Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings”, Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019, 161–170.
Wilson, A. (2013). “Embracing Bayes factors for key item analysis in corpus linguistics”, in M. Bieswanger & A. Koll-Stobbe (eds.) New Approaches to the Study of Linguistic Variability. Peter Lang, 3–11.



