Navigation auf uzh.ch

Suche

URPP Language and Space

Ximena Gutierrez-Vasques, Dr.

Ximena Gutierrez-Vasques, Dr.

  • Postdoctoral researcher
  • TextGroup

I joined the URPP Language and Space in September 2019. My research interests cover Natural Language Processing, quantitative linguistics, under-resource languages, multilingual NLP.

I am currently working on approaches for measuring linguistic complexity (at the morphological level) using text corpora and information-theoretic approaches. I collaborate in the project "Non-randomness in Morphological Diversity: A Computational Approach Based on Multilingual Corpora".

 

*Updated email address: ximena.gutierrezvasques@uzh.ch

Publications

2022

Tanja Samardzic, Ximena Gutierrez-Vasques, Rob van der Goot, Max MüllerEberstein, Olga Pelloni and Barbara Plank. On Language Spaces, Scales and Cross-Lingual Transfer of UD Parsers. CONLL, 2022

Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.99566Kann, K., Ebrahimi, A., Mager, M., Oncevay, A., Ortega, J. E., Rios, A., Fan, A., Chiruzzo, L., Ramos, R., Meza Ruiz, I. V., Mager, E., Chaudhary, V., Neubig, G., Palmer, A., & Vu, N. T. (2022). AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas. Frontiers in Artificial Intelligence,

Bentz, Christian, Gutierrez-Vasques, Ximena, Sozinova, Olga and Samardžić, Tanja. "Complexity trade-offs and equi-complexity in natural languages: a meta-analysis" Linguistics Vanguard, 2022. https://doi.org/10.1515/lingvan-2021-0054

Adran Israel Lerma Mayer, Ximena Gutierrez-Vasques, Ernesto Priani Saiso, Hannu Salmi. Underlying Sentiments in 1867: A Study of News Flows on the Execution of Emperor Maximilian I of Mexico in Digitized Newspaper Corpora. Digital Humanities Quarterly (DHQ)

Moran, S., Bentz, C., Gutierrez-Vasques, X., Sozinova, O., & Samardzic, T. TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP. LREC 2022

Book chapter: “Relación tipo-token para contrastar la complejidad morfológica del español-náhuatl”.   Ámbitos morfológicos: Descripciones y métodos. UNAM, Mayo, 2022

2021

Gutierrez-Vasques, X., Bentz, C., Sozinova, O., & Samardzic, T. (2021). From characters to words: the turning point of BPE merges. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Ruzsics, T., Sozinova, O., Gutierrez-Vasques, X., & Samardzic, T. (2021l). Interpretability for Morphological Inflection: from Character-level Predictions to Subword-level Rules. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Mager, M., Oncevay, A., Ebrahimi, A., Ortega, J., Gonzales, A. R., Fan, A., ... & Kann, K. (2021). Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL.

 

Martínez, D. B., Mijangos, V., & Gutierrez-Vasques, X. (2021). Automatic Interlinear Glossing for Otomi language. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, NAACL

2020

Gutierrez-Vasques, X., & Mijangos, V. (2020). Productivity and Predictability for Measuring Morphological Complexity. Entropy, 22(1), 48.

2019

Gutierrez-Vasques, X., Medina-Urrea, A., & Sierra, G. (2019). Morphological segmentation for extracting Spanish-Nahuatl bilingual lexicon. Procesamiento del Lenguaje Natural, 63, 41-48.

2018

Ximena Gutierrez-Vasques and Victor Mijangos. (2018). Comparing  morphological complexity of Spanish, Otomi and Nahuatl. In Proceedings  of the Workshop on Linguistic Complexity and Natural Language Processing.  Association for Computational Linguistics, Santa Fe, New-Mexico, pages 30–37.

Manuel  Mager, Ximena  Gutierrez-Vasques,  Gerardo Sierra, and  Ivan Meza. (2018). Challenges  of language technologies for the indigenous languages of the Americas. Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018).

Ximena Gutierrez Vasques. “Corpus paralelo español-náhuatl y su uso en las tecnologías del lenguaje humano”  (Book chapter). In Galina Russell, Isabel; Peña Pimentel, Miriam; Priani Saisó, Ernesto; Barrón Tovar, José Francisco; Domínguez Herbón, David; Álvarez Sánchez, Adriana (Coords), Humanidades digitales: lengua, texto, patrimonio y datos. México, Bonilla Artigas Editores. 2018.

2017

Gutierrez-Vasques, X., & Mijangos, V. (2017). Low-resource bilingual lexicon extraction using graph based word embeddings. arXiv preprint arXiv:1710.02569.

2016

Gutierrez-Vasques, X., Sierra, G., & Pompa, I. H. (2016, May). Axolotl: a web accessible parallel corpus for spanish-nahuatl. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) (pp. 4210-4214).

2015

Gutierrez-Vasques, X. (2015). Bilingual lexicon extraction for a distant language pair using a small parallel corpus. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop (pp. 154-160).

Education

2014-2018

National Autonomous University of Mexico (UNAM)

PhD in Computational Linguistics

2010-2012

Charles University, Czech Republic. Free University of Bolzano, Italy

MSc in Computational Linguistics

2004-2010

National Autonomous University of Mexico (UNAM)

Degree in Computer Engineering

 

Grants and Scholarships

 

Swiss Government Excellence Scholarship (2019)

Postdoctoral stay

European commission, Erasmus Mundus Scholarship (September 2010)

Fully funded master studies

Weiterführende Informationen

video

X. Gutierrez: From characters to words -- the turning point of BPE merges

More about X. Gutierrez: From characters to words -- the turning point of BPE merges