Tatyana Ruzsics

MSc Tatyana Ruzsics

Doctoral student Text Group

Address: Freiestrasse 16, 8032 Zürich

Room number: FRF E 1


Tatyana Ruzsics (Soldatova) joined the URPP Language and Space in May 2016. She is a doctoral student under the supervision of Tanja Samardžić (co-supervision with Rico Sennrich) in the project Upstream Text Processing. Her research interests include deep learning methods for upstream NLP processing: writing normalization, lemmatization, morphological segmentation and morphological reinflection. She is working on the character-level neural machine translation methods that allow processing the information on multiple levels of text organization (characters, morphemes, words, sentences) in combination with structural information (multilevel statistical language models and recurrent neural networks, linguistic annotation) from heterogeneous resources (noisy text, dictionaries).



T. Ruzsics,  Lusetti, M., A. Göhring, T. Samardžić  and E. Stark (2019). "Neural text normalization with adapted decoding and PoS features". Natural Language Engineering. 585 - 605. Cambridge University Press. Pre-print


Lusetti, M., T. Ruzsics,  A. Göhring, T. Samardžić  and E. Stark (2018). "Encoder-Decoder Methods for Text Normalization". In Proceedings of the Workshop Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (COLING 2018). Santa Fe, New Mexico, USA, 18- 28. Association for Computational Linguistics.


Ruzsics, T. and T. Samardžić (2017). "Neural Sequence-to-sequence Learning of Internal Word Structure". In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). Vancouver, Canada, 184-194. Association for Computational Linguistics.

Makarov P., T. Ruzsics, and S. Clematide (2017). "Align and copy: UZH at SIGMORPHON 2017 shared task for morphological reinflection". In Proceedings of the CoNLL- SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection, Vancouver, Canada, 49–57. Association for Computational Linguistics. Overall winner of task 1.


Bentz, C., T. Ruzsics,  A. Koplenig, and T. Samardžić (2016). "A comparison between morphological complexity measures: Typological data vs. language corpora". In Proceedings of the Workshop Computational Linguistics for Linguistic Complexity (COLING 2016). Osaka, Japan, 142-153. Association for Computational Linguistics.


"Encoder-Decoder Methods for Text Normalization", SwissText 2018, ZHAW,  Winterthur

"Morphological segmentation", March 2017, Institute of Computational Linguistics Colloquium, University of Zurich

„Morphological richness through massive parallel corpora“ with T. Samardžić, September 2016,  URPP Language and Space,  Second Meeting with Scientific Advisory Board, University of Zurich


2016 - present

University of Zurich, URPP “Language and Space”, Text Group  



ETH Zurich

CAS in Computer Science with a focus on Information Systems

2012 - 2015

ETH Zurich / University of Zurich

MSc in Quantitative Finance

2003 - 2008

Moscow State University 

MSc in Mathematics