Navigation auf


URPP Language and Space Language and Space Lab

Tatyana Ruzsics

Tatyana Ruzsics

  • Former doctoral student Text Group
Freiestrasse 16, 8032 Zürich
Room number

Tatyana Ruzsics (Soldatova) was a member of the URPP Language and Space from 2016 to 2021.

She was a doctoral student under the supervision of Tanja Samardžić (co-supervision with Rico Sennrich) in the project Upstream Text Processing. Her research interests include deep learning methods for upstream NLP processing: writing normalization, lemmatization, morphological segmentation and morphological reinflection. She is working on the character-level neural machine translation methods that allow processing the information on multiple levels of text organization (characters, morphemes, words, sentences) in combination with structural information (multilevel statistical language models and recurrent neural networks, linguistic annotation) from heterogeneous resources (noisy text, dictionaries).



T. Ruzsics,  Lusetti, M., A. Göhring, T. Samardžić  and E. Stark (2019). "Neural text normalization with adapted decoding and PoS features". Natural Language Engineering. 585 - 605. Cambridge University Press. Pre-print


Lusetti, M., T. Ruzsics,  A. Göhring, T. Samardžić  and E. Stark (2018). "Encoder-Decoder Methods for Text Normalization". In Proceedings of the Workshop Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (COLING 2018). Santa Fe, New Mexico, USA, 18- 28. Association for Computational Linguistics.


Ruzsics, T. and T. Samardžić (2017). "Neural Sequence-to-sequence Learning of Internal Word Structure". In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). Vancouver, Canada, 184-194. Association for Computational Linguistics.

Makarov P., T. Ruzsics, and S. Clematide (2017). "Align and copy: UZH at SIGMORPHON 2017 shared task for morphological reinflection". In Proceedings of the CoNLL- SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection, Vancouver, Canada, 49–57. Association for Computational Linguistics. Overall winner of task 1.


Bentz, C., T. Ruzsics,  A. Koplenig, and T. Samardžić (2016). "A comparison between morphological complexity measures: Typological data vs. language corpora". In Proceedings of the Workshop Computational Linguistics for Linguistic Complexity (COLING 2016). Osaka, Japan, 142-153. Association for Computational Linguistics.


"Encoder-Decoder Methods for Text Normalization", SwissText 2018, ZHAW,  Winterthur

"Morphological segmentation", March 2017, Institute of Computational Linguistics Colloquium, University of Zurich

„Morphological richness through massive parallel corpora“ with T. Samardžić, September 2016,  URPP Language and Space,  Second Meeting with Scientific Advisory Board, University of Zurich


2016 - present

University of Zurich, URPP “Language and Space”, Text Group  



ETH Zurich

CAS in Computer Science with a focus on Information Systems

2012 - 2015

ETH Zurich / University of Zurich

MSc in Quantitative Finance

2003 - 2008

Moscow State University 

MSc in Mathematics


Weiterführende Informationen

LxMLS2018 Monitors

LxMLS 2018

In June I participated as a lab monitor in Lisbon Machine Learning School.

Read more about the school.

Encoder-Decoder Methods for Text Normalization

SwissText 2018

CLUZH team is an overall winner of track 1 CoNLL SIGMORPHON 2017 shared task on morphological reinflection

Read more about our model for CoNLL shared task and other shared tasks organized by the Text group at our URPP Language and Space Lab.


Poster presented at CoNLL 2017 for our work on morphological segmentation