Navigation auf


URPP Language and Space Language and Space Lab

Olga Pelloni

Olga Pelloni

  • Doctoral student SNSF Text Group
Room number


I am working on my PhD project at the URPP Language and Space starting from October 2018. The project is a part of Tanja Samardžić's SNSF project  Non-randomness in Morphological Diversity: A Computational Approach Based on Multilingual Corpora.
I have a background in various fields of linguistics, with a main focus on computational linguistics. My research is inspired by possibilities of programming and other state-of-the-art tools for analyzing language data and understanding language related phenomena.


Geometry of Linguistic Morphology

The PhD project is aimed to develop new methods for studying linguistic morphological diversity and for language comparison. In particular, I explore the tools from information theory (entropy), fractal geometry (fractal dimension) and graph theory (tree structures) in order to establish a rigorous scientific approach for comparing morphological structures. An expected outcome of the project is a 1) novel method of studying subword structures language-independently; 2) new knowledge of the morphological systems for the sample of 100 typologically balanced languages.

Supervisor: Tanja Samardžić

Co-supervisor, professor in charge: Martin Volk


Current papers in progress:

Subword geometry: picturing word shapes (extended abstract
accepted to SIGTYP 2021)

Fractal dimension as a measure of morphological complexity

Should complexity measures be complex? Exploring the influence of text compression on measuring morphological complexity




Ruzsics, T., O. Sozinova, X. Gutierrez-Vasques and T. Samardzic. (2021). Interpretability for morphological inflection: from character-level predictions to subword-level rules. European Chapter of the Association for Computational Linguistics, Long Papers.

Gutierrez-Vasques, X., C. Bentz, O. Sozinova and T. Samardzic. (2021). From characters to words: the turning point of BPE merges. European Chapter of the Association for Computational Linguistics, Long Papers.


Sozinova, O. (2016). Complex networks-based approach to Russian rhyme history description: linguostatistics and database.In Digital Humanities 2016, Conference Abstracts, Krakow, Poland, 891-893.

Sozinova, O. and M. Khudyakova (2016). Tense switching in narratives by Russian aphasia speakers.In Temas de lingüística clínica. Proceedings of the IV Clinical Linguistics International Congress, Barcelona, Spain, 209. 95-939. 


Arkhangel’skii, T. and O. Sozinova, (2015). A multimedia corpus of the Yiddish language. Automatic Documentation and Mathematical Linguistics, 49(2), 47-53.


Conference presentations

Bentz, C., O. Sozinova and T. Samardžić (2019). Collecting a corpus for 100 typologically diverse languages (100LC). Workshop on language documentation: multilingual settings and technological advances. Uppsala, Sweden.

Sozinova, O., T. Samardžić and C. Bentz (2019). Measuring inflectional and derivational complexity. Interactive Workshop on Measuring Language Complexity, IWMLC2019. Freiburg, Germany.

Sozinova, O. (2015). Rhyme: psychological experiment. Gasparov Readings 2015. Russian State University for the Humanities. Moscow, Russia.

Sozinova, O. (2015). Rhyme properties in Marina Tsvetaeva’s verse. Structure of Verse workshop. Leiden, Netherlands.

Sozinova, O. (2015), Corpus research on the variation of the reflexive
postfix -sja in the Russian subdialect of the Ustya river basin. Norwegian Graduate Student Conference in Linguistics and Philology. Tromsø, Norway.

von Waldenfels, R., N. Dobrushina, M. Daniel, A. Ter-Avanesova, I. Levin,
O. Sozinova and V. Zhigul’skaya (2015). Modelling speaker variation and dialect change in Northern Russia. The International Conference on Language Variation in Europe, ICLaVE. Leipzig, Germany.

Grabovskaya, M., P. Kasyanova and O. Sozinova (2015). Semantic analysis of Russian augmentatives using RNC. Conference on computational and corpus linguistics, ConCorT 2015. Educational center ”Voronovo”, Russia.

Grabovskaya, M., P. Kasyanova and O. Sozinova (2015). Corpus research on Russian augmentatives. I Student conference at the Institute of Linguistics. Russian State University for the Humanities. Moscow, Russia.



Tutoring, Processing Non-standard Language, HS2020

Tutoring, Techniques of Semantic Analysis (MA), FS2020

University of Zurich


Tutoring, Introduction to Programming (MA), HS2019

University of Zurich


2018 Linguist Consultant (Russian), Lionbridge Technologies, Inc.
2015 – 2016 Junior Linguist (German), API.AI

Laboratory Assistant, Neurolinguistic Laboratory,

National Research University

Higher School of Economics, Moscow

2014 – 2015

Teaching Assistant in German, School of Linguistics,

National Research University

Higher School of Economics, Moscow



Main education

2018 – present

PhD in Computational Linguistics

Text Group, URPP Language and Space

University of Zurich
Thesis: Geometry of Linguistic Morphology

Supervised by Dr. Tanja Samardžić, Prof. Dr. Martin Volk. 

2016 – 2018

MA in Linguistics, specialized in Historical Linguistics,

Minor: Slavic Languages and Literatures

University of Bern

Thesis: Reconstruction of Old Chinese Phonology

based on Computational Linguistic Analysis of the Shijing Rhymes.

Supervised by Prof. Dr. George van Driem.

2012 – 2016

BA in Fundamental and Applied Linguistics,

specialized in Computational Linguistics

National Research University

Higher School of Economics, Moscow

Thesis: Complex Networks-based Approach to Russian Rhyme

History Description: Linguostatistics and Database 

full text in Russian

Supervised by Prof. Dr. Boris Orekhov.

Exchange studies

2015 – 2016

Exchange program in Computational Linguistics (winter semester)

University of Tübingen


Exchange program in Linguistics (spring semester)

University of Bern

Summer schools


9th Lisbon Machine Learning School, LxMLS

Instituto Superior Técnico, Lisbon


Revisiting research training in linguistics: theory, logic, method

Petnica Science Centre, Valjevo


Chinese Language Summer School

University of Wuhan


Introduction to Contemporary Neurolinguistics

National Research University

Higher School of Economics, Moscow


2016 – 2018

Master Grant of the University of Bern

2015 – 2016

Oxford Russia Fund Scholarship

2014 – 2016

Increased State Academic Scholarship

for academic and research achievements (Moscow, Russia)




Co-organizer at Scientifica 2021

University of Zurich


Volunteer at Scientifica 2019, LiRi Information Event

University of Zurich


Programming: Python, R, Java

Web-development: Python (Django, Flask), HTML & CSS, JavaScript (AJAX), ElasticSeach

Databases: Neo4j, MySQL

Graphics software: Adobe Photoshop & Illustrator, Corel Painter & Draw

Mark-up: LaTeX



Russian (native); English, German (fluent); French (intermediate); Chinese, Bulgarian, Serbian (elementary)

Weiterführende Informationen

Web development, 2018 — 2019



GitHub projects