Uliana Petrunina had joined the URPP Language and Space in July 2014.

PhD project:

Approaching language variation in space: human judgements vs. corpus data


This project addresses the effects of space on the variation in tense and aspect across a large number of national varieties of English. For example, non-standard tense and aspect markers such as habitual be, perfective been, completive done were transferred from the British Isles to the Caribbean (and from Britain to Ireland) and retained in speech by the native population of these world areas. Though the variation in tense and aspect must be the result of movement in geographical space, we argue that their retention and persistence can also be influenced by language register (written and oral), or by interaction of both geography and society.

In describing language variation in terms of spread of structural features linguists are increasingly using typological databases. This rapidly growing data sources allow empirical hypothesis testing on a large set of language varieties and their features. However, the reliability of the databases depends on methodologies used for defining and applying codes to features.

We compare the features encoded in the databases with the corresponding features automatically extracted from language corpora to assess the coverage and the reliability of the two sources of data. We explore the potential complementarity of the two sources of data in a multivariate statistical analysis.

Supervision: Marianne Hundt, Tanja Samardžić

Funding source: URPP Language and Space .