SDS Projects 2014-2018
This overview highlights the research projects undertaken by Curdin Derungs, leader of the SDSD Group (formerly known as the GIS Group), and his team from 2014 to mid-2018, some of which continued in various forms beyond 2018
Expansion of Bantu Languages
There are contested theories on the expansion of Bantu languages, most centrally involving debate between two paradigms: one positing an early-split and the other a late-split model. We aimed at reconstructing the most probable expansion of Bantu languages, starting in Nigeria and from there, following the paths associated with least travel costs, towards the east and south. We thus conducted a least cost path analysis with the aim of spatially reconstructing the Bantu language tree. Results are published in Christian Wirth's master thesis Modellierung der Ausbreitungspfade der Bantu-Sprachfamilie mithilfe Geographischer Informationssysteme (PDF, 4 MB) (September 2014), supervised by Curdin Derungs, Robert Weibel and Balthasar Bickel.
Toolbox for Analyzing Dialect Data
Dialect data, for instance regarding different syntax practices in Switzerland, often share the same characteristics: they tend to be unevenly distributed in space (in particular if collected through online questionnaires or mobile apps), with varying numbers of answers per location, and each linguistic feature may take on a different number of categorical values. For this reason, we decided to develop a toolbox that would formalize a standard spatial analysis workflow for dialect data. The analysis includes visual as well as statistical output and allows conducting a large series of tests one step at a time.
This project was set up in close cooperation between Elvira Glaser and Philipp Stöckle as part of the SNSF-project "Modelling morphosyntactic area formation in Swiss German (SynMod)" and the GIS Group.
Global language similarities explored in large-p/small-n data collections from linguistic typology
Only recently have large compilations of typological data been homogenized and made freely available to the public. Such information usually comes in the form of a matrix containing some four to six hundred categorical (i.e. multinomial) linguistic features for several hundred global languages (i.e. large-p/small-n, with n actually not being so small). One of the overarching questions that might be answered with such data is: What is the global relatedness of languages and which historic linguistic theory does it support? However, the challenges in using the data for this purpose are, for instance, its categorical character, its large-p, its uneven spatial distribution, and the many NA values, etc. For this reason, we introduced a procedure that reduces dimensionality while still reflecting the impact of individual linguistic features and NA values in particular. Additionally, our approach accounts for the spatial character of the data and thus combines dimension reduction with spatial analysis.
Uneven Distribution of Morphology in Different Language Families
The global distribution of morphological structures of different language families is uneven. This project aims to explain the impact of language contact on morphology through a combination of historical linguistics, phylogeny, and geographic analysis. In the first stage of the project, we gained the necessary background in the relevant linguistic theories. In the second stage, we applied these insights, in combination with new methodological approaches, to regions only associated with sparse historical linguistic information. Collaboration with SNSF sinergia project "Limits of Morphology in Time and Space" (LiMiTS).
Distribution of Dogon Languages
The Dogon language family, which consists of some 20 languages, is distributed over an area the size of Switzerland and is located in Mali along the border of Burkina Faso. On the one hand, Dogon languages have not yet been fitted into the puzzle of African languages. On the other hand, they represent a complex spatial pattern of linguistic diversity along a large and often impenetrable natural cliff (the Bandiagara). The goal of this project is to quantify the influence of the Bandiagara and to test if accessibility can account for some of the unexplained linguistic features such as extensive lexical borrowing. Cooperation with Steven Moran and Balthasar Bickel.
From Text to Space: Spatial Discourse in Alpine Route Directions and Narratives
The general goal of this project is to explore user-generated descriptions and to develop suitable methods for working with large corpora. In particular, we are interested in examining the way people address their experiences in non-urban natural space through the prism of alpine narratives found in blogs and on alpine clubs’ webpages. The starting point of the research is the non-universal character of space conceptualizations and its dependency on various aspects of context. We aim to study these aspects – those related to the physical world (such as the scale of activity), as well as those related to socially- and individually-dependent constructs (such as the sense of place). By focusing on a specific set of linguistic features – for example, landscape terms – we want to determine how they are used in different ways across different contexts. This project constitutes the PhD thesis of Ekaterina Egorova, supervised by Ross Purves and Thora Tenbrink (Cognitive Linguistics Group, Bangor University). The PhD project is completed (PhD thesis defense April 12, 2018).
Where and What is Near?
The idea that corpora of written language contain interesting insights on the way people perceive the world, although being established, is not representatively covered in Geography yet! In this project we use Ngrams (n-word combinations in connection with frequencies/probabilities), and associated probabilities, as an entry point to the Internet’s information. In particular, we use Ngrams containing the spatial relation - "near" - in combination with place names. This allows us to georeference a large number of "near relations," distributed over several continents and different geographic settings. Thus, the collection of near instantiations from Ngrams offers the possibility for quantitative interpretations of where and what is near.
Georeferencing Web Ngrams
Ngrams (n-word combinations in combination with frequencies/probabilities) are used to index large bodies of written text. In recent years, Google as well as Bing allowed access to their Ngram collections representing all one to five word combinations on the Internet (i.e. hundreds of billions of web pages). This information has often been used in different scientific domains (e.g. computer linguistics, artificial intelligence, genetics, etc.). Geography, however, has fallen somewhat short in using Ngrams for spatial analysis or for learning about the use of geographic concepts. One important reason for this gap is that Ngrams are particularly challenging for georeferencing, which is a precondition for follow-up analyses. In this project, our aim is to find means for associating arbitrary words or word combinations with spatial footprints, which in turn opens the door for an in-depth spatial analysis of a broad set of geographic research issues.
Publications Curdin Derungs at UZH (ZORA Query)
ZORA Publication List
Download Options
Publications
-
Same data, different conclusions: Radical dispersion in empirical results when independent analysts operationalize and test the same hypothesis Organizational Behavior and Human Decision Processes, 165, 228–249. doi:10.1016/j.obhdp.2021.02.003
-
Dialect borders - political regions are better predictors than economy or religion Digital Scholarship in the Humanities, 35(2):276-295.
-
Prediction of soil formation as a function of age using the percolation theory approach Frontiers in Environmental Science:6:108.
-
Towards faithfully visualizing global linguistic diversity In: Calzolari, Nicoletta . Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Paris, France: European Language Resources Association (ELRA), 805-809.
-
Environmental factors drive language density more in food-producing than in hunter–gatherer populations Proceedings of the Royal Society of London, Series B: Biological Sciences, 285(1885):20172851.
-
Identifying probable pathways of language diffusion in South America In: AGILE conference 2017, Wageningen, 9 May 2017 - 12 May 2017.
-
Decomposition and stabilisation of Norway spruce needle-derived material in Alpine soils using a 13C-labelling approach in the field Biogeochemistry, 131(3):321-338.
-
Characterizing place: an empirical comparison between user-generated content and freelisting data International Conference on GIScience Short Paper Proceedings, 1(1):336-339.
-
Characterising landscape variation through spatial folksonomies Applied Geography, 75:60-70.
-
Mining nearness relations from an n-grams Web corpus in geographical space Spatial Cognition & Computation, 16(4):301-322.
-
Soil attributes and microclimate are important drivers of initial deadwood decay in sub-alpine Norway spruce forests Science of the Total Environment, 569-570:1064-1076.
-
Spatial characteristics of a large web n-gram corpus In: GIR '15 9th Workshop on Geographic Information Retrieval, Paris, 26 November 2015 - 27 November 2015. ACM Digital Library, online.
-
From space to place: place-based explorations of text International Journal of Humanities and Arts Comouting, 9(1):74-94.
-
Development and evaluation of a geographic information retrieval system using fine grained toponyms Journal of Spatial Information Science, (11):1-29.
-
More than a list: what outdoor free listings of landscape categories reveal about commonsense geographic concepts and memory search strategies In: Fabrikant, Sara I; Raubal, Martin; Bertolotto, Michela; Davies, Clare; Freundschuh, Scott; Bell, Scott . Spatial Information Theory. 12th International Conference, COSIT 2015, Santa Fe, NM, USA, October 12-16, 2015, Proceedings. Cham: Springer, 224-243.
-
From products to processes: Academic events to foster interdisciplinary and iterative dialogue in a changing climate Earth's Future, 3(8):289-297.
-
Where’s near? Using web tri-grams to explore spatial relations In: GIScience 2014: Eighth International Conference on Geographic Information Science, Vienna (A), 23 September 2014 - 26 September 2014. Technische Universität Wien, 158-162.
-
From text to landscape: extraction of landscape concepts through the resolution of ambiguity and vagueness present in descriptions of natural landscapes 2014, University of Zurich, Faculty of Science.
-
Creating test collections from user generated content for GIR evaluation In: GIR'13: 7th ACM SIGSPATIAL Workshop on Geographic Information Retrieval, Orlando, FL, USA, 5 November 2013. Association for Computing Machinery, 82-83.
-
The meanings of the generic parts of toponyms: use and limitations of gazetteers in studies of landscape terms In: Tenbrink, Thora; Stell, John; Galton, Antony; Wood, Zena . Spatial Information Theory. Cham: Springer, 261-278.
-
From text to landscape: locating, identifying and mapping the use of landscape features in a Swiss Alpine corpus International Journal of Geographical Information Science:online.
-
Resolving fine granularity toponyms: Evaluation of a disambiguation approach In: GIScience 2012: Seventh International Conference on Geographic Information Science, Columbus, Ohio, 18 September 2012 - 21 September 2012, online.
-
Measuring topographic similarity of toponyms In: AGILE 2012, 15th AGILE International Conference on Geographic Information Science, Avignon, FR, 24 April 2012 - 27 April 2012, 30-34.
-
Toponym disambiguation of landscape features using geomorphometric characteristics In: 11th International Conference of GeoComputation, London, 20 July 2011 - 22 July 2011, 106-110.
-
Empirical experiments on the nature of Swiss mountains In: GISRUK 2007 Geographical Information Science Research Conference, Maynooth (Ireland), 11 April 2007 - 13 April 2007, online.