Research Design

Project Organization

The work on this project will be divided into two sub-projects, each of which is scheduled to take 36 months for completion:

Subproject A: Morphosyntactic Area Formation in Swiss German Dialects

Subproject B: Development and Evaluation of Quantitative Methods for Characterization of Dialectological Phenomena in Geographical Areas

Research Design for Subproject A

Objective and Database

The aim of Subproject A is to systematically describe and classify morphosyntactic variables in the dialect area of German-speaking Switzerland, and thus, by means of a regionally limited study, to contribute to the research on areal distribution of linguistic features, a topic that appears to be of fundamental importance for language. As the subject’s scope is extensive and very complex, the matter shall be approached from several different main points of focus. The database mainly consists of the material collected for the project “Dialect Syntax of Swiss German” (SADS) and, supplementarily, the so-called Wenker Sentences from the 1930s. Initially, the results shall provide a visualization of the spatial distribution of morphosyntactic variants in the German-speaking part of Switzerland that supplements grammatical themes in classic representations based on phonetic and lexical variations, e.g. Hotzenköcherle (1984).

This process necessarily entails the study and comparison of different cartographic methods in order to achieve an adequate representation of language area relationships. Proceeding from the classic technique of point symbols – a method that is essentially also continued in SADS – the possibilities and limitations of employing color and incorporating quantitative aspects shall be explored and evaluated. A particular emphasis is thereby set on the interpretive value of area maps. Until now, such area maps have only rarely been generated from point symbol maps, and it also does not seem possible to substitute area maps for point symbol maps, since they would not do justice to the linguistic requirements of precise accuracy of data representation. However, experience has shown that in certain contexts, area maps can provide useful visualizations (of tendencies, for instance). Subproject A will evaluate the different cartographic methods and present their strengths and weaknesses.

Even at this stage, Subproject A will already be proceeding in close cooperative endeavor with Subproject B; this collaboration will intensify with the analysis of general patterns in language area relations and an evaluation of hypotheses concerning the respective spread or containment of linguistic features. Subprojects A’s publications aim to discuss different hypotheses of spatial distribution on the basis of the data which Subproject B will have produced up to this point in time. These publications will be concerned with the inclusion of various geographic variables which previously marked either an expansion or arrest of further spread of linguistic features. Finally, dialectological conceptions of ‘random dispersion’ or ‘regio-typical variants’ respectively, shall be examined by means of common geographic statistic methods, much like it has already been done in Sibler’s (2011) approach. Furthermore, the linguistic concept of linguistic isoglosses and isogloss bundles shall be examined with regard to a new understanding of the alignment of boundary lines, much as it has been developed in geographic information science, cf. below (Subproject B, step B3). In this project, we are therefore not interested in the aggregated variation of linguistic phenomena that can be used for the characterization of dialects, as applied and propagated by Nerbonne (2009, 2010b). Rather, we set the focus on the study of individual language phenomena. In a second step, the correspondence of their spatial distribution will be considered as well, although the focus will continue to be on unique phenomena and not on the characterization of larger aggregated dialect data.

Methodological Approach

i. Firstly, all of the available material shall be examined on the basis of the maps created in “Dialect Syntax” and the therein identifiable types of spatial distribution, will be grouped according to traditional dialectological criteria: for instance, into the respective maps and phenomena that demonstrate an East-West- or South-North-contrast; into those that indicate a small-scale distribution of a phenomena; into maps that show an unsystematic patchwork; and lastly into those maps that indicate no intuitive spatial distribution whatsoever within the Swiss German dialect areas. Using these as a source, the phenomena deemed representative for specific linguistic distributional types will be chosen in order to evaluate them in a quantitative test procedure. The results will then be compared with the intuitive dialectological assessment. The methods used in Sibler (2011) have proven that this process holds promising potential for the assessment of linguistic spatial distribution which could supplement linguistic evaluation and make the description process both easier and, to some extent, more objective, as the case of the inclined plane has shown (Seiler 2005). As such, further methods provided by geography should be tested in any case. Lastly, the linguistic subproject will decide which methods yield the best results for its purposes.

Vice versa, phenomena and maps shall be examined specifically on the influence of non-linguistic variables (topographic, political, religious environments), which, in contrast to Sibler (2011), constitutes a new research question with as yet entirely unknown results. In a further step, the maps that exhibit comparable boundary lines will be selected from the different, previously mentioned types. The main concern when performing this will be, on the one hand, an analysis of linguistically motivated correlations between the phenomena in question (e.g. verb doubling with the verb afa ‘anfangen’ (‘beginning’) and the existence of a short form of the infinitive in the case of this exact verb). On the other hand, the concept of coinciding boundary lines, the traditional isogloss bundles, shall be reconsidered. In doing so, a differentiation is made between clear boundaries separating two complementary areas and those cases in which the variants overlap in a transition area without exact boundaries. Owing to the fact that even those boundaries that indicate a tendency towards sharp border distinctions almost never present an exact match with the boundary lines of other phenomena, dialectological research during the last century showed a tendency to distance itself from the concept of borders in general. As such, it shall be tested as to whether the almost all but abandoned concept of borders or isoglosses could be re-established by drawing upon the conceptualizations of borders and border areas provided by geography.

Lastly, the insights gained from the SADS material shall be compared on a selective basis to the (phenomena-wise, very limited) data of the so-called Wenker material in order to learn if the patterns of area formation are repeated in it.

Research Design for Subproject B

Objective

The interest of this subproject lies primarily in the development and evaluation of methods. Its goal is to develop and expand methods while drawing on processes from GIScience which could help to explain the influence of geography on language as well as on the variation of linguistic features throughout a geographic region. The colloquially influenced short term ‘geography’ here stands for geographic variables that can potentially affect the characteristics of dialects, such as geographic distance (Séguy, 1971; Szmrecsanyi, 2008; Nerbonne, 2010a), physical barriers (e.g. topography, bodies of water) or political or cultural borders. The central research question of the subproject is the following: how can the manner and degree of influence that geography has on areal dispersion patterns of linguistic phenomena be quantitatively measured and characterized? Since the answer to this question shall also require the development of methods that are new and unfamiliar to linguistics, a second research question becomes necessary: by which means will linguists be able to evaluate the results of quantitative methods, as opposed to conventional ones (e.g. point maps, manually set isoglosses)?

Methodological Approach

The subproject B is divided into the following steps:


B1) Implementation of working platform

In favor of efficiency, especially in consideration of the evaluation of the aspired quantitative methods, a consistent working platform in which processes can largely be automatized is essential. This step shall therefore first implement a base amount of general methods of dialectometry (e.g., KDE-based interpolation of intensity values, different measurements for linguistic distance, creation of Voronoi diagrams etc.). The statistical software system R, which was already used by Sibler (2011) to great success, will prove useful for the implementation of statistical techniques. Specific geometric methods call for libraries of Computational Geometry such as JTS, CGAL and the like. More recent methods will either be implemented with the already mentioned software systems or with Java/Processing (processing.org). Although statistical visualization is relatively strong in R, its cartographic visualizations are fairly weak. Since the SADS project makes use of the GIS software ArcGIS it seems reasonable to do so here as well. The use of the open-source GIS OpenJUMP, which holds the advantage of a direct integration into Java code, could provide an alternative method.


B2) Validation of linguistic hypotheses with geostatistical methods

This step pursues three goals. Firstly, the set of geostatistical methods for characterization and analysis of the patterns of spatial variation of linguistic features shall be expanded based on the work of Sibler (2011). Secondly, the use of these methods shall lead to a continuation of the evaluation of content of the hypotheses formulated in B1, again based on the work of Sibler (2011). And thirdly, a basis for the following steps shall be laid both in methodological terms and for further cooperation between the two subprojects in their work on content.


B3) Correlation of dialect variation and geography: linear objects

This step and the following ones are concerned with the development of methods that enable the assessment of potential spatial correlations between geographic and linguistic phenomena. During this step, the focus lies on linear objects. In geography, these stand for linear barriers that influence the formation of boundaries in language variation, such as bodies of water or borders between cultural regions. The so-called “Jassgrenze” (Jass-border, Jass being the name of a popular card game in Switzerland which can be played with French or German card sets), for instance, a cultural border that stretches across all of German-speaking Switzerland, is presumed to have an influence on certain dialect phenomena. The question thus arises how it would be possible to establish if a geographic border and a linguistic border (isogloss) are ‘congruent’. It would be equally as interesting to compare two or more isoglosses (isogloss bundles) and analyze their respective congruency.


B4) Correlation of dialect variation and geography: cost surfaces

While step B3 was focused on linear objects, this step shall mainly be concerned with the analysis of continuously varying geographic phenomena such as, for instance, topography. While topography can form barriers that impede expansion or differentiation of language phenomena, lines are not suited to be used as barrier models. It makes more sense to consider topography in the terms of a cost surface. Cost surfaces are a common and versatile device in GIScience and can be applied to establish ways in which distances along a path can be covered with a minimum of cumulative cost (e.g. Douglas, 1994; Larue & Nielsen, 2008; Rees, 2004).


B5) Evaluation of usability

Each of the previous steps B2 to B4 already encompasses experiments with morphosyntactic data and, as such, will already have undergone verification. However, for most linguists – particularly so for those who are not well versed in the methods of dialectometry – the results of the aspired methods will be new and unfamiliar. It is thus necessary to evaluate the interpretability, and with that, the usability, of the developed methods in linguistic research. This aspect can only be assessed by conducting user experiments. The experiments will be carried out with researchers and students that can be recruited via the Glaser research group and the Zurich Center for Linguistics (Zürcher Kompetenzzentrum für Linguistik, ZüKL).


B6) Completion of dissertation

The Doctoral Student will write the dissertation as a cumulative paper with integrated research papers. To complete the dissertation, the student will formulate a synthesis section of the research, integrate the research articles and document the developed software modules.