How to make data reusable?

Inter-Lab Workshop on data and standards

Friday, 29 May 2015, 9:00 – 18:30, Rooms SOE-E-1 (Morning) and SOE-F-7 (Afternoon), Schönberggasse 11

Registration Deadline: 8 May 2015

Empirical methodology supported by computer technology is a hallmark of the linguistic research at the University of Zurich. Advanced digital resources have been created and exploited in all language seminars and institutes. These growing resources call for a more systematic approach to data management. Additionally, the URPP Language and Space has brought in new opportunities to combine linguistic with spatial data for interdisciplinary research, which brings new challenges. This creates a good context for a thorough discussion on how to integrate, develop, and exploit currently available resources.

To facilitate the data-related activities of the members of the URPP and other UZH researches with an interest or experience in empirical methods, the three URPP labs, CorpusLab, GISLab, VideoLab, are organising jointly a series of workshops entitled “Working with linguistic and spatial data”.

The first workshop in the series, “How to make data reusable” took place on 29 May 2015.


9:00 – 9:15 Introduction (Curdin Derungs, Tanja Samardžić, Wolfgang Kesselheim)
9:15 – 10:00 Standards and platforms for data sharing (Tomaž Erjavec, Jožef Stefan Institute, Ljubljana)
10:00 – 10:30 Coffee break
10:30 – 11:15 Standards in spoken corpora (Thomas Schmidt, Institut für Deutsche Sprache, Mannheim)
11:15 – 12:00 Standards in spatial data (Beat Tschanz, Swisstopo, Bern)
12:00 – 12:30 Panel discussion
12:30 – 14:00 Lunch break
14:00 – 14:30 Making the data usable: issues in unifying language acquisition corpora (Robert Schikowski and Steven Moran)
14:30 – 15:30 Working session on issues in corpora of spoken language (led by Wolfgang Kesselheim)
15:30 – 16:00 Coffee break
16:00 – 17:30 Working session on issues in creating reusable language corpora (led by Tanja Samardžić)
17:30 – 18:00 Panel discussion


Please send short answers to 

  1. Introduce shortly one or more data resources that you have created or used in your research.
  2. Is there a spatial component to these data?
  3. How are the data stored? (system, media, format...)
  4. How do you retrieve the information needed for your research from the data? (web-interface, local database interface, corpus query, custom scripts, manual inspection, ...)
  5. Who can access the data? How?
  6. What are, with respect to standards, limitations of the data?
  7. What would you like to understand better regarding data collection and management?
  8. What is the most useful lesson learned in your experience with these data?