Motivation and Goal: The idea that corpora of written language contain interesting insights on the way people perceive the world, although being established, is not representatively covered in Geography yet! In this project we use Ngrams (n-word combinations in connection with frequencies/probabilities), and associated probabilities, as an entry point to the Internet’s information. In particular, we use Ngrams containing the spatial relation - "near" - in combination with place names. This allows us to georeference a large number of "near relations," distributed over several continents and different geographic settings. Thus, the collection of near instantiations from Ngrams offers the possibility for quantitative interpretations of where and what is near.
Assymentry of near shown with probabilities of sentences cityA near cityB and vice versa, given that A has larger population than B.