Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2013 May 17;8(5):e63238.
doi: 10.1371/journal.pone.0063238. Print 2013.

Phonotactic diversity predicts the time depth of the world's language families

Affiliations
Comparative Study

Phonotactic diversity predicts the time depth of the world's language families

Taraka Rama. PLoS One. .

Abstract

The ASJP (Automated Similarity Judgment Program) described an automated, lexical similarity-based method for dating the world's language groups using 52 archaeological, epigraphic and historical calibration date points. The present paper describes a new automated dating method, based on phonotactic diversity. Unlike ASJP, our method does not require any information on the internal classification of a language group. Also, the method can use all the available word lists for a language and its dialects eschewing the debate on 'language' vs. 'dialect'. We further combine these dates and provide a new baseline which, to our knowledge, is the best one. We make a systematic comparison of our method, ASJP's dating procedure, and combined dates. We predict time depths for world's language families and sub-families using this new baseline. Finally, we explain our results in the model of language change given by Nettle.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The author has declared that no competing interests exist.

Figures

Figure 1
Figure 1. Calibration dates against the number of languages in a language group.
formula images are archaeological, formula images are archaeological and historical, formula images are epigraphic and formula images are historical dates.
Figure 2
Figure 2. Pairwise scatterplot matrix of group size, N-gram diversity and date; the lower matrix panels show scatterplots and LOESS lines; the upper matrix panels show Spearman rank correlation () and level of statistical significance ().
The diagonal panels display variable names. All the plots are on a log-log scale.
Figure 3
Figure 3. Comparing predicted dates for various n-grams.
Figure 4
Figure 4. Combining ASJP with -grams and 3-grams: The ASJP dates are combined with 2-gram dates and 3-gram dates in different proportions ranging from 1% to 100% at an interval of 1.2

Similar articles

Cited by

References

    1. Swadesh M (1952) Lexico-statistic dating of prehistoric ethnic contacts: with special reference to North American Indians and Eskimos. Proceedings of the American philosophical society 96 452–463: 355.
    1. Swadesh M (1955) Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics 21: 121–137.
    1. Lees RB (1953) The basis of glottochronology. Language 29: 113–127.
    1. Holman EW, Wichmann S, Brown CH, Velupillai V, Müller A, et al. (2008) Explorations in auto359 mated language classification. Folia Linguistica 42: 331–354.
    1. Petroni F, Serva M (2011) Automated word stability and language phylogeny. Journal of Quanti360 -tative Linguistics 18: 53–62.

Publication types

Grants and funding

The research presented here was supported by the Swedish Research Council (the project Digital areal linguistics, VR dnr 2009-1448) and by the University of Gothenburg through its support of the Centre for Language Technology and of Språkbanken. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources