Journal article
Zipf's law holds for phrases, not words
Scientific reports, v 5(1), pp 12209-12209
11 Aug 2015
PMID: 26259699
Abstract
With Zipf's law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipf's law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a rich frontier of rigorous text analysis via a rank ordering of mixed length phrases.
Metrics
Details
- Title
- Zipf's law holds for phrases, not words
- Creators
- Jake Ryland Williams - University of VermontPaul R. Lessard - University of Colorado BoulderSuma Desu - Massachusetts Institute of TechnologyEric M. Clark - University of VermontJames P. Bagrow - University of VermontChristopher M. Danforth - University of VermontPeter Sheridan Dodds - University of Vermont
- Publication Details
- Scientific reports, v 5(1), pp 12209-12209
- Publisher
- Springer Nature
- Number of pages
- 7
- Grant note
- NNX 08A096G / NASA; National Aeronautics & Space Administration (NASA) DMS-0940271; 0846668 / NSF; National Science Foundation (NSF)
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Information Science
- Web of Science ID
- WOS:000359287600001
- Scopus ID
- 2-s2.0-84939141504
- Other Identifier
- 991021806672904721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- Web of Science research areas
- Multidisciplinary Sciences