Logo image
Zipf's law holds for phrases, not words
Journal article   Open access   Peer reviewed

Zipf's law holds for phrases, not words

Jake Ryland Williams, Paul R. Lessard, Suma Desu, Eric M. Clark, James P. Bagrow, Christopher M. Danforth and Peter Sheridan Dodds
Scientific reports, v 5(1), pp 12209-12209
11 Aug 2015
PMID: 26259699
url
https://doi.org/10.1038/srep12209View
Published, Version of Record (VoR) Open

Abstract

Multidisciplinary Sciences Science & Technology Science & Technology - Other Topics
With Zipf's law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipf's law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a rich frontier of rigorous text analysis via a rank ordering of mixed length phrases.

Metrics

4 Record Views
53 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Multidisciplinary Sciences
Logo image