Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 26;9(2):e89052.
doi: 10.1371/journal.pone.0089052. eCollection 2014.

Contraction of online response to major events

Affiliations

Contraction of online response to major events

Michael Szell et al. PLoS One. .

Abstract

Quantifying regularities in behavioral dynamics is of crucial interest for understanding collective social events such as panics or political revolutions. With the widespread use of digital communication media it has become possible to study massive data streams of user-created content in which individuals express their sentiments, often towards a specific topic. Here we investigate messages from various online media created in response to major, collectively followed events such as sport tournaments, presidential elections, or a large snow storm. We relate content length and message rate, and find a systematic correlation during events which can be described by a power law relation--the higher the excitation, the shorter the messages. We show that on the one hand this effect can be observed in the behavior of most regular users, and on the other hand is accentuated by the engagement of additional user demographics who only post during phases of high collective activity. Further, we identify the distributions of content lengths as lognormals in line with statistical linguistics, and suggest a phenomenological law for the systematic dependence of the message rate to the lognormal mean parameter. Our measurements have practical implications for the design of micro-blogging and messaging services. In the case of the existing service Twitter, we show that the imposed limit of 140 characters per message currently leads to a substantial fraction of possibly dissatisfying to compose tweets that need to be truncated by their users.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors received funding from commercial sources (Ericsson, Audi Volkswagen, BBVA, The Coca Cola Company, Ericsson, Expo 2015, Ferrovial, and GE). This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials.

Figures

Figure 1
Figure 1. Strong anti-correlation between length and rate of messages posted during events.
(A) The property of content length formula image (number of characters per message) can be related to the property of volume formula image (number of messages per time interval) via power law with slope formula image (green line), a logarithmic fit cannot be rejected either due to the flat slope. Blue squares are hourly average values from the first five days in which the Masters Tournament took place, grey crosses are the hourly average values from subsequent times. Message rate and content length are strongly anti-correlated during the Masters (formula image, p-value formula image for the hypothesis of no correlation) but not after the tournament (formula image, p-value formula image). (B) Respective message rate and (C) content length over time, averaged hourly. Here all plots refer to the Twitter data set 1, results are similar in other media, see Section S2 in File S1.
Figure 2
Figure 2. Lognormal distribution of message lengths and dependence of its parameters on excitation.
(A) Probability distributions of content length formula image of messages gathered by logarithmically binned classes of different hourly volume formula image (circles), and corresponding lognormal fits (dashed curves, fit ranges 0 to 120). During low-volume phases (pink and blue), the distribution grows slowly. For high-volume phases (orange and red) however the distribution grows fast and peaks at formula image. Peaks at the maximum length of 140 are an artifact from the length limitation in the specific medium (Twitter), absent for unlimited media, see Section S2 in File S1. For visual clarity only every third data point is shown. (B) Plot of the lognormal fit parameter formula image against message rate formula image demonstrates the systematic relation between message rate and length, dashed line. (C) Plot of the lognormal fit parameter formula image versus message rate formula image. Here the value of formula image increases with the message rate formula image to some point and appears independent of the volume class in high volume regimes. Error bars denote formula image confidence intervals.
Figure 3
Figure 3. Distributions of individual activity features and their timelines.
(A) Distribution of number of tweets per user formula image during the whole time span. The distribution follows approximately a power law with slope formula image. (B) Cumulative distribution of absolute thresholds formula image of all users, i.e. the smallest hourly volume for each user in which a tweet is posted. Approximately formula image of users only post during the one hour which marks the final of the event, but roughly one fourth of the users also post during hours in which less than 1000 tweets are posted. (C) Participation thresholds formula image over time (formula image is measured separately for each user over the whole timespan; for each hour we average the formula image values of all unique users who tweet in that hour). The curve follows roughly the volume curve of formula image, showing that high volume phases feature additional users who only post during those phases. (D) Timeline of number of tweets formula image per unique user formula image, formula image. During the event, each user posts on average around formula image tweets per hour, with a particular peak at the finale of the tournament, showing that single users write slightly more messages during that time of high excitation. After the event, the individual activity increases slightly due to the departure of the masses of casual Twitter users.
Figure 4
Figure 4. Message length distributions filtered for classes of tweets with certain properties.
Colors correspond to different volume bins, see legend in Fig 2A. The percentage shown in each subplot corresponds to the percentage of tweets within the data set matching the corresponding criterion. The contraction effect appears throughout all classes, except for users in the highest percentiles of activity. Tweets of users who post less than others or who have less followers or followees than others, or which contain no mentions indicating that the message is not of conversational nature, show the strongest contraction. Users with a high threshold, formula image, are missing most of the contraction since they tend to make single short tweets only. For visual clarity we applied a moving average filter of length 5 to all curves.
Figure 5
Figure 5. Fraction of messages posted immediately before and after an event peak.
The fraction during the peak hour serves as a measure for the temporal singularity of the event. (A) The largest peak occurs during the finale of the golf tournament, data set 1, with a peak fraction of formula image. (B) The second singular event is the presidential election thread, data set 2a, with a peak fraction of formula image. (C) The snow storm event, data set 4a, on the other hand, shows a clear peak only when the time scale is coarsened, from hours to days. The snow storm is thus ''much less singular'' than the previous events, but still a distinct event in the broader view.

Similar articles

Cited by

  • Structure of 311 service requests as a signature of urban location.
    Wang L, Qian C, Kats P, Kontokosta C, Sobolevsky S. Wang L, et al. PLoS One. 2017 Oct 17;12(10):e0186314. doi: 10.1371/journal.pone.0186314. eCollection 2017. PLoS One. 2017. PMID: 29040314 Free PMC article.
  • Global multi-layer network of human mobility.
    Belyi A, Bojic I, Sobolevsky S, Sitko I, Hawelka B, Rudikova L, Kurbatski A, Ratti C. Belyi A, et al. Int J Geogr Inf Sci. 2017 Jul 3;31(7):1381-1402. doi: 10.1080/13658816.2017.1301455. Epub 2017 Mar 13. Int J Geogr Inf Sci. 2017. PMID: 28553155 Free PMC article.
  • Cities through the Prism of People's Spending Behavior.
    Sobolevsky S, Sitko I, Tachet des Combes R, Hawelka B, Murillo Arias J, Ratti C. Sobolevsky S, et al. PLoS One. 2016 Feb 5;11(2):e0146291. doi: 10.1371/journal.pone.0146291. eCollection 2016. PLoS One. 2016. PMID: 26849218 Free PMC article.

References

    1. Lazer D, Pentland A, Adamic L, Aral S, Barabási AL, et al. (2009) Computational social science. Science 323: 721. - PMC - PubMed
    1. Kleinberg J (2003) Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery 7: 373–397.
    1. Barabási AL (2005) The origin of bursts and heavy tails in human dynamics. Nature 435: 207–211. - PubMed
    1. Klimek P, Bayer W, Thurner S (2011) The blogosphere as an excitable social medium: Richter's and Omori's law in media coverage. Physica A 390: 3870–3875.
    1. Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp. 497–506.

Publication types

Grants and funding

Sebastian Grauwin acknowledges financial support from Ericsson's “Signature of Humanity” fellowship. Further supporters of Senseable City Laboratory are: the National Science Foundation, the AT&T Foundation, the Rockefeller Foundation, the MIT SMART program, the MIT CCES program, Audi Volkswagen, BBVA, The Coca Cola Company, Ericsson, Expo 2015, Ferrovial, GE, and all the members of the MIT Senseable City Lab Consortium. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.