Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 20;9(1):4787.
doi: 10.1038/s41467-018-06930-7.

The spread of low-credibility content by social bots

Affiliations

The spread of low-credibility content by social bots

Chengcheng Shao et al. Nat Commun. .

Abstract

The massive spread of digital misinformation has been identified as a major threat to democracies. Communication, cognitive, social, and computer scientists are studying the complex causes for the viral diffusion of misinformation, while online platforms are beginning to deploy countermeasures. Little systematic, data-based evidence has been published to guide these efforts. Here we analyze 14 million messages spreading 400 thousand articles on Twitter during ten months in 2016 and 2017. We find evidence that social bots played a disproportionate role in spreading articles from low-credibility sources. Bots amplify such content in the early spreading moments, before an article goes viral. They also target users with many followers through replies and mentions. Humans are vulnerable to this manipulation, resharing content posted by bots. Successful low-credibility sources are heavily supported by social bots. These results suggest that curbing social bots may be an effective strategy for mitigating the spread of online misinformation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Online virality of content. a Probability distribution (density function) of the number of tweets for articles from both low-credibility (blue circles) and fact-checking (orange squares) sources. The distributions of the number of accounts sharing an article are very similar (see Supplementary Fig. 2). As illustrations, the diffusion networks of two stories are shown: b a medium-virality misleading article titled “FBI just released the Anthony Weiner warrant, and it proves they stole election”, published a month after the 2016 US election and shared in over 400 tweets; and c a highly viral fabricated news report titled “Spirit cooking”: Clinton campaign chairman practices bizarre occult ritual, published 4 days before the 2016 US election and shared in over 30,000 tweets. In both cases, only the largest connected component of the network is shown. Nodes and links represent Twitter accounts and retweets of the article, respectively. Node size indicates account influence, measured by the number of times an account was retweeted. Node color represents bot score, from blue (likely human) to red (likely bot); yellow nodes cannot be evaluated because they have either been suspended or deleted all their tweets. An interactive version of the larger network is available online (iunetsci.github.io/HoaxyBots/). Note that Twitter does not provide data to reconstruct a retweet tree; all retweets point to the original tweet. The retweet networks shown here combine multiple cascades (each a “star network” originating from a different tweet) that all share the same article link
Fig. 2
Fig. 2
Anomalies. The distribution of types of tweet spreading articles from a low-credibility and b fact-checking sources are quite different. Each article is mapped along three axes representing the percentages of different types of messages that share it: original tweets, retweets, and replies. When user Alice retweets a tweet by user Bob, the tweet is rebroadcast to all of Alice’s followers, whereas when she replies to Bob’s tweet, the reply is only seen by Bob and users who follow them both. Color represents the number of articles in each bin, on a log scale. c Correlation between popularity of articles from low-credibility sources and concentration of posting activity. We consider a collection of articles shared by a minimum number of tweets as a popularity group. For articles in each popularity group, a violin plot shows the distribution of Gini coefficients which measure concentration of posts by few accounts (see Supplementary Methods). In violin plots, the width of a contour represents the probability of the corresponding value, and the median is marked by a colored line. d Bot score distributions for a random sample of 915 accounts who posted at least one link to a low-credibility source (orange), and for the 961 “super-spreaders” that most actively shared content from low-credibility sources (blue). The two groups have significantly different scores (p < 10−4 according to the Mann–Whitney U test): super-spreaders are more likely bots
Fig. 3
Fig. 3
Bot strategies. a Early bot support after a viral low-credibility article is first shared. We consider a sample of 60,000 accounts that participate in the spread of the 1000 most viral stories from low-credibility sources. We align the times when each article first appears. We focus on the 1 h early spreading phase following each of these events, and divide it into logarithmic lag intervals. The plot shows the bot score distribution for accounts sharing the articles during each of these lag intervals. b Targeting of influentials. We plot the average number of followers of Twitter users who are mentioned (or replied to) by accounts that link to the most viral 1000 stories. The mentioning accounts are aggregated into three groups by bot score percentile. Error bars indicate standard errors. Inset: Distributions of follower counts for users mentioned by accounts in each percentile group
Fig. 4
Fig. 4
Impact of bots on humans. a Joint distribution of bot scores of accounts that retweeted links to low-credibility articles and accounts that had originally posted the links. Color represents the number of retweeted messages in each bin, on a log scale. b The top projection shows the distribution of bot scores for retweeters, who are mostly human. c The left projection shows the distribution of bot scores for accounts retweeted by likely humans who are identified by scores below a threshold of 0.4 (black crosses), 0.5 (purple stars), or 0.6 (orange circles). Irrespective of the threshold, we observe a significant portion of likely bots retweeted by likely humans
Fig. 5
Fig. 5
Dismantling the low-credibility content diffusion network. This analysis is based on a network of retweets linking to low-credibility articles, collected during the 2016 US presidential campaign. The network has 227,363 nodes (accounts); see Methods for further details. The priority of disconnected nodes is determined by ranking accounts on the basis of the different characteristics shown in the legend. The remaining fraction of a unique articles from low-credibility sources and b retweets linking to those articles is plotted versus the number of disconnected nodes
Fig. 6
Fig. 6
Popularity and bot support for the top sources. Satire websites are shown in orange, fact-checking sites in blue, and low-credibility sources in red. Popularity is measured by total tweet volume (horizontal axis) and median number of tweets per article (circle area). Bot support is gauged by the median bot score of the 100 most active accounts posting links to articles from each source (vertical axis). Low-credibility sources have greater support by bots, as well as greater median and/or total volume in many cases

Similar articles

Cited by

References

    1. Gottfried, J. & Shearer, E. News use across social media platforms 2016. White Paper, Pew Research Center (2016). http://www.journalism.org/2016/05/26/news-use-across-social-media-platfo...
    1. Vosoughi S, Roy D, Aral S. The spread of true and false news online. Science. 2018;359:1146–1151. doi: 10.1126/science.aap9559. - DOI - PubMed
    1. Markines, B., Cattuto, C. & Menczer, F. Social spam detection. In Proc. 5th International Workshop on Adversarial Information Retrieval on the Web (AIRWeb) (ACM, New York, 2009).
    1. Mustafaraj, E. & Metaxas, P. T. From obscurity to prominence in minutes: Political speech and real-time search. In Proc. Web Science Conference: Extending the Frontiers of Society On-Line (Raleigh, 2010).
    1. Ratkiewicz, J. et al. Detecting and tracking political abuse in social media. In Proc. 5th International AAAI Conference on Weblogs and Social Media (ICWSM) (AAAI, Palo Alto, 2011).

Publication types

MeSH terms

LinkOut - more resources