research-article

OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation

Authors:

Mourad Khayati,

Philippe Cudré-MaurouxAuthors Info & Claims

WWW '20: Proceedings of The Web Conference 2020

Pages 1851 - 1862

https://doi.org/10.1145/3366423.3380254

Published: 20 April 2020 Publication History

Abstract

Finding social influencers is a fundamental task in many online applications ranging from brand marketing to opinion mining. Existing methods heavily rely on the availability of expert labels, whose collection is usually a laborious process even for domain experts. Using open-ended questions, crowdsourcing provides a cost-effective way to find a large number of social influencers in a short time. Individual crowd workers, however, only possess fragmented knowledge that is often of low quality.

To tackle those issues, we present OpenCrowd, a unified Bayesian framework that seamlessly incorporates machine learning and crowdsourcing for effectively finding social influencers. To infer a set of influencers, OpenCrowd bootstraps the learning process using a small number of expert labels and then jointly learns a feature-based answer quality model and the reliability of the workers. Model parameters and worker reliability are updated iteratively, allowing their learning processes to benefit from each other until an agreement on the quality of the answers is reached. We derive a principled optimization algorithm based on variational inference with efficient updating rules for learning OpenCrowd parameters. Experimental results on finding social influencers in different domains show that our approach substantially improves the state of the art by 11.5% AUC. Moreover, we empirically show that our approach is particularly useful in finding micro-influencers, who are very directly engaged with smaller audiences.

References

[1]

Nitin Agarwal, Huan Liu, Lei Tang, and Philip S. Yu. 2008. Identifying the Influential Bloggers in a Community. In Proceedings of the 2008 International Conference on Web Search and Data Mining (WSDM). ACM, Palo Alto, California, USA, 207–218.

Digital Library

[2]

Shane Barker. 2019. The Ultimate Guide to Micro-Influencers. https://shanebarker.com/blog/micro-influencers-guide/. Accessed: 2019-10-11.

[3]

Bin Bi, Yuanyuan Tian, Yannis Sismanis, Andrey Balmin, and Junghoo Cho. 2014. Scalable topic-specific influence analysis on microblogs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM). ACM, New York, NY, USA, 513–522.

Digital Library

[4]

David M Blei, Alp Kucukelbir, and Jon D McAuliffe. 2017. Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112, 518 (2017), 859–877.

[5]

David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.

Digital Library

[6]

Robert M Bond, Christopher J Fariss, Jason J Jones, Adam DI Kramer, Cameron Marlow, Jaime E Settle, and James H Fowler. 2012. A 61-million-person experiment in social influence and political mobilization. Nature 489, 7415 (2012), 295.

[7]

Kendrick Boyd, Kevin H Eng, and C David Page. 2013. Area under the precision-recall curve: point estimates and confidence intervals. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD). Springer, Prague, Czech Republic, 451–466.

[8]

Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, and Krishna P Gummadi. 2010. Measuring user influence in twitter: The million follower fallacy. In Fourth International AAAI Conference on Weblogs and Social Media (ICWSM). The AAAI Press, Washington, DC, USA, 10–17.

[9]

Zhiyuan Cheng, James Caverlee, Himanshu Barthwal, and Vandana Bachani. 2014. Who is the Barbecue King of Texas?: A Geo-spatial Approach to Finding Local Experts on Twitter. In Proceedings of the 37th ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR). ACM, Gold Coast, Queensland, Australia, 335–344.

Digital Library

[10]

P. Dawid, A. M. Skene, A. P. Dawidt, and A. M. Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics) 28 (1979), 20–28.

[11]

Gianluca Demartini, Djellel Eddine Difallah, and Philippe Cudré-Mauroux. 2012. ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-scale Entity Linking. In Proceedings of the 21st International Conference on World Wide Web (WWW). ACM, Lyon, France, 469–478.

Digital Library

[12]

Ju Fan, Guoliang Li, Beng Chin Ooi, Kian-lee Tan, and Jianhua Feng. 2015. icrowd: An adaptive crowdsourcing framework. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD). ACM, Melbourne, Victoria, Australia, 1015–1030.

Digital Library

[13]

Ju Fan, Jiarong Qiu, Yuchen Li, Qingfei Meng, Dongxiang Zhang, Guoliang Li, Kian-Lee Tan, and Xiaoyong Du. 2018. Octopus: An online topic-aware influence analysis system for social networks. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE Computer Society, Paris, France, 1569–1572.

[14]

Samuel Gershman and Noah Goodman. 2014. Amortized inference in probabilistic reasoning. In Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci), Vol. 36. cognitivesciencesociety.org, Quebec City, Canada.

[15]

Behnam Hajian and Tony White. 2011. Modelling influence in a social network: Metrics and evaluation. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom). IEEE Computer Society, Boston,MA, USA, 497–500.

[16]

Leading Global Influencer Marketing Agency Relatable in collaboration with 350 Brands and Agencies. 2019. The 2019 state of influencer marketing report. https://www.relatable.me/the-state-of-influencer-marketing-2019. Accessed: 2019-05-22.

[17]

Alexy Khrabrov and George Cybenko. 2010. Discovering influence in communication networks using dynamic graph analysis. In 2010 IEEE Second International Conference on Social Computing (SocialCom). IEEE Computer Society, Minneapolis, Minnesota, USA, 288–294.

Digital Library

[18]

Himabindu Lakkaraju, Jure Leskovec, Jon Kleinberg, and Sendhil Mullainathan. 2015. A bayesian framework for modeling human evaluations. In Proceedings of the 2015 SIAM International Conference on Data Mining (SDM). SIAM, Vancouver, BC, Canada, 181–189.

[19]

Janette Lehmann, Carlos Castillo, Mounia Lalmas, and Ethan Zuckerman. 2013. Finding News Curators in Twitter. In Companion Proceedings of the 22nd International Conference on World Wide Web (WWW). ACM, Rio de Janeiro, Brazil, 863–870.

[20]

Daifeng Li, Xin Shuai, Guozheng Sun, Jie Tang, Ying Ding, and Zhipeng Luo. 2012. Mining Topic-level Opinion Influence in Microblog. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM). ACM, Maui, Hawaii, USA, 1562–1566.

Digital Library

[21]

Y. Li, J. Fan, Y. Wang, and K. Tan. 2018. Influence Maximization on Social Graphs: A Survey. IEEE Transactions on Knowledge and Data Engineering (TKDE) 30, 10(2018), 1852–1872.

[22]

Fenglong Ma, Yaliang Li, Qi Li, Minghui Qiu, Jing Gao, Shi Zhi, Lu Su, Bo Zhao, Heng Ji, and Jiawei Han. 2015. Faitcrowd: Fine grained truth discovery for crowdsourced data aggregation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Sydney, NSW, Australia, 745–754.

Digital Library

[23]

Zhanyu Ma and Arne Leijon. 2011. Bayesian estimation of beta mixture models with variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (2011), 2160–2173.

Digital Library

[24]

Joseph Victor Michalowicz, Jonathan M Nichols, and Frank Bucholtz. 2013. Handbook of differential entropy. Chapman and Hall/CRC.

[25]

Michael A Nielsen. 2015. Neural networks and deep learning. Vol. 25. Determination press San Francisco, CA, USA:.

[26]

Besmira Nushi, Ece Kamar, Eric Horvitz, and Donald Kossmann. 2017. On human intellect and machine failures: troubleshooting integrative machine learning systems. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI). AAAI Press, San Francisco, California, USA, 1017–1025.

[27]

Aditya Pal and Scott Counts. 2011. Identifying topical authorities in microblogs. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM). ACM, Hong Kong, China, 45–54.

Digital Library

[28]

Bo Pang, Lillian Lee, 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2, 1–2(2008), 1–135.

Digital Library

[29]

Aditya Parameswaran, Akash Das Sarma, and Vipul Venkataraman. 2016. Optimizing Open-Ended Crowdsourcing: The Next Frontier in Crowdsourced Data Management. Bulletin of the Technical Committee on Data Engineering 39, 4(2016), 26.

[30]

Jiezhong Qiu, Jian Tang, Hao Ma, Yuxiao Dong, Kuansan Wang, and Jie Tang. 2018. Deepinf: Social influence prediction with deep learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2110–2119.

Digital Library

[31]

Vikas C Raykar, Shipeng Yu, Linda H Zhao, Gerardo Hermosillo Valadez, Charles Florin, Luca Bogoni, and Linda Moy. 2010. Learning from crowds. Journal of Machine Learning Research 11, Apr (2010), 1297–1322.

Digital Library

[32]

Fatemeh Riahi, Zainab Zolaktaf, Mahdi Shafiei, and Evangelos Milios. 2012. Finding Expert Users in Community Question Answering. In Proceedings of the 21st International Conference on World Wide Web (WWW). ACM, Lyon, France, 791–798.

Digital Library

[33]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, San Francisco, CA, USA, 1135–1144.

Digital Library

[34]

Matthew Richardson and Pedro Domingos. 2002. Mining knowledge-sharing sites for viral marketing. In Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Edmonton, Alberta, Canada, 61–70.

Digital Library

[35]

Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis. 2008. Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Las Vegas, Nevada, USA, 614–622.

Digital Library

[36]

Aashish Sheshadri and Matthew Lease. 2013. Square: A benchmark for research on computing crowd consensus. In First AAAI Conference on Human Computation and Crowdsourcing (HCOMP). AAAI, Palm Springs, CA,USA, 156–164.

[37]

Chonggang Song, Wynne Hsu, and Mong-Li Lee. 2017. Temporal Influence Blocking: Minimizing the Effect of Misinformation in Social Networks. In 33rd IEEE International Conference on Data Engineering (TKDE). IEEE Computer Society, San Diego, CA, USA, 847–858.

[38]

Wei Tang and Matthew Lease. 2011. Semi-supervised consensus labeling for crowdsourcing. In SIGIR 2011 workshop on crowdsourcing for information retrieval (CIR). 1–6.

[39]

Youze Tang, Yanchen Shi, and Xiaokui Xiao. 2015. Influence Maximization in Near-Linear Time: A Martingale Approach. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD). ACM, Melbourne, Victoria, Australia, 1539–1554.

Digital Library

[40]

Youze Tang, Xiaokui Xiao, and Yanchen Shi. 2014. Influence Maximization: Near-optimal Time Complexity Meets Practical Efficiency. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD). ACM, Snowbird, Utah, USA, 75–86.

Digital Library

[41]

Yuandong Tian and Jun Zhu. 2012. Learning from Crowds in the Presence of Schools of Thought. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Beijing, China, 226–234.

Digital Library

[42]

Dimitris G Tzikas, Aristidis C Likas, and Nikolaos P Galatsanos. 2008. The variational approximation for Bayesian inference. IEEE Signal Processing Magazine 25, 6 (2008), 131–146.

[43]

Christophe Van den Bulte and Yogesh V Joshi. 2007. New product diffusion with influentials and imitators. Marketing Science 26, 3 (2007), 400–421.

[44]

Jennifer Wortman Vaughan. 2018. Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research.Journal of Machine Learning Research 18 (2018), 193–1.

[45]

Jing Wang, Panagiotis G. Ipeirotis, and Foster Provost. 2011. Managing crowdsourcing workers. In In The 2011 Winter Conference on Business Intelligence. 10–12.

[46]

Wei Wei, Gao Cong, Chunyan Miao, Feida Zhu, and Guohui Li. 2016. Learning to find topic experts in Twitter via different relations. IEEE Transactions on Knowledge and Data Engineering (TKDE) 28, 7(2016), 1764–1778.

[47]

Peter Welinder, Steve Branson, Pietro Perona, and Serge J Belongie. 2010. The multidimensional wisdom of crowds. In Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc., Vancouver, British Columbia, Canada, 2424–2432.

[48]

Jacob Whitehill, Ting fan Wu, Jacob Bergsma, Javier R. Movellan, and Paul L. Ruvolo. 2009. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. In Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc., Vancouver, British Columbia, Canada, 2035–2043.

[49]

Barrett Wissman. 2019. Micro-Influencers: The Marketing Force Of The Future?https://www.forbes.com/sites/barrettwissman/2018/03/02/micro-influencers-the-marketing-force-of-the-future. Accessed: 2019-10-11.

[50]

Jie Yang, Thomas Drake, Andreas Damianou, and Yoelle Maarek. 2018. Leveraging crowdsourcing data for deep active learning an application: Learning intents in alexa. In Proceedings of the 2018 World Wide Web Conference (WWW). ACM, Lyon, France, 23–32.

Digital Library

[51]

Jie Yang, Alisa Smirnova, Dingqi Yang, Gianluca Demartini, Yuan Lu, and Philippe Cudré-Mauroux. 2019. Scalpel-cd: leveraging crowdsourcing and deep probabilistic modeling for debugging noisy training data. In Proceedings of the 2019 World Wide Web Conference (WWW). ACM, San Francisco, CA, USA, 2158–2168.

Digital Library

[52]

Yudian Zheng, Guoliang Li, and Reynold Cheng. 2016. DOCS: a domain-aware crowdsourcing system using knowledge bases. Proceedings of the VLDB Endowment 10, 4 (2016), 361–372.

Digital Library

[53]

Yudian Zheng, Guoliang Li, Yuanbing Li, Caihua Shan, and Reynold Cheng. 2017. Truth Inference in Crowdsourcing: Is the Problem Solved?Proceedings of the VLDB Endowment 10, 5 (2017), 541–552.

Digital Library

Cited By

Chen JZhang YCai HLiu LLiao MFang J(2024)A Comprehensive Overview of Micro-Influencer Marketing: Decoding the Current Landscape, Impacts, and TrendsBehavioral Sciences10.3390/bs1403024314:3(243)Online publication date: 18-Mar-2024
https://doi.org/10.3390/bs14030243
Kareem SVenugopal P(2023)Social Media Influencers�� Traits and Purchase Intention: A Moderated Mediation Effect of Attitude Towards Brand Credibility and Brand FamiliarityFIIB Business Review10.1177/23197145231162257(231971452311622)Online publication date: 6-Apr-2023
https://doi.org/10.1177/23197145231162257
Wang YYu ZLiu SZhou ZGuo B(2023)Genie in the ModelProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808157:1(1-29)Online publication date: 28-Mar-2023
https://dl.acm.org/doi/10.1145/3580815
Show More Cited By

Index Terms

OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation

Index terms have been assigned to the content through auto-classification.

Recommendations

Bayesian Inference via Variational Approximation for Collaborative Filtering

Variational approximation method finds wide applicability in approximating difficult-to-compute probability distributions, a problem that is especially important in Bayesian inference to estimate posterior distributions. Latent factor model is a ...
Real-time Influencer Detection In Twitter Using A Hybrid Approach.
Abstract
Social media is attaining popularity day by day across domains. Approximately about 40% of the world's population uses social media. Automatically these people look the influencers in social media to guide them with their decision-making in a day-...
How do influencers mention brands in social media?: sponsorship prediction of Instagram posts
ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Brand mentioning is a type of word-of-mouth advertising method where a brand name is disclosed by social media users in posts. Recently, brand mentioning by influencers has raised great attention because of the strong viral effects on the huge fan base ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '20: Proceedings of The Web Conference 2020

April 2020

3143 pages

ISBN:9781450370233

DOI:10.1145/3366423

Editors:
Yennun Huang
Acadmica sinica, Taiwan
,
Irwin King
The Chinese University of Hong Kong, Hong Kong
,
Tie-Yan Liu
Microsoft Research Asia, China
,
Maarten van Steen
University of Twente, Netherlands

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '20

Sponsor:

SIGWEB

WWW '20: The Web Conference 2020

April 20 - 24, 2020

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
687
Total Downloads

Downloads (Last 12 months)79
Downloads (Last 6 weeks)10

Reflects downloads up to 28 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen JZhang YCai HLiu LLiao MFang J(2024)A Comprehensive Overview of Micro-Influencer Marketing: Decoding the Current Landscape, Impacts, and TrendsBehavioral Sciences10.3390/bs1403024314:3(243)Online publication date: 18-Mar-2024
https://doi.org/10.3390/bs14030243
Kareem SVenugopal P(2023)Social Media Influencers’ Traits and Purchase Intention: A Moderated Mediation Effect of Attitude Towards Brand Credibility and Brand FamiliarityFIIB Business Review10.1177/23197145231162257(231971452311622)Online publication date: 6-Apr-2023
https://doi.org/10.1177/23197145231162257
Wang YYu ZLiu SZhou ZGuo B(2023)Genie in the ModelProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808157:1(1-29)Online publication date: 28-Mar-2023
https://dl.acm.org/doi/10.1145/3580815
Munyaka IAshktorab ZDugan CJohnson JPan Q(2023)Decision Making Strategies and Team Efficacy in Human-AI TeamsProceedings of the ACM on Human-Computer Interaction10.1145/35794767:CSCW1(1-24)Online publication date: 16-Apr-2023
https://dl.acm.org/doi/10.1145/3579476
Kim SWatkins ERussakovsky OFong RMonroy-Hernández A(2023)"Help Me Help the AI": Understanding How Explainability Can Support Human-AI InteractionProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581001(1-17)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581001
Mesbah SArous IYang JBozzon A(2023)HybridEval: A Human-AI Collaborative Approach for Evaluating Design Ideas at ScaleProceedings of the ACM Web Conference 202310.1145/3543507.3583496(3837-3848)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583496
Rezwana JMaher M(2023)Designing Creative AI Partners with COFI: A Framework for Modeling Interaction in Human-AI Co-Creative SystemsACM Transactions on Computer-Human Interaction10.1145/351902630:5(1-28)Online publication date: 23-Sep-2023
https://dl.acm.org/doi/10.1145/3519026
Hai-Jew S(2022)Self-Presenting Virtually for Remote Social InfluencePractical Peer-to-Peer Teaching and Learning on the Social Web10.4018/978-1-7998-6496-7.ch013(407-461)Online publication date: 2022
https://doi.org/10.4018/978-1-7998-6496-7.ch013
Fan MYang XYu TLiao QZhao J(2022)Human-AI Collaboration for UX Evaluation: Effects of Explanation and SynchronizationProceedings of the ACM on Human-Computer Interaction10.1145/35129436:CSCW1(1-32)Online publication date: 7-Apr-2022
https://dl.acm.org/doi/10.1145/3512943
Lai VCarton SBhatnagar RLiao QZhang YTan C(2022)Human-AI Collaboration via Conditional Delegation: A Case Study of Content ModerationProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501999(1-18)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3491102.3501999
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents