skip to main content
research-article

OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation

Published: 20 April 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Finding social influencers is a fundamental task in many online applications ranging from brand marketing to opinion mining. Existing methods heavily rely on the availability of expert labels, whose collection is usually a laborious process even for domain experts. Using open-ended questions, crowdsourcing provides a cost-effective way to find a large number of social influencers in a short time. Individual crowd workers, however, only possess fragmented knowledge that is often of low quality.
    To tackle those issues, we present OpenCrowd, a unified Bayesian framework that seamlessly incorporates machine learning and crowdsourcing for effectively finding social influencers. To infer a set of influencers, OpenCrowd bootstraps the learning process using a small number of expert labels and then jointly learns a feature-based answer quality model and the reliability of the workers. Model parameters and worker reliability are updated iteratively, allowing their learning processes to benefit from each other until an agreement on the quality of the answers is reached. We derive a principled optimization algorithm based on variational inference with efficient updating rules for learning OpenCrowd parameters. Experimental results on finding social influencers in different domains show that our approach substantially improves the state of the art by 11.5% AUC. Moreover, we empirically show that our approach is particularly useful in finding micro-influencers, who are very directly engaged with smaller audiences.

    References

    [1]
    Nitin Agarwal, Huan Liu, Lei Tang, and Philip S. Yu. 2008. Identifying the Influential Bloggers in a Community. In Proceedings of the 2008 International Conference on Web Search and Data Mining (WSDM). ACM, Palo Alto, California, USA, 207–218.
    [2]
    Shane Barker. 2019. The Ultimate Guide to Micro-Influencers. https://shanebarker.com/blog/micro-influencers-guide/. Accessed: 2019-10-11.
    [3]
    Bin Bi, Yuanyuan Tian, Yannis Sismanis, Andrey Balmin, and Junghoo Cho. 2014. Scalable topic-specific influence analysis on microblogs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM). ACM, New York, NY, USA, 513–522.
    [4]
    David M Blei, Alp Kucukelbir, and Jon D McAuliffe. 2017. Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112, 518 (2017), 859–877.
    [5]
    David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993–1022.
    [6]
    Robert M Bond, Christopher J Fariss, Jason J Jones, Adam DI Kramer, Cameron Marlow, Jaime E Settle, and James H Fowler. 2012. A 61-million-person experiment in social influence and political mobilization. Nature 489, 7415 (2012), 295.
    [7]
    Kendrick Boyd, Kevin H Eng, and C David Page. 2013. Area under the precision-recall curve: point estimates and confidence intervals. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD). Springer, Prague, Czech Republic, 451–466.
    [8]
    Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, and Krishna P Gummadi. 2010. Measuring user influence in twitter: The million follower fallacy. In Fourth International AAAI Conference on Weblogs and Social Media (ICWSM). The AAAI Press, Washington, DC, USA, 10–17.
    [9]
    Zhiyuan Cheng, James Caverlee, Himanshu Barthwal, and Vandana Bachani. 2014. Who is the Barbecue King of Texas?: A Geo-spatial Approach to Finding Local Experts on Twitter. In Proceedings of the 37th ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR). ACM, Gold Coast, Queensland, Australia, 335–344.
    [10]
    P. Dawid, A. M. Skene, A. P. Dawidt, and A. M. Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics) 28 (1979), 20–28.
    [11]
    Gianluca Demartini, Djellel Eddine Difallah, and Philippe Cudré-Mauroux. 2012. ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-scale Entity Linking. In Proceedings of the 21st International Conference on World Wide Web (WWW). ACM, Lyon, France, 469–478.
    [12]
    Ju Fan, Guoliang Li, Beng Chin Ooi, Kian-lee Tan, and Jianhua Feng. 2015. icrowd: An adaptive crowdsourcing framework. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD). ACM, Melbourne, Victoria, Australia, 1015–1030.
    [13]
    Ju Fan, Jiarong Qiu, Yuchen Li, Qingfei Meng, Dongxiang Zhang, Guoliang Li, Kian-Lee Tan, and Xiaoyong Du. 2018. Octopus: An online topic-aware influence analysis system for social networks. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE Computer Society, Paris, France, 1569–1572.
    [14]
    Samuel Gershman and Noah Goodman. 2014. Amortized inference in probabilistic reasoning. In Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci), Vol. 36. cognitivesciencesociety.org, Quebec City, Canada.
    [15]
    Behnam Hajian and Tony White. 2011. Modelling influence in a social network: Metrics and evaluation. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom). IEEE Computer Society, Boston,MA, USA, 497–500.
    [16]
    Leading Global Influencer Marketing Agency Relatable in collaboration with 350 Brands and Agencies. 2019. The 2019 state of influencer marketing report. https://www.relatable.me/the-state-of-influencer-marketing-2019. Accessed: 2019-05-22.
    [17]
    Alexy Khrabrov and George Cybenko. 2010. Discovering influence in communication networks using dynamic graph analysis. In 2010 IEEE Second International Conference on Social Computing (SocialCom). IEEE Computer Society, Minneapolis, Minnesota, USA, 288–294.
    [18]
    Himabindu Lakkaraju, Jure Leskovec, Jon Kleinberg, and Sendhil Mullainathan. 2015. A bayesian framework for modeling human evaluations. In Proceedings of the 2015 SIAM International Conference on Data Mining (SDM). SIAM, Vancouver, BC, Canada, 181–189.
    [19]
    Janette Lehmann, Carlos Castillo, Mounia Lalmas, and Ethan Zuckerman. 2013. Finding News Curators in Twitter. In Companion Proceedings of the 22nd International Conference on World Wide Web (WWW). ACM, Rio de Janeiro, Brazil, 863–870.
    [20]
    Daifeng Li, Xin Shuai, Guozheng Sun, Jie Tang, Ying Ding, and Zhipeng Luo. 2012. Mining Topic-level Opinion Influence in Microblog. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM). ACM, Maui, Hawaii, USA, 1562–1566.
    [21]
    Y. Li, J. Fan, Y. Wang, and K. Tan. 2018. Influence Maximization on Social Graphs: A Survey. IEEE Transactions on Knowledge and Data Engineering (TKDE) 30, 10(2018), 1852–1872.
    [22]
    Fenglong Ma, Yaliang Li, Qi Li, Minghui Qiu, Jing Gao, Shi Zhi, Lu Su, Bo Zhao, Heng Ji, and Jiawei Han. 2015. Faitcrowd: Fine grained truth discovery for crowdsourced data aggregation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Sydney, NSW, Australia, 745–754.
    [23]
    Zhanyu Ma and Arne Leijon. 2011. Bayesian estimation of beta mixture models with variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (2011), 2160–2173.
    [24]
    Joseph Victor Michalowicz, Jonathan M Nichols, and Frank Bucholtz. 2013. Handbook of differential entropy. Chapman and Hall/CRC.
    [25]
    Michael A Nielsen. 2015. Neural networks and deep learning. Vol. 25. Determination press San Francisco, CA, USA:.
    [26]
    Besmira Nushi, Ece Kamar, Eric Horvitz, and Donald Kossmann. 2017. On human intellect and machine failures: troubleshooting integrative machine learning systems. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI). AAAI Press, San Francisco, California, USA, 1017–1025.
    [27]
    Aditya Pal and Scott Counts. 2011. Identifying topical authorities in microblogs. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM). ACM, Hong Kong, China, 45–54.
    [28]
    Bo Pang, Lillian Lee, 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2, 1–2(2008), 1–135.
    [29]
    Aditya Parameswaran, Akash Das Sarma, and Vipul Venkataraman. 2016. Optimizing Open-Ended Crowdsourcing: The Next Frontier in Crowdsourced Data Management. Bulletin of the Technical Committee on Data Engineering 39, 4(2016), 26.
    [30]
    Jiezhong Qiu, Jian Tang, Hao Ma, Yuxiao Dong, Kuansan Wang, and Jie Tang. 2018. Deepinf: Social influence prediction with deep learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2110–2119.
    [31]
    Vikas C Raykar, Shipeng Yu, Linda H Zhao, Gerardo Hermosillo Valadez, Charles Florin, Luca Bogoni, and Linda Moy. 2010. Learning from crowds. Journal of Machine Learning Research 11, Apr (2010), 1297–1322.
    [32]
    Fatemeh Riahi, Zainab Zolaktaf, Mahdi Shafiei, and Evangelos Milios. 2012. Finding Expert Users in Community Question Answering. In Proceedings of the 21st International Conference on World Wide Web (WWW). ACM, Lyon, France, 791–798.
    [33]
    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, San Francisco, CA, USA, 1135–1144.
    [34]
    Matthew Richardson and Pedro Domingos. 2002. Mining knowledge-sharing sites for viral marketing. In Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Edmonton, Alberta, Canada, 61–70.
    [35]
    Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis. 2008. Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Las Vegas, Nevada, USA, 614–622.
    [36]
    Aashish Sheshadri and Matthew Lease. 2013. Square: A benchmark for research on computing crowd consensus. In First AAAI Conference on Human Computation and Crowdsourcing (HCOMP). AAAI, Palm Springs, CA,USA, 156–164.
    [37]
    Chonggang Song, Wynne Hsu, and Mong-Li Lee. 2017. Temporal Influence Blocking: Minimizing the Effect of Misinformation in Social Networks. In 33rd IEEE International Conference on Data Engineering (TKDE). IEEE Computer Society, San Diego, CA, USA, 847–858.
    [38]
    Wei Tang and Matthew Lease. 2011. Semi-supervised consensus labeling for crowdsourcing. In SIGIR 2011 workshop on crowdsourcing for information retrieval (CIR). 1–6.
    [39]
    Youze Tang, Yanchen Shi, and Xiaokui Xiao. 2015. Influence Maximization in Near-Linear Time: A Martingale Approach. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD). ACM, Melbourne, Victoria, Australia, 1539–1554.
    [40]
    Youze Tang, Xiaokui Xiao, and Yanchen Shi. 2014. Influence Maximization: Near-optimal Time Complexity Meets Practical Efficiency. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD). ACM, Snowbird, Utah, USA, 75–86.
    [41]
    Yuandong Tian and Jun Zhu. 2012. Learning from Crowds in the Presence of Schools of Thought. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, Beijing, China, 226–234.
    [42]
    Dimitris G Tzikas, Aristidis C Likas, and Nikolaos P Galatsanos. 2008. The variational approximation for Bayesian inference. IEEE Signal Processing Magazine 25, 6 (2008), 131–146.
    [43]
    Christophe Van den Bulte and Yogesh V Joshi. 2007. New product diffusion with influentials and imitators. Marketing Science 26, 3 (2007), 400–421.
    [44]
    Jennifer Wortman Vaughan. 2018. Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research.Journal of Machine Learning Research 18 (2018), 193–1.
    [45]
    Jing Wang, Panagiotis G. Ipeirotis, and Foster Provost. 2011. Managing crowdsourcing workers. In In The 2011 Winter Conference on Business Intelligence. 10–12.
    [46]
    Wei Wei, Gao Cong, Chunyan Miao, Feida Zhu, and Guohui Li. 2016. Learning to find topic experts in Twitter via different relations. IEEE Transactions on Knowledge and Data Engineering (TKDE) 28, 7(2016), 1764–1778.
    [47]
    Peter Welinder, Steve Branson, Pietro Perona, and Serge J Belongie. 2010. The multidimensional wisdom of crowds. In Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc., Vancouver, British Columbia, Canada, 2424–2432.
    [48]
    Jacob Whitehill, Ting fan Wu, Jacob Bergsma, Javier R. Movellan, and Paul L. Ruvolo. 2009. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. In Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc., Vancouver, British Columbia, Canada, 2035–2043.
    [49]
    Barrett Wissman. 2019. Micro-Influencers: The Marketing Force Of The Future?https://www.forbes.com/sites/barrettwissman/2018/03/02/micro-influencers-the-marketing-force-of-the-future. Accessed: 2019-10-11.
    [50]
    Jie Yang, Thomas Drake, Andreas Damianou, and Yoelle Maarek. 2018. Leveraging crowdsourcing data for deep active learning an application: Learning intents in alexa. In Proceedings of the 2018 World Wide Web Conference (WWW). ACM, Lyon, France, 23–32.
    [51]
    Jie Yang, Alisa Smirnova, Dingqi Yang, Gianluca Demartini, Yuan Lu, and Philippe Cudré-Mauroux. 2019. Scalpel-cd: leveraging crowdsourcing and deep probabilistic modeling for debugging noisy training data. In Proceedings of the 2019 World Wide Web Conference (WWW). ACM, San Francisco, CA, USA, 2158–2168.
    [52]
    Yudian Zheng, Guoliang Li, and Reynold Cheng. 2016. DOCS: a domain-aware crowdsourcing system using knowledge bases. Proceedings of the VLDB Endowment 10, 4 (2016), 361–372.
    [53]
    Yudian Zheng, Guoliang Li, Yuanbing Li, Caihua Shan, and Reynold Cheng. 2017. Truth Inference in Crowdsourcing: Is the Problem Solved?Proceedings of the VLDB Endowment 10, 5 (2017), 541–552.

    Cited By

    View all
    • (2024)A Comprehensive Overview of Micro-Influencer Marketing: Decoding the Current Landscape, Impacts, and TrendsBehavioral Sciences10.3390/bs1403024314:3(243)Online publication date: 18-Mar-2024
    • (2023)Social Media Influencers�� Traits and Purchase Intention: A Moderated Mediation Effect of Attitude Towards Brand Credibility and Brand FamiliarityFIIB Business Review10.1177/23197145231162257(231971452311622)Online publication date: 6-Apr-2023
    • (2023)Genie in the ModelProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808157:1(1-29)Online publication date: 28-Mar-2023
    • Show More Cited By

    Index Terms

    1. OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          WWW '20: Proceedings of The Web Conference 2020
          April 2020
          3143 pages
          ISBN:9781450370233
          DOI:10.1145/3366423
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 20 April 2020

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. Human-AI Collaboration
          2. Influencer finding
          3. Variational Inference

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Conference

          WWW '20
          Sponsor:
          WWW '20: The Web Conference 2020
          April 20 - 24, 2020
          Taipei, Taiwan

          Acceptance Rates

          Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)79
          • Downloads (Last 6 weeks)10
          Reflects downloads up to 28 Jul 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)A Comprehensive Overview of Micro-Influencer Marketing: Decoding the Current Landscape, Impacts, and TrendsBehavioral Sciences10.3390/bs1403024314:3(243)Online publication date: 18-Mar-2024
          • (2023)Social Media Influencers’ Traits and Purchase Intention: A Moderated Mediation Effect of Attitude Towards Brand Credibility and Brand FamiliarityFIIB Business Review10.1177/23197145231162257(231971452311622)Online publication date: 6-Apr-2023
          • (2023)Genie in the ModelProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808157:1(1-29)Online publication date: 28-Mar-2023
          • (2023)Decision Making Strategies and Team Efficacy in Human-AI TeamsProceedings of the ACM on Human-Computer Interaction10.1145/35794767:CSCW1(1-24)Online publication date: 16-Apr-2023
          • (2023)"Help Me Help the AI": Understanding How Explainability Can Support Human-AI InteractionProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581001(1-17)Online publication date: 19-Apr-2023
          • (2023)HybridEval: A Human-AI Collaborative Approach for Evaluating Design Ideas at ScaleProceedings of the ACM Web Conference 202310.1145/3543507.3583496(3837-3848)Online publication date: 30-Apr-2023
          • (2023)Designing Creative AI Partners with COFI: A Framework for Modeling Interaction in Human-AI Co-Creative SystemsACM Transactions on Computer-Human Interaction10.1145/351902630:5(1-28)Online publication date: 23-Sep-2023
          • (2022)Self-Presenting Virtually for Remote Social InfluencePractical Peer-to-Peer Teaching and Learning on the Social Web10.4018/978-1-7998-6496-7.ch013(407-461)Online publication date: 2022
          • (2022)Human-AI Collaboration for UX Evaluation: Effects of Explanation and SynchronizationProceedings of the ACM on Human-Computer Interaction10.1145/35129436:CSCW1(1-32)Online publication date: 7-Apr-2022
          • (2022)Human-AI Collaboration via Conditional Delegation: A Case Study of Content ModerationProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501999(1-18)Online publication date: 29-Apr-2022
          • Show More Cited By

          View Options

          Get Access

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media