Harnessing Social Media Data to Forecast Voting Trends

In his book, 'The Online Decoding X to Predict Election Outcomes’, journalist Sanjeev Singh reveals how the digital divide in India is narrowing
Harnessing Social Media Data to Forecast Voting Trends
Sanjeev’s book explores how online political campaigns, when mirrored offline, can yield powerful results. He argues that the lines between online and offline are blurring in today's elections. Excerpts from his interview with TOI Bookmark:
Q. Before we launch into a conversation about your book for the benefit of our listeners, please tell us what the online effect is all about
A:
This book is basically about a scientific study which shows that there is a very little divide between online and offline when it comes to political campaigning.
We have always heard that India is a country with very diverse economies, backgrounds and there is a digital divide, so to say, in the country. But this book is in a nutshell describes that digital divide, if it ever existed, is now slowly narrowing down. That gap is going down and if politicians can follow up what they do in online space, in the offline mode, then they can actually get very good results which mirror each other.
Q. How was your curiosity sort of tickled and it triggered you to write this?
A. This was a culmination of many years of research and curiosity. Ever since the advent of social media, there has been a lot of churning happening, and it's no longer the same thing if you talk about communication in elections, and technology has disrupted that. So how does political communication change and how is it changing in today's world? We started collecting data from the 2017 UP elections, then to Gujarat elections in late 2017. And we tried to see how can we work on something where we have a template which can be followed in every state. And since I had been mostly into political reporting, that was an area of interest to me, that how are politicians adapting to this change?

Listen to Part 1 of the interview here:

TOI Bookmark Interview With Sanjeev Singh Part 1


Q. How do you think social media has disrupted news reporting? And has it really empowered the leaders to communicate directly with the citizens?
A. Social media has definitely democratised information and data. Having said that, everything that we discuss, even when we talk about right to freedom of speech and expression, it has to be done with reasonable restrictions… It is a treasure trove of knowledge that can be used. So if you ask me that question, I am looking at the positive sides of how can we use this data, filter this data, and look at certain trends and practices which can help us in the long run. Of course, there will always be people who are misusing it, and therefore, there is a need for regulation.
Q. How do you actually map, specifically with regard to this book, ever-evolving political tastes?
A
.
We spoke to 15-20 of the top news editors across India and tried to get their feedback on what are the issues that are discussed during elections and we arrived at four basic issues: jobs, development, corruption and farmers. Between 2018 and 2020, we tried to see if political parties and politicians were focusing on these issues and what was the kind of engagement they were getting. So that threw up very interesting trends, because in some cases, some sitting chief ministers were not at all talking about employment or they were not at all talking about corruption. But on the other hand, their political counterparts were focusing more on those issues and getting enough traction. So there was a clear trend that politicians and political parties do not talk more about issues where they are weak, but you will see the opposition talking more about it. But if the opposition is talking more about it, are they getting enough traction or not? We focused on the vote share to see if this engagement that they are getting on this platform, is it translating into something meaningful for them? We were able to build a foundational model which shows that if on certain topics, certain leaders get x amount of engagement, it will have a direct correlation to the vote share. And that is what this book is all about.
Q. I find while looking at the Internet that a lot of people, those with the best intentions at heart, tend to leave a lot of digital crumbs. So how do you actually find that there is a synergy between these digital crumbs, between their online engagement and the offline engagement with the politicians? Is there any difference in their behaviour? Or do they unwittingly reveal more online?
A.
O
nline is more real time, where people have to react immediately, especially on a platform like X. And that actually gives an insight into the way politicians think. But you have to back it up with offline work. So there is no saying that if you are very good with online platforms, that means that you are going to win the election.
And the other thing is that if politicians think that they will use social media during election campaigns and think that it will result into some kind of gain in vote share for them, they are highly mistaken. Because people can see through this. So therefore, this is a full time, 24/7 job that you need to actively engage with your followers. And even for the followers, you may just be following somebody, but you can actually change the opinion of other followers regarding a certain politician or any celebrity that you follow. If you're a responsible follower, you can always point out the mistakes, or you can call out somebody if they've done something wrong, and that has an effect on other followers as well. Therefore, people need to be very careful of the kind of image that they want to build online.
Q: Musk has changed the ways X operates. Do you think there will be any significant change, or do you think people across the country will use X as they have always done?
A:
One of the drawbacks has been that there is very little scope for developers and researchers to get access to the data. Even though you may have a developer API, the maximum tweets you can pull out is there is a limit on 14 days or 15 days. So it's very restrictive in that sense now to be able to study that data unless you are pulling the data on a regular basis.
But what you will see is that people are always looking at other platforms as well… But X will always remain there, because X is a platform used for real-time dissemination of information, and that has not changed.
Listen to Part 2 here:

TOI Bookmark Interview With Sanjeev Singh Part 2


Q. Maybe this is a good time to discuss your success with the modelling, especially with the 2018 Chhattisgarh elections.
A:
So in the 2018 election, you had many BJP politicians who had far greater following than Congress politicians, but when you compare the engagement levels, you could see that they were doing far better. And that was a clear indication of the mood of the people as well. And you saw that in the election, Congress won Rajasthan and Congress won Chhattisgarh very clearly. It was Madhya Pradesh where there was a seesaw battle and it was a very close election. So the kind of engagement each party leader got gave us a very clear idea of where the election was going. We had those kind of trends, but we had to wait for the elections to get out to be able to make that correlation.
Q. This kind of stuff has not really been attempted before, certainly not in India, has it?
A.
There have been some studies, but, you know, most of the studies are qualitative in nature. So they study the sentiment. Most people look at hashtags and then they look at the sentiment and whether the majority of the sentiment was positive or negative. So there have been these kind of studies, and these studies have been limited to maybe one election or, you know, looking at a pan India level on hashtags. There needs to be some quantitative modelling as well, because, you know, unless you put some data there, it becomes tough to call it a proper scientific study which has gone through the rigour of these mathematical models. Therefore, we chose the keywords we selected, the politicians and the political party handles. We got all that data together, and then we worked on this formula.
Q. What next with this modelling? Are you going to take it forward? Are you going to improve upon it?
A.
I think I would definitely want to do something on it. But there are many limitations for researchers on this particular platform, especially when it comes to scraping data. So I'm not very sure how I will do it, but I will keep an eye out for it and see what can be done, because it is a humongous exercise.
End of Article
FOLLOW US ON SOCIAL MEDIA