Sunday, 11 March 2018

Internet-age student movement research in the era of big data and social media

I have written previously about the South African student movement (of 2015/16) and the way social media use by students and others during the movement signaled a new era in student politics - and possibly grassroots politics overall in South Africa, making it the first 'internet-age student movement in South Africa' (see publications). As it was during the Arab Spring, Facebook and Twitter, along with other platforms (like WhatsApp) were used prolifically by activists, sympathizers, as well as journalists to inform on the movement.

This Wednesday at the HSRC, Nkululeko Makhubu and I got really excited to receive the raw data of all tweets with hashtag #feesmustfall in an excel file from the period of 1 January 2015 to 31 December 2016 to analyse. It is extremely exciting to have such big data to work with - it is also a bit scary to think how much information about a person remains stored in the 'cloud'. Today it is almost exactly three years since the start of #RhodesMustFall, and two and a half years since that of #FeesMustFall. Yet, with the right extraction tools, it is possible to get information to the most miniscule datum.

For instance, in our dataset, we have for every one of the 576,000 tweets extracted so far the exact date and time when it was tweeted, location, language, user name of tweeter, twitter handle of tweeter, gender; mentions (handles of others mentioned in the tweet, the actual tweet content including links to pictures, videos, websites etc. in the tweet), other hashtags, and so forth.

Even though this is open, publicly accessible data, it still requires of the researchers to be extremely sensitive to the ethics of research. Thus, how one treats matters like the identity of a tweeter, is really an important question; after all, using certain data in ethically questionable ways may have very real-life implications for the tweeter. What I tweeted in the hype of activism in 2015 must be seen in that context... and such big data sets do not provide such context. As a young student being thrust into the midst of student activism and protest, I may have said (or rather: tweeted) things that I may very well want to disown now; the university experience - including participating in a student protest - is, after all, a learning experience and it is part of learning that one makes mistakes... 

Pear factor, a media monitoring, research and analysis company, sent me last year a 'teaser' of their capabilities when it comes to doing social media data analysis and infographics (Thanks very much!). I included (above) two snips of those infographics.

We will be doing different kinds of analysis on our own data set, but this just illustrates how, in an aggregate fashion, much can be learnt about the 'virtual'/online dimension of the student movement. The timeline indicates here that tweeting using their keyword and hashtag spec 'exploded' mid October 2015 - coinciding with the national timeline of #FeesMustFall (i.e. March to Parliament on 21 October, March to Luthuli House on 22 October, March to Union Building on 23 October 2015). It also shows the geographic spread of tweeting centred on South Africa and the main metropolitan centres in particular, but spreading around the globe, including English-speaking Africa, Europe (especially UK) as well as the United States, India, Middle East and Australia.

I will be documenting the progress with this study in detail on the Osphera.net website and make regular updates via @osphera and on my own blog, facebook and twitter accounts.