Elements for an epistemology of instrumentation and collaboration in Twitter data research

Abstract : Twitter has acquired quite a reputation amongst scholars from all backgrounds, especially in computer and social sciences. According to its advocates, it could help us understand society and predict its behaviour, from movie blockbusters to flu epidemics, with immediate applied use cases for industries. Given that a whole new object of study is at stake for scientists of every kind, there is a need to understand how new epistemic contents are build from those "Big Social Data", and how, beyond their obvious appeal, those data actually have the potential to be a new source of knowledge for researchers; this investigation would prove or disprove the possibility for Twitter to be a tool for research, and measure the limits and scope of such endeavour. In this respect, I studied how Twitter is used in academia, what it is aimed for and which theoretical status is conferred to Twitter data. I did so by collecting all scientific papers about Twitter for 6 months (a total of almost 300 papers), and registering for each one the field in which authors work, the subject of the paper, the methods they use and some other things as the paper source (journal or conference), the researcher's university's country, the language in which they write and the publication date. Through this bibliometric study, I thus empirically demonstrate that, in the quantitative picture, research based on Twitter (or research about Twitter) equally comes from computer sciences on one hand, and social sciences and humanities on the other hand, with a lesser contribution from natural sciences. In addition to the point of view of their field, researchers mobilize quite a range of different tools and techniques, from traditional ethnographic approaches to new algorithms and information systems. This may dramatically impact the meaning of Twitter data use, the scope of the results and the very nature of the knowledge their bring. In academia as in industry, Twitter data use is often heavily instrumented by various digital technologies which are not neutral to scientific processes and results, notably because data must theoretically be computable for a Turing machine, and empirically within a reasonable time and computing power. More importantly, 'studying Twitter' has various meanings; some researchers actually use Twitter data to do such, and some others do not, preferring instead to study users, media, or the general public. Amongst those who collect data, volumes differ by several orders of magnitude from one paper to another. This is no value judgment (more is not necessarily better) but a statement of fact, easily explained by the very scope of each publication, and the corresponding epistemological status given to Twitter or Twitter data. More precisely, I have found three possible statuses for Twitter in its academic use: - it can be a handy source of data for algorithmic design and calibration; - it is also seen as a linguistic, social and cultural phenomenon and an object of study in itself; - lastly, scientists hope they can use it as a mirror for society and human behaviour, a tool to study reality in general. Most papers seem to aim for this third approach, regardless of what is factually done. Those epistemological statuses are not respectively bound to one field or another, and there is a general blurriness in Twitter research regarding how and why data are collected and analyzed. Some computer scientists make no difference between their algorithmic experimentations and an actual study of social phenomena that could be done with their tools; social scientists sometimes do not really tell whether their conclusions relate to Twitter users or people in general. Finally, I found very few collaborations in which both fields would respectively bring means and ends to each other, one showing instrumentation possibilities and limits, the other bringing questions and objects of study in a virtuous circle. In conclusion I will suggest modest advice for such research configuration, based on field observation and epistemological analysis.
Type de document :
1stInternational Conference on Twitter for Research, Apr 2015, Lyon, France. 2015, 〈conftwitter2015.org〉
Liste complète des métadonnées

Contributeur : Eglantine Schmitt <>
Soumis le : mercredi 20 mai 2015 - 16:06:31
Dernière modification le : mercredi 5 septembre 2018 - 15:04:46
Document(s) archivé(s) le : mardi 15 septembre 2015 - 06:31:21


Fichiers produits par l'(les) auteur(s)


  • HAL Id : halshs-01153895, version 1



Eglantine Schmitt. Elements for an epistemology of instrumentation and collaboration in Twitter data research. 1stInternational Conference on Twitter for Research, Apr 2015, Lyon, France. 2015, 〈conftwitter2015.org〉. 〈halshs-01153895〉



Consultations de la notice


Téléchargements de fichiers