Non-random Tweet Mortality and Data Access Restrictions: Compromising the Replication of Sensitive Twitter Studies (Accepted for publication, Political Analysis)

Abstract

Used by politicians, journalists and citizens, Twitter has been the most important social media platform to investigate political phenomena such as hate speech, polarization, or terrorism for over a decade. A high proportion of Twitter studies of emotionally charged or controversial content limit their ability to replicate findings due to incomplete Twitter-related replication data and the inability to recrawl their datasets entirely. This paper shows that these Twitter studies and their findings are considerably affected by non-random tweet mortality and data access restrictions imposed by the platform. While sensitive datasets suffer a notably higher removal rate than non-sensitive datasets, attempting to replicate key findings of Kim’s (2023) influential study on the content of violent tweets leads to significantly different results. The results highlight that access to complete replication data is particularly important in light of dynamically changing social media research conditions. Thus, the study raises concerns and potential solutions about the broader implications of non-random tweet mortality for future social media research on Twitter and similar platforms.

Publication
Andreas Küpfer, n.d. "Non-random Tweet Mortality and Data Access Restrictions: Compromising the Replication of Sensitive Twitter Studies" Accepted for publication in Political Analysis.
Twitter Replication text-as-data