Online social networks are a rich resource of unedited user-generated multimedia content. Buried within their day-to-day chatter, we can find breaking news, opinions and valuable insight into human behavior, including the articulation of emerging social movements. Nevertheless, in recent years social platforms have become fertile ground for diverse information disorders and hate speech expressions. This situation poses an important challenge to the extraction of useful and trustworthy information from social media. In this talk I provide an overview of existing work in the area of social media information credibility, starting with our research in 2011 on rumor propagation during the massive earthquake in Chile in 2010. I discuss, as well, the complex problem of automatic hate speech detection in online social networks. In particular, how our review of the existing literature in the area shows important experimental errors and dataset biases that produce an overestimation of current state-of-the-art techniques. Specifically, these issues become evident at the moment of attempting to apply these models to more diverse scenarios or to transfer this knowledge to languages other than English. As a particular way of dealing with the need to extract reliable information from online social media, I talk about two applications, Twically and Galean. These applications harvest collective signals created from social media text to provide a broad view of natural disasters and real-world news, respectively.
Speaker: Bárbara Poblete
Universidad de Chile
Dr. Bárbara Poblete is an Associate Professor at the Computer Science Department of the Universidad de Chile and an Amazon Visiting Scholar at Alexa Shopping Research.