#AoIR2018 has ended
Back To Schedule
Wednesday, October 10 • 9:00am - 12:30pm
Exploring the Shifting Sands: Accounting for Evolution during Analysis of Data from Social Media Platforms

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!


The goal of this half-day workshop is to explore the problems of studying social media platforms and social media data, while accounting for their evolution. For example, how should research conducted on a single platform account for differences generated over time, particularly when changes may impact observed results? We focus on how reliance on historical social media data raises questions about the reproducibility of science, the applicability of findings across changing platforms, and issues when applying methods developed for one era of social media to another. We explore how to alleviate data discrepancies by accounting for platform evolution; contextualizing changes to qualify their impact on future research; and evaluating prior research with the appropriate lenses.


Researchers are drawn to the magnitude of social media data available today. Hundreds of studies have been published using Twitter data (Zimmer & Proferes, 2014), as a result of it’s popularity Tufecki (2014) calls Twitter the “model organism” for social media analysis. Yet, Twitter is not a static platform, undergoing numerous changes to its user interface, default settings, affordances for engagement, and algorithms over time. Its API changed each time new data structures were added, and its Terms of Service, Community Rules, and Privacy Policies were collectively revised more than a dozen times.

Twitter data from a decade ago is different than today, and analysis on such data may be inappropriate for current studies. To ignore changes in platform evolution is to compare dissimilar data constructs over time. These questions extend outside of Twitter, for example, Instagram added stories, allowing users to share all posts across a single day. Snapchat added mapping features, later allowing users to access them both inside and outside the mobile application. Rigorous inferences on historical data require an account of platform/data evolution, and a transparent awareness of how this evolution impacts conclusions that we draw from such data.

While social media data used for scientific research opened new opportunities in machine learning and artificial intelligence, allowing for new techniques for investigating large-scale trends, researchers do not systematically address the rapid shifting of the research space. Changing platforms and data restrict conclusions to one point in time, yet researchers do not account for shifts in orientation. How do we account for this amalgamation of data, its evolution and the impact platform and design changes have on the the kinds of data sets produced, analyzed, and the conclusions drawn?

Unfortunately, information about the evolution of these platforms is only available in part. Data such as public tweets, public changes to policy, and visible UI enhancements are disparately available. In contrast, most API, underlying platform performance mechanisms and internal policies that drive system design are private. The lack of systemic information about these changes prevents accurate change comparison. Ultimately, research evolves based on assumptions about the actual state of the data. We will ask participants to tackle the following questions:
- How have changes to social media platforms influenced user behavior and vice versa? That is, can we quantify the effect platform evolution has on its users’ perceptions?
- How have researchers using social media data contextualized or integrated historical research/citations? How can we develop a theoretical basis or methodology to account for historical data?
- How do researchers describe/document the current state of a social media platform under investigation?
- What types of documentation are necessary to account for platform changes over time?
- Are social media platforms comparable at different points in time?

Workshop Structure:

This workshop is structured to facilitate open discussion of the challenges faced by researchers working with social media archives and knowledge produced about social media that has since aged. The workshop will include brainstorming grand challenges to data analysis on shifting platforms; discussions on case studies and hypothetical methodologies for addressing the issues identified; and identifying ideas to contextualize social media data sets.

The workshop will consist of three main parts:
- First, the workshop will open with brief introductory remarks from the workshop organizers and selected participants, laying out what we see as the scope of the problem. (30 minutes)
- We will next conduct a fishbowl session to brainstorm the grand challenges of this area. -

What are the dangers of relying on outdated historical data and how do we address this methodologically? (60 minutes)
- For the third session, we will split participants up into small groups for an interactive session to develop hypothetical approaches to address the grand challenges. (60 minutes)
- Lastly, groups will report back to the audience, identifying the kinds of resources required to make these projects or approaches actually happen. (30 minutes).
- The desired output includes the publication of a prioritized roadmap for future research in this area.

Tufekci, Z. (2014). Big Data: Pitfalls, Methods and Concepts for an Emergent Field. In Proceedings of the AAAI International AAAI Conference on Weblogs and Social Media (ICWSM).

Zimmer, M., & Proferes, N. J. (2014). A Topology of Twitter Research: Disciplines, Methods, and Ethics. Aslib Journal of Information Management, 66(3), 250–261.

Wednesday October 10, 2018 9:00am - 12:30pm EDT
Sheraton - Salon 8

Attendees (2)