Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects

Best Paper AwardConference Paper
Samir Passi, Steven Jackson
In Proceedings of the ACM on Human-Computer Interaction, Vol. 2, CSCW, Article 136, (November 2018). ACM. New York, NY.
Publication year: 2018

Abstract:

The trustworthiness of data science systems in applied and real-world setting emerges from the resolution of specific tensions through situated, pragmatic, and ongoing forms of work. Drawing on research in CSCW, critical data studies, and history and sociology of science, and six months of immersive ethnographic fieldwork with a corporate data science team, we describe four common tensions in applied data science work: (un)equivocal numbers, (counter)intuitive knowledge, (in)credible data, and (in)scrutable models. We show how organizational actors establish and re-negotiate trust under messy and uncertain analytic conditions through practices of skepticism, assessment, and credibility. Highlighting the collaborative and heterogeneous nature of real-world data science, we show how the management of trust in applied corporate data science settings depends not just on pre-processing and quantification, but also on negotiation and translation. We conclude by discussing the implications of our findings for data science research and practice, both within and beyond CSCW.

Data Vision: Learning to See Through Algorithmic Abstraction

Best Paper AwardConference Paper
Samir Passi, Steven Jackson
In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). ACM, New York, NY, USA, 2436-2447. DOI: https://doi.org/10.1145/2998181.2998331
Publication year: 2017

Abstract:

Learning to see through data is central to contemporary forms of algorithmic knowledge production. While often represented as a mechanical application of rules, making algorithms work with data requires a great deal of situated work. This paper examines how the often-divergent demands of mechanization and discretion manifest in data analytic learning environments. Drawing on research in CSCW and the social sciences, and ethnographic fieldwork in two data learning environments, we show how an algorithm’s application is seen sometimes as a mechanical sequence of rules and at other times as an array of situated decisions. Casting data analytics as a rule-based (rather than rule-bound) practice, we show that effective data vision requires would-be analysts to straddle the competing demands of formal abstraction and empirical contingency. We conclude by discussing how the notion of data vision can help better leverage the role of human work in data analytic learning, research, and practice.

Click here to see media coverage of this paper from Cornell Research.