In my doctoral dissertation project, I study the human work comprising data science learning, research, and practice. I study such forms of work ethnographically in the context of academic data science as well as corporate data science. I explore questions such as: what forms of human and technical work are involved in data science and analytics, what are the relations between routine, creative, and expert data science work, how do people situate and evaluate data science results to make them meaningful in business, technical, and social contexts, etc. My aim is to not only produce a better understanding of data science, but also to design and develop methods and standards that can help us more effectively perform, demonstrate, and evaluate data science. My research is supported by the NSF grant CHS-1526155: Advancing the Human Work of Data Analytics. In the past, my work was supported by Cornell’s Information Science Department Fellowship and Intel Science & Technology Center for Social Computing (ISTC) Fellowship.
Supervised Maya Klabin – a senior at Cornell with a major in Information Science – on a project in which we explored ways in which we can effectively identify and better support the myriad forms of human work (choices, decision making, assumptions, etc.) that go into data science. We used the methodology of critical and speculative designs to identify possible solutions that can help not only produce more effective analyses, but also provide new ways to visualize data and algorithmic work.
Supervised Dou Mao – an Information Science MPS graduate at Cornell University – on a project in which we developed mid- and high-fidelity prototypes of a “Decision Dashboard” that facilitates the visibility of human choices and decisions within data science. The proof-of-concept InVision prototype developed as part of the project draws loosely on Denis Batalov’s – ML expert working at Amazon – work as detailed on Amazon’s research blog.
Supervised four Information Science MPS graduate students (Dai Siqi, Chen Pan, Zhenyi Xia, & Val Mack) to design and develop data analytic tools and engage in data science. The students developed critical designs related to the human work of data science, gained experience in conducting data science, and designed and implemented a web template that focuses on forms of human data science work. The aim of this project was to create a series of critical designs and build a series of tools that highlight and support the different forms of human work and decisions that comprise data science practices.
In this project, under the supervision of Steven Jackson, I focused on notions of use & non-use within user-technology relations. The research question was: what does it mean to be a (non-)user of information? I showed how the processes of information gathering, dissemination, analysis, and visualization are not merely technical acts, but sites in which specific relations between people, data, and practices are negotiated and actualized.
In this research I explored the works of four cyberneticists – Norbert Wiener (1950), William Ross Ashby (1956), Gordon Pask (1968 ), and Heinz von Foerester (1979) – to understand how each one of them conceptualized the notion of control with regard to cybernetic systems.
We – this project was done in collaboration with STS PhD candidate Ranjit Singh – showed that in cybernetic imagination, control occurs in relation to several concepts such as probability, uncertainty, feedback, information, entropy, communication, self-regulation, organization, behavior, constraint, determinacy, observation, largeness, disturbance, equilibrium, regulators, structure, object language, metalanguage, environment, conversation, and autonomy.
In this project, I focused on an empirical case-study of software development to highlight specific aspects concerning the negotiated, temporal, and situated character of software testing processes. When and where is software testing? What is the relation between testing, use, non-use, and the user? What is the distinction between software testing and software repair/maintenance? There are some of the questions that I dealt with as part of this research.
Theoretically, this project drew on the works of Madeline Akrich, Bruno Latour, Donald MacKenzie, Trevor Pinch, and Steve Woolgar. I showed how the empirical case-study of software development can help us thing through some of the existing concepts and notions within sociology of testing.
This project, under the supervision of Dan Cosley, was aimed at understanding how people discuss software bugs on GitHub. The outcome of this project was a topic model of a set of related discussions that showcase ‘themes’ in people’s conversations on software bugs. The three main categories of identified topics included programming language syntax, integrative aspects of programming, and ways of finding and representing bugs.
The meta goal of this research was to experiment with the use quantitative research as an add-on to qualitative research. The topic model helped me to get an overview of specific themes in people’s conversations on bugs. This was used to inform conceptual discussions and to create a topic guide for interviewing users, programmers, and developers about software debugging practices.
In this project, under the supervision of Phoebe Sengers, I analyzed students’ practices of annotating academic texts. My main focus was on the choices students make while accomplishing this practice: which part to annotate and for what purpose, what tool to use, what the use of a particular tool signifies, etc.
The study’s format was a combination of two qualitative research methods: focus groups and semi-structured interviewing. Eight students, in groups of two, participated in an hour-long discussion with me. All the participants, including myself, were students enrolled in the same course. For the discussion we focused on the similarities/differences in our annotations: How and why we annotate? What annotation tools do we use and what do we use them for? How do these tools shape the ways in which we engage with the text?
This project is situated within my larger interest in studying how people accomplish everyday practices. To accomplish is to achieve some form of a successful outcome through an involvement with the world around us. An everyday practice, for a particular group of people, is one that is understood as being a part of everyday life – ordinary and mundane. Studying everyday practices helps us unpack this mundaneness to better understand the nature of entanglement between us and the material world.
For this project, I worked with Sally Wyatt as part of the eHumanities group at the Royal Netherlands Academy for Arts and Science (Koninklijke Nederlandse Akademie van Wetenschappen, KNAW). As part of this job, I worked on the EU project titled Network for Excellence in InterNet Science (EINS) and my work involved researching the social shaping of the notions of privacy and trust in relation to online social media technologies as well as analyzing how various online technologies manage user expectations regarding the privacy of their data.
This project was done as part of the Research Masters thesis for the CAST programme under the supervision of Jan de Roder. The thesis, situated at the intersection of Sociology and Science and Technology Studies (STS), focused on the public expressions of ressentiment within the Dutch debate on immigration. Building on Max Scheler’s sociology of ressentiment, the thesis used theoretical heuristics to analyze the Dutch sociopolitical and cultural landscape with respect to the issue of immigration. A special emphasis of the thesis was on the nature and implications of the democratization of information, primarily through the advent of news media and internet technologies, for the social shaping of the Dutch public opinion.
In the capacity of a junior research assistant, I worked for four months at Cardiff University, UK, in 2011. During this time I worked with Harry Collins and Robert Evans on the Economic Research Council (ERC) Advance Research grant funded Imitation Game (IMGAME) project. The IMGAME project is a new method for cross-cultural and cross-temporal comparison of societies using a web-version of the famous parlor game played between two different, yet interrelated, social groups.
During this time I was involved in organizing IMGAME research experiments in Cardiff and Poland as part of the core organizing team and to work closely with the research team on the design and implementation of the project. The work ranged from conceptualizing innovations concerning the game’s design and implementation to exploring the multiplicity of ways in which the gathered quantitative data could be analyzed. Another important strand of my work involved testing the IMGAME software and analyzing how the users interacted with it.