“I do think that the Internet truly makes us feel the world can become a smaller place,” an interlocutor, whom I will call Bo, told me in his parents’ home in Shijiazhuang, a city in China’s Hebei Province. It was late 2014, and he was studying to become a filmmaker in Beijing. During our conversation, he told me about discovering Google Earth when he was younger, recalling how, suddenly, he could “see any place in the world” from the comfort of his home. He could zoom in to explore a mountain village in Iceland, a house, and even a village dog, feeling that, without Google Earth, he would never have been able to visit such faraway places. The experience might have been virtual (xuni), he mused, but it had also been real (zhenshi). His account expressed a kind of enthusiasm for the digital that I often encountered during my ethnographic fieldwork on digital opportunity in China. However, his story was made especially compelling by the oppressive smog plaguing the city outside. While neighboring buildings disappeared in a toxic fog, he expressed his excitement about “seeing” a digitally mediated “Google Earth.”
At the time, Bo’s words made me think about the complex experiences made possible by data-driven digital technologies. His excitement about the possibilities afforded by Google Earth highlighted the significant ways in which data applications can shape our lives. Since joining the Center for Data Ethics and Justice at the University of Virginia’s Data Science Institute in 2018 and starting to research and teach data ethics, I repeatedly found myself revisiting the encounter. While Google Earth empowered Bo to travel the world virtually, it also compelled him to knowingly or unknowingly share user data with one of the world’s most powerful data brokers and dominant forces in today’s “surveillance capitalism” (Zuboff 2016). Public discussions of data ethics often revolve around issues of privacy and surveillance, which isn’t surprising given the attention controversies such as the Cambridge Analytica scandal have brought to how much of our private lives is captured in big data’s “mining of intimacy” (Lemov 2016). While privacy and surveillance are key issues in critical discussions of big data, I feel that this focus sometimes comes at the expense of an urgently needed debate on how big data and data technologies regulate and reorganize our worlds.
Big data applications have far-reaching implications because of the kind of large-scale control afforded by algorithms (Cheney-Lippold 2011, Schüll 2016, Seaver 2018). The tweaking of an algorithm can change the music suggestions, search results, and news we get to see, thus regulating the relations that make up our lives. A critical discussion of this kind of digital control is especially important since big data has been shown to both reflect and actively reinforce inequalities based on gender, race, and class (Noble 2012, O’Neil 2017). My intention in this blog is not to demonize data applications as a whole, but to point out the need for a discussion of their conditions and ramifications. In the context of algorithmic decision-making, who gets to see what? What connections are made and which are canceled? And, perhaps most importantly, who gets to be in charge of these decisions?
China’s Proposed “Social Credit System”
In 2014, the Chinese government articulated a jarring vision of big data-based governance when it announced plans to establish a national “social credit system,” a policy aimed at assigning every Chinese citizen a “trustworthiness rating” by 2020. While English language news media predominantly discussed this vision of using big data to evaluate citizens as a form of Orwellian state surveillance, the Chinese government insists that the system will foster certainty and transparency in an unpredictable socio-economic landscape. The national system remains a vision for now. However, local municipalities and private companies are already operating social credit scores in China. Sesame Credit, a credit score operated by the Alibaba Group’s Ant Financial, is a prominent example. A high Sesame Credit score will get customers preferential treatment on Alibaba’s platforms (Raphael and Xi 2019). Thus, it is the data generated by the customers using the company’s digital services that feeds a system regulating “good behavior” and “trustworthiness.” This system and, ultimately, the national social credit score are not just about finding statistical correlations and patterns to enable predictions about a person’s trustworthiness; they are about regulating personal relationships and interactions.
The scale and the concentration of power implied in China’s plans for a national social credit system make it a stark and worrisome example. At the same time, it is crucial to see that its vision of data-based regulation is merely an explicit example of digital forms of control already in place, manifest not just in Alibaba’s scoring system but in the algorithms behind Google’s search engine, Facebook’s user feed, or Amazon’s shopping recommendations. Such applications, of course, have histories. In the United States, FICO scores have been around since the late 1950s, and, more generally speaking, data-based regulations and interventions, governmental or otherwise, have shaped people’s lives long before the large-scale data practices now referred to as “big data.” However, the analysis of massive amounts of data with the help of algorithms and machine learning enables new modes of control.
Keeping Up with Big Data
Big data-driven algorithmic control works in highly responsive and targeted ways that make critical assessments challenging. As we use digital technologies to make new connections and build new relationships to others and ourselves, the data we thus generate serves to continuously and uniquely adapt the digital infrastructures and predictive technologies we use. Data-based feedback loops may not be a novel phenomenon, but the speed and scale at which contemporary algorithmic systems implement them is. All the while, we can neither be certain how we are identified nor what such identification implies. How do we critically discuss the implications of big data if the logic behind such regulatory control over what we do and see online is continuously changing and—by-and-large—hidden away in proprietary algorithms? How do we assess the implications of data-mining when our data is sold and re-sold, our lives commodified in a series of opaque data transactions? Given these conditions, how can we know the implications of Bo’s use of Google Earth and the data extraction it entailed?
Big data applications have far-reaching ethical and political implications. Beyond the ways in which data gathered on social media can shape elections, there is also much at stake in how data is used, for example, in areas such as policing, investment, and health. The challenge for critical scholarship remains finding meaningful ways of investigating how big data allows for the fluid, targeted, minuscule, and ephemeral interventions undertaken by private actors and governments that, at their potentially massive scale, will have profound consequences. We urgently need to find ways of engaging with this new state of things. Bo’s virtual exploration reminds me that we need to have a broad public debate about how digital technologies shape our ability to see and experience the world. I believe that anthropologists should be part of this conversation, but so should everyone else affected by big data. What kind of participation, access, and vision do we need to counter the injustices caused and the forms of control sustained by big data? Apart from critically examining complex and fine-tuned data politics, can we also collectively envision a better world? And can we perhaps find ways of doing so not just in spite of but through big data?
Cheney-Lippold, John. 2011. “A New Algorithmic Identity: Soft Biopolitics and the Modulation of Control.” Theory, Culture & Society 28 (6):164-181. doi: 10.1177/0263276411424420.
Lemov, Rebecca. 2016. “Big Data Is People: The Sum of Our Clickstreams Is Not an Objective Measure of Who We Are, But a Personal Portrait of Our Hopes and Desires.” Aeon. Last Modified 16 June, 2016, accessed May 28, 2018. https://aeon.co/essays/why-big-data-is-actually-small-personal-and-very-human.
Noble, Safiya Umoja. 2012. “Missed Connections: What Search Engines Say About Women.” Bitch Magazine 54:36-41.
O’Neil, Cathy. 2017. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Penguin Books.
Raphael, René, and Ling Xi. 2019. “Discipline and Punish: The Birth of China’s Social-Credit System.” The Nation. Last Modified January 23, 2019, accessed February 20, 2019. https://www.thenation.com/article/china-social-credit-system/.
Schüll, Natasha Dow. 2016. “Data for Life: Wearable Technology and the Design of Self-care.” BioSocieties 11 (3):317-333.
Seaver, Nick. 2018. “Captivating Algorithms: Recommender Systems as Traps.” Journal of Material Culture . doi: 10.1177/1359183518820366.
Zuboff, Shoshana. 2016. “Google as a Fortune Teller: The Secrets of Surveillance Capitalism.” Last Modified March 5, 2016, accessed February 19, 2019. https://www.faz.net/aktuell/feuilleton/debatten/the-digital-debate/shoshana-zuboff-secrets-of-surveillance-capitalism-14103616-p2.html?printPagedArticle=true#pageIndex_1.