Over*Flow: Digital Humanity: Social Media Content Moderation and the Global Tech Workforce in the COVID-19 Era
Sarah T. Roberts / University of California, Los Angeles

Author’s Note: Over the past days, I have fielded many questions
asking about commercial content moderation work during the global coronavirus
(COVID-19) crisis. There are many aspects to consider, including location,
logistics and infrastructure, legal worker protections, and state-level
actions. As I have written and rewritten this article, I have needed to
repeatedly come back to update this article based on changing circumstances. At
this point, the evening of March 17, I will not fundamentally change it but
will continue to update it until it goes to press. My heart goes out to all of
those around the world who are touched by this disease: all of us

A small gathering at UCLA last week, in what we could not know at the time was likely to be the last of its kind for most of us for the foreseeable future, a group of scholars at all levels of career and life gathered with community activists, artists and others to respond to a conversation curated by Professor Patrik Svensson under the aegis of Humane Infrastructures, an appropriate context for what we were about to collectively experience, despite assuredly not having been on the horizon during the event’s planning.

For the purposes of this event I was asked, in what I have now come to regard as an uncanny bit of timing, to discuss technology labor forces dedicated to social media content moderation and/as infrastructure, prompting me to open my remarks with a nod to “human infrastructure” more generally. It is an exercise I find useful to my work but a metaphor or description that has serious limitations. And so I use it, while also applying to it various caveats, the first of which is simply that humans are humans. They are not pipe. They are not fiber. They are not, despite all attempts of management theorists of the early 20th century and gigwork proponents of the 21st, cogs to be replaced when one becomes worn, reducible to their motion study-documented singular movements, or blips on a delivery map.

Yet because the approach to provisioning labor for large-scale technology operations often takes on these overtones, it bears discussing labor forces as infrastructure, if for no other reason than to accurately account for them in the production chain of things like, in my case, social media, or manufactured goods, or textiles, or whatever the product or output may be. I also believe that gaining insight into corporate orientations toward such labor forces is helpful to develop a more thorough and sound critique of said orientations and the concomitant practices that emerge from characterizing workforce as infrastructure in the first place. In other words, we need to see how the firms see to make the most salient and effective critiques of their practices and credos.

I will cut to the chase of what many readers want to know: how is the pandemic of COVID-19, the coronavirus that is sweeping around the globe, impacting the moderation of social media. More to the point, your question may be, “Why is corona having an impact on moderation at all?” Let me give the briefest of overviews that I can to say that the practice of social media moderation happens at industrial scale with many of the transnational service outsourcing firms now involved, and with countless other players of lesser size at the table. It is a global system that involves labor pools at great geographic and cultural distance, as well as jurisdictional and legal remove, from where we might imagine the center of social media action is: Menlo Park, or Mountain View, or Cupertino or another Silicon Valley enclave.

The second thing to bear in mind is that there is a vast human workforce doing an incredible amount of high-impact content moderation for firms; my typical estimate (that I consider to be extremely conservative) is that the global moderation workforce is in the six figures at any given time, and I likely need to try to revise this number significantly. Yes, there are AI and computational tools that also conduct this work, but it is important to keep in mind that it is exceedingly difficult for those systems to work without human oversight or in the absence of humans vetting and doing manual removals, too.


Facebook's Announcement on March 16, 2020
Facebook’s announcement on March 16th indicated to many that a new experiment in content moderation was forthcoming.

This particular fragility has been seen most acutely today at Facebook, which announced yesterday evening that it would shut down as much of its operations as it could and have workers work from home when possible. In the case of their commercial content moderators, Facebook has explained that there are many cases in which workers cannot do their work effectively from home and the company is therefore moving to a much greater reliance on its AI tools and automated moderation systems. The switch in reliance upon automated removal appears to have occurred today, when vast numbers of users began reporting the deletion of benign and sometimes even newsworthy content (in many cases, about COVID-19). Confirmed by some representatives from Facebook is that there was a “bug” with some of the automated content removal systems that has now been corrected.[ ((It bears mentioning that there was some debate on Twitter about whether or not this bug was related to the letting go of human content moderators, with Guy Rosen of Facebook stating that it was not and former Facebook CSO Alex Stamos expressing skepticism. My guess is that the new widespread reliance on AI tools has already revealed and will continue to reveal a variety of hits a human would not make.))]


Professor Vaidhyanathan's Tweet
Professor Siva Vaidhyanathan of UVA expresses frustration with Facebook’s moderation under all-AI, March 17, 2020.

To understand this better, I will describe the general status quo for much of the top-tier American social media firms and their content moderation ecosystem.[ ((The operative phrase here is “top-tier”; many smaller firms have considerably fewer resources to put on moderation and may have devised other systems entirely to manage the bulk of their moderation needs. Two important examples of alternative systems are Reddit and Wikipedia, both of which rely on a huge network of volunteer community moderators whose interventions are user-facing and who are typically themselves close to the communities they moderate.))] The characteristics of the ecosystem is that it tends to be arranged with contract labor through third-party companies and has a global footprint. The firms have created their own network of call center-like facilities that form a web across the globe, and cover a massive array of linguistic, cultural, regional and other competencies and knowledge (although there are inevitable gaps and gaffes).

The distributed nature of the contract commercial content moderation system indeed allows for some degree of redundancy when it comes to issues of natural disaster or other catastrophic events that could take a center, a city or even a region offline. That said, most firms are at capacity when it comes to their screening needs, and the loss of a major site could very well impact quality. That appears to have happened in the last 72 hours, when Metro Manila and, indeed, much of the island upon which it is located, Luzon—a part of the Philippine archipelago that is home to 57 million people—went into quarantine. Reports the Singaporean Straits Times, “Police began closing off access to the Philippines’ sprawling and densely populated capital Manila, a city of some 12 million people, imposing a month-long quarantine that officials hope will curb the nation’s rising number of coronavirus cases.”

The Philippines is also the call center capital of the
world, and competes with India for the vast outsourced business of commercial
content moderation for the so-called Global North. In short, the Philippines is
where social media content goes to be screened.

Eleven days ago, I communicated with a reporter colleague to give my sense of how a virus-related shutdown in the Philippines could affect American social media giants. I told him that while a lot of the most irritating and highest-volume unwanted content (as deemed by the platforms) can be found and removed by automated tools—here I refer to spam, pornographic content, copyright violations, and other already known-bad material—they tend to be imperfect and blunt instruments whose interventions can be calibrated to be more sophisticated or to cast a wider net.[ ((See the work of Safiya U. Noble, Ruha Benjamin, Cathy O’Neill, Frank Pasquale, Joan Donovan and others who demonstrate that algorithmic interventions are deeply imbued with and shaped by a host of values, manipulation and bias, following key critiques of the politics of software by Wendy HK Chun, of computation by David Golumbia, after the fundamental question posed and answered by Langdon Winner that artifacts, indeed, have politics.))] But the loss of a major moderation site that would mean a switchover to reliance on these tools would invariably cause disruption in social media’s production chain, and could even potentially lead to quality issues perceived by users.

That appears to be the very case we saw today, where we see users become frustrated by false positives: cases where overzealous and undersophisticated AI tools aggressively remove reasonable content, because its judgment is too rudimentary. The alternative is also no alternative at all, for if the AI tools were turned off altogether, the result would be an unusable social media platform flooded with unbearable garbage, spam and irrelevant or disturbing content. One moderator interviewed in my book described the internet without workers like him as “a cesspool.”

Which, then, is the lesser of two evils, an overpoliced automated AI-moderated internet, or a “hole of filth” (as another Silicon Valley-based worker described it) of unbearable human self-expression? Ultimately, the firms will decide for the former, as it is powerful matter of brand protection and legal mandates (most from outside the United States) that will drive their choice in this matter. I suspect that it will be much of the public’s first contact to both the contours of content moderation on its platform, as well as how the disappearance virtually overnight of legions of humans doing this work has led to marked and immediate quality decline.

I return to the most important question, perhaps, that has been asked about this issue, which is why the work cannot simply be done by the workers from home. The answer, like everything about this issue, is complex. In many cases, such work can and is done at home. In the case of AAA social media firms, however, constraints like privacy agreements and data protection policies in various jurisdictions may preclude this. There is also a nontrivial infrastructure that goes into setting up a computing center with requisite hardware, software (internally developed and maintained systems) and routing of data. The call center locations themselves are often highly secure, with nothing allowed on the floor where workers are logged in. Working from home eliminates the ability for oversight and surveillance of workers and their practices, both what they are doing and what they are not, to the extent that can be achieved on-site. This alone is possibly a deal-breaker for moving the work home. In a moment of dark humor, one rightly cynical colleague pointed out that this is an event that, while likely wholly unimagined and unplanned, is allowing for a certain amount of stress testing of these tools at scale.

Bringing the work consisting of the rapid review of thousands of images and videos, many of which can be psychologically difficult and taxing, into the home also may be considered too much to ask of workers in a time of crisis. Workers in call centers rely on each other and their teams for support doing commercial content moderation, and may have access to an on-site or on-call therapist, counselor or other mental health professionals.[ ((Even when counselors are available, it is not always the panacea it may seem. Some workers contracted by Accenture discovered that what they presumed were private sessions with workplace therapists were actually reporting on those sessions to Accenture’s management, according to The Intercept.))] But it is also worth mentioning that many people already do this kind of work at home, whether as contractors or on microtask sites from anywhere in the world.[ ((See this report released just yesterday on the state of microwork in Canada, from the Toronto Workforce Innovation Group (TWIG), or an interview with sociologist Antonio Casilli on microwork in France.))]

Further, infrastructure differences will play into the picture locally. For example, European tech hub the Republic of Ireland has widespread penetration of at-home fixed broadband, whereas in the Philippines the story looks different. Here is where we return to the way the firms themselves view the matter of outsourced labor in what we can consider the production chain of social media: as a component in a production cycle characterized by the East to West flow of supply-chain logistics for manufactured goods. The model is one of just-in-time, in which all aspects of the process, from putting up a site to hiring in workers to the actual moderation itself, takes place as quickly and as “leanly” as possible, particularly for functions such as content moderation that are seen as a “cost center” rather than a “value-add” site of revenue generation.

Just-in-time supply-chain logistics may be being tested in other parts of the tech industry and in industries reliant on other types of manufactured products, when we consider the goods’ origin point (frequently East Asia, in general, and China, specifically, particularly for textile, tech and other material goods). Consider the recent shuttering of numerous retail chains (e.g., Apple Stores; Lululemon; Victoria’s Secret) not only as issues of lack of clientele or safety of employees, but one that may reflect a significant gap in the availability of goods making their way out of factories and across oceans: “Just how extensive the crisis is can be seen in data released by Resilinc, a supply-chain-mapping and risk-monitoring company, which shows the number of sites of industries located in the quarantined areas of China, South Korea, and Italy, and the number of items sourced from the quarantined regions of China,” reports the Harvard Business Review.

When we consider a social media production chain that is less material, perhaps, in terms of the product (user-facing content on a social media site) than an H&M fast fashion jacket or a pair of Apple AirPod Pros, the essential nature of the presence of humans in that chain is just as apparent as when a production line goes down for a month and no goods leave the factory. Here, where content moderators are both the product (in the form of their cultural and linguistic sense-making ability upon which their labor is frequently valued and sold) and the producer (in the form of the work they undertake), their impact of their loss in the production chain must be considered profound.


Microsourcing, a Manila-based commercial content moderation outsourcing firm
Microsourcing, a Manila-based commercial content moderation outsourcing firm, advertised their laborforce as having specialized linguistic and cultural “skills.” In this way, these “skills” were the commodity on offer.

In essence, what is supposed to be a resilient just-in-time chain of goods and services making their way from production to retail may, in fact, be a much more fragile ecosystem in which some aspects of manufacture, parts provision, and/or labor are reliant upon a single supplier, factory, or location. Just as it is in manufacturing, where a firm discovers that a part is made only in one factory and its going offline affects everything downstream, such is it decidedly the case for the fragile ecosystem of outsourced commercial content moderation and its concentration in areas of the world such as the Philippines. The reliance on global networks of human labor is revealing cracks and fissures in a host of supply-chain ecosystems. In the case of human moderators who screen social media, their absence is likely to give many users a glimpse, quite possibly for the first time, of the digital humanity that goes into crafting a usable and relatively hospitable online place for them to be. In the face of their loss, perhaps just when we need them the most—to combat the flood of misinformation, hate speech, and racism inspired by the global pandemic that is COVID-19 now circulating online—they are gone. Will we learn to finally collectively value this aspect of the human infrastructure just a little bit more than not at all?



Image Credits:

  1. Facebook’s announcement on March 16th indicated to many that a new experiment in content moderation was forthcoming.
  2. Professor Siva Vaidhyanathan of UVA expresses frustration with Facebook’s moderation under all-AI, March 17, 2020.
  3. Microsourcing, a Manila-based commercial content moderation outsourcing firm, advertised their laborforce as having specialized linguistic and cultural “skills.” In this way, these “skills” were the commodity on offer. Source: Behind the Screen: Content Moderation in the Shadows of Social Media (Yale University Press, 2019)


References:




Section 230 as American Tech’s “Soft Power” Secret Weapon
Sarah T. Roberts / University of California, Los Angeles


The October 16th, 2019 Hearing.


The Hearing

On Wednesday, October 16th, the Subcommittee on Communications and Technology and the Subcommittee on Consumer Protection and Commerce of the Committee on Energy and Commerce convened for a joint hearing of said committees in the Rayburn House Office Building, where most such hearings take place.[ ((The official web page for the hearing can be found here, and includes video: https://energycommerce.house.gov/committee-activity/hearings/hearing-on-fostering-a-healthier-internet-to-protect-consumers.))] Largely unremarkable to the general public, particularly against such other governmental high proceedings as the building impeachment case against President Trump dominating the news, it likely went unnoticed by you, the reader. This is no surprise: despite its expansive and perhaps overly optimistic title, “Fostering a Healthier Internet to Protect Consumers,” the conceit of the event was actually much narrower in scope and of interest to a much more specialist, if not wonkier, crowd. The focus of the hearing was primarily on the specificities of a particular portion of legislation from the 1996 Communications Decency Act known as Section 230. Accordingly, those assembled as witnesses to the committee on the morning of October 16th were, to those who follow the barometer of public opinion toward Section 230 (as do I), recognized as a cavalcade of stars—for and against altering or elimination of 230 altogether, which quickly became the theme of the day.[ ((Present at the hearing were the following individuals, with links to their written testimony: Steve Huffman, Co-Founder & CEO of Reddit, Inc.(Testimony); Danielle Keats Citron, Professor of Law, Boston University School of Law (Testimony);  Corynne McSherry, Legal Director, Electronic Frontier Foundation (Testimony); Hany Farid, Professor, University of California, Berkeley (Testimony); Katherine Oyama, Global Head of Intellectual Property Policy, Google, Inc. (Testimony); Gretchen S. Peters, Executive Director, Alliance to Counter Crime Online (Testimony).))]

Despite the fact that this event, and others like it (carried, if it merits enough interest on a channel like CSPAN, but more likely accessible primarily via a government live stream and through written testimony) must assuredly constitute what stands for the phrase “inside baseball,” when it is derisively levied against conversation so esoteric as to shut others out, I am using this occasion of my FlowTV column debut to argue that few portions of legislation are so key to the nature of the internet as we currently know it: commercial, opaque, global and controlled by few. If you take me at my word, it would therefore stand to reason that any change to the legislation would undoubtedly yield in a status quo shift that, depending on where you sit in your relation to Section 230, would either free the internet from an American corporatist stranglehold, irrevocably destroy it as it devolved into a useless cesspool of unfettered “free expression,” or…something else. For this reason, what may, at first blush, appear to be a case of nothing more than D.C. wonk inside baseball is actually incredibly important, and more of us should be engaged in understanding what Section 230 is, what it has been, and what changes to it might mean. For something that most Americans may have never heard of, its power is immense, and it is now being wielded in new ways.


Section 230: A(n Extremely) Brief Overview

In 1996, as Congress hurried to respond to the moral and technological panics being wrought by the widescale adoption of internet service to the masses (and the concomitant and probably realistic suspicion that hardcore pornography was its primary engine), it passed the Communications Decency Act of 1996, believing that it was not possible to rely on extant telecommunications legislation to contend with these new media forms. Yet arguably its most enduring legacy (after its indecency mandates later faced, and lost, court challenges) came in the form of an addition to it, Section 230, which, in 1996, gave so-called “internet service providers” (or ISPs) largely total immunity in terms of being held responsible for material transmitted through their services. In the late 20th century, when ISPs were primarily and quite literally banks of modems that one dialed up from home to connect to a larger internet offering a variety of user experiences bounded mostly by technology, first, and imagination, second, this provision made good sense. Few could have conceived of an internet landscape dominated by a handful of powerful American companies who not only transmitted user-generated content, or UGC, but actually solicited it at a scale unimaginable to those of one generation earlier. And so Section 230 became a powerful mechanism by which the internet, dominated by private companies and their new services, and enhanced by those firms’ ingenuity and technological prowess, as well as a cozy relationship with the federal government, exploded. Since that time, the firms themselves and those who support them, as well as a host of lobbying groups from the Electronic Frontier Foundation to the American Civil Liberties Union, have actively and vociferously resisted any encroachment on Section 230 that might change the firms’ immunity, the very thing that has largely afforded them absolute discretion to decide what stays up and what must come down on their platforms. The resulting regime—the policies, practices and people actually implementing that discretion—is what is known as commercial content moderation, itself the subject of constant and polarized debate.[ (( The background and history of Section 230 certainly cannot be done justice here, nor by me, and are therefore best taken up by legal scholars, of which I am not one. I point readers to the work of those who are, including but not limited to: Kate Klonick, Jeff Kosseff, Frank Pasquale, Danielle Citron (herself a witness at the October Congressional hearing), and others to learn more about Section 230 and its relationship to the contemporary internet. If you are interested in reading more about commercial content moderation’s people and practices, the politics inherent in it, and Section 230 as it relates to those things, I humbly recommend my own book, Behind the Screen: Content Moderation in the Shadows of Social Media (Yale University Press 2019).))]


screen grab of the electronic frontier foundation's CDA page
EFF’s feelings about Section 230 are clear.

With this background in mind, it may now seem more critical than ever to understand it in order to better understand the contemporary American-dominated commercial internet ecosystem, how it has come to be, and under what logics (legal and otherwise) it is constituted and reconstituted. It also therefore becomes key to understand where various players invested in debates around the control of the internet sit, particularly with regard to its status quo as relatively underregulated in the American context, and in the endurance of an unmolested Section 230. To this end, the October 16th hearing was fascinating.


Section 230 as American Soft Power’s Secret Weapon

Upon a recent visit to Australia, I had occasion to speak to tech reporter and radio host Ariel Bogle about my work on the commercial content moderation of social media. She asked me, rightly, if the sum total of platforms’ content moderation rules, use policies, engagement in markets, and so on, constituted a form of American “soft power”[ ((First coined by Joseph Nye in Foreign Policy: https://www.jstor.org/stable/1148580; an interesting study that offers an expanded description of what might constitute “soft power,” albeit “developed in collaboration with Facebook,” can be found here: https://softpower30.com/what-is-soft-power/))] being meted out around the globe. I agreed, and described how the resulting patchwork of rules, affordances, operations and policies—not to mention the siting of commercial content moderation call centers around the globe—could fairly be considered as an example of American hegemony, tightly wrapped up and packaged, then exported seamlessly, in the form of technological platforms that rarely easily betray the politics embedded in their functionality and affordances, never mind in their prohibitions and constraints.

As I watched the House Subcomittee hearing on October 16th,
hearing Boston University legal scholar Danielle Citron and Berkeley computer
scientist and creator of PhotoDNA technology Hany Farid argue for sensible
rethinking of Section 230, I was fascinated to hear the pushback. It came,
predictably, from EFF Legal Director Corynne McSherry, whose organization has
long been resistant to the notion of any tampering with Section 230. And it
also came, no surprise, from Google’s representative, Global Head of
Intellectual Property Policy Katherine Oyama. And yet, even an old cynic such
as myself was surprised by what I heard.

In her statement, Oyama wasted no time in mentioning the positive economic impact to the United States that the tech (internet content) industry brings, with the both overt and latent suggestion being that it is Section 230 itself that fosters this incredible remunerative benefit—a powerful reminder to lawmakers who go to Congress and stay there largely on their ability to generate a healthy economy for the country.[ ((Indeed, Google’s Ms. Oyama makes this point again on page 2 of her written testimony, saying, “This creativity and innovation continues to yield enormous economic benefits for the United States. Digital platforms help millions of consumers find legitimate content across the internet, facilitating almost $29 trillion USD in online commerce each year. In 2018, the internet sector contributed $2.1 trillion to the U.S. economy and created 6 million jobs.”))]


Google's Katherine Oyama testifies
Google’s Katherine Oyama testifies before a Congressional Subcommittee, October 16th, 2019.

But it was in the subsequent back-and-forth questioning of the assembled experts by the Committee that took me by surprise, when, at a number of points in their dialog, Oyama made it clear that while Section 230 was under discussion and even ostensibly up for debate and review in this very House panel, Google and others were lobbying powerfully for section 230-like language to be included in trade agreements with countries like Japan and Mexico—trade agreements that largely remain secret and certainly that the American public, nor the citizens of their trading partners, have very little ability to intervene upon.

This was a fascinating new bit of information for me: outside of a
fairly small group of specialists, many legal scholars and policy advocates who
work in the social media or tech space have indicated a certain disinterest in
conversations about 230 because they rightly point out that it is an American
statute and therefore only pertinent in the context of the United States. Given
that social media firms emanating from the United States are, more and more,
subject to jurisdictional demands from outside the country, they have
argued to me that the impact of Section 230 is destined to lessen as the
demands from elsewhere ramp up.

Yet, as this testimony unequivocally proved, we learn that it is not simply a de facto and waning exporting of the US legal norm but it is indeed a literal exporting of it—codified in trade agreements that are developed and ratified far beyond the reach of citizens who are ultimately subject to them. The soft power of United States-based tech and social media firms therefore goes far beyond just what is seen on the user-facing side and, in fact, must now be tracked and traced through its extensive lobbying practices, as it urges representatives of the United States in trade and commerce agreements to force “partners” to adopt Section 230-like measures in their own countries. If this does not constitute the soft power of American tech firms, I am certain I do not know what does.



Image Credits:

  1. The October 16th, 2019 Hearing.
  2. EFF’s feelings about Section 230 are clear. (author’s screengrab from the EFF CDA page)
  3. Google’s Katherine Oyama testifies before a Congressional Subcommittee, October 16th, 2019. (author’s screengrab)


References: