loading . . . **The metaphor of online communities that “have become toxic” or that are “being polluted” in different ways is a common one. But what do we mean when we talk about pollution and toxicity in online spaces; and what can we learn from the environmental sciences and natural ecosystems to improve things with and for communities? I invite you to think through this with us at CAT Lab.**
The r/AskHistorians sub-reddit is one of the hundreds of thousands of online communities on Reddit to bring together people around different topics or missions. For r/AskHistorians, that mission is _public history_ – the exchange between learners and experts, where people can ask their questions about history, with the expectation to get answers from informed experts. Since its launch in 2011, this community has grown to around 2.5 million subscribers. And as for most communities, this exchange is enabled and stewarded by a team of volunteer moderators who create and enforce moderation rules to serve the community’s mission. In the case of r/AskHistorians, those rules include writing original, in-depth answers rooted in good historical practices.
Content moderation, especially at such large scales, has always been work, but the amount of labor performed by the r/AskHistorians moderators drastically increased in 2022, when ChatGPT was released and “AI”-generated content started to be posted into the community _en masse_. CAT Lab Research Director Dr. Sarah Gilbert recently presented her research on these challenges at the 2025 conference of the _Society for Social Studies of Science_(4S):
While the community already prohibited plagiarism, such as copy & pasting from Wikipedia, outputs of a Large Language Models (LLMs) like ChatGPT introduced a much bigger pool of potential plagiarism. Which first needs figuring out which posts are “AI”-generated, which might require discussions amongst moderators to form a consensus, to avoid falsely accusing posters of plagiarism. Especially as people like English language learners (or speakers of English dialects that have been involved with training “AI”) are at higher risks of being accused of using “AI’. Once a decision is made, moderators need to take action – banning them in case of r/AskHistorians – and ultimately deal with the potential negative backlash from users who feel treated unfairly as well as reviewing appeals of users who were banned. Jointly, the advent of such “AI”-generated contributions, even if done in a well-meaning way, has had a polluting effect on the community, by increasing volunteer-workloads and straining the moderators.
## Managing Pollution in Large-Scale Cooperation
But /r/AskHistorians, or even reddit content moderation, is far from being unique in struggling from the “cultural pollution” that digital communities and people across the internet experience: On Wikipedia, volunteer editors are struggling to keep articles free from unsourced, “AI”-generated contributions, leading to the formation of the _WikiProject AI Cleanup_. Furthermore, machine-generated translations of Wikipedia articles into smaller languages are now polluting the editions of already vulnerable languages as well as the languages themselves.
Free & Open Source Software (FOSS) is another digital commons affected. FOSS is mostly developed by software developers who volunteer their time at a scale where it has been estimated that it would cost some $4.2 billion to re-build their efforts in a commercial setting. “AI”-generated submissions strain those volunteers in two main ways: Firstly, by increasing the rate at which people try to contribute machine-generated code to FOSS projects, thus increasing the workload of the human reviewers and thus raising concerns about long-term sustainability and maintenance. Secondly, maintainers of FOSS projects report an increase of machine-generated bug reports that pollute their bug trackers, including reports for security vulnerabilities: Daniel Stenberg, maintainer of the _cURL_ program and library – which runs on virtually any digital system including by cars from nearly 50 manufacturers – speaks of a “death by a thousand slops”. He outlines how reviewing LLM-generated security vulnerabilities, which virtually always turn out to be false, take up significant amounts of his time.
The impact of this type of “cultural pollution” is also felt by those who are not actively contributing to creating or maintaining online communities or commons: Websites solely based around machine generated content are proliferating, polluting both search engines and journalism, crowding our human-generated, high-quality journalism. The analogy of pollution that many of these communities and maintainers refer to seems like an apt one. It even predates the launch of generative AI systems that currently are the focus of this “digital pollution”: The related environmental concept of toxicity is a staple when discussing how people interact in online communities, references to which go back to at least the early 2000s. And more recently, people have argued that social media companies themselves should be viewed as potential polluters of society and how our information is being polluted.
## Going From Metaphors to Modeling Pollution in Online Cooperation
As we will see, the goal of the “digital pollution” framing is **not** to call individual community participants or types of online cultures _per se_ as toxic or polluted. Instead, it can serve to understand how online ecosystems can suffer, despite lots of well-intentioned and well-meaning interactions. Understanding these pollution dynamics is not just of academic interest, it might also help with modeling online interactions. Which in turn can help design interventions that have the potential to support moderators and improve online communities.
If we look at “pollution” more closely, in which ways do different factors in “commons pollution” mirror environmental pollution? Firstly, both environmental pollution and digital pollution can come in different shapes and forms. If we just think of water pollution, we have point source pollution, in which a single, identifiable source such as a factory discharges harmful materials into bodies of water. Online, we can find similar “point sources” in targeted misinformation campaigns, run by humans or bots.
But there are also more diffuse types of “nonpoint source pollution”, which in environmental pollution could be agricultural runoff of fertilizers that ends up in streams. In those cases, there can be an excess of “nutrients” that create eutrophication that allows for algal or bacterial blooms that deplete all oxygen in a body of water and which in turn leads to mass fish die-offs. In our online communities or commons, similar “nonpoint source pollution” could be a drastic increase in new contributions due to a technology like generative “AI” or even an increased rate of new human contributors who aren’t familiar with community norms.
If treating “newcomers” as a potential pollutant seems strange, this is another factor where the pollution analogy might hold for both environmental & cultural pollution: In toxicity, “the dose makes the poison” is a common refrain to for the idea that there can be too much of a good thing, in human health, the environment as well as for online communities. While fertilizers and other products in agricultural run-off are productively used in the right dose, their accumulation in bodies of water is what creates the eutrophication that leads to the algal blooms. And in our online communities, an influx of new members is welcome if they can engage in “productive” contributions, but if the moderation and community engagement systems get overwhelmed, such increases can be harmful.
Which might lead to another interesting similarity between environmental and digital pollution: Both can be rooted in exogenous shocks. Examples in the environment can be catastrophic events like oil spills – or heavy rains such as the ones that happened during the Paris Olympics and that overwhelmed the basins that were designed to prevent wastewater flowing into the Seine, leading to a contamination of the river. In our digital pollution, a similar shock exogenous could be world events/news, but algorithmic recommendations, which lead to a big and sudden influx of new community members, similar to what Ed has shown in his data visualizations on the impact of algorithmic recommendations.
## Working with Communities to Model and Intervene on Digital Pollution
Beyond these interesting parallels, is there a way we can learn something or benefit from treating the idea of “pollution” as more than just a metaphor? Water pollution as well as other forms of pollution have been academically studied for at least 100 years. As a result of this, there exists a very rich and broad set of mathematical tools to model and understand pollution and how it impacts the environment.
A famous example from water pollution is the Streeter-Phelps equation, which was made in 1925 as part of a pollution study of the Ohio River. The equation models the impact that organic mass that gets into a stream, as it happens through agricultural or urban run-offs, has on the levels of dissolved oxygen depending on the distance or time from where/when the pollution occurred. This in turn lets one understand if or how a stream can support life at different times/distances from the pollution. The question is, could methods such as these be adapted to understand the dynamics of online communities or digital commons? Or what would similar models for better understanding digital pollution look like?
We would like to explore these questions and are looking for people to join us in this. If you are equally intrigued by this, get in touch. You can reach me via email at [email protected], on Mastodon under @[email protected], and find many more contact methods on my website.
### **Footnotes**
Gilbert, S. A., (2025, September 6). Reluctant Saviors: Volunteer moderation and social media collapse [Panel Presentation]. Society for the Social Studies of Science (4S), Seattle. https://citizensandtech.org/2025/12/online-communities-as-ecosystems/