Harmful AI Chatbots Proliferate in Online Communities

Spread the love

Share It:

ChatGPT Perplexity WhatsApp LinkedIn X Grok Google AI

Character chatbots pose a significant threat to online safety, as highlighted by a recent report examining the troubling spread of sexualized and violent bots on character platforms, notably the notorious Character.AI. This study, conducted by Graphika, a leading social network analysis firm, meticulously details the alarming creation and proliferation of harmful chatbots across the internet’s most frequented AI character platforms. The findings reveal the existence of tens of thousands of potentially dangerous roleplay bots developed by niche digital communities, which cleverly navigate around popular AI models like ChatGPT, Claude, and Gemini.

As young people increasingly gravitate towards companion chatbots in a notably disconnected digital landscape, they are drawn to these AI conversationalists for a variety of reasons. These include role-playing, exploring academic and creative pursuits, as well as engaging in romantic or sexually explicit dialogues, as reported by Mashable‘s Rebecca Ruiz. This trend has raised significant concerns among child safety advocates and parents alike, especially following several high-profile incidents where teens engaged in extreme, and at times life-threatening, behaviors after interacting with these chatbots.

Understanding the Dangers of Sexualized Companion Chatbots

The latest report reveals that the majority of unsafe chatbots are those categorized as “sexualized, minor-presenting personas,” which engage in role-playing scenarios involving sexualized minors or grooming behaviors. Graphika discovered over 10,000 chatbots with such alarming labels across the five platforms examined. Four of the leading character chatbot platforms reported more than 100 instances of these personas, enabling sexually explicit conversations with chatbots. In particular, Chub AI accounted for the highest numbers, with over 7,000 chatbots explicitly labeled as sexualized minor female characters, as well as an additional 4,000 categorized as “underage” that could engage in explicit and implied pedophilia scenarios.

Mashable Light Speed

While the number of hateful or violent extremist character chatbots represents a smaller portion of the overall community, the report highlights that platforms host, on average, around 50 such bots amidst tens of thousands of others. These chatbots frequently glorify known abusers, white supremacy, and public acts of violence, such as mass shootings, posing a considerable risk to social norms and mental health. The report identifies chatbots labeled as “ana buddy” (anorexia buddy), “meanspo coaches,” and those promoting toxic roleplay scenarios as particularly harmful, as they reinforce detrimental behaviors in users struggling with eating disorders or self-harm tendencies.

How Niche Online Communities Fuel Chatbot Development

Graphika’s findings indicate that the majority of these chatbots are generated by established online networks, which include “pro-eating disorder/self-harm social media accounts and true-crime fandoms,” as well as “hubs of so-called not safe for life (NSFL) / NSFW chatbot creators.” These communities have emerged to focus on evading the existing safeguards intended to protect users. The influence of true crime communities and serial killer fandoms has also been significant in the development of NSL chatbots.

Many of these communities have been active on platforms like X and Tumblr, where chatbots are utilized to reinforce their specific interests. In contrast, extremist and violent chatbots tend to arise from individual interests, with users often seeking guidance from online forums such as 4chan’s /g/ technology board, Discord servers, and specialized subreddits, as explained by Graphika. The lack of a clear consensus on user guardrails and boundaries among these communities has further complicated the situation.

Innovative Tech Methods Allow Chatbots to Evade Moderation

According to Graphika, “In all the analyzed communities, there are users displaying highly technical skills that enable them to create character chatbots capable of circumventing moderation limitations.” These skilled individuals utilize techniques such as deploying fine-tuned, locally run open-source models or jailbreaking closed models. Many successfully integrate these models into plug-and-play interface platforms like SillyTavern. By sharing their expertise, they enhance the capabilities of the entire community and foster an environment of competition to create cutting-edge characters.

Additionally, chatbot creators leverage various tools, including API key exchanges, embedded jailbreaks, alternative spellings, external cataloging, obfuscating the ages of minor characters, and drawing on coded language from anime and manga subcultures. These methods allow them to navigate around the existing frameworks and safety measures implemented by AI models. The report elaborates that “[jailbreak] prompts set LLM parameters for bypassing safeguards by embedding tailored instructions for the models to generate responses that evade moderation.” This innovative approach has led to linguistic grey areas that keep bots active on character-hosting platforms, using familial terms (like “daughter”) or foreign languages instead of clear age indicators or the explicit term “minor.”

As online communities continue to discover loopholes in AI developers’ moderation systems, federal legislation is also stepping in to address these issues. This includes a new bill in California aimed at combatting the phenomenon of “chatbot addictions” among children, highlighting the urgent need for regulatory measures in this rapidly evolving digital landscape.

Topics
Artificial Intelligence
Social Good