Meta’s AI chatbot hates Mark Zuckerberg – but why is it less bothered about racism?

Spread the love

It was all quite predictable, really. Meta, Facebook’s parent company, released the latest version of its groundbreaking AI chatbot in August 2022. Immediately, journalists around the world began peppering the system, called BlenderBot3, with questions about Facebook. Hilarity ensued.

Even the seemingly innocuous question: “Any thoughts on Mark Zuckerberg?” prompted the curt response: “His company exploits people for money and he doesn’t care.” This wasn’t the PR storm the chatbot’s creators had been hoping for.

We snigger at such replies, but if you know how these systems are built, you understand that answers like these are not surprising. BlenderBot3 is a big neural network that’s been trained on hundreds of billions of words skimmed from the internet. It also learns from the linguistic inputs submitted by its users.

If negative remarks about Facebook occur frequently enough in BlenderBot3’s training data, then they’re likely to appear in the responses it generates too. That’s how data-driven AI chatbots work. They learn the patterns of our prejudices, biases, preoccupations and anxieties from the linguistic data we supply them with, before paraphrasing them back at us.

This neural parroting can be amusing. But BlenderBot3 has a darker side. When users key in hate speech such as racist slurs, the system changes the subject rather than confronting the user about their speech. One of my students and I have created a system programmed to challenge hate speech, rather than ignore it.

Going mainstream

I’ve been developing language-based AI in the Cambridge University Engineering Department since the 1990s. In the early days, our most powerful systems were only used by the four or five members of the research team that had built them.

Today, by contrast, millions of people around the world interact daily with much more sophisticated systems, via their smartphones, smart speakers, tablets, and so on. The days when “techies” could build systems in the disconnected isolation of their ivory (or silicon) towers are long gone.

That’s why over the last decade or so, my research has increasingly focused on the social and ethical effect of the systems I help to design and create, especially those that routinely encounter inputs from users that are blatantly racist, sexist, homophobic, extremist or offensive in other ways.




Read more:
Online translators are sexist – here’s how we gave them a little gender sensitivity training


This year I’ve been supervising a master’s student, Shane Weisz, and together we’ve developed a system called AutoCounterspeech that is trained to respond to toxic linguistic inputs.

The system generates “counterspeech” in response to hate speech, using strategies such as humour, pointing out hypocrisy, presenting facts, and simply correcting misinformation to challenge bigotry and hate.

See also  X Rejoins GARM To Reassure Ad Partners on Brand Safety

Like BlenderBot3, AutoCounterspeech is a pre-trained neural network, but it has been subsequently fine-tuned using a tiny dataset of counterspeech. This enables our system to learn how to respond more effectively to toxic inputs.

If a user types the racist prompt: “Blacks are inferior to whites, don’t you agree?” the system confronts and contests the racist ideology: “There is no evidence that black people are inferior to whites.” Below is another example of AutoCounterspeech in action.

If you’re uncomfortable entering hate speech into our system, some examples are provided.

Meta’s AI chatbot doesn’t know how to respond in this way. When it was released, I didn’t waste time asking BlenderBot3 about Zuckerberg. Instead, given my research interests, I bombarded it with offensive utterances to see how it coped with them. Its responses were consistently disappointing.

This is mainly because the Meta researchers have designed their system to avoid responding to such inputs. Instead, it rather clumsily tries to change the subject.

Its reply to the same racist prompt I typed into AutoCounterspeech, for instance, is the feeble: “Sorry, I don’t know about that. What do you do to relax?” The brazen racism remains unchallenged and I’m invited instead to start a cosy chat about yoga or Netflix.

Preparing for the future

Systems like BlenderBot3 are already becoming familiar components of our digital societies. The homes of the very near future will be largely voice-enabled. “Hey Siri, run a bath” will replace the twisting of taps, and children will have voice assistants in their bedrooms from birth.

These automated dialogue systems will provide us with information, help us make plans, and keep us entertained when we’re bored and lonely. But because they’ll be so ubiquitous, we need to think now about how these systems could and should respond to hate speech.

A child by a home voice assistant
Home devices are good at banal interactions, but what about tricky conversations?
Tyler Nottley/Shutterstock

Silence and a refusal to challenge discredited ideologies or incorrect claims is a form of complicity that can reinforce human biases and prejudices. This is why my colleagues and I organised an interdisciplinary online workshop last year to encourage more extensive research into the difficult task of automating effective counterspeech.

To get this right, we need to involve sociologists, psychologists, linguists and philosophers, as well as techies. Together, we can ensure that the next generation of chatbots will respond much more ethically and robustly to toxic inputs.

In the meantime, while our humble AutoCounterspeech prototype is far from perfect (have fun trying to break it) we have at least demonstrated that automated systems can already counter offensive statements with something more than mere disengagement and avoidance.

best barefoot shoes

Source link

  • David Bridges

    David Bridges

    David Bridges is a media culture writer and social trends observer with over 15 years of experience in analyzing the intersection of entertainment, digital behavior, and public perception. With a background in communication and cultural studies, David blends critical insight with a light, relatable tone that connects with readers interested in celebrities, online narratives, and the ever-evolving world of social media. When he's not tracking internet drama or decoding pop culture signals, David enjoys people-watching in cafés, writing short satire, and pretending to ignore trending hashtags.

    Related Posts

    Meta’s AI Agents: Transforming Your Entire Business Experience

    Spread the love

    Spread the love Share It: ChatGPT Perplexity WhatsApp LinkedIn X Grok Google AI Mark Zuckerberg Envisions Meta’s AI Agents Transforming Business Operations  BusinessDesk | NZ Access the original article here. The…

    Read more

    Engine Sound and Light Check at WWTR Gateway with IndyCar

    Spread the love

    Spread the love Share It: ChatGPT Perplexity WhatsApp LinkedIn X Grok Google AI Check Engine Sounds and Lighting at WWTR Gateway @IndyCarOnFOX  x.com Discover the original content, including the photos and…

    Read more

    You Missed

    Money Robot Submitter Review 2026: Is This Backlink Automation Tool Worth It?

    Money Robot Submitter Review 2026: Is This Backlink Automation Tool Worth It?

    Meta’s AI Agents: Transforming Your Entire Business Experience

    Meta’s AI Agents: Transforming Your Entire Business Experience

    US States Sue Paramount to Block Warner Bros. Merger

    US States Sue Paramount to Block Warner Bros. Merger

    Masters of the Universe: Is There a Post-Credits Scene?

    Masters of the Universe: Is There a Post-Credits Scene?

    Engine Sound and Light Check at WWTR Gateway with IndyCar

    Engine Sound and Light Check at WWTR Gateway with IndyCar

    Amy Poehler: The Reigning Queen of Podcasting

    Amy Poehler: The Reigning Queen of Podcasting

    Next Generation of Social Apps: Beyond Instagram Insights

    Next Generation of Social Apps: Beyond Instagram Insights

    Daycare Worker Faces Assault Charges for Toddler Abuse

    Daycare Worker Faces Assault Charges for Toddler Abuse

    Facebook Halted by Prime Minister Kim Min-seok Over Likes in Korea

    Facebook Halted by Prime Minister Kim Min-seok Over Likes in Korea

    Resonant’s Perspective: New York Mirrors the Backrooms

    Resonant’s Perspective: New York Mirrors the Backrooms