Perplexity Accused of Unauthorized Website Scraping Again

Spread the love

Reports from Cloudflare reveal alarming tactics employed by Perplexity regarding its web crawling practices. Allegations suggest that Perplexity’s web crawlers are engaging in stealth crawling techniques, cleverly disguising their identities to bypass common restrictions found in robots.txt files and firewalls. This deceptive behavior raises significant concerns about data scraping ethics and compliance with web standards.

The robots.txt file is a critical resource that informs web crawlers about a website’s scraping permissions. Perplexity’s official crawling entities, known as PerplexityBot and Perplexity-User, are supposed to adhere to these guidelines. However, Cloudflare’s investigations revealed that even when these specific bots were disallowed by the robots.txt file, Perplexity managed to extract content from new, unindexed websites. This issue persisted even for sites equipped with specific Web Application Firewall (WAF) rules, which are designed to restrict web crawlers from accessing their content.

A flowchart created by Cloudflare to illustrate the different ways Perplexity's web crawlers try to access the content of a website.

Cloudflare

According to Cloudflare’s analysis, it appears that Perplexity is circumventing these security measures by utilizing “a generic browser intended to impersonate Google Chrome on macOS” when its designated bots are restricted by robots.txt. Additionally, the testing showed that Perplexity’s unidentified crawler could alternate between different IP addresses not included in its official range, allowing it to breach firewalls. Furthermore, Cloudflare indicated that Perplexity’s operations extend to using multiple autonomous system numbers (ASNs), which are unique identifiers for groups of IP addresses managed by a single entity, noting that the crawler was observed to switch ASNs “across tens of thousands of domains and millions of requests per day.”

See also  Build Your Custom Mac at Apple's Online Store Today

In response to the findings, Engadget has reached out to Perplexity seeking their perspective on Cloudflare’s claims. This article will be updated should we receive any feedback from the company.

Access to current and accurate information from various websites is essential for companies that are training AI models. This is particularly true for services like Perplexity, which aim to function as alternatives to traditional search engines. In previous incidents, Perplexity has also been reported to bypass restrictions to maintain updated content access. In 2024, several websites highlighted that Perplexity continued to reach their content despite explicit prohibitions in their robots.txt files. At that time, the company attributed this issue to third-party web crawlers it had employed. Subsequently, Perplexity established partnerships with various publishers to share revenue generated from advertisements placed alongside their content, seemingly as a corrective measure for past infractions.

Efforts to prevent companies from scraping content from the internet will likely remain a persistent challenge, akin to a game of whack-a-mole. In the interim, Cloudflare has taken steps to exclude Perplexity’s bots from its verified bot list and has implemented mechanisms to identify and block Perplexity’s stealth crawler from gaining access to its customers’ valuable content.

best barefoot shoes

Here you can find the original content; the photos and images used in our article also come from this source. We are not their authors; they have been used solely for informational purposes with proper attribution to their original source.

  • David Bridges

    David Bridges

    David Bridges is a media culture writer and social trends observer with over 15 years of experience in analyzing the intersection of entertainment, digital behavior, and public perception. With a background in communication and cultural studies, David blends critical insight with a light, relatable tone that connects with readers interested in celebrities, online narratives, and the ever-evolving world of social media. When he's not tracking internet drama or decoding pop culture signals, David enjoys people-watching in cafés, writing short satire, and pretending to ignore trending hashtags.

    Related Posts

    Robotics AI Startup Acquisition Fuels Meta’s Humanoid Push

    Spread the love

    Spread the love Share It: ChatGPT Perplexity WhatsApp LinkedIn X Grok Google AI Frame Stock Footage/Shutterstock Meta has announced the acquisition of Assured Robot Intelligence (ARI), a promising startup focused…

    Read more

    Hippo Horror Movie: Hollywood Finally Delivers the Thrills

    Spread the love

    Spread the love Share It: ChatGPT Perplexity WhatsApp LinkedIn X Grok Google AI As an enthusiastic fan of the horror genre, I can confidently say that the year 2026 has…

    Read more

    You Missed

    Prodentim Reviews: Customer Feedback, User Results & Oral Health Benefits

    Prodentim Reviews: Customer Feedback, User Results & Oral Health Benefits

    Sydney Sweeney and Scooter Braun’s Cozy Stagecoach Moments

    Sydney Sweeney and Scooter Braun’s Cozy Stagecoach Moments

    Thoughtful Mother’s Day Gifts for the Time-Strapped

    Thoughtful Mother’s Day Gifts for the Time-Strapped

    Robotics AI Startup Acquisition Fuels Meta’s Humanoid Push

    Robotics AI Startup Acquisition Fuels Meta’s Humanoid Push

    Miss Universe Philippines 2026 Reveals Top 15 Finalists

    Miss Universe Philippines 2026 Reveals Top 15 Finalists

    OMB Peezy’s Emotional Response to Girlfriend’s Grades

    OMB Peezy’s Emotional Response to Girlfriend’s Grades

    Hippo Horror Movie: Hollywood Finally Delivers the Thrills

    Hippo Horror Movie: Hollywood Finally Delivers the Thrills

    Break Out of Social Media Echo Chambers Effectively

    Break Out of Social Media Echo Chambers Effectively

    Shovel Location Guide for Goat Simulator 3

    Shovel Location Guide for Goat Simulator 3

    Met Gala 2026: Date, Theme, and Red Carpet Viewing Guide

    Met Gala 2026: Date, Theme, and Red Carpet Viewing Guide