This report highlights an advanced WAF bypass tactic employed by Perplexity, targeting Cloudflare's protections and robots.txt rules. The bypass involves using stealth crawlers that rotate IP addresses and ASNs, combined with spoofing user agents to impersonate legitimate browsers such as Chrome on macOS. These methods allow Perplexity's bots to evade detection and access large volumes of data despite the presence of Cloudflare's security measures. The scale of this operation is massive, with millions of requests per day across tens of thousands of domains, demonstrating the effectiveness of IP rotation and user agent spoofing in large-scale WAF circumvention. This kind of universal bypass does not target a specific vulnerability type but instead exploits trust and identification mechanisms, enabling pervasive crawling that bypasses typical WAF defenses and robots.txt restrictions.
Original tweet: https://twitter.com/Newtalics/status/1959235408300019893