FacebookInstagramTwitterContact

 

You C-1000 Basketball Cup           >>           Brunei Youth League U18           >>           Lela Cheteria League           >>           Water Tank Cleaning Works           >>           Doa Kesyukuran Ceremony           >>           Major Responsibility in Realising Wawasan Brunei 2035           >>           Seminar in Conjunctions With CIPTA 2025           >>           Appointment Letters for Mosque Takmir Committee Presentation           >>           igning of MoU with Cadi Ayyad University of Morocco           >>           Signing of MoU           >>          

 

SHARE THIS ARTICLE




REACH US


GENERAL INQUIRY

[email protected]

 

ADVERTISING

[email protected]

 

PRESS RELEASE

[email protected]

 

HOTLINE

+673 222-0178 [Office Hour]

+673 223-6740 [Fax]

 



Upcoming Events





Prayer Times


The prayer times for Brunei-Muara and Temburong districts. For Tutong add 1 minute and for Belait add 3 minutes.


Imsak

: 04:34 AM

Subuh

: 04:44 AM

Syuruk

: 06:09 AM

Doha

: 06:33 AM

Zohor

: 12:22 PM

Asar

: 03:48 PM

Maghrib

: 06:34 PM

Isyak

: 07:49 PM

 



The Business Directory


 

 



Internet & Media


  Home > Internet & Media


Websites Accuse AI Startup Anthropic Of Bypassing Their Anti-Scraping Rules And Protocol


Anthropic

 


 July 29th, 2024  |  00:58 AM  |   532 views

ENGADGET

 

iFixit and Freelancer said Anthropic's bot aggressively crawled their websites.

 

Freelancer has accused Anthropic, the AI startup behind the Claude large language models, of ignoring its "do not crawl" robots.txt protocol to scrape its websites' data. Meanwhile, iFixit CEO Kyle Wiens said Anthropic has ignored the website's policy prohibiting the use of its content for AI model training. Matt Barrie, the chief executive of Freelancer, told The Information that Anthropic's ClaudeBot is "the most aggressive scraper by far." His website allegedly got 3.5 million visits from the company's crawler within a span of four hours, which is "probably about five times the volume of the number two" AI crawler. Similarly, Wiens posted on X/Twitter that Anthropic's bot hit iFixit's servers a million times in 24 hours. "You're not only taking our content without paying, you're tying up our devops resources," he wrote.

 

Back in June, Wired accused another AI company, Perplexity, of crawling its website despite the presence of the Robots Exclusion Protocol, or robots.txt. A robots.txt file typically contains instructions for web crawlers on which pages they can and can't access. While compliance is voluntary, it's mostly just been ignored by bad bots. After Wired's piece came out, a startup called TollBit that connects AI firms with content publishers reported that it's not just Perplexity that's bypassing robots.txt signals. While it didn't name names, Business Insider said it learned that OpenAI and Anthropic were ignoring the protocol, as well.

 

Barrie said Freelancer tried to refuse the bot's access requests at first, but it ultimately had to block Anthropic's crawler entirely. "This is egregious scraping [which] makes the site slower for everyone operating on it and ultimately affects our revenue," he added. As for iFixit, Wiens said the website has set alarms for high traffic, and his people got woken up at 3AM due to Anthropic's activities. The company's crawler stopped scraping iFixit after it added a line in its robots.txt file that disallows Anthropic's bot, in particular.

 

The AI startup told The Information that it respects robots.txt and that its crawler "respected that signal when iFixit implemented it." It also said that it aims "for minimal disruption by being thoughtful about how quickly [it crawls] the same domains," which is why it's now investigating the case.

 

AI firms use crawlers to collect content from websites that they can use to train their generative AI technologies. They've been the target of multiple lawsuits as a result, with publishers accusing them of copyright infringement. To prevent more lawsuits from being filed, companies like OpenAI have been striking deals with publishers and websites. OpenAI's content partners, so far, include News Corp, Vox Media, the Financial Times and Reddit. iFixit's Wiens seems open to the idea of signing a deal for the how-to-repair's website's articles, as well, telling Anthropic in a tweet he's willing to have a conversation about licensing content for commercial use.

 

 


 

Source:
courtesy of ENGADGET

by Mariella Moon

 

If you have any stories or news that you would like to share with the global online community, please feel free to share it with us by contacting us directly at [email protected]

 

Related News


Lahad Datu Murder: Remand Of 13 Students Extende

 2024-03-30 07:57:54

How Trump's Tariff Chaos Could Reshape Asia's Businesses

 2025-07-07 10:22:30

Crying At Work: A Sign Of Strength, Weakness Or Just Being Human?

 2025-07-06 01:39:58