Last week, it was reported that Reddit had decided to block access to its data from search engines unless they pay a fee to do so, including Microsoft"s Bing. So far, only Google"s search engine has been confirmed as paying Reddit for the rights to access its data.
In a post on X earlier this week, Microsoft"s head of search, Jordi Ribas, pointed out that the company "provided all publishers including Reddit with webmaster crawling controls in September 2023". Even with this feature, Ribas stated in a follow-up post that Reddit decided to block Bing from its data anyway "favoring another search engine and impacting competition from Bing and Bing-powered engines."
Despite this, Reddit has blocked Bing from crawling their site for search, favoring another search engine and impacting competition from Bing and Bing-powered engines.
— Jordi Ribas (@JordiRib1) July 29, 2024
Today, as part of a new interview on The Verge, Reddit CEO Steve Huffman offered his side of the story. He claims that Microsoft had already been taking data from Reddit and using it to train its AI service, along with summarizing its content in the Bing search engine "without telling us".
Huffman added that two more AI companies, Anthropic and Perplexity, were also training their systems via Reddit"s data. He stated:
We’ve had Microsoft, Anthropic, and Perplexity act as though all of the content on the internet is free for them to use . . . That’s their real position.
Indeed, Microsoft AI CEO Mustafa Suleyman recently said in a separate interview that in terms of using data for AI, " Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” that’s been the understanding."
Huffman stated it was "a real pain in the ass to block these companies". However, he does feel that the idea of a search engine taking content from a site and reusing it without some kind of compensation is changing, adding "he value exchange of crawling in exchange for traffic back is becoming muddied.”