hhmx.de

piperef

· Föderation EN Do 23.01.2025 23:28:17

@clive

Does anyone know, if these web-crawlers also scrape the content of kindle e-books? Can they enter these kind of products?
Is anything safe from this kind of scraping?
How can we protect internet content against it in general? A standard website will be scraped easily, or?

@jasonkoebler @404mediaco

Frank Heijkamp

Föderation NL Fr 24.01.2025 00:06:41

@piperef @clive @jasonkoebler @404mediaco Bezos probably already sold everything to those AI houses.

Föderation EN Fr 24.01.2025 00:23:59

@piperef @clive @jasonkoebler @404mediaco If it can be read, it can be scraped. You can mitigate the issue (often by putting it behind an account wall), but not eliminate it entirely. The film industry has been desperately trying to stop piracy and I have yet to see a situation where a movie was released but wasn't available on piracy sites.

But also, yeah, if it's kindle, it's probably already part of Amazon's AI dataset.

Clive Thompson

Föderation EN Fr 24.01.2025 00:37:06

@StarkRG @piperef @jasonkoebler @404mediaco

yeah, I am sure this is true

I read recently, though I can't find the source (still looking, will update if I can find it) that US AI firms used corpuses of cracked western ebooks that circulate in Russia etc, for training

piperef

Föderation EN Fr 24.01.2025 09:57:05

@clive

This is so ironic. Definitely plausible, but also scary.

@StarkRG @jasonkoebler @404mediaco