Skip Navigation

an incomplete list of fediverse instances scraped by meta to train AI

cyberpunk.lol :pona_plush: #FediPact :pona_plush: (@FediPact@cyberpunk.lol)

# INSTANCES KNOWN TO HAVE BEEN SCRAPED BY META INCLUDE: • mastodon.social • mastodon.online • tech.lgbt • hackers.town • chaos.social • mastodon.org.uk • mastodont.cat • mastodon.de • mastodon.xyz • mastodon.coffee • mastodon.cloud • mastodon.scot • mastodonapp.uk • mastodon.green • m...

26 comments
  • I see that shitposter.club is on the list. Good to know they're using only the highest-quality training material.

    • I tried to visit but their security certificate is expired. Are they still a legit site?

      • Moved to shitposter.world according to their site with the expired cert, but I haven't seen as much on fedi from the new domain as I used to from the old one.

  • This is only a loosely related thought, but are there any new foss licenses or anything that prohibit ai usage? I know it'll be ignored but it feels like explicitly disallowing things could be important in opening the door to successful legal challenges to ai scraping and theft...

    • Case law is still pretty young in this area, but it's looking like there's nothing actually against copyright about the training of AI on copyrighted content. It's not something that a license can restrict because the trainers can simply reject the license and carry on training under the basics of what the law allows them to do anyway.

      Open source licenses only have power because they grant permissions that people normally wouldn't have and put conditions on those permissions. If you don't need those permissions then you don't have to be bound by those conditions.

      • Ahhh, that sucks ass :(

        Thank you for expanding my understanding of the problem!

  • Can we poison our posts by putting a nonsense “signature” at the end of each of them?

26 comments