So according to this article: dropsitenews.com/p/meta-facebo…

#Meta is scraping the media proxies of mstdn, masto and .coffee..

If this is true, this is very worriying and pisses me very much off

No wonder our media loads so crappy if they are constantly tapping in..

Fuck #Meta to hell

#meta

reshared this

in reply to stux⚡️

i wonder if adding another line to that stanza, specifically so that the logs of everything hitting that rule goes to a separate log file - so you can harvest all the source IPs out - would be handy?

because im sure they will do the anthropic thing soon, and will start moving their scrapers into various other clouds to get around people blocking their asn and user agents.

if you can fingerprint their patterns, block them by pattern >😁

reverse-cambrige-analytica them.

in reply to stux⚡️

I've been looking at The Ultimate Nginx Bad Bot Blocker, I just want to make sure it doesn't include Mastodon due to the "DDOS" link preview issue.

It claims, "The Ultimate Nginx Bad Bot, User-Agent, Spam Referrer Blocker, Adware, Malware and Ransomware Blocker, Clickjacking Blocker, Click Re-Directing Blocker, SEO Companies and Bad IP Blocker with Anti DDOS System, Nginx Rate Limiting and Wordpress Theme Detector Blocking. Stop and Block all kinds of bad internet traffic even Fake Googlebots from ever reaching your web sites. "

github.com/mitchellkrogza/ngin…

This entry was edited (Saturday, August 9, 2025, 7:43 PM)
in reply to stux⚡️

Wouldn't it be more effective to fight fire with fire?

Create a mass amount of fake-ish looking content and then serve that up as real content to the scraper, effectively poisoning the AI? So impersonate a fake user post, a fake image with improper alt text.

This way, they might not catch onto it right away. A 403 means they'll instantly change their methodology because they know they're blocked.

in reply to stux⚡️

does that mean Fedi people can take part in the class action law suit that could do "immense harm not only to a single AI company, but to the entire fledgling AI industry and to America’s global technological competitiveness." as stated by said industry reps? :think_bread:

arstechnica.com/tech-policy/20…

via @Lazarou
mastodon.social/@Lazarou/11499…

in reply to stux⚡️

techcrunch.com/2025/08/04/perp…

Once you put information out there, it appears corporations and other unscrupulous actors take that as an invitation to use the "free data" for whatever they want. They have weaponized it more than once (clearview, et al.) and I see nothing that prevents them from doing this again. "Public domain", they'll cry.

I've minimized my web presence, but there's nothing to suggest they will stop at what they consider "public domain".

This website uses cookies. If you continue browsing this website, you agree to the usage of cookies.