2w ago

How to keep bots and AI slop off lemmy?

I have noticed that lemmy so far does not have a lot of fake accounts from bots and AI slop at least from what I can tell. I am wondering how the heck do we keep this community free of that kind of stuff as continuous waves of redditors land here and the platform grows.

EDIT a potential solution:

I have an idea where people can flag a post or a user as a bot and if it's found out to be a bot the moderators could have some tool where the bot is essentially shadow banned into an inbox that just gets dumped occasionally. I am thinking this because then people creating the bots might not realize their bot has been banned and try and create replacement bots. This could effectively reduce the amount of bots without bot creators realizing it or know if their bots have been blocked or not. The one thing that would also be needed is a way to request being un-bannned if they get hit as a false positive. these would have to be built into lemmy's moderation tools and I don't know if any of that exists currently.

53 comments

My instance has "Rule 3: No AI Slop. This is a platform for humans to interact" and it's enforced pretty vigorously.
As far as "how":
Sometimes it's obvious. In those cases, the posts are removed and the account behind it investigated. If the account has a pattern of it, they get a one way ticket to Ban City
Sometimes they're not obvious, but the account owner will slip up and admit to it in another post. Found a handful that way, and you guessed it, straight to Ban City.
Sometimes t's difficult on an individual post level unless there are telltale signs. Typically have to look for patterns in different posts by the same account and account for writing styles. This is more difficult / time consuming, but I've caught a few this way (and let some slide that were likely AI generated but not close enough to the threshold to ban).
I hate the consumer AI crap (it has its place, but in every consumer product is not one of them), but sometimes if I'm desperate, I'll try to get one of them to generate a similar post as one I'm evaluating. If it comes back very close, I'll assume the post I'm evaluating was AI-generated and remove it while looking at other content by that user, changing their account status to Nina Ban Horn if appropriate.
If an account has a high frequency of posts that seems unorganic, the Eye of Sauron will be upon them.
User reports are extremely helpful as well
I've even banned accounts that post legit news articles but use AI to summarize the article in the post body; that violates rule 3 (no AI slop) and Rule 6 (misinformation) since AI has no place near the news.
If you haven't noticed, this process is quite tedious and absolutely cannot scale under a small team. My suggestion: if something seems AI generated, do the legwork yourself (as described above) and report them; be as descriptive in the report as possible to save the mod/admin quite a bit of work.
- That's interesting I suppose everyone has their own moderation styles. To me I am not 100% opposed to all AI. I define AI slop more like really low effort posts and bulk posts. So a person who is just posting all AI generated content and cross posting to tons of community. Basically AI spam I guess you could say. If someone was to say generate an AI image and make a post talking about the prompt they used and maybe sharing what they like about the image and then commenters make derivatives or share their own results using a similar prompt I could see that sort of post being useful. Maybe there is a balance... but at the same time I can see that some people might prefer an instance that takes more of a hard line stance.
- Seems to me like we could build a bot for this…🤔
  
  I refuse to reward the "Create problem, sell solution" business model these tech companies are employing lol.
- Maybe you could train an AI to help you with the tedious work of banning AI? Lol
- Sometimes t’s difficult on an individual post level unless there are telltale signs. Typically have to look for patterns in different posts by the same account and account for writing styles.
  The problem is that this is only going to get harder. First of all, AI is going to get better and be able to produce more natural sounding stuff.
  But also, people will inevitably get affected by AI as well and people will drift towards sounding more like AI too. So both AI and humans will converge on each other and they'll likely be impossible to tell apart in general in not too many years.
  I'm not sure how we solve this tbh.
  
  But also, people will inevitably get affected by AI as well and people will drift towards sounding more like AI too.
  The “AI checkers” that schools/unis use has found a strong correlation between neurodiversity and sounding like AI. Basically, AI sounds autistic, so autistic people get flagged as AI.
- That instance bans people for nothing, and has some automated ban sync system in place. It’s crazy.
  
  We're not a general purpose instance, we have a defined mission statement, and the site info clearly states the rules apply to local and federated accounts. 🤷‍♂️ And the ban syncs are no longer needed as later versions of Lemmy server do the same thing automatically (our automod just implemented something almost identical prior to Lemmy adding that natively).

What I find as annoying than bots is real people copy/pasting their comments from ChatGPT prompts because they can't be arsed to formulate/organize their own thoughts. It is just aggressively wasting both their and my time. Mindboggling.

Re: bots
If feasible, I think the best option would be an instance that functions similarly to how Reddit’s now defunct r/BotDefense operated and instances which want to filter out bots would federate with that. Essentially, if there is an account that is suspect of being a bot, users could submit that account to this bot defense server and an automated system would flag obvious bots whereas less obvious bots would have to be inspected manually by informed admins/mods of the server. This flagging would signal to the federated servers to ban these suspect/confirmed bot accounts. Edit 1: This instance would also be able to flag when a particular server is being overrun by bots and advise other servers to temporarily defederate.
If you are hosting a Lemmy instance, I suggest requiring new accounts to provide an email address and pass a captcha. I’m not informed enough with the security side of things to suggest more, but https://lemmy.world/c/selfhosted or the admins of large instances may be able to provide more insight for security.
Edit 2: If possible, an improved search function for Lemmy, or cross-media content in general, would be helpful. Since this medium still has a relatively small userbase, most bot and spam content is lifted from other sites. Being able to track where bots’ content is coming from is extremely helpful to conclude that there is no human curating their posts. This is why I’m wary of seemingly real users on Lemmy who do binge spam memes or other non-OC. Being able to search for a string of text, search for image sources/matching images, being able to search for strings of text within an image, and being able to find original texts that a bot has rephrased are on my wishlist.
Re: AI content
AFAIK, the best option is just to have instance/community rules against it if you’re concerned about it.
The best defense against both is education and critical examination of what you see online.
- If you are hosting a Lemmy instance, I suggest requiring new accounts to provide an email address and pass a captcha
  Those are easy to bypass (or a human can spin up a bunch with throwaway emails and plug them into bots). I recommend enabling registration applications. While not foolproof, it gives the admins eyes on every new account. Also, consider denying any application that uses a throwaway email service.
- If you are hosting a Lemmy instance, I suggest requiring new accounts to provide an email address and pass a captcha.
  The captchas are ridiculously ineffective and anyone can get dummy emails. Registration applications is the only way to go.
  
  Plenty of websites filter out dummy email generators, could do the same in addition to applications. Making a drawing of something specific, but random (think of a list of a dozen or two images gen-ai gets wrong) could be a captcha replacement.

I don't think there's really a solution to this.
Everyone is so fixated on getting more users but honestly I don't think that will make it a better experience.
- Growth for growth’s sake is the destruction of many good things.
  Keep Lemmy Obscure!
- I kind of agree. It seems like there is some point at which it's ideal and then after it grows to a certain size things become unhinged.
  
  It would be nice to see some additional interests and communities.
- To me it would be worth it for Lemmy to get somewhat Eternal September'd if it meant Reddit being destroyed/replaced with something that isn't a company.
  
  I respect your opinion, and can see some benefit to reddit's demise, but I think I'm too cynical and jaded to hold that belief.
  It looks like bluesky will be twitter's replacement, and it's not clear that bluesky will be better.
  If reddit implodes there's not really any likelihood that refugees will seek out lemmy.
  That said, at least lemmy is self hostable and federated. If the larger lemmy network did shit itself there would be smaller instances which are not federated with the majority of other servers so potentially they might be somewhat sheltered from bots and trolls.

Just ask them to draw images of full glasses of wine.
- here's my captcha, hope I pass 🙏
  
  Same

While shadow banning is an option, it's also a terrible idea because of how it will eventually get used.
Just look at how Reddit uses it today.
- I am getting shadow banned constantly just by existing in a developing country. let's not do this with Lemmy because otherwise I'm fucked
- It would only succeed in filtering really low effort bots anyway, because it's really easy to programmatically check if you are shadowbanned. Someone who is trying to ban evade professionally is going to be way better equipped to figure it out than normal users.

I've noticed that a lot of ai slop exists in dedicated instances, so the first answer is to just block those. Dunno of blocking whole instances also blocks cross-posts from said instance but it's the first step in the journey of 1000 miles.
Bots tend to have flair stating they're bots, which makes them easier to block. And since lemmy preaches open-source like gospel, you can probably write some optional code for specific lemmy clients that auto blocks that stuff for you
- You can block properly flaired bots in your user settings.

Keeping bots and AI-generated content off Lemmy (an open-source, federated social media platform) can be a challenge, but here are some effective strategies:
Enable CAPTCHA Verification: Require users to solve CAPTCHAs during account creation and posting. This helps filter out basic bots.
User Verification: Consider account age or karma-based posting restrictions. New users could be limited until they engage authentically.
Moderation Tools: Use Lemmy’s moderation features to block and report suspicious users. Regularly update blocklists.
Rate Limiting & Throttling: Limit post and comment frequency for new or unverified users. This makes spammy behavior harder.
AI Detection Tools: Implement tools that analyze post content for AI-generated patterns. Some models can flag or reject obvious bot posts.
Community Guidelines & Reporting: Establish clear rules against AI spam and encourage users to report suspicious content.
Manual Approvals: For smaller communities, manually approving new members or first posts can be effective.
Federation Controls: Choose which instances to federate with. Blocking or limiting interactions with known spammy instances helps.
Machine Learning Models: Deploy spam-detection models that can analyze behavior and content patterns over time.
Regular Audits: Periodically review community activity for trends and emerging threats.
Do you run a Lemmy instance, or are you just looking to keep your community clean from AI-generated spam?
- I see what you did there
  
  Yep, realized in the first sentence and had a laugh
- Lmao

By not having a corporate owner who wants the site to appear more active

In short: you don't.

The only true solution to this is cryptographically signed identities.
One method is identity verification tied to a public key, which can be done with claims aggregation (I am X on GitHub, and Y on LinkedIn, and Z on my national ID, etc), but this removes anonymous use.
Another is a central resource to verify a user's key is a real human, where only one entity controls the identity verification. While this allows pseudo anonymous use, it also requires everyone to trust one individilual entity, and that has other risks.
We've been discussing this with FedID a lot, lately.

What stops banned people from creating a new account and continuing?
- Little at this point. That's bound to change eventually. Just ask Nicole from Toronto.
  
  I smoked dabs with Nicole from Toronto

Shadow ban doesn't do anything because the people running the bot could just create a script to check if the comments is visible from another account (or logged out). And if it isn't visible, they'll know there's a shadowban.

I don't think it really answers your question. But I have been blocking every AI comm that comes up on my feed. Except for c/fuckai

Cunningham's law helps. You can make a stand-alone website that's slop and hope an individual user doesn't notice the hallucinations, but on Lemmy people can reply and someone's going to raise the alarm.

I hate memes and images so I don't look at any of them on this platform so I don't know what you're talking about. You're welcome

I was going to make a joke by posting an obviously stupid and AI generated answer, but I'm genuinely surprised by the insightfulness of Gemini's answer:
Dealing with bots and AI-generated "slop" (low-quality, repetitive, or irrelevant content) is an ongoing challenge for any online platform, including Lemmy. Here's a breakdown of strategies and considerations: Challenges:
Federation Complexity:
Lemmy's federated nature means content comes from various instances, making centralized control difficult.
Moderation efforts need to be distributed, which can lead to inconsistencies.
AI Advancement:
AI-generated content is becoming increasingly sophisticated, making it harder to distinguish from human-created posts.
Bots can mimic human behavior, evading simple detection methods.
Balancing Freedom and Control:
Lemmy values user autonomy and free speech, which can conflict with aggressive moderation.
Overly strict measures might stifle legitimate contributions. Strategies:
Instance-Level Moderation:
Individual Lemmy instances play a crucial role in setting their own moderation policies.
Instance administrators can implement tools and rules to filter out unwanted content.
Blocking known bot-infested instances is a common practice.
Community-Driven Moderation:
Empowering users to report suspicious activity and content.
Implementing robust reporting mechanisms and clear guidelines.
Utilizing community moderators to review and take action on reports.
Technical Solutions:
Developing algorithms to detect patterns of bot behavior and AI-generated content.
Implementing CAPTCHAs or other verification methods to prevent automated accounts.
Enhancing Lemmy's software to provide better moderation tools.
User Filtering:
Lemmy itself has user level settings that allow for the hiding of bots. As shown in the search results, there are settings that can hide bot activity.
Community Education:
Raising awareness about the dangers of bots and AI slop.
Educating users on how to identify and report suspicious content.
Development of Lemmy Software:
As shown in the github issue, there is development discussion regarding excluding bot comments from sorting algorithms. This type of software development will be key to combating bot influence on the platform. Key Considerations:
The fight against bots and AI slop is an ongoing process.
A multi-faceted approach is necessary, combining technical solutions, community involvement, and responsible moderation.
Maintaining a balance between control and freedom is essential for the long-term health of the Lemmy network.
- I was going to make a joke by posting an obviously stupid and AI generated answer
  Non-joke slop is even worse.
- I don't mind you generating an AI answer to gain insight. However it would be better if you took the reply and then gave your own view on it rather than just reposting it.

53 comments