Skip Navigation

Mastodon thinks Lemmy’s privacy stinks. What say you?

Federated services have always had privacy issues but I expected Lemmy would have the fewest, but it's visibly worse for privacy than even Reddit.

  • Deleted comments remain on the server but hidden to non-admins, the username remains visible
  • Deleted account usernames remain visible too
  • Anything remains visible on federated servers!
  • When you delete your account, media does not get deleted on any server
413 comments
  • First - we're all using alpha/beta software (Lemmy is 0.17.4, Kbin is 0.10.). None of these services are "production quality" software yet, so let's keep that in our minds - we're all early adopters.

    The points mentioned in the OP are a bad look. Naturally. User should have expectation of their data being deleted on request - especially since this request might be regulatory privacy request (GDPR related). It's a clear failure from the software and should be improved and iterated upon.

    The expectation shouldn't be "oh well it's on the Internet, live with it". While Facebook might keep mining your data after deletion request, our software shouldn't behave like that, we should strive to be better with this stuff.

    And finally, ensuring privacy in federated system is hard. Mastodon suffers from same problems. We shouldn't give up on the idea though.

    • It is an early stage software and such things can be worked out, you're right. But on the other hand, such basic elements should be based on a thorough concept before a single line is coded, and implementing something like a delete button with "Let's just make it delete the most visible stuff for now, we can always improve that later when there is time" is recipe for disaster.

    • The more important part for privacy: Mail address is optional, and IP addresses are not stored in the database. A correctly configured instance (at least for EU legislation) also will not log IP addresses in the web server - with that you can have profiles that can't be tied to an actual human, and you don't have location and movement data.

      The data deletion is pretty much a nice to have - it's on the level of the Exchange feature to recall Emails: Sure, you can ask nicely, but outside of your own server pretty much nobody will care. Lemmy is federated over multiple jurisdictions, so even with full deletion implemented there'll almost certainly be instances which will ignore the deletion request - and it will be completely legal for them to do so. More important is education about what you publish, and a basic understanding of the technical and legal realities you'll have to deal with if you later decide you want that information gone.

      I already had that discussion with my 6 year old when she wanted to publish some videos - and she understood the problems quite well.

      • but outside of your own server pretty much nobody will care. Lemmy is federated over multiple jurisdictions, so even with full deletion implemented there’ll almost certainly be instances which will ignore the deletion request - and it will be completely legal for them to do so

        Lemmy also seems to federate your matrix_user_id, that is clear personal data. It does not matter how the data gets to the federated server, this is still user data within the scope of the GDPR. It does not matter that that server does not have an agreement with the user, the instance that would ignore a GPDR related deletion request would be in direct violation of the GDPR. Maybe it can do that without consequences, though.

        I completely understand that making Lemmy fully GPDR compliant will probably be impossible, however I don't like the approach of "we will not succeed, so we don't make any attempt". Instances should actually delete data when that is requested, or instance hosts can get fined. For now, Lemmy has bigger issues to solve, but eventually they should do at least a best effort attempt to respect user data.

    • But is it solvable at all in principle? The only enforcement policy available is defederation, but that just means future posts won't go to that instance, the older posts will still be there. Plus an instance could just lie when confirming delete requests and you'd never know unless the non-deleted posts leaked.

      • Not really, same as email. Once you send it out and it's on somebody else's server, you can request they delete it but that's about it. They have a copy of your message and can do whatever they want with that.

        This is not a principle that needs solving imo, it's the nature of Internet. If you post it online then you should know that there's a chance it'll be there permanently.

      • Hmm, it's an interesting problem. I'm afraid you are right and there's really nothing left but defederation - on the other hand, then it's the same as with stuff like the parsers that could show deleted reddit messages, or things like waybackmachine, which basically do the same, so the core logic of base lemmy source should be as privacy-respecting as possible.

        I remember few years ago when I was reading about Signal that there is some way how you can verify that their server is running on the same code as the one published (and audited heavily), so you can be 100% sure that there were no modifications. Wouldn't something like that be a solution? That would prevent servers from modifying the code that deletes data. I don't know how it works, and I couldn't find it when I tried looking for it again, but assuming such a thing is possible, each Lemmy instance could just have a verify widget on their VCS and you could be sure that this instance really does delete your data, since they didn't modify the deletion code.

        But this is just a theorycrafting, I wouldn't really have enough experience to create something like that and I can imagine that it's not an easy thing. But if anyone knows more details about the way Signal verification works, assuming I'm just didn't misunderstood something (since it's literally a memory I have of a single sentence from one random article when I was researching best private messages app), I would love to read more about the way it works!

        But yeah, outside of that, I'm afraid that the following set of features is mutually exclusive:

        • An user is able to delete their data, and it's guaranteed that they are deleted from everywhere.
        • If a lemmy instance dies, it's data is not lost.
        • There is not a single centralized authority for anything.

        Another option would be to create some kind of reputation system, where self-hosted bots could check for servers that still provide posts and comments that should be deleted, and flag offenders. But that's overengineering anyway, and as I've already said - there's still no way how to stop scraper or anyone from simply copying your data when they see it.

  • So, I was born in the late 90's - I don't know if they still have "computer literacy" as a core course in schools these days, but they did when I was going through K-12 (or, well K-9.. once you were in high school they assumed you knew the basics of how to use a computer, and had more advance courses).

    One of the very first things we learned about the internet is that once you put something on the internet, there is no way to take it back. At the time, uploading pictures to the "cloud" and such wasn't really a thing so we learnt this by using email: Once you've sent an email to someone, you cannot "unsend" it. You can kindly ask the other party to delete the copy of the email without opening it, but you cannot guarantee that the email wasn't saved on another computer, or saved somewhere else along the route between your computer and the receiver's computer. Clicking the send button was taught to us as "etching your letter into stone".

    Because of this, I've always (or at least, as far as I can remember) made sure that anything I put on the internet, or even "put into digital form" (such as even writing something in a file on your computer - you can recover deleted files from a hard drive unless you really put in the effort to actually erase it... there is a huge difference between erasing a file, and marking it as "deleted") is something that I'm okay being tied with me forever. I'm sure if you looked hard enough, you could find me participating on message boards as a young teenager - and to that I just say "Oh well". Is some of it probably very cringe-inducing and embarrassing? I have no doubt.

    (This is also why you should take extreme caution when talking about say, your friend, on the internet - if you post something about them on the internet, you're condemning them to this same exact thing)

    Now funnily enough, as far as I understand the ActivityPub protocol, it is for all intents and purposes the exact same as email in this regard. Once you've sent something, there are no "take backs". All you can do is kindly ask others to delete their copy, and that comes with zero guarantees. If I had a mastodon server, and someone deletes their toot - I could take down my server and my server would never receive that delete request. Or, just simply change the source code of the Mastodon instance on my server to straight up ignore deletion requests.

    Would it be nice for Lemmy to have a way to actually delete your content? Sure. But that's not technically feasible, and personally (as controversial as it may seem) I would rather Lemmy not try to give you the false sense that everything was completely gone forever. I'm not saying that you shouldn't be able to delete your account off a Lemmy instance, but it shouldn't come with an option that says "Check here to remove your data/media from all federated instances" because Lemmy/no one can promise that, and I really hate it when software (or really anyone/anything) attempts to make a promise in bad-faith knowing that they can't possibly ever uphold it.

    Anyone who thinks Reddit is "better" than Lemmy in this regard probably doesn't realize that Reddit is making a claim they can't keep. The most obvious example of this is all of these subreddits that have gone dark? You can bring up most of their posts on the Wayback Machine or Google Cache. That would be the case regardless of whether they were set to private, or even if they were just straight up "deleted".

    We really should not be setting the belief for people that there exists a way to completely nuke a piece of data off the internet, because you cannot make a guarantee of that being the case.

  • The illusion of Privacy is Mastodon (or social media in general)

    There's a reason why when you go to "private mentions" on Mastodon, this appears:

    While yes, we should be able to delete our content if we want, but it's a bit naive to think there could be true privacy in any decentralised social media platform.

    There's a reason why one of the think people tell you when you come to the fediverse is not to share personal and sensible information.

    The only decentralised social media that has some level of privacy is Matrix, and that's why it has it's own protocol and only federates within/between its own servers.

  • This demonstrates a fundamental misunderstanding of digital privacy. You can never be guaranteed that data is deleted, just like you can never be guaranteed that someone has "forgotten" something. It doesn't matter what any entity claims they are doing under the hood, you have to assume they can't be trusted. That's not an expectation you can have, and not something privacy advocates are asking for.

    I'm posting this comment publicly, and there's nothing stopping any random user (or non-user) from scraping this lemmy instance and archiving the data themselves. I know that when I post it. Same for reddit, raddle, any mastodon instance, etc. I can copy the text and usernames of everyone involved in that raddle thread and do whatever I want with it, there's nothing anyone can do to stop me.

    To think otherwise reminds me of that first day on the internet kid meme. "I deleted my comments off of their servers, hah, they'll never get them now!"

    What I can demand is: if I send a message directly to another party, I want to be able to verify that that party and ONLY that party can read the message (end-to-end encryption). I can also demand that they not require me to dox myself to them, that they not run weird js-based fingerprinting/port scanning processes on my system/network, and that I am allowed to connect to their services through a VPN should I so choose.

  • i mean raddle is a site that has an anti doctor post pinned in the mental health community ... like c'mon I and many others need medicine to survive and you are encouraging anti-psychiatrist posting, Church of Scientology levels of anti-medicalist posting

    • That's fucking ghoulish.

      — someone who has to do that shit in order to have a stable life where I don't want to end it all on a daily basis

  • Damn, Raddle seems worse than Reddit when it comes to toxic attitudes. I never looked much into it since it's just another centralized platform like Reddit with different management, but boy oh boy are those comments just awful. Great community you folks got over there 😬

  • Opposite to Instagram or Facebook, on Lemmy or Mastodon you can create an anonymous account. Yes it will be logged (normal public internet), but you won't be treacable. The UI doesn't have any tracking scripts, and many instances don't require an email even to sign up. Use the Tor browser to spoof your IP.

    • There are certainly ways to manage your privacy in how you use this service, and it's different in a lot of ways from other services out there. Users should be educated on the risks against different types of threat models:

      • In what ways can my comments be linked to my real world identity, through correlation to my username, registered email address/phone number/Matrix ID/other identifier, by other users of this service?
      • In what ways can my comments and activity be linked to my real world identity by site administrators or other privileged users of the service (through access to things like server logs, trackers, etc.)?
      • How can I control what activity I consider to be public or private on this service, and who can view that activity I prefer to be considered private?

      Even with end to end encryption (which Lemmy does not have for DMs), the most secure protocol is only as secure as the other end you don't control. People can and will screenshot, save, log, or simply remember what you've sent them before.

      Lemmy and ActivityPub are new services and protocols to a lot of people. The shortcuts they have internalized on what is or isn't true about privacy of other services (Facebook, Instagram, TikTok, Snapchat, Reddit, plain old email, cell phones, WhatsApp, iMessage/Facetime, etc.) need to be re-learned for these specific services.

      New users should understand that the Lemmy/ActivityPub protocols on deletion or privacy of DMs don't necessarily work like other services they're used to. And we should encourage robust discussion around these things until they become common knowledge.

  • It’s no different than me sending an email to someone and then sending a request to delete it. There likely is still a copy on the email provider’s server and the recipient could have potentially backed up their emails to something outside of the email ecosystem.

    Unfortunately the only way to be absolutely sure that there isn’t information you don’t want on the internet is to not share it at all. There will always be an issue of making sure every system actually deletes content when you request it. Like I said, that doesn’t stop anyone from backing up the data to another system. (E.g. Reddit archives from 2005 to now are available to download, even content that has already been deleted)

    • Honestly, I kinda question how good of a time investment it is to try and allow deletion from the public facing parts of the internet, given the numerous places where your content will be cached or otherwise stored.

      There is certainly some value in simply making it as hard as possible to find things you want to delete. Why let perfect be the enemy of good, after all. There's plenty of types of content we certainly want to do our best at deleting even if we can't be perfect. Eg, do you wanna be the one to tell a revenge porn victim, "sorry, we can't make it harder to find the content that harms you because we can't delete all of it anyway"?

      But at the same time, development time is limited. Everything is a trade off. We do have to decide what is most important, because we can't do it all immediately. The fact we can't actually delete everything does have to be a factor in this prioritization, too.

      There is something to be said about ensuring people know and understand that nothing can truly be 100% deleted once it's posted on the internet. Not that Lemmy is doing good about that, either (especially since deleted comments apparently lie about being deleted).

      All this said, I do think federated, reliable deletion is critical for illegal content. Such content needs to be removed quickly and easily from as many places as possible. Without this, instance owners are put at considerable legal risk. This risk poses a threat to the scalability of the Fediverse.

      • Oh I wish we had the ability to fully delete our content that we’ve posted or that someone has posted of us. Illegal content is a huge concern with federation. As soon as someone pushes something like that, it gets sent to all the federated instances so they have a copy as well. That is a huge concern for instance owners (and honestly the fediverse as a whole).

        I run a kbin instance and I’m a software developer for my day job. I honestly don’t have a great answer for “how do we ensure the data we request be deleted on the fediverse is actually deleted.” My best solution would for us to have several federated master databases that we maintain our federated content with. If there is a big delete flag for some content then the child instances will follow suit.

  • I didn't know anything about Raddle besides the name until now. But gosh, is that a needlessly toxic pit. There's a poor guy there getting completely beaten up by an admin and some others which seem to be enjoying their time-wasting public bullying. Oh well...

  • Mastodon's privacy issues are just the same as the rest of the fediverse/threadiverse.

    With federation there is more openness, transparency and accountability. Take care of your privacy, use alts.

  • I find all the "privacy isn't possible on the clearnet, lol" Commets quite troubling. Yes, the internet doesn't forget and we should always behave on the internet as if our moms could read it.

    But that kind of "privacy realism" fosters an additude that doesn't care about privacy at all; no matter how it could be improved (even if it's never perfect). Just because anyone on the street can follow me home and therefore can find my home address, I'm not carrying a sign with my address when going to a protest.

    According to this comment, privacy is worse than with mastodon. And while data always can be scraped, it still isn't too much to ask to properly federate deletions.

    Yes, the internet is a public place and reddit is bad and you might not like raddle, but come on, people. Have you all given up on improving things already? And do only tech-savvy people with the knowledge and resources to run their own servers have a right to privacy on the internet?

    • I think you are conflicting some things.

      The analogy you used doesn't quite work, because you are not telling everyone at the protest where you live. A more accurate analogy might be you going to a protest, loudly saying something which you later regret, and then ask everyone to just forget about it and delete any footage you might be on. Some might comply, but many won't, and you won't have any idea who didn't.

      Furthermore, "people with the knowledge and resources to run their own servers" would be no more safe than you are, because other servers (instances) will still record whatever they post out there. If I make my own federated server and send out a comment, other instances that federate with mine will receive a copy of it. At that point I can ask them to delete it; however, even if they do comply, there is no guarantee that another user hasn't made a local backup of the comment or just screenshotted it.

      At the end of the day, tech isn't magic. Everything has limitations, and you can't do everything at once. You can't have a system that allows you to make public comments that go out to several servers where it is shown to thousands or millions of people, and at the same time expect to be able to delete all of it when you feel like it. Tech can't do everything, and at some point we need to take agency and accept responsibly for what we put out there.

      Finally, I'll add on what another user said:

      Opposite to Instagram or Facebook, on Lemmy or Mastodon you can create an anonymous account. Yes it will be logged (normal public internet), but you won’t be treacable. The UI doesn’t have any tracking scripts, and many instances don’t require an email even to sign up. Use the Tor browser to spoof your IP.

  • Anyone who has open discussions on the Internet and thinks they're somehow private is a fool. Short of end to end encrypted chat I'm not sure what they expect.

  • After reading some more comments, I think I came up with a good analogy to explain this issue, and I wanted to share.

    Think of websites like a bar that also has an open mic.

    Now, when I go to a bar, I don't want to have to give the bouncers and staff my full name as well as my address. I also wouldn't want them to know that I just came, for example, from a store where I was looking for a vacuum, and then have them warn a vacuum seller about it. A vacuum seller who is then going to sit next to me, while I'm trying to have a drink, and show me a pamphlet regarding the "amazing vacuum" he has for sale.

    Ideally, I can also look for a bar that will allow me to come in costumed and not show my face. Or I could ask the bar to delete footage of me at some point, and to not store my ID if I do have to show it to a bouncer at the entrance.

    All of that is relatively feasible and within the realm of reason; and all of that are things that privacy advocates might advocate for.

    However, what is not feasible, or within the realm of reason, or what privacy advocates tend to advocate for, is the ability for me to willingly go up on stage, say something on the mic which I immediately regret, and then ask everyone present to forget it ever happened and delete any footage they might have of it. No reasonable person would ask for something like that, because it is not a reasonable request.

    That is how regular websites work. With federated websites, that becomes enhanced; it's like if the bar you're in has a camera pointed at the microphone, and transmits both video and audio directly into several other bars. So when you go up to that mic, you better make sure you're okay with what you are saying being made public and available to anyone.

    • Allow me to pick your example apart a bit.

      However, what is not feasible, or within the realm of reason, or what privacy advocates tend to advocate for, is the ability for me to willingly go up on stage, say something on the mic which I immediately regret, and then ask everyone present to forget it ever happened and delete any footage they might have of it. No reasonable person would ask for something like that, because it is not a reasonable request.

      That's not what is demanded. No one demands that the audience (users) forget what I said (the comment), much less: immediately. No one is asking for mind-erasing power or the ability to remove screenshots from other people's client devices.

      With federated websites, that becomes enhanced; it's like if the bar you're in has a camera pointed at the microphone, and transmits both video and audio directly into several other bars.

      Now, that is where the actual demands come into play: As you pointed out, it is reasonable to demand that the bar deletes any recording of what I said on stage. But the way the footage is shared with the other bars can be regulated via a protocol. In your analogy, it's like the other bars copy tapes from the original bar and show them at their place. Now, implementing a procedure of "delete that tape, please" is not impossible. In fact, it already works on Mastodon. If a bar doesn't comply, it simply wont get any tapes from the other bars (it gets defederated).

      AFAIK, there is already such a feature planned on github. Which is great. But that is exactly the reason why these things need to be brought up and "privacy realism" is counterproductive.

      • That’s not what is demanded. No one demands that the audience (users) forget what I said (the comment), much less: immediately. No one is asking for mind-erasing power or the ability to remove screenshots from other people’s client devices.

        Well, that why it is an analogy; the forgetting is equivalent to erasing from someone else's storage. You have no real control over it. Other people can say they do, but you don't know that. And that is what is being demanded - right now I can already "delete" my comments and Beehaw will indicate to other instances that it was deleted, but it can't control whether they do it, and it has no way to know if they really deleted something or just hid it from public view.

        Differentiating between a client and a provider becomes extra tricky when you remember everyone can start up their own instance and still be essentially just a client - and, I think this is also worth mentioning, people can create their own backends that also federate using ActivityPub, but which are not open-source, and you'll have no idea what goes on in their servers. In the bar analogy, this would be people watching a stream of the mic at home; or another place, other than a bar with the same set-up, streaming and recording what goes on in that bar.

        Also, if no one is demanding that things be deleted from client devices, then logically nothing should stop someone from sharing it with other people/clients. And if you believe otherwise, then as example: what if someone posts a comment, I reply, and then they edit it to put me in a bad light? Is it an invasion of privacy for me to show what it said previously?

        This is not a privacy issue; you cannot demand privacy for something you shared willingly and publicly.

        Respectfully, I find it more counterproductive, and even harmful, to encourage and spread the idea that people should have any expectation of privacy regarding things they have shared publicly.

  • The privacy stinks you say? Did you know that Likes and Dislikes are public too? That was the most shocking to me. Because it is very much not like Reddit or others.

    It's still a fantastic piece of software, with all its flaws, though.

    • It's impossible to federate these without making them public in this way.

      The up-votes are also mapped to favourites in Mastodon etc, so that was always public anyway.

      You could argue that this should not be hidden in the Lemmy UI, but there are also good reasons to not highlight that much who voted on a post.

      • The up-votes are also mapped to favourites in Mastodon

        Explains why this obvious issue is not brought up by Mastodon lol

      • I thought votes didn't federate yet anyways... but, yes, it is possible, and i can come up off the top of my head with three or four potential implementations.

      • Hey 👋 I know you. Hehe.

        And yes, it should not be hidden. It is very much unexpected, because Reddit doesn't do it, and it's not visible to normal users.

  • The stuff listed in OP doesn't really seem like much concern. "What you put on the internet is there forever!" is completely true, and things like this should only make it more concrete that you can't rely on your service provider to delete information somebody else already archived.
    With that being said, default privacy settings - at least on Kbin - seem pretty bad.

  • Eww. Well, there is a reason why I try and be extremely careful about what I post nowadays. Don't want to regret dumb shit I said in the future.

413 comments