I know that Lemmy is open source and it can only get better from here on out, but I do wonder if any experts can weigh in whether the foundation is well written? Or are we building on top of 4 years worth of tech debt?
From some comments I've read, it's at least in better shape than kbin? A few people expressed interest in helping with that project and then went running for the hills after reading through the code.
No the answer is that it is written in a modern language, is in its infancy and needs a lot of work to be really great, but it's based on a certified protocol ActivityPub, that Mastodon and other "fediverse" systems use. It's going to be really great, eventually.
It's fine. Nothing impressive about it but nothing horrifying about it. Could use better testing and better documentation, it's pretty weak on both fronts. It's a pretty young/immature code base, hard to have much tech debt yet. Not like its core dependencies can be a decade out of date. But it's easy to navigate and understand,relatively speaking.
I've seen one dev talk about documentation and it's admittedly weak, but they're pretty impacted by everything else. It's on the burner and they'll work on it at some point.
As long as the backend is stateless, it can be scaled to handle huge amount of users, at least in theory. IMO the main issue right now with Lemmy deployment is pictrs not being stateless. It uses a filesystem-based internal database called sled. Not only this make pictrs not stateless, you can't even run multiple replica of pictrs in the same host because sled would crash if the database file lock is already acquired by another replica. Someone with some rust skill should consider donating their time to add postgresql support to pictrs soon, which will greatly help making Lemmy scalable. Too bad I know nothing about rust.
An interesting choice of solution there for image hosting... I would have thought they would have gone with a simple proxy through to an object store like S3, GCS, Wasabi, insert other clone here. Or even picked an off the shelf BLOB capable system for self hosting like Mongo or Cassandra. Then your image hosting becomes stateless as you just give each image a flake ID, pop it in the storage system and give back a shortened URL. I'm sure they had their reasons though :-)
It’s decent, but it isn’t scalable, at least not yet.
Right now the entire Lemmy backend is one big “monolith”. One app does everything from logins and signups to posting and commenting. This makes it a little harder to scale, and I wouldn’t be surprised to see it split out into multiple micro services sooner rather than later so some of the bigger instances can scale better.
I’d love to know where the higher level dev stuff is being discussed and if they’ve made a decision on why or why not microservices.
There’s no reason that a monolith can’t scale. In fact you scale a monolith the same way you scale micro services.
The real reason to use micro services is because you can have individual teams own a small set of services. Lemmy isn’t built by a huge corporation though so that doesn’t really make sense.
I disagree that it being a monolith is immediately a problem, but also
In fact you scale a monolith the same way you scale micro services.
This is just not true. With microservices, it is easy to scale out individual services to multiple instances as demand requires them. Hosting a fleet of entire Lemmy instances is far more expensive than just small slices of it that may require the additional processing power.
Microservices aren't a silver bullet. There's likely quite a lot that can be done until we need to split some parts out, and once that happens I expect that federation would be the thing to split out as that's one of the more "active" parts of the app compared to logins and whatnot.
Definitely not a silver bullet, but should stop the app from locking up when one thing gets overloaded. I’m sure they have their reasons for how it’s designed now and I’m probably missing something that would explain it all.
I’m still not familiar enough with how federation works to speak to how easy that would be. Unfortunately this has happened all as I’ve started moving and I haven’t gotten a chance to dive into code like id want to.
There's nothing stopping you from putting a load balancer and running multiple instances of a monolith connected to one database. Then the database will also become a bottleneck, but that would still happen with microservices.
Exactly, and nothing prevent a monolith from doing vertical slicing at the database level as long as the monolith is not crossing its boundaries. This is the only scaling part that is inherent to microservices.
If the issue is the horizontal scaling, microservices doesn't solve anything in this case.
Also specifically on what I understand of the Fediverse, you want something easy to host and monitor since a lot of people will roll out their own instances which are known issues when running microservices.
This is a discussion I'm also interested in. Migrating a monolith to microservices is a big decision that can have serious performance, maintainability and development impact.
Microservices can be very complex and hard to maintain compared to a monolith. Just the deployment and monitoring could turn into a hassle for instance maintainers. Ease of deployment and maintenance is a big deal in a federated environment. Add too much complexity and people won't want to be part of it.
I've seen some teams do hybrids. Like allowing the codebase to be a single artifact or allowing it to be broken by functionalities. That way people can deploy it the easy way or the performant way, as their needs change.
That’s what I’m thinking. Microservices could be a huge pain in the ass, but a hybrid approach would make things much better. Smaller instances wouldn’t be a problem, but the larger instances would be able to separate out components.
To keep it possible to run monolithicly would probably need a lot of work, but it’s possible to do and would probably be the best approach.
I have a friend that's a lot more technical than me and he said that Lemmy's codebase is kinda messy and relies on libraries that are still in beta and have not been tested well in the real world (since Rust is a relatively new language). This was a few months ago though, I'm not sure how much things changed since it's been getting a lot more support and rewrote the front-end. The good news is it'll get a lot better as more developers contribute to it.
I think a lot of people assume that because it's written in Rust that means it has to be super stable but that really isn't the case.
I think it will improve as more people get involved. The fundamentals seem to work fine. Haven't looked at the repository yet but I am planning to do so and see whether I can make a (small) contribution somewhere. Probably in the form of cleaning up some technical debt.
Someone mentioned they had started out using websockets instead of http. I guess they’ve since migrated, but that design choice makes me wonder about the qualifications of the devs to make that kind of choice.
Websocket was deprecated with 0.18.0 which is the latest official release, but a lot of instances are on the release candidate for 0.18.1. because of some major improvements. My login instance is on that one which is great for me because I'm using a desktop browser and it looks way nicer. A lot of fixes too.
Websocket is easier to implement so that's probably why they started with it. It has heavy overhead and doesn't scale well so that's the down side. It wasn't trivial for them to move to http. Websocket was probably a better starting point for them at the time, but they did realize its shortcomings and deprecate it in time to support growth. I don't know if I'd hold that against them.
It's probably decent, but it is also worth noting that Lemmy was never really expecting the massive explosion of activity it currently has quite so soon.
The current code base was probably good for a small number of users/instances, but everything isn't doing quite as well now that there are thousands, or even tens of thousands rattling about the place.
Someone mentioned they had started out using websockets instead of http. I guess they’ve since migrated, but that design choice makes me wonder about the qualifications of the devs to make that kind of choice.
Web sockets are meant for applications where it's important that you receive updates fast in a push fashion. E.g. collaborative editors like Google docs or a chat application. To scroll Lemmy or open a specific Lemmy post you don't need that at all. You can just fetch the data once and have users refresh manually if for example they want to fetch the latest comments on a post. Using websockets for that type of application just puts unnecessary strain on the server.
I've been waiting for an Elixir version for a while now
Haven't seen much traction on the few projects I've been seeing, but still holding out hope. Maybe I should get off my ass and start actually contributing