Skip Navigation

Upcoming 0.18 upgrade, 404 errors and infrastructure costs

Hello, fellow lemmings!

I have a few quick updates about lemm.ee. If you don't want to read a wall of text, then the key points are summarized here for you:

  • There is a Lemmy upgrade (0.18) on the horizon, executing this upgrade will require downtime for lemm.ee
  • I have made some improvements to our infrastructure in order to reduce those pesky 404 errors that some users have been seeing
  • It's already looking like ~15% of our infrastructure bill for this month is going to be covered by community funding. A huge thanks to all financial supporters of lemm.ee! It's extremely heartwarming to see that people believe in this platform and are willing to share the costs with me.

Upcoming 0.18 upgrade

With the next version of Lemmy nearing completion, I am starting to plan the upgrade for lemm.ee.

With the 0.17.3 -> 0.17.4 upgrade, I was able to keep lemm.ee online during the upgrade with no downtime. That's how I would prefer to do all upgrades in the future as well, but unfortunately, there are some fundamentally incompatible changes in 0.18. This means that running a mix of 0.17.4 and 0.18 servers in our infrastructure at the same time will not work - effectively meaning that we can only execute this upgrade with some downtime.

In order to keep surprises to a minimum, I am planning to create a post with a title like "When this post is 1h old, the server will go down for an upgrade". Once 1h has passed from that post, you will be unable to access lemm.ee until the upgrade completes. If everything goes smoothly, then total expected downtime will be around 15 minutes, but in case of any issues, it could be slightly longer!

It's not clear yet when 0.18 will be fully ready, but if everything goes well, then this could already happen as early as next week. I will keep you all posted!

Why do we even want 0.18?

There are some very important optimizations landing in 0.18, which should help make the Lemmy UI feel considerably snappier and at the same time give the backend servers some much-needed breathing room. This should help take a lot of pressure of the federated network as a whole, and is a good first step towards scaling further.

Additionally, there are some key fixes that AFAIK will all land in 0.18, such as:

  • Additional posts will no longer automatically appear in your feeds while you're scrolling
  • You should stop getting redirected onto a completely different post when opening other posts in other tabs
  • The front page will stop showing stale posts for all instances (lemm.ee users will have been enjoying this patch since yesterday already, as I am the author of the patch and decided to apply it early here 😃)

All in all, 0.18 is looking like a great upgrade, so I’m personally looking forward to it.

Random 404 errors

Several users have been experiencing errors on lemm.ee (and similarly on other instances) where some page loads will fail with a white page and a 404 error.

I have spent some time debugging and attempting to mitigate this issue today. I have identified the root cause (spikes in database load related to the amount of new posts in the federated network for every 5 minute interval), and after some database tuning, I have managed to significantly mitigate this issue. Previously, this issue was appearing for about ~6000 page loads every hour. In the hour following my changes, this error only appeared for roughly ~596 page loads! It’s still not 0, so I will continue to try and improve this, but we are starting to brush up against the limits of what our current database infra can manage.

In the longer term, we will seriously benefit from any Lemmy optimizations - I am hopeful that even 0.18 will start bringing down the load on our servers. Additionally, we have a lot of room to upgrade our database infrastructure, but of course this would mean increasing the budget, which I’m not in a position to do for now. This segues us nicely into the third and final topic I wanted to cover:

Server costs

As of today, our infrastructure has scaled up to the point where my own budget will allow. To be more specific, I am able to keep the servers running as is indefinitely, but I am not able to make any further upgrades to our servers out of my own pocket.

Thankfully, we have some extremely kind members in our community, who have already decided to begin supporting lemm.ee and thus ensuring that every single one of us can enjoy a well functioning platform and potential further upgrades down the line! As of today, we have 4 supporters who have signed up for monthly (!!) contributions on my GitHub sponsors as well as one supporter who has donated money through my Ko-Fi page. I want to seriously thank each of you! I am personally super excited about Lemmy as a network, and specifically lemm.ee as an instance, so I’m truly happy to see that others share this excitement and are willing to join me in funding all this.

Pinning updates on the front page

Finally, I am looking for some feedback on how you feel about update posts such as this being pinned to the top of your lemm.ee front page.

My current plan is to pin this post on the front page for the next ~24 hours, after that, I will unpin it, but you will still be able to find it in !meta@lemm.ee.

I have seen some comments complaining about too many pinned posts, so alternatively, I could start pinning the latest site update post to the top of the !meta community, and avoid pinning it to the front page altogether.

If you have thoughts about this (or anything else I have mentioned), please comment below!

You're viewing a single thread.

37 comments
  • 15 minutes is nothing. But as usual those 15 minutes might easily become hours if something breaks. If you want to minimize the downtime inconvenience as much as possible, you could do it in the middle of the night for the instances timezone. But I imagine you'd like to sleep too. Second best time would be early morning.

    Do you plan to use 0.18 right as it comes out? A more cautious approach would be to monitor some other instance that pilots it for a day or two and then, if it works smoothly, adopt.

    • 0.18 is already being piloted on https://voyager.lemmy.ml, but this is not a federated instance and it has very few people actively testing things, so for sure some issues could come out once 0.18 starts being rolled out on proper instances.

      In general lemmy.ml seems to always have been the first to roll out new versions in order to verify them, my plan for now is to give it some time on lemmy.ml and then follow with the ugprade. I expect any major issues to become apparent quite quickly (probably within hours rather than days).

      • Just wondering if there is an update on when the 0.18 rollout will be? I updated jerboa to 0.0.35 and so can't use it to access lemmy instances <.0.18 (just to clarify I'm totally fine with this - just using a web browser for now). Thanks for all the work you are doing to build and maintain this!

        Oh and I like the idea of having pinned posts (as long as they are relevant) to convey important information (such as when lemm.ee will be down during the server upgrades). That way more people will see the message even if they don't necessarily subscribe to the meta (lemm.ee) community

        • For the moment, there are several key issues with 0.18, so I'm holding off on the upgrade until they can be addressed - most likely it will happen some time next week. I'll let you guys know ahead of time before I upgrade!

37 comments