Skip Navigation

Anubis is awesome and I want to talk about it

I got into the self-hosting scene this year when I wanted to start up my own website run on old recycled thinkpad. A lot of time was spent learning about ufw, reverse proxies, header security hardening, fail2ban.

Despite all that I still had a problem with bots knocking on my ports spamming my logs. I tried some hackery getting fail2ban to read caddy logs but that didnt work for me. I nearly considered giving up and going with cloudflare like half the internet does. But my stubbornness for open source self hosting and the recent cloudflare outages this year have encouraged trying alternatives.

Coinciding with that has been an increase in exposure to seeing this thing in the places I frequent like codeberg. This is Anubis, a proxy type firewall that forces the browser client to do a proof-of-work security check and some other nice clever things to stop bots from knocking. I got interested and started thinking about beefing up security.

I'm here to tell you to try it if you have a public facing site and want to break away from cloudflare It was VERY easy to install and configure with caddyfile on a debian distro with systemctl. In an hour its filtered multiple bots and so far it seems the knocks have slowed down.

https://anubis.techaro.lol/

My botspam woes have seemingly been seriously mitigated if not completely eradicated. I'm very happy with tonights little security upgrade project that took no more than an hour of my time to install and read through documentation. Current chain is caddy reverse proxy -> points to Anubis -> points to services

Good place to start for install is here

https://anubis.techaro.lol/docs/admin/native-install/

155 comments
  • I don't think you have a usecase for Anubis.

    Anubis is mainly aimed against bad AI scrappers and some ddos mitigation if you have a heavy service.

    You are getting hit exactly the same, anubis doesn't put up a block list or anything. It just put itself in front of the service. The load on your server and the risk you take it's very similar anubis or not anubis here. Most bots are not AI scrappers they are just proving. So the hit on your server is the same.

    What you want is to properly set up fail2ban or, even better, crowdsec. That would actually block and ban bots that try to prove your server.

    If you are just self-hosting with Anubis the only thing you are doing is deriving the log noise towards Anubis logs and making your devices do a PoW every once in a while when you want to use your services.

    Being honest I don't know what you are self hosting. But at least it's something that's going to get ddos or AI scrapped, there's not much point with Anubis.

    Also Anubis is not a substitute for fail2ban or crowdsec. You need something to detect and ban brute force attacks. If not the attacker would only need to execute the anubis challenge get the token for the week and then they are free to attack your services as they like.

  • I appreciate a simple piece of software that does exactly what it’s supposed to do.

    • The front page of the web site is excellent. It describes what it does, and it does its feature set in quick, simple terms.

      I can't tell you how many times I've gone to a website for some open-source software and had no idea what it was or how it was trying to do it. They often dive deep into the 300 different ways of installing it, tell you what the current version is and what features it has over the last version, but often they just assume you know the basics.

  • At the time of commenting, this post is 8h old. I read all the top comments, many of them critical of Anubis.

    I run a small website and don't have problems with bots. Of course I know what a DDOS is - maybe that's the only use case where something like Anubis would help, instead of the strictly server-side solution I deploy?

    I use CrowdSec (it seems to work with caddy btw). It took a little setting up, but it does the job.
    (I think it's quite similar to fail2ban in what it does, plus community-updated blocklists)

    Am I missing something here? Why wouldn't that be enough? Why do I need to heckle my visitors?

    Despite all that I still had a problem with bots knocking on my ports spamming my logs.

    By the time Anubis gets to work, the knocking already happened so I don't really understand this argument.

    If the system is set up to reject a certain type of requests, these are microsecond transactions of no (DDOS exception) harm.

    • AI scraping is a massive issue for specific types of websites, such as git forges, wikis and to a lesser extend Lemmy etc, that rely on complex database operations that can not be easily cached. Unless you massively overprovision your infrastructure these web-applications come to a grinding halt by constantly maxing out the available CPU power.

      The vast majority of the critical commenters here seem to talk from a point of total ignorance about this, or assume operators of such web applications have time for hyperviligance to constantly monitor and manually block AI scrapers (that do their best to circumvent more basic blocks). The realistic options for such operators are right now: Anubis (or similar), Cloudflare or shutting down their servers. Of these Anubis is clearly the least bad option.

    • With varnish and wazuh, I've never had a need for Anubis.

      My first recommendation for anyone struggling with bots is to fix their cache.

    • If crowdsec works for you thats great but also its a corporate product whos premium sub tier starts at 900$/month not exactly a pure self hosted solution.

      I'm not a hypernerd, still figuring all this out among the myriad of possible solutions with different complexity and setup times. All the self hosters in my internet circle started adopting anubis so I wanted to try it. Anubis was relatively plug and play with prebuilt packages and great install guide documentation.

      Allow me to expand on the problem I was having. It wasnt just that I was getting a knock or two, its that I was getting 40 knocks every few seconds scraping every page and searching for a bunch that didnt exist that would allow exploit points in unsecured production vps systems.

      On a computational level the constant network activity of bytes from webpage, zip files and images downloaded from scrapers pollutes traffic. Anubis stops this by trapping them in a landing page that transmits very little information from the server side. By traping the bot in an Anubis page which spams that 40 times on a single open connection before it gives up, it reduces overall network activity/ data transfered which is often billed as a metered thing as well as the logs.

      And this isnt all or nothing. You don't have to pester all your visitors, only those with sketchy clients. Anubis uses a weighted priority which grades how legit a browser client is. Most regular connections get through without triggering, weird connections get various grades of checks by how sketchy they are. Some checks dont require proof of work or JavaScript.

      On a psychological level it gives me a bit of relief knowing that the bots are getting properly sinkholed and I'm punishing/wasting the compute of some asshole trying to find exploits my system to expand their botnet. And a bit of pride knowing I did this myself on my own hardware without having to cop out to a corporate product.

      Its nice that people of different skill levels and philosophies have options to work with. One tool can often complement another too. Anubis worked for what I wanted, filtering out bots from wasting network bandwith and giving me peace of mind where before I had no protection. All while not being noticeable for most people because I have the ability to configure it to not heckle every client every 5 minutes like some sites want to do.

      • If crowdsec works for you thats great but also its a corporate product

        It's also fully FLOSS with dozens of contributors (not to speak of the community-driven blocklists). If they make money with it, great.

        not exactly a pure self hosted solution.

        Why? I host it, I run it. It's even in Debian Stable repos, but I choose their own more up-to-date ones.

        Allow me to expand on the problem I was having. It wasnt just that I was getting a knock or two, its that I was getting 40 knocks every few seconds scraping every page and searching for a bunch that didnt exist that would allow exploit points in unsecured production vps systems.

        • Again, a properly set up WAF will deal with this pronto
        • You should not have exploit points in unsecured production systems, full stop.

        On a computational level the constant network activity of bytes from webpage, zip files and images downloaded from scrapers pollutes traffic. Anubis stops this by trapping them in a landing page that transmits very little information from the server side.

        • And instead you leave the computations to your clients. Which becomes a problem on slow hardware.
        • Again, with a properly set up WAF there's no "traffic pollution" or "downloading of zip files".

        Anubis uses a weighted priority which grades how legit a browser client is.

        And apart from the user agent and a few other responses, all of which are easily spoofed, this means "do some javascript stuff on the local client" (there's a link to an article here somewhere that explains this well) which will eat resources on the client's machine, which becomes a real pita on e.g. smartphones.

        Also, I use one of those less-than-legit, weird and non-regular browsers, and I am being punished by tools like this.

        All the self hosters in my internet circle started adopting anubis so I wanted to try it. Anubis was relatively plug and play with prebuilt packages


        edit: I feel like this part of OP's argument needs to be pointed out, it explains so much:

        All the self hosters in my internet circle started adopting anubis so I wanted to try it. Anubis was relatively plug and play with prebuilt packages

    • I also used CrowdSec for almost a year, but as AI scrapers became more aggressive, CrowdSec alone wasn’t enough. The scrapers used distributed IP ranges and spoofed user agents, making them hard to detect and costing my Forgejo instance a lot in expensive routes. I tried custom CrowdSec rules but hit its limits.

      Then I discovered Anubis. It’s been an excellent complement to CrowdSec — I now run both. In my experience they work very well together, so the question isn’t “A or B?” but rather “How can I combine them, if needed?”

    • You are right. For most self-hosting usecases anubis is not only irrelevant, but it actually works against you. False sense of security and making your devices do extra work for nothing.

      Anubis is though for public facing services that may get ddos or AI scrapped by some not targeted bot (for a target bot it's trivial to get over Anubis in order to scrap).

      And it's never a substitute of crowdsec or fail2ban. Getting an Anubis token it's just a matter of executing the PoW challenge. You still need a way to detect and ban malicious attacks.

  • Stop playing wack-a-mole with these fucking people and build TARPITS!

    Make it HURT to crawl your site illegitimately.

  • I don't really understand what I am seeing here, so I have to ask -- are these Security issues a concern?

    https://github.com/TecharoHQ/anubis/security

    I have a server running a few tiny web sites, so I am considering this, but I'm always concerned about the possibility that adding more things to it could make it less secure, versus more. Thanks for any thoughts.

    • all of the issues listed are closed so any recent version is fine.

      also, you probably don't need to deploy this unless you have a problem with bots.

    • Security issues are always a concern the question is how much. Looking at it they seem to at most be ways to circumvent the Anubis redirect system to get to your page using very specific exploits. These are marked as m low to moderate priority and I do not see anything that implies like system level access which is the big concern. Obviously do what you feel is best but IMO its not worth sweating about. Nice thing about open source projects is that anyone can look through and fix, if this gets more popular you can expect bug bounties and professional pen testing submissions.

    • This isn't really a security issue as much as it is a DDOS issue.

      Imagine you own a brick and mortar store. And periodically one thousand fucking people sprint into your store and start recording the UPCs on all the products, knocking over every product in the store along the way. They don't buy anything, they're exclusively there to collect information from your store which they can use to grift investors and burn precious resources, and if they fuck your shit up in the process, that's your problem.

      This bot just sits at the door and ensures the people coming in there are actually shoppers interested in the content of some items of your store.

  • I use it with OpenBSD’s relayd and I find it amazing how little maintenance it needs.

155 comments