What's the most cost-effective way to serve lots of large files
I'm looking at starting a service that involves hosting a lot of LLM models, which are often going to be 16GB+ (compressed). I did a bit of searching for cloud storage providers with cheap egress, and the cheapest I could find is $0.01 per GB, which would still be $0.16+ per download.
How do sites like Huggingface or CivitAI do it? Lots of VC funding?
If the files are not going to be changing much, then what is typically done is to use a CDN service (e.g. Cloudflare, Akamai, Fastly). The idea is you have an "origin" which could be any old server which serves your files over HTTP (even a VPS running nginx). The CDN is configured to proxy requests to the origin, building up a cache of the files it serves. The CDN can serve files from cache on their own (very large) infrastructure.
See also What is a CDN?
To keep costs down and depending on how much you want to get your hands dirty, you could start investigating renting dedicated servers. Some hosting providers offer unmetered network connectivity. Here's something from OVH: https://www.ovhcloud.com/en/bare-metal/rise/rise-stor-1/
And hey, depending on how grassroots the project is, there's always bittorrent! ;)
Storj does it at 7 USD/TB. And there are providers that technically provide unlimited bandwidth, like Hetzner's dedicated servers; they still have some abuse limits, but even working within the limits should make it much cheaper. This means custom engineering though.
Probably. They might have gotten additional discounts off of the advertized price by talking with sales and committing to the service for a year or other ways.
Interesting, I'll have to have a look at doing something like that. If I remember correctly, the CivitAI devs are active on Discord, so maybe I could just ask them directly.
There's no harm in messaging the sales team directly. Whatever deals CivitAI got might not work for you, they might not even be legally allowed to mention specifics. But you can shop around and see what different providers will offer you.