scaling to millions
scaling to millions
scaling to millions
You're viewing a single thread.
There's a really fine line between needing a spreadsheet and needing a database and I've not yet found it. It's probably more fuzzy than I realized but I have participated on so many programming projects that amounted to a spreadsheet that lived too long.
Does it need to be accessed by multiple people? Does it need to be updated frequently? Does it need to be accessed programmatically? Does performance matter? If you answered "yes" to any of these questions, you should probably use a database.
If it's something you interact with manually, has less than 100,000 rows, and is mostly static, then sure, use a spreadsheet.
I used to have some scripts to convert and merge between CSV and sqlite3. Even a lightweight db like sqlite3 has value. Google Sheets fills some of the gaps with its QUERY statement but I still find it a bit awkward to use a lot of the time.
Google sheets works just fine for accessed by multiple people.
The line is probably somewhere on machine vs human readable.
Performance. If you get 30k transactions per second don’t even SAY spreadsheet lol
Per second? If you get that many per day, I wouldn't touch it with a spreadsheet.
I can answer yes to all of these questions but still use a spreadsheet. I understand your point, but I feel even with these the line is still gray.
I just checked and my largest spreadsheet currently has 14,300 lines across 12 tabs. Each tab contains the same information just pulled from a separate form. So each tab is linked to a form to update automatically when someone submits a new response. We then process these responses periodically throughout the day. Finished responses are color coded so a ton of formatting. Also 7+ people interacting with it daily.
Then we have a data team that aggregates the information weekly through a script that sends them a .csv with the aggregate data.
The spreadsheet (and subsequent forms) are updated twice a year. It was updated in June and will be updated again in December. It’s at 14k now and will continue to grow. We’ve been doing this through a few iterations and do not run into performance issues.
At some point you end up surpassing databases and end up with a giant pile of spreadsheets called a data warehouse
As soon as you stop data maintenance per hand, start using a db.