Data Organization
Data Organization
Data Organization
If you call the bottom picture a "Data Lake" you can IPO and walk away with millions
"Unstructured Data".
It's horizontal scaling!
Time series
I shit you not, IT around 2004, I had a nurse who stored all her important docs in "Recyle Bin"
She put in a ticket that her computer was slow. We scheduled a time to look at it and made sure she knew to be there.
When I showed up, she had left to go to lunch on purpose so she could take a free long lunch. I asked her manager to call her back in, she refused.
I diagnosed she was out of space, and emptied her bin.
That did not end up going well.
She was furious, Her boss was mad. My boss was pissed that it happened but considered it reasonable since she refused to be there.
I spent the better part of 4 hours undeleting deleted recycle bin contents which is WAYYYYYY harder than undeleting deleted files. They're already UUID's and bringing them back into existence will not put them back in the recycle bin, all that meta is gone.
Well duh.
   
It is a recycle bin after all.
  
The thoughts will be reused at some point for something new /s
Project designer: the project function is self explanatory.
User:
I asked her what the fuck she was thinking later in the process. She knew that files weren't supposed to be there She just thought it was a good idea, and was very defensive borderline offensive about being able to store files wherever she wanted.
My first inclination was she was just putting non-work-related stuff in there so that her manager would never see it. But no, there were hundreds of megs of work related stuff. I recommended she not store the 500 megs of personal digital camera fodder on what computer if she was that tight on space. Hard drives of this era were only a handful of gigs large. She just flipped out some more demanded a bigger disc. I had a private consult with her manager and mentioned that We could get a bigger desk but it was going to come out of her budget. She declined.
A year later we did SOX compliance and as part of that we deleted emails over 3 months and deleted any recycling bin data over a month old. I made sure her manager noted this and that it would delete her preferred file storage and never heard another word out of them.
Apparently ISO 8601:2000 allowed YY-MM-DD, but the 2004 version does not.
Til
Anyone who uses YYMMDD instead of ISO 8601 needs to be fed feet first into a wood chipper.
ISO 8601 is YYYYMMDD (or YYYY-MM-DD in extended format)
Are you really going to wood chipper someone for leaving off the leading 20? I think we can safely infer the century and millennium with a high confidence, why not trade them for two extra name characters?
As an old person who has archives dating back to the 90s, yes.
I recently had an accountant file something for the IRS that was dated as expiring in 1940 when it should've been 2040. I had to catch it myself after reading through 70 pages of dense forms before it was sent off, and I could've easily missed it.
Digital records have existed long enough now that it's downright irresponsible to leave off the century for anything where having an accurate date might even slightly matter.
I use to do that but got tired of typing out unnecessary characters and appreciate the shorter character length. I think my folders and files will be long gone by Y2Point1K.
So, was the time of murder 20th of October 2021 - 1:25 PM or 21st of October 2020 - 1:25 AM?
Depending upon that, you may/may-not have an alibi.
I make a point to train people on this at work, and I also make a point to periodically delete all relevant files that are not dated or not dated correctly
oh no you lost some important files? should've followed the standards
we only have so much space and your 1.2 GB undated file that isn't even in the folder it should be in is getting deleted
one place i was at had ridiculous formatting standards. but like i loved that i could tell everything in a document by reading its title. just, when your pdf scan of your supporting documents for your tax return is 135 pages long, well the title took ten minutes to read
it was like 2010 tax return supporting documents + w2 - john doe - abc corp + w2 - john doe - def corp + 1099INT - john doe - BankBank +...pdf
and one of my jobs was to double check that the title accurately represented all 135 documents in that godsforsaken supporting documents scan. That was a rough year.
Other firm i worked at that year, because i was stupid and moonlit at TWO tax firms one tax season, just called the file SUPPORTING DOCS.pdf . Typed everything in all caps because we thought the IRS was blind. Also allowed us to stream music online and not have to play it on headphones with our doors shut in our offices. They were better.
nah sideways
I assumed they meant it like 2025-08-18…
Though TBF I sometimes rename files using the terminal and go mv $file "some_name_$(date +%s),ext"
I’ll say that as much as I love Apple and macOS, Finder has some pretty terrible defaults that make file management pretty difficult for the average user. The default “All Files” view is atrocious.
I HATE that windows will sort folders at the top instead of alphabetically with everything else. I guess it comes from using a Mac for so long.
I agree about .DS_Store in any mixed os environment though.
What is a spring-loaded folder?
Folders aren’t by default listed at the top
This is a aweful windows only thing. Anyone who likes it should be ashamed.
No good intuitive way to set defaults for ALL folders at once
This is inexperience with the finder because it's ridiculously easy to set this.
Man, I hate my moms pc folder layout, like why do you have Documents folder inside of documents folder inside of Documents folder? Why do you create excel sheets inside Downloads folder when you didn't download them???
Just missing a random pile of files on the desktop.
What is this "desktop" of which you speak?
Is that what's under all these files?
Desktops, are like ogres. They have layers 😬
My actual desk and office - messy. My desktop - folder, folder, 4 shortcuts. My phone -groups of apps ordered by function - Pebble, Office, Entertainment, etc. My garage - absolute hoarder nightmare from hell cause I just can't seem to get to it. Why I can be ordered in one area and not in another is beyond me.
"SDD"?
"SDD"?
Yes. Solid Disk Disk.
Solid disk drive
Thank you.
I think most computer users now don't know that file systems exist
Especially younger people. They're used to files just... being there on their phone. Photo albums? Nah, just scroll though every photo you've ever taken to find the right one.
That, and having powerful search functionality + tagging has made perfect folder structures less of a requirement. I've never had trouble finding documents in paperless-ngx just by searching, for example.
Photo albums? Nah, just scroll though every photo you’ve ever taken to find the right one.
Then screenshot it so that the screenshot of the photo is at the top, then switch to the other app and upload the screenshot of the photo there.
Ok. Calling me out like that. It's fine, I deserve it.
I store everything "temporarily" because "I'll sort it later" on the Desktop.
It's never later.
The most items I had on my work desktop was 1366, they were overlapping on my screen and windows+D would lag the whole computer. It was glorious.
I call it "Purgatory"
I wish someone make github but for documents. Image your documents can be forked by someone and has many branches and revisions, it must be hilarious.
You can literally just upload a library of documents to github or another repo service like codeberg. That's basically what a code project is, a bunch of files.
You just described SVN. It's what we used before the invention of git. And is still used today for team projects that use complex file formats, like images, binary blobs, 3d models, that sort of stuff. It will work with any files.
P.A.R.A. - It's a simple organization method and very easy to maintain.
This is really damn good. Thanks for sharing it!
I find myself having too many nested folders, and I’m just a normie. I wonder how deep they go for you tech people.
At some points, Windows won’t let me change the file name because it was too long and I’m assuming the file path to it plus the ridiculously long name (“person last name, first name - type of document (purpose) yyyymmdd”) just breaks Windows.
Sometimes I have to copy those files to my desktop just to rename the new file, so that I can upload the file to an online system that only lets me upload files with names under 42 characters long. It’s wild.
You can enable long names in Windows, essentially removing that restriction and giving you the power of all the sub folders up to something like 26'000 characters.
This was one of the reasons I quit trying to develop on Windows way back when. I had a very well organized system of subfolders for all my code, and it was literally running into some kind of path length limit trying to import deeply nested dependencies in certain projects. This was WELL into the era of 64-bit computing, absolutely no excuse other than Microsoft taking shortcuts.
I still run into this issue when one of my company's clients requires developing on Windows. Doesn't take many subfolders before node_modules just starts breaking.
There are lots of reasons I hate developing on windows and that's certainly one of them.
In my projects folder I have an "all" folder where I store all my projects. But back at the projects folder there are others like "by-client", " by-language", and "by-date". When I make a new project I create it inside the all folder, and then place shortcuts inside the corresponding folders.
I do something like:
From Documents > ‘routine documents’ > FY > Month > Section (personnel, operations, or logistics) > and whatever task from there for my main day-to-day stuff
But, for operations outside of the monthly sort, like managing personnel training, it gets really weird;
From Documents > Training > FY > department > categories of training > subcategory > individual person’s folder for the course > application folders with dates (the last folder here is when the one that got approved and they’re going to the school on).
This one is where I end up with file names I can’t rename.
In my obsidian notes folder, i have
.
For file navigation, i use links and references within the notes themselves, which creates a network of linked files that is far far easier to navigate than folders
Everything else is sorta all over the place, but in general
~/ is the user home directory
For pictures, i use a self hosted Immich instance
Too deep.
  
I am having a peoblem bwcause sometimes I broke my own rules or sorted every itme in it's own folder.
My paths are pretty short ngl /home/user/devel/projects/android/testproject/ Probably is the longest one. Or maybe even /home/user/devel/lessons/dotnet-aspnet/exam/AspnetExam/xxxroot/libs/bootstrap-icons/ But that one is temporary, I'll archive it once it's done
wtf is an SDD?
Super duper drive ...
 says SSD
  
 shows a symbol of an HDD  
MFW most people don't care because they understand the nuance of communication except for me
Actually it says SDD. Must be referring to those SeaGate hybrid drives, but even those are referred to as SSHD, so I'm at a loss for what they mean.
So i not the only one who misspells. Cool!
That's clearly an ipod
Do you even git?
Surely experiment 1…n should be branches.
With git LFS there's no excuse.
You guys have never had to handle a 300GB tiff file from microscopy and it shows.
Just put it all in the same folder and call it something like:
20250816_ProjectType_ActualNameHere_v001
How about New folder (11)/Final/Final2/TO DELETE/New version/DO NOT DELETE/20250816_Version 4
Mmmmm... Just format the hard drive.
Realistically, the skip should be named "Desktop"
Hmmm yeah. But most of it lives in an automatic cloud backup as well.. Photos, important documents, game saves, programming projects. I've lost drives before and apart from one or two moments where I couldn't find a very specific file I didn't really miss anything. The only things that I really do need to backup at the moment are my music projects and the raw files from my photography
~/Desktop/sort/sort/sortme/shit_from_dt/sort/really_important_shit/sort
Hey, I know what's in my folder labeled Stuff.
Ugh thanks for reminding me to clean up my desktop, I guess…
Shouldn't it show the directory the file is in instead of just showing them grouped together? Or is Projects 2 through 4 in the Project 1 folder and ditto with all of the experiment folders?
It is clear what the comic is trying to communicate and does so locally sound.
It's not like a comic has to be realistic
Data shouldn't be organized hirarchically.
How would you propose to organize it then?
Relationally
I often catch myself using Downloads to store a very suspicious quantity of files.
Yes. Downloads is the way.
If you want to make yourself organize better, set up a cron to remove all downloads older than 7 days 😳 then you’ll be efficient—and probably have nightmares.
No,I'll just disable the cron job before it executes and forget about it.
😳
You're a massive du -sh
Downloads is usually my largest folder. Funny thing is that it is literally all just Linux isos because I'm trying some things with servers
"Linux ISOs"
"Trying some things"
Then you end up liking what you slapped together with jank and a prayer to the temporary work-around and now there's a new prod server!
Linux or Windows… doesn’t matter. Downloads is where I. Will find it.
Yeah cause my work computer limits onedrive storage to like 100mb and downloads is one if the only places where I have write access and it doesn't go to onedrive.