Zip makes different tradeoffs. Its compression is basically the same as gz, but you wouldn't know it from the file sizes.
Tar archives everything together, then compresses. The advantage is that there are more patterns available across all the files, so it can be compressed a lot more.
Zip compresses individual files, then archives. The individual files aren't going to be compressed as much because they aren't handling patterns between files. The advantages are that an error early in the file won't propagate to all the other files after it, and you can read a file in the middle without decompressing everything before it.
Yeah that's a rather important point that's conveniently left out too often. I routinely extract individual files out of large archives. Pretty easy and quick with zip, painfully slow and inefficient with (most) tarballs.
Voicebanks for Utau (free (as in beer, iirc) clone of Vocaloid) are primarily distributed as SHIFT-JIS encoded zips. For example, try downloading Yufu Sekka's voicebank: http://sekkayufu.web.fc2.com/ . If I try to unzip the "full set" zip, it produces a folder called РсЙ╠ГЖГtТPУ╞Й╣ГtГЛГZГbГgБi111025Бj. But unar detects the encoding and properly extracts it as 雪歌ユフ単独音フルセット(111025). I'm sure there's some flag you can pass to unzip to specify the encoding, but I like having unar handle it for me automatically.
When I was on windows I just used 7zip for everything. Multi core decompress is so much better than Microsoft's slow single core nonsense from the 90s.
Makes sense. There are actual programmers working at facebook. Programmers want good tools and functionality. They also just want to make good/cool/fun products. I mean, check out this interview with a programmer from pornhub. The poor dude still has to use jquery, but is passionate to make the best product they can, like everone in programming.
There's several levels you can use to trade off additional space for requiring more processing power. That being said, I hate xz and it still feels slow AF every time I use it.
Yeah, those tend to be pre-folder settings for the File Explorer.
Like View options, thumbnails and such.
It's been a while for me, but I think there was something specially for thumbnails too.
You might find one if you go into the folder options and set a folder to optimized for pictures/videos and add some to it.
this is a complete uneducated guess from a relatively tech-illiterate guy, but could it contain mac-specific information about weird non-essential stuff like folder backgrounds and item placement on the no-grid view?
They're Metadata specific for Macs.
If you download a third party compression tool they'll probably have an option somewhere to exclude these from the zips but the default tool doesn't Afaik.
HFS+ has a different features set than NTFS or ext4, Apple elect to store metadata that way.
I would imagine modern FS like ZFS or btrfs could benefit from doing something similar but nobody has chosen to implement something like that in that way.
I'm the weird one in the room. I've been using 7z for the last 10-15 years and now .tar.zst, after finding out that ZStandard achieves higher compression than 7-Zip, even with 7-Zip in "best" mode, LZMA version 1, huge dictionary sizes and whatnot.
You can actually use Zstandard as your codec for 7z to get the benefits of better compression and a modern archive format! Downside is it's not a default codec so when someone else tries to open it they may be confused by it not working.
First bundling everything in a tar file just to compress the thing in an individual step is kinda stupid, though. Everything takes much longer because of that. If you don't need to preserve POSIX permissions, tar is pointless anyway.
Zip is fine (I prefer 7z), until you want to preserve attributes like ownership and read/write/execute rights.
Some zip programs support saving unix attributes, other - do not. So when you download a zip file from the internet - it's always a gamble.
Tar + gzip/bz2/xz is more Linux-friendly in that regard.
Also, zip compresses each file separately and then collects all of them in one archive.
Tar collects all the files first, then you compress the tarball into an archive, which is more efficient and produces smaller size.
Zip, RAR, 7z etc. store and compress the files. Tarballs work differently, tar stores the files and the second program compresses the tar as a continuous stream.
They originated in different mediums, programs like zip were born to deal with folder structures, tar was created to deal with linear tape archives (hence the name).
Kind of redundant. Both .zip and .rar store an index of files within the archive and are a bit 'inside-out' when it comes what we get from tar.gz.
That is, ZIP is pretty close to what you'd get if you first gzipped all your files and then put them into a .tar.
RAR does a little more (if I remember correctly), such as generating a dictionary of common redundancies between files and then uses that knowledge to compress the files individually, but better. Something akin to a .tar file is still the result though.
If you download and extract the tarball as two separate steps instead of piping curl directly into tar xz (for gzip) / tar xj (for bz2) / tar xJ (for xz), are you even a Linux user?
They really, really aren’t. Let’s take a look at this command together:
curl -L [some url goes here] | tar -xz
Sorry the formatting's a bit messy, Lemmy's not having a good day today
This command will start to extract the tar file while it is being downloaded, saving both time (since you don’t have to wait for the entire file to finish downloading before you start the extraction) and disk space (since you don’t have to store the .tar file on disk, even temporarily).
Let’s break down what these scary-looking command line flags do. They aren’t so scary once you get used to them, though. We’re not scared of the command line. What are we, Windows users?
curl -L – tells curl to follow 3XX redirects (which it does not do by default – if the URL you paste into cURL is a URL that redirects (GitHub release links famously do), and you don’t specify -L, it’ll spit out the HTML of the redirect page, which browsers never normally show)
tar -x – eXtract the tar file (other tar “command” flags, of which you must specify exactly one, include -c for Creating a tar file, and -t for Testing a tar file (i.e. listing all of the filenames in it and making sure their checksums are okay))
tar -z – tells tar that its input is gzip compressed (the default is not compressed at all, which with tar is an option) – you can also use -j for bzip2 and -J for xz
tar -f which you may be familiar with but which we don’t use here – -f tells tar which file you want it to read from (or write to, if you’re creating a file). tar -xf somefile.tar will extract from somefile.tar. If you don’t specify -f at all, as we do here, tar will default to reading the file from stdin (or writing a tar file to stdout if you told it to create). tar -xf somefile.tar (or tar -xzf somefile.tar.gz if your file is gzipped) is exactly equivalent to cat somefile.tar.gz | tar -xz (or tar -xz < somefile.tar – why use cat to do something your shell has built-in?)
tar -v which you may be familiar with but which we don’t use here – tells tar to print each filename as it extracts the file. If you want to do this, you can, but I’d recommend telling curl to shut up so it doesn’t mess up the terminal trying to show download progress also: curl -L --silent [your URL] | tar -xvz (or -xzv, tar doesn’t care about the order)
You may have noticed also that in the first command I showed, I didn’t put a - in front of the arguments to tar. This is because the tar command is so old that it takes its arguments BSD style, and will interpret its first argument as a set of flags regardless of whether there’s a dash in front of them or not. tar -xz and tar xz are exactly equivalent. tar does not care.
the problem is if the connection gets interrupted your progress is gone. you download to a file first and it gets interrupted, you just resume the download later
I stopped doing that because I found it painfully slow. And it was quicker to gzip and upload than to bzip2 and upload.
Of course, my hardware wasn't quite as good back then. I also learned to stop adding 'v' flag because the bottleneck was actually stdout! (At least when extracting).
Bzip2 compression is often surprisingly good with text files, especially log files. It seems to "see" redundancies there - and logs often have a lot of it - far better than gzip and sometimes even lzma.
Anyway, if I saw a bunch of tar.bz2 files, that's what I'd expect to find in them.
I unironically used xz for a long time. It was just eazy and all around very good compress. A close second is 7zip because I used it on windows for years.
I still wonder what that's like. Somebody must still occasionally get a notification that SOMEWHERE somebody paid for their WinRAR license and is like "WOAH WE GOT ANOTHER ONE!"
Compatibility aside, I'd say that .tar.pxz aka .tpxz is probably my vote.
LZMA is probably what I'd want to use. xz and 7zip use that. It's a bit slow to compress, but it has good compression ratios, and it's faster to decompress than bzip2.
pixz permits for parallel LZMA compression/decompression. On present-day processors with a lot of cores, that's desirable.