
On Mon, Nov 20, 2017 at 11:31:01PM +0000, Giles Orr wrote:
Wow. I did not know that, thank you. And I see there's a specific switch to rsync for better handling of sparse files.
But ... then what exactly does straight-up 'ls' (without the '-s') report? The man page says '-s' "print[s] the allocated size of each file, in blocks." I was under the mistaken impression (for 23 years now) that that was more or less what 'ls' was already doing.
ls shows the filesize. That is the total size of the file currently. This means if you were to open the file and read it from one end to the other, that is how many bytes you get. If the file was created sparse (ie, you open it for writing, seek somewhere, then write some data but don't write the stuff before the seek), then the parts that have never been written to are not allocated yet and are simply implied to be all zeros. Reading the file will simply return a stream of zeros for the unallocated parts (and will conviniently also read those parts VERY fast). So size of a file and allocated space on disk for a file are not the same thing in some cases (filesize is always greater than or equal to allocated size, not counting the wasted space in the last block if the file is not a multiple of the block size).
Here are some answers: https://en.wikipedia.org/wiki/Sparse_file ... I get the utility of the idea, but it seems to come with some fairly significant hazards.
'ls' can be made to indicate directories (with '/') and links (with '@') and a couple other things with '-F'/'--classify': sparse files would seem to be staggeringly misleading and thus a good target for this kind of marking as well ... Is that possible?
Where else am I likely to run into sparse files? Sounds like mostly things that create file systems, like VirtualBox and friends, Docker (obviously) ... anywhere else?
Sorry to ask so many questions, but 'ls' seems like one of the most basic commands of Linux and I thought I knew what it did: I'm suddenly feeling like a newbie again and would like to get a handle on this ...
Unix file systems simply allow files that are not fully allocated yet. Useful feature, not supported on FAT, but NTFS in fact supports it too. Might have been required for posix compliance in the past. -- Len Sorensen