I think it is just the clusters currently being written.
The stuff I was writing, was compressible, and was not pathologically bad.
If I'd known in advance, that the data could not be compressed, it would
be pretty dumb to find out the hard way, that it was not going to save any
space. (I did some tests first, to see how much individual files would
compress. So I knew my 600GB of files would easily fit in the 500GB of
available space.)
I do not know if the NTFS compression scheme is clever enough to leave
uncompressed, things that do not compress well, or goes ahead anyway.
As it takes slightly more space, to store something that does not compress
well. And the I/O rate would be pretty "lumpy", if the file system
had to make decisions on the fly, how to store things. So my guess would
be, the compressor does its thing in any case.
*******
http://en.wikipedia.org/wiki/Doublespace
For instance, it compresses whole discs rather than select files.
Furthermore, it hooks into the file routines in the operating system so
that it can handle the compression/decompression (which operates on a
per-cluster basis) transparently to the user and to programs running
on the system."
http://en.wikipedia.org/wiki/NTFS_Compression#File_compression
Files are compressed in 16-cluster chunks. With 4 kB clusters, files
are compressed in 64 kB chunks. If the compression reduces 64 kB of data
to 60 kB or less, NTFS treats the unneeded 4 kB pages like empty sparse
file clusters - they are not written. This allows not unreasonable
random-access times."
Those descriptions sound different, but I am not sure they are. When I
enabled the tick box on that NTFS partition, it applies to the partition,
so you could claim it applies to the "whole disk". I do not know if the DoubleSpace
concept was different, in that it was "visible", or it was just the description
of the thing that was not clear about how it worked.
Windows has the ability to unzip a ZIP archive (a thing ending in .zip), but
that is different than a file system compression scheme. The files on my disk,
did not end up with a new extension of .zip or anything. They still had their
original file names.
And it is not something I left enabled. After the experiment was finished,
and I had my answer, I returned the disk to uncompressed mode. I did not
leave it that way. If compression takes a whole CPU core, and handles 20MB/sec
with something like video content, it would not be a very good permanent choice.
A highly compressible file (pathologically so), is worthwhile to compress, if
you count keeping a CPU core pegged a good usage of CPU. It would allow
a write rate, faster than the raw disk alone. But not many things are going
to be that compressible. I do not generally work with data files that are all
zeros (or some other repetitive value besides zero).
What I was doing at the time, was converting a Camstudio screen capture movie, into
individual BMP files, for analysis. To discover, that the stupid thing duplicates
captured frames, if it has not finished processing the current frame. In Camstudio,
the default screen capture speed is set to 200 frames per second, leaving the
impression the program can actually do that. It turns out, in fact, that the
same frame is repeated about 30 times, before a new frame has been finished
processing and can be written. It means the output "screen movie", contains
30x more data than it is really got. It was actually only capturing at 6 to 7
frames per second. So again, I'd be stupid to leave the tool set at the
default 200, if it had no intention of actually capturing 200 frames per second
and making a "smooth" movie. The movie was far from smooth, and did not succeed
in capturing mouse movement well. By converting the movie to BMP format,
then computing a checksum on each frame, I was able to determine how many
frames contained identical content, and I could easily see how the Camstudio
algorithm worked. But to get there, I needed 600GB of space, while the
BMP files were being written out. The individual BMPs were compressible, so
the whole thing fit easily on a 500GB compressing drive. And once I understood
what the thing was doing, I could dump the 600GB of BMP files as they were
no longer needed. If I was to use Camstudio again, I'd simply adjust the
default to something more realistic (perhaps a setting of 12 to 14 FPS, if
the actual hardware can only manage to capture 6 to 7 real frames - there'd be
no sense to keep the 200 setting). The wasted space is only evident, if you
attempt to convert the movie to something else. Then it balloons.
Paul