1) Today in my Imaging class, we were taught that 'file size' and 'size on disk' were different because 'file size' is the raw data of the file, and 'size on disk' is the whole size including file header/tags. I always thought it had something to do with clusters, but my teacher was having none of it... which is it?
2) We were also made to do an exercise using the TIFF format for saving indexed colour in decreasing amounts. The decrease in filesize from 256 to 128 colours was rather large as expected, then upon saving in 64, 32, and 16 colours, the file size was pretty much the same as 128. Why is this? Again, the teacher said this is due to lots of tags in the file header. He didn't elaborate, leading me to think he could be wrong.
Thanks in advance, and excuse my ignorance.
Name:
Anonymous2006-02-21 5:48
Questions 1) Does not have an answer with the given information. It depends on how you count filesize, what filesystem you are using...
Name:
Anonymous2006-02-21 6:08
How is that not enough information? I'm going on the two filesizes given on the properties dialogue of an image file. There are two figures, "file size" and "size on disk". Everywhere I've read on Google says that this is indeed to do with the clusters used by the file. Even using NTFS (which I am) there will be a difference.
I'd just like some confirmation that that's the only reason, and it's not because "the 'size on disk' includes image file headers and 'file size' includes only the raw image data".
And of course, question 2: why do uncompressed TIFFs have the same filesize (in fact, occasionally increasing) using fewer numbers of indexed colors?
Your analysis is correct. The answer is not simple, since different filesystems handle it in different ways, but it generally works like this: disk sectors usually are 512 bytes in size. In order to keep overhead down, most filesystems group sectors into clusters (or zones/blocks/whatever). You can only have one file per cluster, no more. So, since file data generally doesn't stop at convenient cluster boundaries, you usually end up with something known as "slack space", since a file, or the remainder of a file, doesn't take up the entire cluster.
Now, I'm not familiar with TIFF, but I suspect the reason you may not be seeing improved benefit with reduced colours is related to the type of image you're using. Is it mostly gradients (e.g.: an actual photograph), or is it full of flat areas?
I suspected as much. At least I can write this in the exam confident in the knowledge that we're being taught wrong and I should really just do my own research.
As for the TIFF question, yes, this is involving photographs, but I'm not sure that that has much to do with it, seeing as a 128 color image has the same amount of colors whether or not they're in a gradient formation. A simplified version of my question would be:
"why are 8, 16, 32 and 64-colour TIFFs the same size as a 128-colour TIFF, when the 128-colour one is half the size of a 256-colour image?"
Please also excuse the thoroughly boring nature of the question.
The correct answer is whatever your instructor says it is. At least until the test. You can concern yourself with less valuable things like "facts" and "reality" after you ace the course.
Name:
Anonymous2006-02-21 7:24
Short answer. TIFF is a complex format, and your editor doesn't seem to do a very good job at it. Are you using Photoshop? One would think Adobe would do a good job at their own format, but that isn't so.
From your numbers it looks like it's saving the 256 color picture as 8-bit data (as it should), and the 128 color picture as 7-bit data (also as it should), but is for some reason using 7-bit data for the 64, 32, and 16 color pictures as well. Which simply doesn't make any fucking sense at all.
Name:
Anonymous2006-02-21 7:32
God damn, is it too much to expect your lecturer to know what he's talking about? Apparently so. I wouldn't mind so much if I wasn't paying so much to be at this uni, and haven't had to correct instructors in the past.
In my last essay on the attractions of the videogame medium, I discussed the differing forms of ludus and paidea play (taught in the lectures), only to have my essay handed back with "what are these words? Are they some kind of spelling mistake?"
Anyway, back to the subject. So, Photoshop is using 7-bits for <128-color indexing... and nothing at all to do with "lots of tags in the header bumping up the filesize".
Name:
Anonymous2006-02-21 7:46
Come to think of it it must be doing something even weirder since a jump from 8-bit data to 7-bit data shouldn't account for an almost halving of size if you're saving the pictures uncompressed. Do you have some webspace allocated to put the pictures up so we could examine them?
Name:
Anonymous2006-02-21 17:39
So, Photoshop is using 7-bits for <128-color indexing... and nothing at all to do with "lots of tags in the header bumping up the filesize".
This probably isn't true. What is true is that you shouldn't try guessing what's going on in there, since image compression depends on a number of variables (as well as the algorithm). Unless you know how TIFF works, you're better off not assuming anything. Your professor is probably wrong regarding headers though.
PS. Photoshop really is shitty when it comes to image compression. You'd expect the premiere raster editing program to do this well, but nooooooo...
Your best bet would be to get hold of a hex editor/viewer, a copy of the TIFF specification and dissect the files, looking for the reason of their enlarged size. Post results here please.
That's what I got. Uncompressed btw. I don't think tiff supports indexed colour..
Name:
Anonymous2006-02-22 9:03 (sage)
Oh wait this is better, with LZW compression
256 colours - 285kb
128 colours - 244kb
64 colours - 209kb
32 colours - 169kb
16 colours - 133kb
The sizes are proportional now.
Name:
Anonymous2006-02-22 12:55
^ Not OP ^
That would make sense, but we weren't using compression in class. I'll see if I can find the pics and upload them.
Name:
Anonymous2006-02-22 13:24
OK, I had a go at reducing the colors in the same images he used, and I got similar results as >>13 ...which is fine, just means the instructor fucked up.
TIFF definately supports indexed colour, it just doesn't seem to reduce the filesize under 256 colours... which again, I can live with.
Now I'm back to the problem of file size vs size on disk. Can anyone explain these file sizes to me?
So... all the 'Size on Disk' sizes are the same, understandably, but why are the 'Size' ones different? I'm guessing this is why my instructor thought that 'Size' doesn't include file header tags, but 'Size on Disk' does.
Clusters are usually 4k in size (8 sectors). Sometimes they're 1k or 8k (or any power of 2), depending on the size of the filesystem. It looks to me like yours is 4k:
278,528 % 1 = 0
278,528 % 2 = 0
278,528 % 4 = 0
So even though the file itself is less than 278,528 bytes, since only one file can be assigned on a cluster, you use 278,528.
278,528 % 8 = 4
Name:
Anonymous2006-02-22 17:33 (sage)
The last line should be next to the other numbers.
In TIFF parlance it's known as a 'ColorMap', and is referenced by one byte per pixel. Hence the lack of size reduction below 256 colours.
Your two-colour image could potentially be stored as a 'BiLevel' image in the TIFF file, with 1 bit per pixel, if it was black and white. Which would reduce the filesize to approximately an eighth.
Name:
Anonymous2006-02-22 21:08
The TIFF format sucks, there is no format that is worse.
The only reason it is still used is the same reason people use Macs, and that is: People are fucking stupid desu.
Apparently Photoshop CS actually has a decent PNG implementation, so it doesn't need SuperPNG?
Name:
Anonymous2006-02-23 1:41
I suggest you inform your professor and give him a chance to correct the information he has given the class. That way you get more wiggle room in terms of "what is correct" when the professor tries to screw the class out of marks. I hate people who are like "I wasn't wrong. i was just sort-of right. You were still wrong."