2010-05-16

If you know it, you can...

I confess it: I often creep in forums and "ask your question" sites, sometimes I add my answers to solve a problem or manifest an idea (and often I think it's the best answers of course). Mostly it's boring but it is also a cheap pastime and I find it relaxing.

Few hours ago someone asked help for the following problem: I have a JPG file N kbytes long, and I want to make it M kbytes long (M > N). A typical trait of these questions is an almost total inability of the asker to express what they want in a non-ambiguous way.

The already written answers talked about increasing JPG quality and/or resizing image's width and height; these actions have as consequence to make the file bigger.

The common answerer focused on the fact that the file was an image (that can be resized) in a specific format the "quality" can be set of. Since the asker talked about file sizes, I focused on the file instead and swifly thought of cat and dd: dd can be used to "cut" a stream at the desired size, while cat allows the creation of a stream bigger than M. Since the simplest way to make a file bigger is to pad it with zeros, I thought of /dev/zero too, and wrote the line

cat file.jpg /dev/zero |
dd of=modified.jpg bs=1 count=800k


if we want the final file to be 800k bytes (800*1024) long. Neat and easy, isn't it? At this point someone can ask: but this way the JPG gets corrupted while the other answers have the merit not to waste it! And at this point it is useful a little bit of knowledge about JPG files: you can append data to a well-terminated JPG file without problem, since decoders stop to a marker which says the JPG file is finished. If you put data after the stop-marker, you do no harm.

The story taught me that a little bit of knowledge about file formats and classical "*nix" tools (in their GNU versions in my case) can solve a lot of problems easily, and that who does not know these things have as only option the use of softwares that do the job as needed; if there are not such softwares then you've not bricks you can build the solution with. If you don't have a bigger picture of what computers are for, your only idea will be to load you preferred image manipulation software and try to match the file size changing "parameters", and maybe you'll get it after several trial-error cycles... only to notice that now it looks different and you don't want it to look so different!!

So the suggestion is read read read read and learn learn learn learn (and install GNU coreutils:D)

And now a brief explanation of what the line does. cat A B concatenate two streams (e.g. two files) one after the other. The /dev/zero can be considered as a "virtual" infinite file made of all zero bytes. For this reason we need dd to select the right amount of bytes from the stream, copying to "output file" only the quantity we want (see e.g. this Wikipedia link about dd).

No comments:

Post a Comment