2020-08-09

Old times news

Weeks ago I digged into one of my dad’s hard drive on his Microsoft Windows office machine, searching for interesting stuffs he could have left there. Since I hadn’t the time to cope with all the amount of data found there, I did a simple

dir /s ... >a_file.txt

(or something like this) to have a list to check against the content of other hard drives at home. Likely what’s there was already copied here.

Now I need to do a similar list of all folders on the home computer. Indeed, I’ve already backed everything (hopefully) up on my external hard drives and something also on my current laptop. Which runs GNU Linux, of course.

I thought a_file.txt was encoded as windows-1252. But

iconv -f windows-1252 -t utf8 a_file.txt

didn’t give what I expected: that’s not the encoding of the characters in the file!

NOTE: the header produced by dir, and also many file names, contain Italian accented letters like à, ù, … So it was very easy to spot the error.

After trying other windows-NNNN encoding with no success, I’ve tried the next “obvious” thing: IBM-nnn charsets. At last, through automated attempts, I’ve found that the charset used by cmd was IBM437. This funny relic.

Next time I’ll try with the PowerShell. Maybe.

What I’ve done

Just to fill the page:

> iconv -f ibm437 -t utf8 x >y
> gawk 'BEGIN{FIELDWIDTHS="36 *"} !/<DIR>/ && /^[0-9]{4}-[0-9]{2}-[0-9]{2}/ {print $2 }' <y >z

The sort -u can be useful. Then I plan to produce a similar list (with find) of all the files I have backed up from his computer, and to compare these lists with diff.

So far, I picked a pattern for a collection of several PDF files and checked that at least the number of such files matches.

> grep -ci "pattern" <z
29
> find /path -iname "pattern" -type f |wc -l
29

Then I assume I have all those files. But I want to check it better for all the other files, so… I’ll work on it.

For now, this whole post is • to drop summer drops, and • to show how the past can be still present (IBM 437… OMG!)

Now, a little bit of old time graphics:

╒═════════════════╗
├─────────────────╢
│CHARACTER GRAφICS╟──────────┐
└─────────────────╫──────────┘
                  ║
                  ▀

The only thing I like… (maybe because it remembers me the ZX Spectrum’s frames, which were thicker and nicer anyway)

This is how it should look with a proper font:

No comments:

Post a Comment