2011-04-09

What happens if...

Back again. This time I fished out something I had put apart for future fun. It is simply a piece of code that, like a lot of pieces of code I've seen recently, drove me crazy; since it is clearly wrong, but people using it think it is right, just because it happens that it works. (Moreover names chosen for the functions are misleading, but this is a different fact).

How about that, you ask: if something happens to work, it means it is right. Doesn't it? Wrong. Of course, it does not work miraculously: it works since the "real world space" of inputs which feeds it, is smaller than the "theoretical space" of inputs that the code pretends to handle.

Less words, let's look at the code (pseudocode).


procedure Copychar (start, end : integer,
src, out dest, repl : string);

var tmp : string,
i, k : integer;

k := 0;
for i := 0 to src.length;
if i >= start-1 and i < repl.length then
tmp[k] := src[i];
k := k + 1;
else if i >= end then
tmp[k] := src[i-1];
k := k + 1;
end if;
end for;
dest := tmp;
end procedure.


Hopefully every real programmer (being using Pascal or whatever) can see how ill this code is (and the procedure name does not explain what the code's intentions were). First, the for runs from the first character (indexed by 0) to past the last one (indexed by the length of the string minus 1); the guard ring is on the length of the repl string, not on src that is the variable indexed by i. If the repl string is longer than src, bad things may happen.

So the procedure must have at least this constraint: repl.length <= src.length; with this (undeclared) constraint, the index i can index src correctly in the first if. When the index has passed that "threshold" the else if is taken into account for sure, and its body executed when i >= end.

This code is so evil that it is hard to explain why (and it is hard to imagine that it did the job it was written for! But if I would show the rest of the code, you will see that this is very specialized code, camouflaged as generic function, used in a very "special" way, in a very "special" context).

The intention of the procedure (by the name and by input parameters) seemed to be to substitute a string with another, where of the old string we know just where it starts, and where it ends. But if we take a look at it, we see it does not read from repl, and it does indexing in the wrong way...

A correct, correctly general procedure could be instead (I am using more abstractness in place of explicit for loops):


procedure ReplaceStrPart (start, end : integer,
src, out dest, repl : string);

dest := "";
dest := dest + src[0:start-1];
dest := dest + repl;
dest := dest + src[end:];

end procedure.


Hopefully everyone who can be considered a programmer can translate this with for loops and indexes, if the language does not support strings slicing or similar syntax (note: in the syntax a[i:f], if i > f, nothing is copied, while a[i:i] is the same as a[i] and picks the i-th character).

This procedure does what we expect and has a name that is a little bit better.

As for the original evil code, the programmer failed to write a senseful generic function/procedure; rather, he proceeded in a trial-error-fix fashion, in a "limited input space", and he had to "compensate" for what likely he recognized as odd behaviour, until he reached the purpose... I would say, by chance. And everything went fine since the "input space" stood limited, though (I know this) there was no reason why it should happen, there was no a special constraint or handshake about that... it simply was so since people have their habits and generated strings keep their "places"...

No comments:

Post a Comment