2018-03-18

Sather reloaded

Sather Tower By John Galen Howard - from the exhibition catalog of the 'Roma Pacifica' Online Exhibit. Of the John Galen Howard Collection (1955-4), Environmental Design Archives, University of California, Berkeley., Public Domain

Once upon a time, there was Sather… I've briefly said it exists in a previous old article, Languages, OOP, Sather et al..

Since then I had discovered other things I liked about Sather, but I've abandoned it because there isn't a maintained compiler, and then I've forgotten it.

Taking a look at Eiffel, Sather popped into my mind again. It was easy to understand why:

Originally, it was based on Eiffel […] It is probably best to view it as an object-oriented language, with many ideas borrowed from Eiffel.

Besides,

Even the name is inspired by Eiffel; the Sather Tower is a recognizable landmark at Berkeley […]

And its features' list according to Wikipedia includes design by contract, and in fact among other things it has pre, post, and invariant routine.1

An example:

class MAIN is
  something(a, b:INT):INT
    pre a < b
    post result = initial(a) - initial(b)
  is
    return finish - start;
  end;

  main is
    r ::= something(1, 5); -- ok
    #OUT + r + "\n";
    r := something(4, 3);  -- fail
    #OUT + r + "\n";
  end;
end;

As in Ada, these checks are disabled by default, so they must be enabled when necessary. The Sather compiler (sacomp) does it with the options -chk_pre, -chk_post, -chk_invariat2 followed by the list of classes we want to enable the checks for. I haven't found a way to control these checks in the language like you do in Ada with the pragma Assertion_Policy.

When compiled with sacomp -chk file_name.sa the output of the example is

4
Runtime error - Violation of precondition
in MAIN::something(INT,INT):INT

as expected.

Uniform access principle

All services offered by a module should be available through a uniform notation, which does not betray whether they are implemented through storage or through computation.

Words by Bertrand Meyer (Eiffel's father), cited in Uniform access principle.

This is another feature I like which is an inheritance of Eiffel. You don't really know if you are accessing a variable/property/attribute or calling a method (in theory you are always calling a method, but optimization can change a call into an “access” to memory, though3).

  object.speed := 100;
  if object.altitude > 100 then
    -- ...
  end;

They look like an assignment and an access of a member attribute, but they aren't necessarily.

The following is a bad class in C++.4

class State
{
public:
    size_t count;
    State();
    void countVotes();
};

State::State() : count(0)
{}

void State::countVotes()
{
    // heavy computation which gives 1999566
    count = 1999566;
}

This class can be used in an unintended way. E.g.

int main()
{
    State Florida;
    Florida.countVotes();
    std::cout << Florida.voteCount << "\n";
    Florida.voteCount -= 100;
    std::cout << Florida.voteCount << "\n";
    return 0;
}

Instead you will do it more like this.

class State
{
private:
    size_t m_voteCount;
public:
    State();
    void countVotes();
    size_t voteCount() const;
};

State::State() : m_voteCount(0)
{}

void State::countVotes()
{
    // heavy computation which gives 1999566
    m_voteCount = 1999566;
}

size_t State::voteCount() const
{
    return m_voteCount;
}

Now the caller can't tamper with the votes. Unless he's fine with making everyone to believe there are 0 votes; in this case he “forgets” to call countVotes() and he has the result he wants.

int main()
{
    State Florida;
    //Florida.countVotes();

    std::cout << Florida.voteCount() << "\n";

    return 0;
}

If the caller hasn't malicious intent, he must however know how to use properly the class: he must know that voteCount() gives wrong results if called before countVotes().

It is a wrong usage of the class, but a question must be done: why must the caller be in charge of this duty? Is it necessary?

Let's try to mitigate the burden.

class State
{
private:
    size_t m_voteCount;
    bool m_computed;
public:
    State();
    void countVotes();
    size_t voteCount();
};

State::State() : m_voteCount(0),
                 m_computed(false)
{}

void State::countVotes()
{
    // heavy computation which gives 1999566
    m_voteCount = 1999566;
    m_computed = true;
}

size_t State::voteCount()
{
    if (!m_computed)
        countVotes();

    return m_voteCount;
}

Some users do use the class correctly, some users don't. John Doe, who worked on two projects made by different teams, saw both “styles”, the one which calls countVotes() and that uses voteCount() wherever they need the count of the votes, and the one which calls only voteCount() wherever they need the count of the votes5. Both styles work and John, who's a smart guy, can imagine why, but he's a little bit pissed off by this difference. Therefore he writes to Jim Cee, the provider of the class, telling him his idea.

Jim Cee isn't scared of breaking clients' code and doesn't care about legacy code: it's against his well known manifesto — his employers know it very well. So he writes a new version of the class State and announces old versions won't be maintained anymore.

In this last version you have only a method to access the count of the votes, that is, voteCount(). Nobody but Jim can see it, but all he has done was to make countVotes() private. Now the interface of his class is cleaner (according to him) and has only a way to do the right thing.

In Sather this final version would be something like:

class STATE is
  private attr votes:INT;
  private attr computed:BOOL;

  create:SAME is
    return new;
  end;

  voteCount:INT is
    if ~computed then
      votes := countVotes;
    end;
    return votes;
  end;

  private countVotes:INT is
    -- heavy computation
    votes := 1999566;
    computed := true;
    return votes;
  end;
end;

This is very close to the C++ example (or viceversa), except for the syntax: C++ stresses the fact that it's a method call by making it mandatory to write the () (or the call isn't made and you have something different entirely), but in Sather countVotes can be both the name of a variable and a call to a method with no arguments.

Usage example6:

class MAIN is
  main is
    florida ::= #STATE;
    #OUT + florida.voteCount + "\n";
  end;
end;

Of course critics have a point: in the voteCount (or voteCount() in C++) approach you don't really have control over when the count is done. Indeed the answer is simple: the first time you ask for it. Nonetheless, if you have an application which can take different paths according to conditions you can't control and you want to be sure of where and when the heavy computation is done, you simply have to call/access voteCount when it can be done safely.

  -- (A) here we can afford a heavy computation, so we
  -- "trigger" voteCount for the first time
  votes ::= florida.voteCount;
  -- ...

  -- (B) maybe we access florida.voteCount here, maybe not,
  -- who knows? (To avoid problems in C, we have done
  -- something in A; then, B too can use votes, or
  -- florida.voteCount with no heavy computation in sight)

  -- (C) now we can't afford a heavy computation anymore,
  -- and we don't know if (B) has done it or not... Luckly
  -- we did it in (A), which is always executed.

This isn't a specific problem of the “only one function following the UAP” approach. We have the exact same thing with the voteCount() / countVotes() pair, but we use different functions, that's all.

  // (A)
  countVotes();
  // ...

  // (B)
  // maybe does something calling voteCount()

  // (C)
  // does something calling voteCount() somewhere

This isn't a strong case in favor of having countVotes() and voteCount() (or better names doing the same thing); plus, if there are other paths where countVotes() isn't called (a bug, of course), you get 0 from voteCount(), and if you solve as Jim Cee did in the second version of his class State, then it's better if you provide only a function, e.g. voteCount(), which counts the votes only the first time it's called.

In the case of the setter, the syntactic sugar makes the code cleaner.

In general you have getCount() and setCount(value) to get and set a count. In Sather you write:

class CLASS is
  attr count:INT;
  -- ...
end;

This implicitly defines a setter and a getter so that you can write

   if a_class.count > 100 then
     -- ...
   end;

and also

  a_class.count := 100;

which is syntactic sugar7 for

  a_class.count(100);

So, in order to make something more complicated in an assignment, you define a method accepting a value:

class CLASS is
  private attr p_count:INT;
  
  -- ...
  
  count(v:INT) is
    p_count = v * 2;
  end;

  count:INT is
    return p_acount;
  end;
end;

And if you write in a main:

  o ::= #CLASS;
  o.count := 500;
  #OUT + o.count + "\n";

the output will be 1000 (then you can infer correctly that something more than a simple assignment was done in the o.count := 500 statement).


  1. As in Eiffel, invariant is the name of a routine (a feature in Eiffel's lingo) to be written to check the “invariance” of the class instance. In Ada, you “attach” the aspect Type_Invariant to a private type specifying a function to be called to check whichever condition must be checked.

  2. There are other checks which can be enabled, as e.g. the out of bounds check. The option -chk_all enables all the check for a class or several classes (specified after the option), and -chk “is a shortcut for -chk_all all”.

  3. Default accessors can be optimized this way. The section Problems of the Wikipedia's page is mostly meaningless. It borrows concerns from c2 wiki; some of these are odd to say the least. An argument basically says that object.value is bad because hides what could happen behind (a method call). A class does this: it hides stuffs and it controls access to its state through methods (or alike). It's part of the OO game. There's no difference with OO languages which hasn't this syntactic sugar but where you must write object.getValue(). This approach “hides the cost of getting the result”, too; while allowing direct access to a variable of an instance (which, by the way, could be not correctly valorized if the accessor hasn't called the right method previously) is a bad pattern. The implementation of getValue() should take care of the case when the user calls many times the method: the costly computation behind the scene (if there's such a computation) is done once (maybe again if the internal state has changed as a consequence of a call to another method). The same happens when the syntax is object.value.

  4. The example isn't random. See my pseudo-rant in another note and take a look at c2's page on Uniform Access Principle, search for these words (unless they are changed in the meantime): Although the idea carries a nice idealism, one thing that bothers me about it is that it hides the cost of getting the result.

  5. Jim Cee, the provider of the class, wrote it this way because he kept receiving bugs reports from several users lamenting the fact that sometimes the voteCount() method returned 0. Jim imagined the problem was that they didn't call countVotes() in every “code path”, but he got tired of telling his clients to check their code first and to read carefully the documentation (and of course he didn't want to debug clients' code). Thus he changed the class so that voteCount() calls countVotes() if it hasn't been called before.

  6. The main method can be put in the class STATE; but then you must compile with the -main STATE option.

  7. If you don't care about syntactic sugar, you can do the same in many, it not all, OO languages, simply using overloading: name() is a getter and name(xxx) is a setter for the same property. C# allows for a better approach than C++ or Java, with results equal to Sather (and Eiffel). That is, in C# you can follow this principle.

No comments:

Post a Comment