2019-01-13

“Dynamic overloading” in other languages (not C++)

In a previous post I've imagined that future C++ could dynamically dispatch a call to a proper function/method according to the derived types of its arguments — that is, dynamic overloading, whereas notoriously in C++ overloading is a compile time feature.

Now the question is: what about other languages?

Python

Python isn't statically typed; methods and functions can be overloaded, but the only discriminant seems to be the number of arguments. In recent revisions of the language we have type hints; in 3.5.1 (the latest version I have) you can hint, almost1, but the interpreter ignores the hint.

def stronghold(t: int) -> str:
    return str(t) + " old"

print(stronghold("hello"))

This should fail, but it doesn't (and in the same time the syntax is accepted). Do I need a more recent version? Maybe 3.6? Experimenting with python.org/shell (currently Python 3.7…) tells me I'm missing something.

Anyway… I doubt the feature is used to extend the way Python can overload functions and methods. So we need to write our dynamic dispatcher and to implement what we need inline, or in functions which must have different names…:

def op_sum(a, b):
    if type(a) == Number and type(b) == Number:
        return Number(int(a.value()) + int(b.value()))
    elif type(a) == Number and type(b) == String:
        return String(str(a.value()) + b.value())
    elif type(a) == String and type(b) == Number:
        return String(a.value() + str(b.value()))
    elif type(a) == String and type(b) == String:
        return String(str(a.value()) + str(b.value()))
    else:
        raise Exception("...")

By the way, we can use the very same “inlined” solution in C++, without the need for separate (overloaded) functions.

I suppose the same for Ruby applies to Python: this dynamic argument-based dispatch (dynamic overloading in my words) is something no guys from mainstream languages cared about, and maybe not without a reason.

Ruby

The most recent version I have is 2.1. The answer is explained here, so we can cook a solution which resembles the one written for Python because surely the language itself doesn't offer anything different out of the box.

However, this answer is promising, and also this one (I expect in fact that functional languages don't have these problems at all).

def op_sum(a, b)
  if a.is_a?(ONumber) and b.is_a?(ONumber)
    ONumber.new(a.value + b.value)
  elsif a.is_a?(ONumber) and b.is_a?(OString)
    OString.new(a.to_s + b.value)
  elsif a.is_a?(OString) and b.is_a?(ONumber)
    OString.new(a.value + b.to_s)
  elsif a.is_a?(OString) and b.is_a?(OString)
    OString.new(a.value + b.value)
  else
    raise RuntimeError
  end
end

In Ruby Number and String are Ruby's classes, hence the initial capital O for our classes.

Ruby and Python (and goals)

In C++ I need to derive Number and String from Base because otherwise I can't put them in a container like std::stack. In fact containers are homogeneous. In Python and Ruby I don't need to have a common ancestor for Number (or ONumber) and String (or OString). Hence we have two different unrelated classes — just in case we need to implement special behaviours, but this isn't always the case: e.g., in my string-number example, String (OString) could be replaced by the native string class of the languages. Indeed, Number (ONumber) can be replaced, too. That is, there's no reason to mess with new classes.

In C++, primitive types hasn't runtime type informations… and anyway you can't put an int and a std::string together into the same container. We need a wrapper class for int (like Number), and still a base class.

Now, let me explain (again?) what I wanted to do. This is related to Golfrun, which in 2019 I can declare dead — but you never know if and when and where metempsychosis will hit.

There is a stack that can hold objects of different types, and a program which pushes objects on the stack and “combines” them using operators. These are “overloaded operators” that pops one or more of them according to their arity, like + (dyadic) and . (pop and print, monadic) in the example.

So, + is a sum when the two top objects are numbers, and the concatenation operator when the two top objects are strings, and concatenation (again) when just one of those is a string — with an implicit stringification for the other.

I wanted passive objects: operators do the job. That's why I don't want the sum operator to be a method of any of the objects — even if this solves the problem of the dynamic dispatch at least for the first argument-object (then 100% solved for unary operators)2.

I want to write code like this:

   //...
   case OP_PLUS:
      o2 = stack.pop();
      o1 = stack.pop();
      stack.push(op_plus(o1, o2));
      // (renamed to op_plus() to stress the fact that
      // it isn't necessarily a sum)
      break;
   //...

But in the same time I would like to have only

Base* op_plus(Number*, String*);
Base* op_plus(Number*, Number*);
Base* op_plus(String*, Number*);
Base* op_plus(String*, String*);

without having to carve my own dispatcher:

Base* op_plus(Base*, Base*)
{
   // generate by the compiler... what a dream!
}

If a new type is added, you need to add a lot of new op_plus3 and also new dispatching paths for those. Thus, the question was: is there a way (without external tools like a specialized preprocessor) to automatically avoid at least this by-hand update of the dispatcher?

The blamed C preprocessor can help to reduce the amount of keystrokes and to keep the code cleaner (somewhat). But I was searching for an automatic “in language” solution.

It seems I'm looking in vain.

Erlang and Haskell

These are my hope 😄, even though their particular nature can make the life harder. I doubt it makes sense to speak of dynamic argument-based dispatch; anyway these languages rely on powerful pattern matching, which isn't exactly the one you use with grep 😉, and you can see it as a form of dynamic dispatch.

In Erlang the + implementation could look like this:

plus({string, A}, {string, B}) -> ostring:new(A ++ B);
plus({string, A}, {number, B}) -> ostring:new(A ++ integer_to_list(B));
plus({number, A}, {number, B}) -> onumber:new(A + B);
plus({number, A}, {string, B}) -> ostring:new(integer_to_list(A) ++ B);
plus(_, _) -> error.

But we could use guards as well:

plus(A, B) when is_list(A) and is_list(B) -> A ++ B;
plus(A, B) when is_list(A) and is_number(B) -> A ++ integer_to_list(B);
plus(A, B) when is_number(A) and is_number(B) -> A + B;
plus(A, B) when is_number(A) and is_list(B) -> integer_to_list(A) ++ B;
plus(_, _) -> error.

And that's it. Nice and clean. One could think of the imagined feature dynamic_dispatch<Base*> of C++ as a way to specify guarded entry points (functions) where the guards can only inspect the type of the arguments (which then must be determined at runtime — this works for classes and anything else which comes with RTTI).

In Erlang I used the idea of “annotating” each element pushed on the stack with a proper label, so that it can be used to match the correct function to be executed. (Also, guards can work as in the example, but they have limits: in fact you can't use any function to check if the argument has the desidered property; therefore exploiting pattern matching is more flexible)

C++ doesn't perform runtime pattern matching of any kind on the function arguments; the dispatcher function is the runtime code which does. The idea of annotating each element pushed on the stack has as C++ equivalent the wrapping of every type in a class; if we have different classes, all need to be derived from a base class, so that the container can hold them.4

In Haskell we can achieve the same results with a similar approach to Erlang. Haskell is a strongly typed purely functional language which sometimes makes your life a little bit harder, but except for few minor details, we have the same thing as in Erlang in this case.

data StackEl = Onum Integer | Ostr String

instance Show StackEl where
  show (Onum a) = show a
  show (Ostr s) = "\"" ++ s ++ "\""

plus (Onum a) (Onum b) = Onum $ a+b
plus (Onum a) (Ostr b) = Ostr $ (show a) ++ b
plus (Ostr a) (Onum b) = Ostr $ a ++ (show b)
plus (Ostr a) (Ostr b) = Ostr $ a ++ b

My conclusion is that the features of these languages make it possible to have clean code for what I wanted to do. But they have also few (or many, it depends…) drawbacks when compared to an imperative/OO language like C++.

Perl6

Perl6 shows what's missing in Python (before version 3.7…?) and Ruby (before version …?): they are dynamically strongly typed without the ability to overload a function according to the type — i.e., they miss exactly the feature I need, “dynamic overloading”, as I called it, or dynamic argument-based dispatching — while C++ has only static argument-based dispatching5.

As in Ruby and Python, since containers can be heterogeneous and all variables have runtime type informations, we can lose wrapper classes, and if we want them (maybe to add special features like a representation of the value as a string), anyway we don't need the class to be subclass of a common class (derived classes).

The solution is trivial:

multi plus(Numeric $a, Numeric $b --> Numeric) { $a + $b }
multi plus(Str $a, Numeric $b --> Str) { $a ~ Str($b) }
multi plus(Numeric $a, Str $b --> Str) { Str($a) ~ $b }
multi plus(Str $a, Str $b --> Str) { $a ~ $b }

In the gist I've shown also the code when the values are wrapped in wrapper classes. There's nothing fancy, just a little bit more lines of code and unwrapping/rewrapping ballet.

What's next

In the first article I've imagined how a similar feature could be added in C++.

In this article I've explored what it can be done “out of the box” in other languages (Ruby, Python, Haskell, Erlang, Perl6). Ruby and Python (at least in the latest version available to me) needs an implementation of a dispatcher very similar to C++; Haskell and Erlang can exploit their pattern matching feature (Erlang with guards or “labeling” the objects on the stack, Haskell using “direct” pattern matching on types). Perl6 can embrace the best of every world and would allow me to do exactly what I wanted in an “environment” I find more programmer-friendly than Haskell (and Erlang), at least when coping with this kind of programs.

Hence, I think there's a “winner” so far: Perl6. I am not surprised, indeed.

In future posts I will explore other languages: Go, Java? Smalltalk? Objective-C? Ada? …

And then maybe I will end up programming a preprocessor in Perl6 to generate the needed code in C++…

Codes

For those who want working examples, here a gist containing the codes.

Note: dynerlang.erl depends on onumber.erl, ostring.erl and stack.erl.


  1. It's ok in def x(a: Type) -> Type: ..., but gives error in p: int = 10. Version 3.6 (tried on repl.it, for instance) and more recent ones accept both.

  2. This is also a clue of the fact that it is possible in general, with the help of the compiler and runtime code/data, of course.

  3. If N is the number of different types, the total number of functions for a dyadic operator is N². When you add a new type, the total number will be (N+1)², hence you need to add 2N + 1 functions. In our case, 5 functions. If the types are three (Number, String, Block), for each dyadic operator we need 9 functions. If the operator is commutative we will spare some code. However, all the point of this “heavy” overloading is to have more operators than the number of the available symbols: in our case, the fact that "o"(N, S) may be different from "o"(S, N) is a virtue!

  4. I have other tricks in mind, but I bet these are frowned upon.

  5. Dynamic arguments'-number-and-type based dispatching vs. static arguments'-number-and-type based dispatching.

No comments:

Post a Comment