2019-01-12

Dynamic overloading

I have always stated that I dislike C++. “Modern C++” (from C++11 on) changed a lot, but the language still has plenty of dark corners. And limits when you would like to achieve easily some goals, but the language reminds you it doesn't work that way, so things can become more complicated than you would like to think.

Overloading is static

By that I mean that the actual method, or function, is determined at compile time by using the compile-time-known types of the arguments. This avoids “hidden” overhead and maybe new possible runtime errors, but it doesn't allow for syntactical ease in some circumstances.

Let us suppose we have a stack, and dyadic functions behaving differently according to their arguments:

Number* op_sum(const Number*, const Number*);
String* op_sum(const Number*, const String*);
String* op_sum(const String*, const Number*);
// ...

I would like to do something like this:

// ...
Stack<Base*> stack;

   // ...
   case '+':
      stack.push(op_sum(stack.pop(), stack.pop()));
      break;
   // ...

Of course this can't work because we can determine which op_sum() to call only at runtime, i.e., “dynamically”, but “statically” the type returned by stack.pop() will be always the same (a base class), and the op_sum() searched for will be:

return_type op_sum(const Base*, const Base*);

In this case I would finish my days writing code like this…

    // ...
    if (dynamic_cast<Number*>(o1) && dynamic_cast<Number*>(o2)) {
       stack.push(op_sum(dynamic_cast<Number*>(o1),
                         dynamic_cast<Number*>(o2)));
    }
    // ...

…ignoring people saying it's ugly, gives maintenance problems, and there must be better way — i.e., it's better if you redesign your code.

What I'd really like to see it's this feature added to C++ language: under-the-hood dynamic dispatch to the proper function/method based on the real actual class. Some sort of upcast, which of course must be explicitly requested, so that the programmer knows what's going on.

The implementation will use RTTI to determine the actual “upper” class, and then we need a way to tell the compiler that our function must be dynamically dispatched, and maybe we should also list the possible dynamic types, or maybe the compiler can deduce them. Likely there will be usage constraints, too.

dynamic_dispatch_group<Base*> {
   Number* op_sum(const Number*, const Number*);
   String* op_sum(const Number*, const String*);
   String* op_sum(const String*, const Number*);   
}

Or maybe:

Number* op_sum(const Number*, const Number*) dynamic_dispatch<Base*>;
String* op_sum(const Number*, const String*) dynamic_dispatch<Base*>;
String* op_sum(const String*, const Number*) dynamic_dispatch<Base*>;   

Or anything else which will fit better into the grammar of C++ (which is, in my opinion, already a mess, thus anything can experimentally work and then be cleaned up in future revisions of the standard).

Since op_sum() must be dynamically dispatched, when you use the function (or method…) the implementation must produce the code which tries to “upcast” each argument according to the signatures of the functions, hence finding the correct op_sum() to be called; if it doesn't exist, a runtime exception will be thrown. It shouldn't be a great burden, even when implemented roughly (in the worst case we need a number of comparisons equal to the total number of arguments of all the “dynamically overloaded” functions; e.g., 6 in our case1).

Once the correct op_sum() is selected, we have also a certain return type. Likely we have other constraints here. At compile time the compiler could check if all the return type of the group are “compatible”. If we add:

int op_sum(const String*, const String*) dynamic_dispatch<Base*>;

the compiler should give an error, since int isn't Base* nor a derived type, but other return types in the group are. But if we had:

int op_sum(const Number*, const Number*) dynamic_dispatch<Base*>;
int op_sum(const Number*, const String*) dynamic_dispatch<Base*>;
int op_sum(const String*, const Number*) dynamic_dispatch<Base*>;

then it's ok, unless we use op_sum() like in our example, since we want to push the result in a Stack<Base*>. So, it must be possible to cast all the return types of the “dynamic dispatch group” to the expected type on the calling site.

This:

int op_sum(const Number*, const Number*) dynamic_dispatch<Base*>;
double op_sum(const Number*, const String*) dynamic_dispatch<Base*>;

should give at least a warning, no matter how op_sum() is actually called.

And with such a declaration, the following will fail to compile, as mentioned already:

Number* a = new Number(10);
Number* b = new Number(11);
Base* x = op_sum(a, b);

We can mix static common overloading too:

Number* op_sum(const Number*, const Number*) dynamic_dispatch<Base*>;
Number* op_sum(const Number*, int) dynamic_dispatch<Base*>;
Number* op_sum(int, int) dynamic_dispatch<Base*>;

The call to the last line can be determined at compile time, as the usual overloading. Nonetheless, maybe that op_sum() needs to be part of the dynamic dispatching group anyway. I don't know if it is good or not to mix, for the same function name, regular overloading and this new theoretical mechanisms. Anyway, if it is part of the group, it must adhere to the rule for the return types in the group.

But what about virtual methods?

When does C++ dynamically dispatch something? In C++ we have virtual methods; these are dynamically dispatched, but according to the inheritance hierarchy. A method overrides another method in a parent class if the signature matches (return type not included). How can this mechanism play altogether with the “dynamic overloading”?

I think there shouldn't be problems; we need just some (messy) rule to regulate the usage. With the second syntax I proposed, we can imagine that the inheritance rules will be basically the same. (I am going to use struct but of course usually it isn't what you want.)

struct Number : public Base
{
   // ...
   virtual Number* op_sum(const Number* n) const override;
   virtual String* op_sum(const String* s) const override;
   // ...
};

This resolves half the problem in the case of a dyadic operator:

   // ...
   case '+':
      stack.push(stack.pop()->op_sum(stack.pop()));
      break;
   // ...

This would try to call a certain op_sum(const Base*), a method which isn't in Number. But it could be in Base itself. The implementation should upcast to the correct derived class, and then dispatch the call to the correct overloaded method (of the Number class). Again, to make it possible automagically, we should add some information so that the compiler can generate the proper code. In this case however I think the compiler should assume an implied whatever_type op_sum(const Base* s) in the derived class, which would act like a (hidden) entry point for the automatic dynamic dispatch.

struct Number : public Base
{
   // ...
   virtual const Number* op_sum(const Number* n) const override
       dynamic_dispatch<Base*>;
   virtual const String* op_sum(const String* s) const override
       dynamic_dispatch<Base*>;
   // ...
};

In the base class we should have something like this:

    virtual const String* op_sum(const String* s) const = 0;
    virtual const Number* op_sum(const Number* n) const = 0;
    virtual const String* op_sum(const Number* n) const = 0; // ?

But we can't, indeed (only the types of the arguments matter). Then we must return always Base:

    virtual const Base* op_sum(const String* s) const = 0;
    virtual const Base* op_sum(const Number* n) const = 0;

Because of the invented dynamic_dispatch<Base*>, the compiler should implicitly create the entry point for the dynamic dispatching, as if we had:

    const Base* op_sum(const Base* b) const /* ... ? ... */;

From where does it come the constness of the argument and the return type? It must be deduced, and it must be consistent — so, we can't mix const and non-const arguments.

The code, written by hand, could look like this:

const Base* Number::op_sum(const Base* b) const
{
    if (dynamic_cast<const Number*>(b)) return this->op_sum(dynamic_cast<const Number*>(b));
    else if (dynamic_cast<const String*>(b)) return this->op_sum(dynamic_cast<const String*>(b));
    else throw std::runtime_error("dispatching error");
}

In the case when we have an argument whose actual type can be determined at compile time, it would partially work as for the well known overloading. For example (a common function, not a method of a class anymore — but the idea can be used in that case as well):

Base* op_sum(const Number* n, int n1) dynamic_dispatch<Base*>;
Base* op_sum(const Number* n, Number* n1) dynamic_dispatch<Base*>;

// calling (a possible implementation of the operator inc)
    stack.push(op_sum(o1, 1));

At compile time, the compiler already can rule out all those functions of the group op_sum whose second argument isn't an int.

Can (variadic) templates help?

C++ hasn't the feature I imagined above. Is there a way to do something without writing code like the dispatchers (... Number::op_sum(Base*) and similar) by hand?

Can anything be attempted with the use of (eventually variadic) templates (and eventually typeinfos and type_traits)?

I have no clue so far.

The only thought I had was to leverage the idea of tuples, as shown here. The tuple would contain the “base” dynamically casted to every possible derived class. Ignoring constness:

tuple<String*, Number*> t(dynamic_cast<String*>(b),
                          dynamic_cast<Number*>(b));

Only one of the elements of the tuple is not null. Then what? How can it be exploited? I don't know. The vague idea is that at the end of the template game, the typename survivors will be exactly the type of the object, and finally the correct op_num will be used.

I think that if this is possible, it must be also possible this: creating an upcast “dynamic” operator which returns the real class of a base class… and it makes the found type available to the static type system so that the correct function can be called… Of course this can't be possible.

   Base* op_num(Number* a, Number* b);
   // ...

   Base* o1 = stack.pop(); // pick doesn't pop, but pop should pop...
                           // not according to STL, but let's prented...
   Base* o2 = stack.pop();
   stack.push(op_num(upcast<Number*,String*>(o1),
                     upcast<Number*,String*>(o2));

We give a list of “upcasts” to try, and the one which succeeds is the type of the argument… But there's no way we can use the static overload mechanism to call the correct op_num(). This idea simply doesn't carry us anywhere. Dead end2.


  1. The number of arguments of each “overload” is a first classification: we don't try to match the caller's function with those “overloads” which have a different number of arguments.

  2. The idea fails since the beginning because templates are a compile time feature, so they can't be used to do anything special that can be done only at run time. However, I thought there could be a way to use them to help to generate the “dispatching” implementation.

No comments:

Post a Comment