2017-09-11

Overloading so much

I suppose that my average readers know what overloading is, but let me give a quick definition: overloading is a feature which allows to reuse (overload) a function name provided that each signature makes each function distinguishable. According to the signature the compiler can pick the right actual code to execute.

I wanted to exploit this in a very simple case: I have a class with the method load which loads some data using different keys. For this story these keys will be name and surname.

So, at a certain point in the code I want to write

o.load(the_name);

and in another point I will have

o.load(the_surname);

The object o can be passed around with its data loaded, and the callee of course doesn't care about how (from which key) those data were loaded.

Here comes the problem: both the_name and the_surname are strings (std::string) coming from somewhere. So there's no way to make any distinction. How can I change this?

One may think the following would do:

typedef std::string Name;
typedef std::string Surname;

This were nice because I would use those “types” to clarify the “meaning” of the content of the strings, and different types would “force” the call to the right method.

But it actually doesn't work: typedef does not define a new type, despite its name. It is the same for using (since C++11), which at least has the merit not to disguise its real nature.

The typedef creates a synonym which is seen exactly as std::string, so that the following code excerpt wouldn't compile even when you add all the garnish code you can infer by yourself.

void MyClass::load(const Name& name) {
   // ...
}

void MyClass::load(const Surname& surname) {
   // ...
}

The compiler should emit an error like redefinition of, because from its point of view it's like you've written the same method twice.

It's like if Surname and Name get replaced early with std::string; so the annoying behaviour depends only by the “stage” when the typedefinitions are put in.

I can't see a reason why the signature using typedefined names must be considered the same when the compiler has to decide if it's an overloaded method or just a redefition of an already existing method. But of course there must be a reason. For instance,

std::string s("Hello");
o.load(s);

Should this code trigger an error? If typedef would define a type, it should, because it couldn't decide which load to call: in fact there isn't any with const std::string& or alike. This is perfectly fine being a defined type!

At the same time, inside load (outside the signature of the method) I want to use name or surname exactly as an instance of std::string.

Differently speaking, I would have liked to use typedef to clarify the meaning of the variable (the datum it holds) and to select the correct method to execute, faking a type in the signature (in order to “deceive” the overloading mechanisms) but behaving as the real type in the body of the method (otherwise we'd need a cast) and in other places.

If feasable, is this a good idea?

Anyway likely it would break a lot of code; the committee would rather consider a new keyword instead of changing the meaning to an old one, I suppose.

Plain wrapping

Let's see a “fix” that could be worse: wrapping.

struct Name // minimal wrapping
{
  std::string value;
  Name(const std::string& s) : value(s) {}
};

// ---

void MyClass::load(const Name& name_arg) {
  const std::string name = name_arg.value;
  // the same code ...
}

// ---

o.load(Name(the_name));

I can't use Name(the_name) as a std::string in the rest of the code, hence the wrapping would occur only there, in order to “trigger” the right overloaded function (or method in this case). Inside the method we unwrap the string at the very beginning and go on with the same code we had before.

But at this point one may wonder why all this effort when methods load_by_name and load_by_surname would be clear and will need less code? (Is this how software becomes overbloated so easily?)

By the way, I have a second alternative that can mix with the load_by_* solution: defining constants to specify what that std::string is. Example:

o.load(the_name, MyClass::LoadBy::Name);

Something like that. The load method would dispatch to the correct method:

void MyClass::load(const std::string& s, MyClass::LoadBy b) {
    switch (b) {
       case LoadBy::Name:
          load_by_name(s);
          break;
       case LoadBy::Surname:
          load_by_surname(s);
          break;
    }
}

In the public section of our MyClass (!) we have something like

enum class LoadBy  // since C++11 on
{
   Name,
   Surname
};

A sort of handmade dispatching which doesn't save small overbloating and it doesn't make the code clearer. It is exactly like having to pick a load_by_* method, but done through a value passed as argument to a more general method. It doesn't look a very nice idea.

All this happens because I would like to “tag” the std::string as bearer of a name or a surname and let the compiler pick one of those (overloaded) methods accordingly.

Harder wrapping

In the plain wrapping case I wrote:

o.load(Name(the_name));

That is, the_name is a string which I wrap “in place” only to trigger the correct overloaded load method.

The following would make more sense:

Name the_name(...);

// ...

o.load(the_name);

If typedef had defined a new type which is such only when used in signatures, I would have:

Name the_name;

// code using the_name as if it were std::string
// ... because in fact it is std::string

o.load(the_name); // run the code that loads by name

This was my original idea. If I use the wrapper class Name I don't obtain what I want because I can't use the_name as if it were std::string.

How can we achieve something close enough?

class Name : public std::string
{
public:
    Name(const std::string& s) : std::string(s) {}
    Name() : std::string() {} // removable
};

Something like this.

This works in many cases, but not all cases; e.g. the following won't compile:

Name p;
// ...
p = "Chandler";

Neither this will compile:

Name p = "Chandler";

But the following would:

Name chandler("Chandler");
Name p = Name("Mr. " + chandler);

And also the following works as expected:

Name x("X");
Name y1(x);
Name y2 = y1;
Name px = y2 + " H.";
Name p("Mr. Chandler");
p += " Bing";

And so on.

In this case deriving from std::string isn't an attempts to extend std::string: I want exactly a string, but wrapped into another class just to let the compiler see it as a different type — because in this case it is. The fact that I can use the instances of Name almost like std::string is a convenience. I wouldn't add more code into the class just to cover few lacks: I would try to keep it as thin as possible — likely I would also remove the Name() ctor.

Moreover, I am also fine with the fact that an instance of Name can “become” a plain std::string (e.g. when you use a method of std::string which returns std::string&). I don't care if it is “degraded” into its base class as long as it already served its purpose.

Of course this will limit your freedom to use it mindlessly before the call to the method load is done, but you still have a lot of liberty. E.g. the following is fine.

Name p("Doris");
Surname s = p.append(" Surnamed");
o.load(p); // load by name
// ...
o.load(s); // load by surname

But this is not:

Name p("Doris");
// ...
o.load(p.append("!"));

In fact append gives a std::string& and both

void load(const Name& name);

and

void load(const Surname& surname);

are candidated. Anyway the wrapper offers an almost-std::string experience which can be enough for my purposes.

(By the way, I'll drop this idea: let strings stay strings — even if typedefined for purely semantic clarity — and all the consequences come.)

Overlord Java

This was about C++. Now, what if I have the same “need” in Java? In this case I can't resolve the same way because String it's final and it can't be inherited from:

    class Name extends String
    {
        public Name(String s) {
            super(s);
        }
    }

This piece won't compile. Likely in Java there's no option but the “plain” wrapping, or another clever solution that can't come into my not-a-fan-of-Java mind. Even the plain wrapping could override Object's toString() to gain a little bit of convenience, but nothing if compared to what you've in C++.

Something like:

    final class Name
    {
        private String value;

        public Name(String s) {
            value = s;
        }
        
        @Override
        public String toString() {
            return value;
        }
    }

This code doesn't implement any of the interfaces implemented by String. So likely you will use toString() at the beginning of the implementation of load(), like we did in one of the C++ “solutions”.

public void load(Name s) {
  String name = s.toString();
  // ...
}

This is another example where you wrap the String only when you want to “select” the right load method:

String the_name = "Jessica";
// ... do anything with the_name
o.load(new Name(the_name + " Rabbit"));

We must wrap the string at the end, just to call the proper load method. (This is ungly, just in case I haven't written so above.)

Kneel in the Church of Haskell

At this point I would say that a language like Haskell offers a more interesting approach, though we can't try to compare C++ or Java or anything else OO with a purely functional language like Haskell.

Moreover one may say that this is the closest thing to the “wrap just to call the right function”, but it comes rather naturally in Haskell.

data PersonalData = Name String | Surname String deriving (Show)

-- this isn't an action, and this isn't general...
generalAction s = s

load (Name n) = generalAction $ "Name " ++ n
load (Surname n) = generalAction $ "Surname " ++ n

I think this is beautiful but there's a little problem: no states, no objects. What can the load function load, and where? Since load has side effects, we must exit the purity in the Haskell way, which means — according to me — that what is in this case only a syntactic advantage suddenly becomes a burden.

A syntactic advantage… compare

load (Name n) = ...

to

void load(const Name& n);

or to

void load_by_name(const std::string& n);

What about Scala?

I don't know Scala, but few buzz facts like: supporting functional programming and strongly typed (with a strong static type system).

Maybe I can achieve what the beauty of overlord Java doesn't admit… If only I knew Scala. So far I've done something that resembles very much the Java wrapper solution.

Here it is:

import scala.language.implicitConversions

trait Wrapper[T] {
  val value: T
}

implicit class Name(val value: String) extends Wrapper[String] {
   implicit override def toString :String = value
}

implicit class Surname(val value: String) extends Wrapper[String] {
   implicit override def toString :String = value
}


def act(n: Name) = println("Name " + n.toString)
def act(n: Surname) = println("Surname " + n.toString)


act(new Name("name"))
act(new Surname("surname"))

Not very thrilling. I'm sure we can have a better solution, but currently I don't know enough of the language to say if I'm right or wrong.

Reading here I think I'm misusing/misimplementing the implicit conversion feature.

What about Ada?

No way. Ada won't implicitly convert anything, nor allow you to use a type as if it were another.

Incredibly — but it holds what I've said about Scala, even if I've explored Ada a little bit more — it seems like Ada behaves as C++ when using typedef.

-- ...
   type Name is new String;
   type Surname is new String;
   
   procedure Load(S : in Name) is
   begin
      Put_Line("Special Name");
   end Load;
   
   procedure Load(S : in Surname) is
   begin
      Put_Line("Special Surname");
   end Load;
-- ...

This won't compile: ambiguous expression. That is, from the point of view of the type system, both Loads are equal.

This is explained here:

A subtype of an indefinite subtype that does not add a constraint only introduces a new name for the original subtype (a kind of renaming under a different notion). …

Their example is:

subtype My_String is String;

My_String and String are interchangeable.

I suppose it must be the same if I use type, as in fact I did, otherwise I would expect a different error. (I will explore the differences between using type and subtype in this case in a far future.)

No comments:

Post a Comment