2012-04-09

Modelling social relationships using a knowledge base

Weeks ago I attended Codemotion 2012. Among the talks I was interested in, there were those related to graph databases (personal consideration: currently I think classical relational databases are misused more often than one can imagine at first). At the first appearing of the first example, some old memory came into my mind: it was about Prolog. Each time I have taken a look at Prolog, I've stumbled on examples like these: Prolog describing genealogy (X has A and B as parents, Y has Z as sibling...), tastes (X likes A...), ... and of course graphs.
I had not thought about the analogy until then: a Prolog collection of facts is nothing but an internal representation of a graph database (by the way it is the opposite: the knowledge base made of Prolog facts can be represented as a graph, hence the idea of a graph database).
So I fired my gprolog installation (the most unused bunch of bytes on my hard disk, but I like to have the larger possible number of programming languages/environments installed) to do a little bit of testing. Just for the fun of it. But since nowadays "social stuffs" have a lot of momentum I wanted to create a small knowledge base focused on who knows who, who likes who, who loves who... and the possible conseguences. The imagined application is for «services» like "find friends", or dating sites, or alike.

In the given example there are five persons with fantasy names Mary Brod, Carl Stuckart, Rudolf Fisher, Amanda Least, Minority Report and the following informations: birth date, gender, sex orientation, who knows who, who likes who, and who loves who. Then there are some educational rules (a rule for friendship i.e. reciprocal knowledge, another to ask the system if two persons are "sex compatible") and some helper rules (get the birth year, compute the age). Everything is of course semplified: human relationships happen to be a little bit more complicated.

The following messed up graph (generated by graphviz's dot starting from a source generated by a perl script "parsing" the prolog knowledge base file) depicts the relationships between those persons; black edges are about "who knows who", blue edges are about "who likes who", red edges are about "who loves who"; the background of each nodes represents the gender of the person, while the male-female symbols are about their sex orientation.
The knowledge base file can be now used to do queries (I mean, Prolog queries). They are rather intuitive. For example, if we want to know all the people who know Minority, we write this query:

findall(P, knows(P, 'Minority Report'), L).

which returns the list L = ['Mary Brod', 'Carl Stuckart', 'Rudolf Fisher', 'Amanda Least'], and in fact everyone knows Minority (who doesn't?).

The knowledge base as it is written allow to check if two subjects can have sex according to the following criteria: if A and B love or like each others, and their gender is in the list of preferred genders (which I've called sex-sexual orientation) of the may-be-partner, and if their age is over 18 or is over 13 but the difference between ages is no more than 5, then they might have sex. Let's try between Minority Report and Amanda Least. Unfortunately for Minority, who is very young, they can't (because of the age difference is too great). Let's try between Mary and Amanda... no luck yet, since Amanda is not bisexual, so Mary's unrequited love can't be finalized into sex.

Instead of going on this way, let's ask the system to list, for each person, a possible lover. The query I would do looks like

setof(X, canHaveSex(Person, X), List).

We are asking the set (just to be sure not to have repetition, since some person could match more than one sex compatibility criterion) of persons X the person Person can have sex with (according to the stated rules and the knowledge in the database). The answer is

List = ['Rudolf Fisher']
Person = 'Amanda Least'

which can be read as: the Person Amanda Least is sex compatible with Rudolf Fisher. They are in love, in fact.

And who has  unrequited loves?

findall(P, (loves(P, X), \+ loves(X, P)), L).

gives

L = ['Mary Brod','Carl Stuckart','Rudolf Fisher']

i.e. Mary, Carl and Rudolf love a person who does not love them. People who loves more than one person are potentially polygamous, and this is the last query we do, promise:

setof(X, loves(P1, X), L), length(L, N), N > 1.

That is, find the set L of persons X loved by P1, where length of L is greater than 1, i.e. P1 loves more than one single person. The variable matching the criteria are:

L = ['Amanda Least','Mary Brod']
N = 2
P1 = 'Rudolf Fisher'

thus, Rudolf is potentially polygamous.

The way the queries can be done is rather intuitive (once you get accustomed a bit to Prolog programming), and likely the algorithms behind the scenes (backtracking and so forth) are already exploited in "digital social contexts", I think (and if they are not, why not?)