Some languages allow nested functions (or procedures, or subroutines, or whatever you prefer to call them), others don't.
I think those can be useful and I wonder why they aren't embraced by all living programming languages.
The example are made thinking of this scenario: you have a structure (a compound type, a record, a…), and an array of this structure/type, and you want to count the unique elements of this array using a very specific idea of equality (and less than…) which holds only for that purpose. That is, outside this specific count unique elements function, a different rule to compare elements exists (where two elements are equal if each field of the elements is equal).
What about C?
ISO C forbids nested functions. GNU C allows them, unless you specify the -pedantic
or -Wpedantic
option, in which case you have a warning that you can turn into an error by adding -Werror
.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int x[] = {1, 2, 3, 4, 5, 10, -4};
const size_t n = sizeof (x) / sizeof (x[0]);
const int shift = 1;
int invcmp(const void *a, const void *b)
{
const int *i = a;
const int *j = b;
printf("INSIDE %zu\n", n);
return *j - *i + shift;
}
printf("** %d\n", __STDC__);
qsort(x, n, sizeof (x[0]), invcmp);
size_t i = 0;
for (i = 0; i < n; ++i)
printf("%d\n", x[i]);
return 0;
}
In this simple code shift
and n
are used just to show that the nested function can see them. Anything inside the function invcmp
, instead, is out of scope outside it, of course. (Beware: it's not a closure! We don't need closures for this, anyway.)
The following example shows how you would do it in C, without nested functions.
This is the main
:
#include "elements.h"
#include <stdio.h>
int main(void)
{
element s[] = {
{ "zak", 2, 0 },
{ "mac", 3, 1 },
{ "abc", 4, 2 },
{ "zak", 5, 0 },
{ NULL, -1, -1 }
};
printf("%zu\n", count_unique(s));
return 0;
}
In order to avoid “exposing” the specific idea of equality needed by the count_unique()
function to the code actually using count_unique()
, you put it in a file of its own (a compilation unit).
In the place where you use count_unique()
, the compare()
function (see elements.c
) isn't visible. You can't achieve this if you put count_unique()
in the same file of the function using it — namely the file containing main()
, in this case.
In elements.c
, even if the two functions compare()
and count_unique()
are very close and work together, compare()
can't access to anything visible to count_unique()
— except for global variables and static symbols in the same compilation unit (I don't remember how it is called… top scope? outermost scope of the compilation unit?).
Files elements.h
and elements.c
can be found in this gist on github.
What about C++?
C++ suffers from the same problems as C. But modern C++ has lambda functions, which provide exactly for what we are looking for, even if with a different syntax.
size_t count_uniques(const std::vector<SElement> v)
{
size_t c = 0;
bool found;
auto compare = [&v](int i, int j) -> int {
if (v[i].e2 == v[j].e2)
return 0;
else if (v[i].e1 != v[j].e1)
return v[j].e1 - v[i].e1;
else
return v[j].e2 - v[i].e2;
};
for (size_t i = 0; i < v.size(); ++i) {
found = false;
for (size_t j = 0; j < i; ++j) {
if (compare(i, j) == 0) {
found = true;
break;
}
}
if (!found) ++c;
}
return c;
}
With this tool we can write compare()
using indexes instead of references to the elements.
The SElement
is something like this (from the header file):
//...
struct SElement
{
std::string s;
int e1;
int e2;
};
size_t count_uniques(const std::vector<SElement> v);
// ...
The complete example can be found as a gist on github.
What about Fortran?
Fortran allows for one level of nested functions or subroutine, and only in a contains
of a module. For the common usage of a nested function, it makes sense — I suppose that too much nesting can be a sign of bad programming practice.
The idea is the same already used for the C++ example:
! ...
function count_unique(s) result(c)
implicit none
integer :: c
type(element), dimension(:), intent(in) :: s
integer :: i, j
logical :: found
c = 0
do i = 1, size(s)
found = .false.
do j = 1, i-1
if (comp(i, j) == 0) then
found = .true.
exit
end if
end do
if (.not. found) c = c + 1
end do
contains
function comp(i, j) result(d)
implicit none
integer :: d
integer, intent(in) :: i, j
if (s(i)%e2 == s(j)%e2) then
d = 0
else if (s(i)%e1 /= s(j)%e1) then
d = s(j)%e1 - s(i)%e1
else
d = s(j)%e2 - s(i)%e2
end if
end function comp
end function count_unique
! ...
The complete example can be found as a gist on github. It won't compile with Fortran 95 or older versions.
The code for count_unique()
must stay in its module file, so that from the point of view of the file organizations, it is more like C. The advantage here is the ability of the nested function to see the variables of its container, hence allowing us to use indexes (as in the C++ version) instead of references or pointers (as in the C version).
What about Haskell?
Haskell is a different beast, but anyway it is useful to take a look at it. Of course, it allows what we could call nested functions — but I believe this isn't the correct terminology.
The example can be written like this:
countUnique :: [Element] -> Int
countUnique [] = 0
countUnique (x:xs) | x `foundIn` xs = countUnique xs
| otherwise = 1 + countUnique xs
where foundIn :: Element -> [Element] -> Bool
foundIn x [] = False
foundIn x xs = any (== 0) $ map (compare x) xs
where compare :: Element -> Element -> Int
compare a b | (e2 a) == (e2 b) = 0
| (e1 a) /= (e1 b) = (e1 b) - (e1 a)
| otherwise = (e2 b) - (e2 a)
The where
can be thought as the starts of the definition of the “nested” function(s). Here we have already two levels, and I suppose there's no limits.
You can also use let … in
, even if this is an expression (see Let vs. Where), or mix let … in
and where
as in the next example:
countUnique2 :: [Element] -> Int
countUnique2 [] = 0
countUnique2 (x:xs) = let foundIn :: Element -> [Element] -> Bool
foundIn x [] = False
foundIn x xs = any (== 0) $ map (compare x) xs
where compare :: Element -> Element -> Int
compare a b | (e2 a) == (e2 b) = 0
| (e1 a) /= (e1 b) = (e1 b) - (e1 a)
| otherwise = (e2 b) - (e2 a)
in
case () of _
| x `foundIn` xs -> countUnique2 xs
| otherwise -> 1 + countUnique2 xs
In general, I think you choose what it seems more expressive to explain what the code does.
In this Haskell example you can read a
deriving (Show, Eq)
That is, the type has “embedded” its own obvious idea of equality. In C++ we could so something similar overloading the required operators, and in Fortran too. In C, we can't: we need to be more explicit and write other functions.
The test code is:
import Counter
main = do print $ countUnique2 [ Element "zak" 2 0,
Element "mac" 3 1,
Element "abc" 4 2,
Element "zak" 5 0 ]
and the full code can be found in this gist.
Note: the functions in where
can “look back”; e.g., foundIn
could be written with a single argument, and compare
too, because the first argument of these two functions is in fact the x
which is the head of the list passed as argument to countUnique
(see the unexported countUnique3
in the gist). The functions in let … in
too can “look back”, but of course they can't “look forward”.
What about Ada?
Ada accepts nesting as you please.
function Count_Unique (S : Elements) return Integer
is
Found : Boolean;
C : Integer := 0;
function Compare (I, J : Integer) return Integer
is
begin
if S(I).E2 = S(J).E2 then
return 0;
elsif S(I).E1 /= S(J).E1 then
return S(J).E1 - S(I).E1;
else
return S(J).E2 - S(I).E2;
end if;
end;
begin
for I in S'Range loop
Found := False;
for J in S'First .. I - 1 loop
if Compare(I, J) = 0 then
Found := True;
exit;
end if;
end loop;
if not Found then
C := C + 1;
end if;
end loop;
return C;
end Count_Unique;
This function could be put in the declarative part of the “main” procedure, hence making it an example of nesting with depth 3:
procedure CTest is
function Count_Unique (S : Elements) return Integer
-- ...
end Count_Unique;
S : Elements := (
("zak", 2, 0),
("mac", 3, 1),
("abc", 4, 2),
("zak", 5, 0)
);
begin
Put (Count_Unique (S));
New_Line;
end CTest;
In the gist I've put it in its own file. Note that in the specification of course there's no clue about the fact that Count_Unique
uses a helper function to achieve its purpose.
What about other languages?
Every language which has lambda but doesn't allow nested functions, can be considered like C++ (modern C++). If the language allows nested functions and has also lambda, which one to use? It depends.
Python
An example of a language that has lambdas and also allows nested functions defines is Python.
I don't know if those def
s can be considered functions or they are methods of an anonymous class, like in Ruby if I remember correctly.
Anyway they can “see” the variables of the outer scopes.
Maybe using lambdas is better. But Python's lambda must be defined as expressions, so we need to write a thing like this:
comp = lambda i, j: \
0 \
if a[i].e2 == a[j].e2 \
else ( \
a[j].e1 - a[i].e1 \
if a[i].e1 != a[j].e1 \
else a[j].e2 - a[i].e2 \
)
Ruby
In Ruby it is possibile to nest def
s, but the nested def
behaves almost like a function defined outside. The scope of its name with respect to outside the “container” is different, but the problem here is that it doesn't see the variables of the container (said differently, it's not lexically scoped).
def a
z = 1
def b
puts z # ERROR
end
b
puts z
end
a
Luckly lambdas are nicer than Python and our example can be written with those:
# ...
def unique_count(a)
comp = lambda do |i, j|
if a[i].e2 == a[j].e2
return 0
elsif a[i].e1 != a[j].e1
return a[j].e1 - a[i].e1
else
return a[j].e2 - a[i].e2
end
end
# ...
return c
end
The only quirk is that we need to write comp.call(i, j)
, not just comp(i, j)
.
Conclusions
For the purpose of this article, I would say that only the following languages have nested function (uniform syntax, lexical scoping):
- C (only GNU extension)
- Fortran
- Ada
- Python (?)
Those which can use lambda, can achieve the same expressiveness, maybe, but I don't consider them languages with nested functions.
- C++
- Ruby
Haskell belongs to another world, so I don't know if it makes sense to consider it for the exploration of this article — it was used to make an example.
No comments:
Post a Comment