5. Functions

At their simplest functions can be thought of as a way to organize code, to make it easier to read and to avoid repetition; in fact, we have seen functions before, computer programmes are designed around functions, statements that take arguments and return a value. What we look at here is how the programmer defines functions.

3nPlus1.cpp
5.1

      1    //http://en.wikipedia.org/wiki/Collatz_conjecture
      2    
      3    #include<iostream> 
      4    
      5    using namespace std; 
      6    
      7    int update(int integer)
      8    {
      9      if(integer%2==0)
      10       return integer/2;
      11     else
      12       return 3*integer+1;
      13   }
      14   
      15   
      16   int main() 
      17   {
      18              
      19     cout<<"Choose an integer"<<endl;
      20     int number;	 
      21     cin >> number;
      22     
      23     while(number!=1)
      24       {
      25         cout<<number<<endl;
      26         number=update(number);
      27       }
      28     
      29     cout<<"1"<<endl;
      30   
      31     return 0;
      32   }

The Collatz conjecture holds that if you choose any positive integer and successively halve it if it is even or triple it and add one if it is odd, you will eventually get to one. This programme illustrates this. For our purposes here the important thing is the code at line 5.1.3-13; this defines the function int update(int number). Inside the loop in the main programme at line 5.1.26 the function is called when number is assigned the value update(number). Inside the function there is a local int called integer and it is assigned the value of number. The function returns a int and when it gets to a return statement it returns the value following the return, in this case either integer/2 or 3*integer+1.

The function statement has three parts and is written with the return type, followed by the function name, followed by the parameters in brackets, each of which is also given a datatype. The function is written at the top of the programme because the compiler needs to know that about the function when it gets to line 5.1.26 where it is called; otherwise it would not know what sort of object update(number) is. However, it can be very messy having all the functions written at the top and it is possible to prototype the function; this basically means that there is a one line statement at the top of the programme which gives the compiler the information it needs as it is compiling main, the function proper is elsewhere, at the bottom of the file, or even in a different file, and the compiler links that in when it gets to it. The easiest thing is to look at an example.

3nPlus1_withHeader.cpp
5.2

      1    #include<iostream> 
      2    
      3    using namespace std; 
      4    
      5    int update(int number);
      6    
      7    int main() 
      8    {
      9               
      10     cout<<"Choose an integer"<<endl;
      11     int number;	 
      12     cin >> number;
      13     
      14     while(number!=1)
      15       {
      16         cout<<number<<endl;
      17         number=update(number);
      18       }
      19     
      20     cout<<"1"<<endl;
      21   
      22     return 0;
      23   }
      24   
      25   int update(int number)
      26   {
      27     if(number%2==0)
      28       return number/2;
      29     else
      30       return 3*number+1;
      31   }

The prototype is at line 5.2.5; it is the same as the function statement except it ends in a semicolon instead of a curly-brackets delimited block giving the function commands, the function itself is at lines 5.2.25-31.

Functions can call themselves, this is known as recursion; this is usually used to write a factorization example, here I use it in another version of the 3nPlus1 programme.

3nPlus1_recursive.cpp
5.3

      1    #include<iostream> 
      2    
      3    using namespace std; 
      4    
      5    void update(int number);
      6    
      7    int main() 
      8    {
      9               
      10     cout<<"Choose an integer"<<endl;
      11     int number;	 
      12     cin >> number;
      13     
      14     cout<<endl;
      15   
      16     update(number);
      17   
      18     return 0;
      19   }
      20   
      21   
      22   void update(int number)
      23   {
      24     cout<<number<<endl;
      25   
      26     if (number==1)
      27       return;
      28   
      29     if(number%2==0)
      30       update(number/2);
      31     else
      32       update(3*number+1);
      33   }

Now the function only gets called once from inside the main function; in the function itself, it prints out the value of number and stops if it is one, otherwise it calls itself with the value number/2 or 3*number+1. This function does not return a value and so it is given the return value void in the prototype at line 5.3.5 and at the start of the function statement at line 5.3.22.

Here is another example

Cursing.cpp
5.4

      1    #include<iostream> 
      2    #include<cstdlib>
      3    #include<ctime>
      4    
      5    using namespace std; 
      6    
      7    
      8    void curse(int);
      9    int rand0To3();
      10   
      11   int main() 
      12   {
      13     
      14     srand ( time(NULL) );
      15   
      16     curse(rand0To3());
      17   
      18     return 0;
      19   }
      20   
      21   void curse(int number)
      22   {
      23     switch(number)
      24       {
      25       case 0:
      26         cout<<"Meatus"<<endl;
      27         return;
      28       case 1:
      29         cout<<"Belguim"<<endl;
      30         return;
      31       case 2:
      32         cout<<"R*!&%er"<<endl;
      33         return;
      34       case 3:
      35         cout<<"Tanj"<<endl;
      36         return;
      37       default:
      38         cout<<"f/p"<<endl;
      39         return;
      40       }
      41   }
      42   
      43   int rand0To3()
      44   {
      45     return int(4*double(rand())/(double(RAND_MAX)+1));
      46   }

In this programme there are there two functions, void curse(int number) and int rand0To3(), one does not return anything so its type is void, the other has no argument so it has an empty pair of round brackets. These functions are prototyped at the start, at lines 5.4.8/9; notice that the prototype does not actually need the variable names for the arguments, these are only needed in the actual function definition, all the prototype needs is the datatypes of the arguments. It is, however, quite common to leave the variable names in, partly because it sometimes helps, for complicated function, to remember which argument is which, are partly, because the prototype is often cut-and-pasted from the function definition.

At line 5.4.16 curse; its argument is something that evaluates to an integer, rand0To3(). The call to rand0To3 returns a number between 0 and 3 inclusive and using a switch statement curse prints out one of four curses, depending on what value is passed to it. This shows that an argument can be another function, provided the function evaluates to the correct type, the compiler can check that it does evaluate to the right type because it has read the prototype at line 5.4.9. Notice as well that there is no need for break statements, the returns perform the same function of preventing the programme from executing the rest of the statements in the switch block.

The Cursing.cpp programme uses the random number generator; this is done inside rand0To3. Of course, one advantage of writing things in functions is that provided you are told what a function does and what its interface is, and provided you trust the person who wrote it, you can use the function without knowing precisely how it does what it does. Furthermore, if you discover a better way of doing something, having it in a function allows you to change in only one place. This might actually happen with the c++ random number generator, better random number support is likely to be included in the next version of the language.

The random number generator is included in the cstdlib package and so this must be included, this happens at line 5.4.2. The statement rand() returns an integer between zero and a large number RAND_MAX; this number of often equal the maximum possible int. Here you can see the c-style syntax with a pre-defined global variable; a more c++ syntax would have a function which returns this value. Anyway, with suitable casting to doubles rand() is divided by RAND_MAX+1 to give a number in [0,1); multiplying by four and casting to an integer gives the required random number. An alternative way of doing this is to use modulo:

5.5

      42   
      43   int rand0To3()
      44   {
      45     return rand()%4;
      46   }

The random numbers produced by rand() are, of course, actually calculated by a deterministic algorithm seeded by an initial condition. Since the default seed is always the same, unless the seed is explicitly changed, a programme will always produce the same random numbers, sometimes this is useful, but if it isn't then the usual way of getting different random numbers each time is to seed the random number generator using the internal clock. This requires that ctime is included, as it is in line 5.4.3; the random number generator is seeded at line 5.4.14.

Two function can have the same name provided they have different argument datatypes, the signature of the argument list allows the compiler to identify which function is being called. This is not true of the return type, having two function which differ only in return type will cause an error. Here is an example of a programme where two functions have the same name:

Adding.cpp
5.6

      1    #include<iostream> 
      2    
      3    using namespace std; 
      4    
      5    
      6    int add(int,int);
      7    char add(char,int);
      8    
      9    int main() 
      10   {
      11     
      12     cout<<add(5,3)<<" "<<add('h',3)<<endl;
      13   
      14     return 0;
      15   }
      16   
      17   int add(int a, int b)
      18   {
      19     return a+b;
      20   } 
      21   
      22   char add(char letter,int number)
      23   {
      24     int newNumber=int(letter)+number;
      25   
      26     if(newNumber>127)
      27       newNumber-=96;
      28     
      29     return newNumber;
      30   }

In the argument list new variables are declared and initialized with the call value, in other words, in Adding.cpp for example, the char letter declared in line 5.6.22 is a new char scoped to the function; while it is initialized by the value of the variable letter in main it is a different variable. This is called passing by copy and the next example is the standard illustration of how passing by copy works.

BadSwap.cpp
5.7

      1    #include<iostream> 
      2    
      3    using namespace std; 
      4    
      5    void swap(int a,int b);
      6    
      7    int main() 
      8    {
      9               
      10     int a=4;
      11     int b=1;
      12   
      13     cout<<"inside main, before a="<<a<<" b="<<b<<endl;
      14   
      15     swap(a,b);
      16   
      17     cout<<"inside main, after  a="<<a<<" b="<<b<<endl;
      18   
      19   }
      20   
      21   
      22   void swap(int a,int b)
      23   {
      24   
      25     cout<<"inside swap, before a="<<a<<" b="<<b<<endl;
      26   
      27     int c=a;
      28     a=b;
      29     b=c;
      30   
      31     cout<<"inside swap, after  a="<<a<<" b="<<b<<endl;
      32   
      33   }

Which has output

5.8

      1    inside main, before a=4 b=1
      2    inside swap, before a=4 b=1
      3    inside swap, after  a=1 b=4
      4    inside main, after  a=4 b=1

So, although the a and b inside swap are swapped, this doesn't swap the a and b in main. There are two alternatives to passing by copy, one involves pointers, which we won't deal with here, and the other is passing by reference. When a variable is passed by reference, the function is told where the calling variable is, and works with that variable instead of making its own; the syntax for passing by reference is simple, an ampersand & is added after the data type in the parameter list

Swap.cpp
5.9

      1    #include<iostream> 
      2    
      3    using namespace std; 
      4    
      5    void swap(int & a,int & b);
      6    
      7    int main() 
      8    {
      9               
      10     int a=4;
      11     int b=1;
      12   
      13     cout<<"inside main, before a="<<a<<" b="<<b<<endl;
      14   
      15     swap(a,b);
      16   
      17     cout<<"inside main, after  a="<<a<<" b="<<b<<endl;
      18   
      19   }
      20   
      21   
      22   void swap(int & c,int & d)
      23   {
      24   
      25     cout<<"inside swap, before c="<<c<<" d="<<d<<endl;
      26   
      27     int e=a;
      28     c=d;
      29     d=e;
      30   
      31     cout<<"inside swap, after  c="<<c<<" d="<<d<<endl;
      32   
      33   }

Now, noting the ampersand at lines 5.9.5 and 5.9.22, the c and d inside swap are references to the variables a and b, I have changed the names of the variables in swap so as to make this easier to explain, but they could just as well of being a and b. Variables inside a scope are like dummy indices in sums, it does not matter what they are called; in BadSwap.cpp the a and b inside the swap function are new variables scoped to the function, in Swap.cpp c and d are references to the a and b in the main and the output is

5.10

      1    inside main, before a=4 b=1
      2    inside swap, before c=4 d=1
      3    inside swap, after  c=1 d=4
      4    inside main, after  a=1 b=4

There are two main reasons to use pass by reference, the first is the one illustrated here, where we want the function to act on more than one variable: a function can only return one value, using references allows more than one value to be recovered from the action of the function. The other reason has to do with speed and memory, we have only been using simple, small, datatypes so far, ints, doubles and so on. Soon we will start looking at more complex datatypes, vectors, vectors of vectors and then classes and these can take some time to copy, making a pass by copy expensive from a performance point of view, passing by reference does not carry that expense, however, you need to be aware that if something is passed by reference and changed in the function, it is changed elsewhere as well. We will see that using const in the datatype can help protect against errors of this sort.

Exercises 5

Write a programme to allow someone to play high-or-low against the computer; in each round the computer draws a card and displays it, the user bets whether a second card will be higher or lower than that one, with the suits ranked by alphabetical order and clubs lowest. The users starts with 50 Euro and can choose each time how much to bet, the user wins by reaching 100 and loses by reaching zero. The computer cheats in two ways, if the same card is drawn twice the second one is altered to suit the computer and the draw itself is skewed slightly depending on how much the user bets. Use plenty of functions in this programme. It might also be convenient to use enum; enumerated constants, for the suits.