2. Variables

In mathematics variables are quite a subtle idea and if we weren't so familiar with them, variables in c++ would be much more straight forward. In c++ when you declare a variable say x, the program reserves a memory location to store a value. x is the name of the variable and evaluating it returns the value at the memory location. As a program runs it evaluates expression; a variable evaluates to the value it stores.

Add.cpp
2.1

      1    #include<iostream> 
      2 
      3    using namespace std; 
      4 
      5    int main() 
      6    {
      7 
      8        int a=2; 
      9 
      10       int b; 
      11       b=3; 
      12 
      13       cout<<a<<" "<<b<<" "<<a+b<<endl; 
      14       a=7; 
      15       cout<<a<<"\n"<<endl; 
      16       cout<<&a<<endl;
      17       cout<<&b<<endl; 
      18 
      19       return 0; 
      20 
      21    }

Here we see it compiled and run. 2.2

      1   $ g++ Add.cpp -o ad
      2   $ ./ad
      3   2 3 5
      4   5
      5
      6   0x7fff32d681cc
      7   0x7fff32d681c8
      8   $

In the example Add.cpp a variable a is declared at line 2.1.8; first the variable type is specified, in this case int for integer, declaring the variable type tells the program how much memory to set aside, an int requires more memory than a bool for example, a bool short for boolean, can only take the values 0 and 1. Knowing the variable type also specifies which operations can be performed on the variable. An int can be added for example and on line line 2.1.13 that is what happens, we output the value of a, of b and of a+b. Along with declaring a and b we give them values, = is the assignment operator, b=3 gives b the value 3; it stores 3 at the memory location set aside for b.

Notice that a value can be assigned when the variable is declared, as with a in line 2.1.8, or the variable can be declared and given a value later, as with b in line 2.1.10/11. Even though a was given a value when it was declared, it can still be given a new value later as happens on line 2.1.14. When a variable is being declared, it can be initialized, given an initial value, this is effectively the same as assigning it a value and is done using a bracket: int a(3) declare an int a with the value 3, just as int a=3 does.

As a curio the actual memory locations are also printed out, &a is the memory location of a and this is printed out in line 2.2.6, this will vary from run to run, the program decides when it gets to the declaration of a where to store it. Luckily we needn't know anything about this, the high level language and the compiler hide this from us.

In the Add.cpp example, b=3 assigns a constant as the value of b; however, while the left hand side of an assignment statement has to be a variable, the right hand side only needs to something that returns a value when it is evaluated. There are all sorts of expressions that can appear on the right hand side; at the moment lets look at a simple example just using some simple algebra

Assign.cpp
2.1a

 
      1    #include<iostream> 
      2 
      3    using namespace std;
      4 
      5    int main() 
      6    { 
      7 
      8        int a; 
      9        int b; 
      10       int c 
      11 
      12       a=2; 
      13       b=3; 
      14
      15       c=a+b; 
      16 
      17       cout<<a<<" "<<b<<" "<<c<<endl; 
      18 
      19       c=c+a; 
      20 
      21       cout<<c<<endl;
      22 
      23       c+=a; 
      24 
      25       cout<<c<<endl; 
      26 
      27       c++; 
      28 
      29       cout<<c<<endl; 
      30 
      31       ++c; 
      32 
      33       cout<<c<<endl;
      34 
      35       return 0; 
      36 
      37    }

giving
2.1b

      1     2 3 5 
      2     7 
      3     9 
      4     10 
      5     11

So, in line 2.1a.15 the right hand side of the assignment is an expression, this is evaluated to give 2+3=5 and c is set equal 5, as testified by line 2.1b.1. Next, in line 2.1a.19 we see that c itself can appear on the right hand side, the right hand side is read first by the program, it has to, it works out what value to assign before assigning it, so the value of c is used in the expression, giving 5+2=7 and this value is assigned to c, line 2.1b.2. In fact, expressions like this are so common that there is a short hand, for any variable foo for which addition is defined, foo+=bar is short for foo=foo+bar, there are similar -=, *= and /= assignment operators, line 2.1a.23 gives an example. Now, expressions like c+=1 are so common that it is given an even shorter short hand for adding one to a variable c++ and its near synonym ++c. There are also decrement operators c-- and --c which both have the effect of reducing the value of c by one.

Here is a list of some standard data types:

int for integer stores an integer; since the program assigns memory to a variable there are limits on the size of an integer. In principle this varies from compiler to compiler and chip-type to chip-type, generally integers are in the range (-2^31,2^31).
unsigned int is an unsigned integer, so it has to be positive.
bool is a boolean, used to record true or false it stores the values 0 or 1.
float is a floating point number, effectively a real number. They are usually stored using 1 bit for the sign (s), 8 bits for the exponent (e) and 23 bits for the mantissa (m) such that the number is equal to s x m x 2e.
double are also used to store real numbers, but they take up twice as much memory as a float and store the numbe twice as precisely.
char stores a single character, this can be a letter or number, or a special character, like "\n" for carriage return, or "\t" for tab.

There are also, or example, long int and short int data types, for most compilers, short ints have a rather small range and use half the memory of an ordinary int; long ints are actually often the same as ordinary ints.

This program uses the standard data types

DataTypes.cpp
2.1c

 
      1    #include<iostream> 
      2 
      3    using namespace std;
      4 
      5    int main() 
      6    { 
      7 
      8        int a(-23); 
      9        cout<<a<<endl;
      10       
      11       unsigned int b(23); 
      12       cout<<b<<endl;
      13       
      14       bool c(false); 
      15       cout<<c<<endl;
      16 
      17       cout.precision(22);
      18
      19       float d(3.14159265358979323846);
      20       cout<<d<<endl;
      21
      22       double e(3.14159265358979323846);
      23       cout<<e<<endl;
      24     
      25       char f('E');
      26       cout<<f<<endl; 
      27    }

which gives output

2.1d

      1    -23
      2    23
      3    0
      4    3.141592741012573242188
      5    3.141592653589793115998
      6    E

Without worrying about its syntax, statement cout.precision(22) at line 2.1c.17 changes the precision that cout outputs at; it is normally six significant figures and cout.precision(22) changes it to 22. This is larger than the precision of either the float or the double data type, you can see on lines 2.1d.4/5 that the float produces nonsense after eight significant figures and the double after 16. In line 2.1c.3 we see that for bools false has the value 0, true has the value 1.

One important point is that different data types have different operations and, in fact, the same operation can behave differently for different data types. For example if you take one unsigned int from another the answer is assumed to be an unsigned int.

UIntAdd.cpp
2.3a

     1    #include<iostream>
     2   
     3    using namespace std;
     4
     5    int main()
     6    {
     7
     8         unsigned int a=2,b=1;
     9         cout<<a-b<<endl; 
     10        cout<<b-a<<endl; 
     11  
     12   }

giving
2.3b

     1    $ g++ UIntAdd.cpp  -o ui
     2    $ ./ui
     3    1
     4    4294967295
     5    $

It assumes a-b is an unsigned int which is fine, it is 1, it also assumes b-a is; this is a problem since it gives -1, a negative number, as you can see in line 2.4.4 it responds by assigning it a nonsense value. Notice, by the way, in line 2.3a.8 the same int is used to declare two variables, separated by commas. It is possible to get around this by casting the data; this instructs the program to reinterpret one type as another, see line 2.4.9

Cast.cpp
2.4

     1    #include<iostream>
     2   
     3    using namespace std;
     4
     5    int main()
     6    {
     7
     8         unsigned int a=2,b=1;
     9         cout<<int(b)-int(a)<<endl; 
     10  
     11        char letter='e';
     12        cout<<letter<<endl;
     13        cout<<int(letter)<<endl; 			       
     14  
     15        double x=1.2;
     16        cout<<int(x)<<endl;
     17        int c=x;
     18        cout<<c<<endl;
     19   }

giving
2.5

     1     $ g++ Cast.cpp -o ca
     2     $ ./ca
     3     -1
     4     e
     5     101
     6     1
     7     1
     8     $

As you can see from lines 2.5.5 and 2.4.13 a char can be cast to an int; a char stores numbers not characters, but when evaluated, converts that number to a character; if you cast the char to an int you get the number instead. As lines 2.5.6 and 2.4.15/16 shows, doubles cast to ints by rounding down. Finally assignments between types are often cast automatically, as lines 2.5.7 and 2.4.17/18 shows.

One data type we haven't covered is string: unlike the data types above string has to be loaded as a separate package, if you want to use strings in a program, you need to have a #include command at the start of the program and, unless there is a using namespace std; at the start, string will need to be written as std::string. string, of course, stores strings, sequences of letters.

String.cpp
2.6

     1    #include<iostream>
     2    #include<string>
     3
     4    using namespace std;
     5
     6    int main()
     7    {
     8
     9         string line1="Of man's first disobedience";
     10        cout<<line1<<endl; 
     11  
     12        string line2="and the fruit of that forbidden tree";
     13        
     14        string twoLines=line1+" "+line2;
     15  
     16        cout<<twoLines<<endl;
     17        
     18        cout<<twoLines.length()<<endl;
     19        cout<<twoLines.at(5)<<endl; 
     20   }

giving
2.7

     1    Of man's first disobedience
     2    Of man's first disobedience and the fruit of that forbidden tree
     3    64
     4    n

Thus, line1 stores "Of man's first disobedience". Notice in lines 2.6.14 and 2.6.2 that addition for strings is defined as concatenation. The string data type or class has other functions defined along with addition. The length of the string twoLines is given at line 2.6.18. The program prints out twoLines.length() which is the output of the string function length() applied to the string twoLines. Put another way, since twoLines is an instance of a string it has a set of functions which can be called, they are called using a dot and the function name and the brackets. In the case of length, the brackets are empty, this is because length() doesn't have an argument. In lines 2.6.19 and 2.7.4 we see another function, at(int i) which takes an integer argument and returns the character at the integer, in this case "n"; noting that it counts from 0 so twoLines.at(0) would return "O" from "Of".

There is another, older, synatax which is almost equivalent to at(int i): [int i]. twolines[5] would of had the same effect. However, this is method is not as good, firstly, it is ugly, it doesn't conform with other language conventions as well as at(int i) and secondly, it is not range safe. If you wrote twoLines.at(64) you would get a runtime out of range error because twoLine only has 64 letters and, counting from 0, twoLines.at(64) attempts to return the 65th letter. twoLines[64] will also do something wrong, but the consequence is less predictable and may not be so easy to find.

One important aspect of variables is that they are scoped, this means they are only declared inside the current block of code, delimited by braces. This is something we will see more of later, but here is an example.

Scope.cpp
2.8

     1    #include<iostream>
     2    #include<string>
     3
     4    using namespace std;
     5
     6    int main()
     7    {
     8
     9         string line;
     10        
     11        {
     12              string word("disobedience");
     13              line="Of man's first "+word; 
     14        }
     15  
     19        cout<<line<<endl; 
     20   }

which will print out Of man's first disobedience. line is declared in the main scope, however, word is scoped inside the braces at lines 2.8.11/14, if you tried to print out word at line 2.8.19 you would get an error because line is not declared outside of its scope, the program creates it at the declaration, line 2.8.12 and then deletes it when it goes out of scope at line 2.8.14. Obviously, here, the braces have only been put in to scope the variable word, this is actually sometimes useful, but in most examples, the brackets are also there for other reasons. Scoping can be subtle if a variable name is repeated, the more local name takes precedence over the less local one.

Scope_2.9.cpp
2.9

     1    #include<iostream>
     2    #include<string>
     3
     4    using namespace std;
     5
     6    int main()
     7    {
     8
     9         string line("Of man's first disobedience");
     10        
     11        {
     12              string line;
     13              line="And the fruit of that forbidden tree"; 
     14        }
     15  
     16        cout<<line<<endl; 
     1    }

returns Of man's first disobedience, the assignment to "And the fruit of that forbidden tree" happens only to the local line scoped inside the braces.

Summary 2

Variables are declared with a data type, so for a data type T a variable named foo is declared as T foo, for example int foo declares an int named foo.
The equals sign does assignment so a=4 assigned the value 4 to a.
The right hand side of assignment can be any expression, even one which includes the variable. foo+=bar is short for foo=foo+bar. foo++ and ++foo both add one to foo,
Variables can be assigned when they are declared int a=4, or initialized, int a(4) which amounts to the same thing.
The common data types are
- int used for integers.
- unsigned int used for positive integers.
- bool used for true and false.
- float used for real numbers.
- double used for more precise real numbers.
- char used for characters.
One data type can be cast to another, int(1.4) casts 1.4 to an int giving 1; sensible rules are used in casting, but if you aren't sure how one data type casts to another write a small test program to check.
There is a string data type but it must be loaded at the start using the preprocessor. It has member functions allowing you to get the length of string, for example.
Variables are scoped.

Exercises 2

What value does a variable store if you declare it without assigning a value to it: we write int a for example and then print out a.
c++ and ++c are subtly different. The point is that all expressions return a value when evaluated, one of these two expression adds one and then returns the resulting value, the other does it the other way around. Which is which? Which is identical to c+=1?
What happens if you set int a=7 and int b=3 and output a/b? What about double(a)/double(b) and int(double(a)/double(b))?
What happens if you add two chars?
On many new compilers many numerical datatypes, such as double, can have values inf, -inf and nan, short for not-a-number, to allow the program to deal gracefully with division by zero, the square root of minus one and so on. What are the properties of the these quantities: what is inf+1, inf-inf and 2*inf, is -inf==(-1)*inf, is nan==nan? This exercise won't work on older compilers where double a=0,b=1, infinity=1/0; will cause a runtime error rather than, what we want, setting infinity=inf/