# Vectors II, typedef, C-style arrays and C-strings, character-level I/O, representation of numbers

In C++ a string behaves in many ways like a vector of char elements. char is a datatype; a char is a single character. For example:

string s = "eyrie";
s[1] = s[0];      // s is now "eerie"

string t = "cat";
char ch = t[0];   // ch is the letter c; note that ch is of type char, not type string
t[0] = t[1];
t[1] = ch;        // t is now "act"

string literals are enclosed in double quotes - "Z" is a string of length one - whereas char literals are enclosed in single quotes - 'Z' is a single character value.

You can also define strings using the same sort of constructor function that you use for vectors. For example:

string s(20, '*'); // creates a string of length 20, initialised to asterisks; note that the second argument is a char, not a string

The library cctype contains a number of useful predicate functions for returning information about char data, for example:

 isupper(ch) returns true if ch is upper case islower(ch) returns true if ch is lower case isdigit(ch) returns true if ch is a digit isalpha(ch) returns true if ch is a letter

C++ strings are different from C strings. C strings are arrays of char, not vectors (more about C strings below). If you needed to cast a C++ string into a C string, you would use s.c_str() (where s is a string).

You can use the at function with strings, as you can with vectors, as an alternative to square brackets so as to get the benefit of array bounds checking. It can be used as follows:

string s = "hello, world!";
cout << s.at(10) << endl; // returns letter l
s.at(10) =  'e';          // changes l to e
cout << s.at(10) << endl;
cout << s.at(s.length()) << endl;   // aborts the program

However, a string is not exactly the same as a vector <char>. For example, a string has push_back but not pop_back.

## Two-dimensional vectors

A two-dimensional vector in C++ is just a vector of vectors. For example, you could define a two-dimensional vector of integers as follows:

vector<vector<int> >   v2;


Note the space between <int> and the second >. If you have them next to each other, as in >>,the compiler interprets it an operator and flags it as an error.

This definition gives you an empty two-dimensional vector. If you wanted to grow it with push_back, you would have to push back a one-dimensional vector, not a single int. For example:

vector<int> v(5);
v2.push_back(v);


To initialise a two-dimensional vector to be of a certain size, you first have to initialise a one-dimensional vector and then use this to initialise the two-dimensional one:

vector<int> v(5);
vector<vector<int> >   v2(8,v);

You can picture v2 as a two-dimensional vector consisting of eight rows with five integers in each row.

You refer to individual elements of a two-dimensional vector by using two subscripts; the following sets the third element of the fifth row of v2 to 99:

v2[4][2] = 99;  // remember that the first element of the first row is v2[0][0]

The following procedure would display a two-dimensional vector of int:
void display (const vector<vector<int> >& vy)
{  for (int i = 0; i < vy.size(); i++)       // loops through each row of vy
{  for (int j = 0; j < vy[i].size(); j++) // loops through each element of each row
cout << vy[i][j] << " ";           // prints the jth element of the ith row
cout << endl;
}
}


## Defining datatypes with typedef

You can give your own names to datatypes using the typedef keyword. The following code creates a datatype called Words, which is a vector of string.

typedef vector<string> Words;  //note the capitalisation: this is a convention

This looks just like a definition of a variable called Words except that it has the word typedef on the front. This means that we have not defined a variable - we do not now have a vector called Words. We have defined a datatype. As well as being able to define variables of type int or string or vector and so on, we can now also define variables of type Words.

Once you have defined a datatype in this way you can use it in the same way as any other datatype to declare variables of that type. For example:

 Words w; // creates a vector of strings, called w

Defining new types can help to simplify the notation of a multidimensional vector. Let's use a typedef to create a 2-D vector of integers:

typedef vector<int> Vint;   // creates the datatype Vint
Vint v(5);                  // creates a vector v of five integers, all zero
vector<Vint> vx(4, v);      // creates a vector of 4 Vints called vx

vx[1][2] = 99;   // vx[1] is a vector, so we can subscript it
v[0] v[1] v[2] v[3] v[4]
vx[0] 0 0 0 0 0
vx[1] 0 0 99 0 0
vx[2] 0 0 0 0 0
vx[3] 0 0 0 0 0
vector<Vint> vy(10);                    // creates a vector vy of 10
// Vint vectors, all empty
Vint	v(5);
for (int i = 0; i < 10; i++)
vy[i] = v;                           // makes each vy a vector of
// 5 zeroes - note that you can use assignment with vectors
vy.push_back(v);                        // You can push a Vint onto the end of vy (or pop one off)
vy[3].push_back(66);                    // You can push an int onto the end of one of the component Vints
vy[4].pop_back();                       // (or pop one off) - the component Vints do not all have to be the same length


### Generic algorithms

Consider a vector of integers v that contains the following values

 5 2 9 6 3 4

If we #include the <algorithm> library, we can use the generic algorithm sort() as follows:

sort(v.begin(), v.end());

which sorts the vector v in ascending order from start to finish. begin() and end() are iterator values corresponding to the position of the first element of the vector and a position immediately after the last element. The results of this procedure are thus:

 2 3 4 5 6 9

Similarly we can use the algorithm reverse() as follows to reverse the order of elements in v:

reverse(v.begin(), v.end());
 9 6 5 4 3 2

There are a number of these generic algorithms - see the appendix to Lippman and Lajoie. They are called generic because they operate on a range of container types, including vectors and strings. We will not say much about them in this course since the course is an introduction to programming in general, using C++, rather than a course specifically about C++. If you were going to be a serious C++ programmer, you would need to get acquainted with the generic algorithms - there are about 70 of them - and you would make extensive use of them, but I don't expect you to do that on this course.

## Arrays

C-style arrays are less flexible than C++ vectors, but they are still occasionally useful, as we shall see. An array is declared as follows:

int a[6];

creates an array of six integers, all uninitialised (like ordinary int variables).

• Arrays are not objects in the C++ sense, so they do not have member functions. If a is an array, you cannot have things like a.size() or a.push_back(99)
• The size of an array is calculated by the compiler and cannot be set or changed at runtime.
• When you pass an array to a function, you need also to pass its length since there is no way of finding out, once inside the function, how long it is.
• Arrays are always passed to functions by reference - there is no need for an &.

Note one consequence of the second point - the size must be known at compile time. People are often tempted to write something like this:

void proc(int len)
{	int a[len];		// Can't be done!
...
}

The array is local to the procedure and the programmer is trying to set the length of the array with a parameter. But this value is only passed to the procedure at run time. The compiler cannot set aside the right amount of storage for this array at compile time. This cannot be done.

When we passed vectors as parameters, the procedures or functions that used them typically contained lines like this:

int func(const vector<int>& v)
{	for (int i = 0; i < v.size(); i++)
...
}

But we can't do that with an array since there is no a.size() function. So, if we want to pass an array as a parameter, we also have to pass the length as a separate parameter, like this:
int func(int a[], int alen)
We have to pass the array length, as there is no other way for the function to know how long it is. (You can put a number inside the square brackets if you like, for example int func(int a[6], int alen), but the compiler ignores it.)

Arrays are always passed by reference - you don't include an &. This means that a procedure or function that takes an array as a parameter can potentially make changes to the array. Sometimes you don't want this to happen and you can prevent it by adding a const to the parameter, like this:

int func(const int a[], int alen)

Now the function should treat a as though it were composed of const int. (At the least you should get a warning from the compiler if the function contains code that might change a value in a.)

### Initialising arrays

One advantage of arrays over vectors is that you can initialise them with a set of different values as follows:

int a[6] = {2, 5, -9, 14};

which produces an array of 6 elements as follows:

 2 5 -9 14 0 0

The last two elements are initialised to zero as a by-product of the initialisation. (If you explicitly initialize any elements, even just the first, the rest get initialized to zero.) The argument in square brackets is optional, and if omitted, the array is populated only with the elements specified in curly brackets. In this example, if you left out the 6, you'd get an array of 4 elements.

This feature of arrays provides a back-door route to initialising vectors. We can specify the elements of the array to be used to populate a vector as follows:

int a[6] = {2, 5, -9, 14};
vector<int> v(a, a + 6);

This code will populate the vector with the first 6 elements of the array. Note that the second argument of the initialisation (a + 6) corresponds to the element after the end of the array. For example the following code:

int a[] = {1, 5, 10, 50, 100, 500, 1000};
vector <int> v (a, a + 5)

populates the vector with the values 1, 5, 10, 50 and 100.

### C-strings

Just as C++ strings are like vectors of char, C-strings are arrays of char. For example:

char cs[4] = "C++"; // four characters to allow for the null byte at the end

C-strings always occupy one byte more than their apparent length because a C-string always ends with a null byte. (As a literal, a null byte is written as '\0'.) All the functions in C that manipulate strings rely on the presence of this null byte.

If you are providing a string literal as initialization, you need not specify how long the array is to be - you can leave it to the compiler to work it out from the literal:

char csx[] = "mary smith";

The compiler determines how long the array has to be to hold the string and the null byte.

C-strings, like other arrays, are always passed by reference. However, we do not need also to pass the length in the case of C-strings because we can find where it ends by looking for the null byte. Consider the following function which receives an array containing a C-string and converts all the spaces it contains into asterisks:

void stars (char cs[])
{  for (int i = 0; cs[i] != '\0'; i++)
if (cs[i] == ' ') cs[i] = '*';
}

### Converting strings to C-strings

C-strings are not equivalent to C++ strings. For example:

char ch[] = "E"; // an array of two bytes, an E and a \0
string s = "E";  // a vector of character data, containing an E

In some C++ compilers, certain functions cannot handle C++ strings as arguments. In particular the open()function for an ifstream on many compilers takes a C-string as its argument (it can also take a string literal, but that is because a string literal is a C-string). If, for example, you ask a user for the name of a file to be opened and store the filename as a string, you must convert it before passing it to the open() function as follows:

string filename;
cin >> filename;
infile.open(filename.c_str());

Similarly, the function atoi() (from the cstdlib library) only takes C-strings. This function is used to convert a string (holding the character representation of an integer) into an integer. For example:

string s = "1234";
int n = atoi(s.c_str());
cout << "string " << s << " is integer " << n << endl;  
Or we can use an istringstream for the same purpose:
string s = "1234";
int	n;
istringstream iss(s);
iss >> n;


## Input and output of characters

### get() and put()

If we want to analyse files on a character-by-character basis, we use the input_stream.get(char) function, to read the next character from the specified input stream. Similarly you place char data into the output stream using the output_stream.put(char) function. The following code reads in character data from the input stream and places a copy of it in the output stream:

int main ( )
{  char ch;
while (cin.get(ch))
cout.put(ch);
}

### ignore()

If for any reason you need to ignore a character in the input stream, for example because you know it will send the stream into a fail state, you can use the input_stream.ignore() function. If you need to ignore several characters, you can pass arguments to ignore() to specify a character to act as a delimiter (ignore all characters up to and including the delimiter) and the maximum number of characters to ignore. For example:

//assume cin contains abcde$fgh cin.ignore(10, '$');
string s;
cin >> s;  // s is now "fgh": characters up to and including $ignored // again assume cin contains abcde$fgh
cin.ignore(4, '$'); string s; cin >> s; // s is now "e$fgh": maximum of 4 characters ignored

You know that you can effectively skip the rest of a line with a getline(instream, junk); You could also do it with instream.ignore(INT_MAX,'\n');

### peek()

input_stream.peek() returns the next character in the buffer without getting it, so you can have a look at the next character that's coming up before you have committed yourself to reading it. This could be useful if you wanted to avoid sending a stream into a failed state, e.g. by reading char data into an int variable.

### unget()

input_stream.unget() replaces the last character you obtained (with a get()) and pushes it back into the input stream, so that it will be the next character to be read. You can only unget() one character.

## Problems with number representation

Here are a couple of issues you should be aware of when working with numerical data:

### Largest and smallest integers

The largest integer that can be represented as an int varies from system to system. If you try to go beyond the limits you will get incorrect results, and the compiler won't check the limit for you. To get around this, include the <climits> library which gives you access to the INT_MIN and INT_MAX constants of your compiler, so you can check that you are not going to go beyond the legitimate bounds. Example Code

If you have an int, say bignum, containing some large integer, and another called smallnum containing a small one, and you want to add smallnum to bignum but are afraid that you might exceed INT_MAX, do not try to do the following:

if (bignum + smallnum > INT_MAX)		// Can never be true!
cerr << "Integer overflow" << endl;
else	bignum += smallnum;

If the total of bignum and smallnum really does exceed INT_MAX, the machine will not be able to represent that number, as an int. The value it would get by adding bignum to smallnum would not be an int larger than INT_MAX (it can't be), but in all likelihood a small negative number - refer to your Computer Architecture notes on two's complement.

What you have to do is establish first whether you have enough room to make the addition:

if (INT_MAX - bignum >= smallnum)
bignum += smallnum;
else	cerr << "Integer overflow" << endl;


### Round off errors

Are computers good at arithmetic? The following code seems to suggest not:

#include <iostream>
using namespace std;

int main( )
{  double d = 7;
for (int i = 0; i < 10 ; i++)
d -= 0.7;
if (d == 0.0) cout << "Yes";
else cout << "No";
}

The representation of doubles is necessarily approximate. Consider the decimal representation of one third - 0.3333333.... No matter how many 3's you stick on the end, you haven't quite got it exactly. Computer representation of doubles is subject to the same sort of problem. In this example, after subtracting 0.7 from 7.0 ten times, we ought to be at 0.0, but, because of these approximations, a tiny amount of error creeps in so that the final value of d is not exactly zero.

In computations involving huge numbers of calculations, typical of many applications in science and engineering, these cumulative errors can render the final result wildly inaccurate and so completely useless. Consequently great effort is devoted in such applications to devising algorithms than minimise these errors, and the results are not presented as being completely accurate, but rather as accurate to within certain limits.

The important thing to remember is that a test for absolute equality with a double is dangerous.

If you wanted to fix the example program so that it output "Yes" as it ought to, you would decide on what level of accuracy you were prepared to accept, and write a near_enough function that took two doubles and returned true if they were within your accepted distance of each other. For example:

bool near_enough(double x, double y)
{	const double ACCEPT = 0.000001;
return x <= y ? x >= y - ACCEPT : y >= x - ACCEPT;
}


posted on 2006-03-04 19:28  cy163  阅读(...)  评论(...编辑  收藏