This document discusses the manner in which the C++ programming language stores and manipulates data. For a basic introduction to the fundamentals of how computers do this in general, view the document entitled Data Types & Languages.
The C++ programming language provides a variety of data languages for storing numbers and characters.
In computer science, data languages are referred to as data types.
The most common primitive data types in C++ are
int (for storing whole numbers, also know as integers), float and double (both for storing numbers with decimal precision, also know as real or floating point numbers), and char (for storing symbols - characters.
Each of these data types has limitations as to the range of values and precision of the data it can hold.
Not all data types use the same amount of storage in the computer. For example, a float
typically uses more storage bits than an int. This is because numbers with decimal precision
are more complex and require more space to store than whole numbers.
A C++ programmer can specify the data type of literal data implicitly by simply typing the data value in the proper format. For example, the integer value five would be typed as simply "5" (without the quotes), whereas the floating point value five must be typed as "5.0" (without the quotes). Floating point literals are typed in C++ with a decimal point and at least one digit and also will accept a unary sign (such as - or +). It is advisable to write at least one digit on each side of the decimal point. It is more readable to type the value one-quarter as "0.25" rather than as ".25". Dollar signs, percent signs, commas, and other special characters are not permitted in numeric data.
A C++ programmer must specify the size of each variable (changeable storage location), and consequently the range of values that it can hold, by declaring the variable. This is done by typing a statement that starts with a word (or words) indicating the data type, followed by the identifier(s) of the variable(s) being declared as having that data type. For example, to declare two storage locations named A and B to hold integer data, the C++ statement would be:
int A, B;
When declaring a numeric variable, a C++ programmer can be even more specific regarding its size (range of values and precision) by using the reserved words long, short, or unsigned. The use of the qualifier signed is not recognized in C++, as integer data is signed by default. The exact quantity of bits used by each data type depends on the computer system being used. For example, in early PC's, short int data could represent a range of only 256 values. But in modern computers, that range has been expanded to 65,536 values. Normally, half of these values are positive, the other half negative, allowing a range of -32,768 to 32,767 (and 0). The modern C++ statement to declare a small integer storage location named N with the capacity to hold only a positive whole number between the range of values 0 to 65,536 is:
unsigned short N;
The basic integer data type is int. By adding qualifiers, you can produce the following combinations:
| Declared Data Type | Size | Typical* Range of Values Possible | Stores Sign |
|---|---|---|---|
| short | 2 bytes | -32,768 to +32,767 | Yes |
| int | 4 bytes | -2,147,483,648 to + 2,147,483,647 | Yes |
|
long int |
4 bytes | -2,147,483,648 to + 2,147,483,647 | Yes |
| unsigned short | 2 bytes | 0 to 65,535 | No |
| unsigned int | 4 bytes | 0 to 4,294,967,295 | No |
| unsigned long | 4 bytes | 0 to 4,294,967,295 | No |
* On older 16-bit computers, short and int typically occupy 16
bits, and long 32 bits. Newer 32-bit computers typically use 16 bits
for short, and 32 bits for both int and long.
Integer literals such as the value five can be written in any of the following numbering bases:
0550x as in: 0x5An integer literal is normally of type int, if its value lies within the range of that data type; if not, it is taken to be of type long. You can force the literal to be of type long or unsigned by following the literal with any combination of the letters L, l, U, or u. To store the value ten as an unsigned long, write it as 10UL.
Floating point numbers are numbers that have to be stored with a decimal point and fractional digits, such as 8.2 or 0.75. There are three floating point data types: float, double, and long double. On most machines, float provides about 6 decimal digits of precision and double about 15. The type long double is rarely used because of its large size, but typically provides between 16 and 30 decimal places, which is far greater precision than would be required for most applications.
Inside the computer, floating point numbers are stored in a manner similar to scientific (exponential) notation, separating the value into two parts: a "mantissa" (which represents the significant digits of the number) and an "exponent" (which represents the power or "magnitude" of the number). Thus the value 123.456 would be stored as 1.23456e+002, (or 1.23456 times ten to the 2nd power, or 100). The value could be written in any of the following forms: 123.456, 1.23456e+002, 12.3456e+001, 123.456e+000, etc. Whenever we use scientific notation to write a number in C++, the data type is interpreted as being floating point. Thus value 1e3 will be interpreted as 1000.0 and the value 1e-2 will be interpreted as 0.01 or one hundredth (10 to the minus 2nd power). The number following the letter e indicates which direction to move the decimal point to get the true
value. Positive number move the decimal point to the right; negative to the left.
Floating point literals must contain either a decimal point or an exponent (in the case of scientific notation). Thus 3E3 and 3000.0 represent the same value. The exponent is specified with the letter E (or e) and may optionally contain a sign (+ or -). By default, floating constants are of type double. To specify otherwise, add F (or f) to indicate type float, or add L (or l) to
indicate long double to the end of the literal, e.g. 3.1E2L.
The representation of characters in C++ is machine-dependent, although most computers today use the ASCII code. For more information on the ASCII code, see the web page at:
[http://www.neurophys.wisc.edu/comp/docs/ascii/]
Some compilers treat the data type named char as unsigned, with values normally ranging between 0 and 255. Other compilers treat char as signed, with values between -128 and 127. Character literals are specified using apostrophes (also called single quotes), as in 'a'or '#'. In C++, characters are treated as small integers, most often stored in one byte. For example, in the ASCII code, 'A' has the value 65, and 'a'h as the value 97. Characters can appear anywhere that integers can; C++ simply uses the character's integer value. Thus, characters can be added, subtracted, compared, incremented, assigned, and so on. Arithmentic expressions such as 'a' * '#' make little sense, but are valid. We can also use a character's integer value to compare it to another character. For example, the relational expressions ('A' == 65) and ('B' == 66) are both true.
Not all characters are symbols. Output actions such as [Tab], [Backspace], and [Enter] (carriage return) are also considered to be characters and are part of the ASCII data format. Such characters are called "control codes" and are produced using escape sequences such as '\n'. Each escape sequence produces a single character (which is why they are typically written enclosed in apostrophes rather than quotes). In C++ source code, some characters are interpreted by the compiler as having a special meaning. Examples of this are: the quote mark (") which indicate the beginning or end of a string, and the backwards slash (\) (spoken aloud by some using the word "whack") which indicates the start of an escape sequence. When you want to display such characters literally, you must use the escape sequence that represents that character instead of typing the character
itself; otherwise the compiler will interpret the character as having special significance. For example, the display a quote mark in a message, you must produce the quote mark as the escape sequence \" within a quoted message, as in:
cout << "His name is \"Mark\".";
The statement above will display the string: His name is "Mark"
For information about how to output special characters, read the class web page about Displaying Special Characters in C++.
Escape sequences also can be written as "numeric escapes" in which you express any character by its numerical ASCII code by writing a backslash character (\) followed by the ASCII code expressed as either an octal (base-8) or hexadecimal (base-16) number. Octal digits must immediately follow the backslash (for example \23 or \40), and with hexadecimal digits an x character must be written before the digits themselves (for example \x20 or \x4A).
Note: C++ provides special input methods for reading character data that are discussed in Chapter 3.