Array Declaration and Manipulation


Imagine that you have been asked to write a program to print a sales report similar to the one below for a maximum of ten employees. (Note that the example contains only four employees, but could contain more.)

Emp. ID      $ Sales   % Sales 
12345        3000.00        30 
24681        1000.00        10 
35791        4000.00        40 
48484        2000.00        20 
Total:      10000.00       100 

Initially, you might think that this task could be accomplished using a simple sentinel controlled loop that re-uses just a few variables similar to those shown in the web page about Accumulation Using a Sentinel Loop. In this problem, the difficulty lies in the calculation of the percentages in the right-most column. It is impossible to calculate any employee's percentage of sales until all the employee sales data has been entered and totaled. This fact would prevent you from being able to output each employee's row until all the employee data was entered. And that fact would require that all data be retained throughout the entire program, preventing the re-use of just a few storage locations in a loop.

The challenge here is to find a way to manipulate multiple storage locations (such as the four dollar sales amounts in the example above) without having to define and manipulate many separate variable labels. In the example above, we could define four separate variables (such as S1, S2, S3, and S4) for the sales amounts and then total them using the expression S1+S2+S3+S4. But imagine the difficulty if we had 100 employees!

The solution is to use a data structure known as an array. An array is a type of storage in which one identifier relates to a group (or set) of storage locations, all of the same data type. For example, if you wanted to store the four dollar sales amounts above, you could create a storage array named S that you might visualize or draw as a horizontal table like this:

S
  3000.00   1000.00   4000.00   2000.00

Each individual storage location within an array is called an element. Within analysis documentation, each individual element within an array is referenced using a subscript, which is a small numeral that follows the identifier, such as in S2. The C++ language numbers the elements within its arrays starting with zero. Thus, the first element in the array (shown above) that contains the value 3000.00 would be referred to as S0. The last element in the array above, containing the value 2000.00, would be referred to as S3. In C++ source code, a subscript is written as a numeral that follows the identifier enclosed within brackets, such as S[3]. Many programmers visualize and draw arrays as vertical tables, with the first (lowest numbered) element at the bottom and the last (highest numbered) element at the top. In the example above, such programmers would refer to the element containing the 2000.00 value as the "top of the array".

 
Top Element: 3
2
1
Bottom Element: 0
S
2000.00
4000.00
1000.00
3000.00

Because of the use of subscripts in arrays, arrays are often called subscripted variables, which begs the question: Is there a term used to discuss ordinary storage where there is only one storage location for each identifier? There is. Ordinary variables are called scalars.

DECLARATION:

In C++, arrays can be declared in two different ways, depending on whether you know the contents of each element in advance or not.

Option 1 - Array Declaration without Initialization

If you do not know the contents of the array in advance of the program's execution, then you would declare the array in the following manner:

     datatype label[size];

The statement used to declare an array is written in a manner similar to other variable declarations. The data type is written first, followed by the variable label, and finally an integer constant (or symbolic constant) in brackets indicating the size (or quantity of elements to be allocated). All elements must be the same data type. The identifiers used to label an array must conform to the same rules as any other identifier in C++ and cannot duplicate a name already in use by a scalar. Thus, the sales array pictured in the illustrations above could be declared and initialized with the statement:

     double S[4];

This would declare four double storage locations identified as: S[0], S[1], S[2], and S[3]. Another array for holding the four percentage sales values could be declared with the statement:

     double P[4];

Because both of the arrays mentioned so far are of the same data type (double), they could (optionally) both be declared with the single the single statement:

     double S[4], P[4];

Then an integer array for holding the four employee ID's could be declared with the statement:

     int EID[4];

Many textbooks show an approach to declaring arrays that is more detailed and useful in situtations when we are declaring many arrays that may share some common properties. For example, all of the arrays in the example above would have four elements. For this reason, a programmer might take a more step-by-step approach when declaring the three arrays (EID, S, and P) that would involve separate steps to define the things that these arrays had in common. Consider the following declaration:

     #define MAX 4  /* Define the maximum quantity of elements expected */

     int    EID[MAX];  /* Array of MAX employee identification numbers */
     double S[MAX], P[MAX];  /* Arrays for $ Sale and % Sales of MAX elements */

First, the symbolic constant MAX is declared to make it easy to change the size of the arrays later. Then MAX is used to declare the arrays. It can also be used throughout the program in loop sentinel tests to prevent us from trying to access elements that are outside of the size of the array.

Option 2 - Array Declaration with Initialization

If you know the contents of the array in advance of the program's execution, then you would declare the array in the following manner:

     datatype  label[4] = { value0, value1, value2, value3 };
or
     datatype  label[ ] = { value0, value1, value2, value3 };

The data type is written first, followed by the variable label, followed by the desired size, then an equal sign, and finally a list of initial values for any elements (from bottom to top subscript) inside of braces {}. If the braces are left empty, the size is inferred from the quantity of items in the intialization list. The quantity of items can be less than the size, but never greater. All elements must be the same data type. The identifiers used to label an array must conform to the same rules as any other identifier in C++ and cannot duplicate a name already in use by a scalar. The sales array mentioned above would be declared with the statement:

     double  S[ ] = { 3000.0, 1000.0, 4000.0, 2000.0 };

MANIPULATING ARRAYS:

The major advantage of using an array to store data is that we can easily refer to all of the elements of an array by using only a single identifier (to refer to the array of elements) and then using a variable such as a loop counter as the array's subscript. For example, after declaring the dollar sales array from the example above, we could initialize all of it elements to hold the value 0.0 with the loop statement:

     for (COUNTER=0; COUNTER<MAX; COUNTER++) S[COUNTER] = 0.0;

Note the use of an integer variable (COUNTER) to act as a subscript to each element of the array during each pass of the loop. If we had used scalars (such as S0 and S1) to store the four sales amounts, it would require the following multiple statements to initialize them:

     S0=0.0; S1=0.0; S2=0.0; S3=0.0;

Not so bad - unless you decide to change MAX to one hundred. Now the advantages of using arrays should be obvious. If fact, any process that we want to apply to all of the elements within an array now becomes very easy to do by simply placing that process within a counting loop and using the counter as the subscript to point at each element in the array. For example, to load the dollar sales array with values from the keyboard, we could write a loop like this:

     for (COUNTER=0; COUNTER<MAX; COUNTER++)
              {
                  cout << "Dollar sales? ";
                  cin >> S[COUNTER];
              }

To total (accumulate) all of the elements in the dollar sales array in a variable named TOT, we could write a one-statement loop like this:

     for (COUNTER=0; COUNTER<MAX; COUNTER++) TOT = TOT + S[COUNTER];

To display the entire dollar sales array in a column, we could write a one-statement loop like this:

     for (COUNTER=0; COUNTER<MAX; COUNTER++) cout << S[COUNTER] << endl;

So you see, arrays are very useful when we want to manipulate large amounts of data using a single identifier, provided that we know how to write a loop that can use its counter as the array's subscript.


For a full analysis and coding example of the program described on this page, view the web page entitled Array Analysis and Programming.

PATH: Instructional Server> COP 2000> Examples>