Computer programming starts with an analysis of the task to be performed. The first step is to clearly define (and document) the problem or task. A computer analyst must determine the objectives of the program and then record them in a clearly written "problem statement". This statement is a formal document that is placed at the front of the analysis documentation. It must mention all important goals of the program, including output to be produced and items of input that the program must obtain, including mention of the units of measure. It should not explain how the program should work, but rather what program should do. It should be sufficiently thorough that all other parts of the analysis could be performed based on its content. It must be clear and unambigous. For example, if it mentions output, it should indicate which device (screen, printer, disc, etc.) and any units of measure (such as feet and inches or dollars and cents) and precision or format (such as "rounded to two decimal places"). The problem statement should be written from the point of view of the computer rather than the programmer. In other words, don't write about your task (writing a program); instead write about the computer's task (doing whatever the program should accomplish).
The analyst must recognize the "scope" (boundaries) of the problem. If it is very large and complex, it should be sub-divided into smaller, simpler modules. This approach of dividing big problems into smaller ones is called "top-down design". When this is done, each smaller task is treated as a separate programming problem and analyzed and documented independently. The relationship between the separate modules is illustrated in a diagram called a "structure diagram" (also referred to by some as a "hierarchy chart"). A simple program does not required top-down design; so no such diagram is included in its documentation.
Once the quantity of modules has been determined, a separate problem statement should be written for each module, describing its objective. An analyst often must clarify and refine initial requirements described by clients or prospective program users to make them more concise. In this programming course, for each programming assignment, students will be presented with initial "situation" that describes a reason for developing a program. From it the student will be required to perform an analysis of the task to be accomplished by the program and produce various items of documentation that clearly illustrate the decisions made by the student during the problem solving process. The remainder of the this document provides an example of the processes performed by a programer while developing the documentation for a program analysis. The example is simple, so it will not include top-down design, nor require a structure diagram.
You need to calculate a sales tax for a purchase amount and then add them to determine a total payment. This situation might seem clear initially. But it must be analyzed and refined as described above to produce a concise problem statement, written from the computer's point of view, such as:
Problem Statement
Request and store a dollar purchase amount (a number which has a decimal portion). Calculate and store the sales tax on that amount (based on the local sales tax rate). Calculate and store the sum of the resulting tax amount added to the purchase amount. Display the results rounded to two decimal places with identifying text. The program will evaluate only one purchase each time it is run. It should also display a title, program credits, and an introduction.
A student would be asked to analyze this problem in preparation for writing a computer program to accomplish the objective stated above. The analysis should be documented clearly so that any programmer could write the program's code later. Although program code is often developed by different people than those who perform the analysis, in this example the development of the program code (instruction written in a computer programming language) will be included too.
Assignment instructions specify the items of documentation that should be submitted. These are typically items such as:
The items above are listed in the order that they should be organized, reflecting the problem solving steps defined on the Computer Programming Overview.
Note that an analyst should always label each item of documentation (for example: "Problem Statement:") to ensure that the reader recognizes its significance. Some authors choose to start a new page for each item of documentation. But this is not necessary, as long as each item is clearly labeled. Analysts must be careful to date and number each page of documentation and indicate which program it relates to. So - each item of documentation should be clearly labeled. Each page should contain at least: your name, the program's name, the date, and the page number.
After defining the task (as explained above), the next step of an analysis is to define the data that the program will handle. This activity produces a collection of documents that serve as examples of any program output and lists that describe in detail each item of data that the program will store.
Because the program in this example must produce output on a screen, the analyst must create an example of how the screen should look when the program runs. This document is referred to as a "Sample Softcopy". It describes the program's visual objectives. Some programs have very specific requirements about the content and format of the output. Other programs have few specific requirements, which leaves the analyst free to select whatever wording and format seems most appropriate to the task. The important thing here is to be careful to conform to any requirements defined in the problem statement and to be clear.
When composing a sample softcopy, an analyst can promote readability by including some notations to explain anything that a novice reader might find helpful (such as destinctions between input and output in the example.) It is also helpful to include some line numbers to the left of the example in case the analyst wants to refer to those lines in other analysis documents. Any notations such as these should be explained ahead of their use as shown in the example below.
1 2 3 4 5 6 7 8 9 10 11 12 |
SALES TAX CALCULATOR<CR> <CR> Written by (your name here) - 01 January 2011<CR> <CR> This program will request a purchase amount and then calculate<CR> the sales tax based on a 6.5% tax rate and display the result<CR> rounded to 2 decimal places.<CR> <CR> Enter the purchase amount: $[100.10]<CR> <CR> The sales tax for that amount is: $6.51<CR> The total payment with tax is: $106.61<CR> |
Some programs involve the use of an item of data which will not change each time the program runs. In this example, the tax rate of 6.5% is such a value. Because it will not change with each run of the program, it is referred to as a "constant". Normally, constants require no special treatment within documentation. You simply write them as needed. In this case, it is important to note that the value 6.5% is actually 0.065. The '%' symbol used by humans to represent percentage indicates that the digits ahead of it should be divided by 100 to obtain the true value. Computers do not do this automatically; so it is important for analysts to pay attention to this when noting values.
Sometimes, programs involve constant values which might change someday, but normally will not change with each run of the program. In cases such as this, programmers often treat the constant as if it will be a "variable" (stored item of data). They give it a name and store the constant value in it. If they then refer to the constant throughout their documentation by using its name instead of the literal value, it will be easy to change the constant in the future. For example, if we define the label RATE to represent 0.065, then we can use the label RATE throughout our program documentation instead of 0.065. If we ever want to change the tax rate, we need only change the statement defining the label RATE instead of searching through all of the program documentation for everywhere we might have written the value 0.065 and changing all of them. There are two different approaches to labeling constants. One involves the use of a variable (storage location) that is specially marked as being "read-only" (unchangeable). This approach is easy to understand and use, but it takes up unneeded computer storage space. In C++, this type of constant is referred to as a named constant. The other approach is to define a label called a symbolic constant which is used as an alias for the literal value, but is not actually stored by the program. Instead, the analyst simply notes the label and the value it represents. While constructing the program code, the constant value is substituted in place of each position within the program's instructions in which the symbolic constant (label) was used. This approach uses no storage locations because the literal value is not stored by the program. Instead it is embedded it the program instructions everywhere that the programmer mentioned the symbolic constant.
If the analyst chooses to use a read-only variable to represent a constant, then the constant is listed and noted as such with the varables in a "variable list" (explained later in this document.) If the analyst chooses to use a symbolic constant to represent a constant, then the symbolic constant is listed and defined in a separate "symbolic constant list" (example follows).
Symbolic constants are fixed values that may be referenced using a label at many different locations within a program's instructions rather than repeating the value many times. This practice makes it easy to update the program if the value ever needs to be changed. The constant identified below is typical of those found in most programs.
| IDENTIFIER | DESCRIPTION | DATA TYPE | VALUE | USAGE | DESTINATION |
|---|---|---|---|---|---|
| RATE | Sales tax rate | Floating Point | 0.065 | for TAX | Screen (as a percentage) |
The format of the list is columnar, as shown above. An items listed should be clearly defined in the Description column. A label should be chosen to represent the symbolic constant and recorded in the first column. The constant value shuld be written in the Value column. Note that fractional values (such as .065) should always be written with a leading zero to be sure that the decimal point is noticed. Specific notes about any processing that will depend on the value should be placed in the Usage column. If the constant will be output by the program, the target device should be noted in the Destination column. No position in the list should ever be left blank (which would imply that the analyst had not yet determined the item's properties). If there is no applicable notation for a position, enter "---" or "N/A".
If there are a few symbolic constants, they are typically listed in the order in which they are expected to be used in the program. If there are many symbolic constants, they are typically listed in alphabetical order by label.
Variables are storage locations that hold values that will be different each time a program is run. Each variable is given an identifier (label) that can be used by programmers to identify the storage location without having to know its numeric address in computer memory. The labels chosen by the programmer can be influenced by the programming language that will be used to code the program; however it is normally acceptable to use basic nouns or abbreviations of them. Variables are documented in a columnar table similar to a symbolic constant list (see above), with the exception of the fourth column which is used to indicate the initial source of each variable (rather than the Value indicated in a symbolic constant list). For this example, the following variable list would be reasonable.
| IDENTIFIER | DESCRIPTION | DATA TYPE | SOURCE | USAGE | DESTINATION |
|---|---|---|---|---|---|
| PUR | Purchase amount (in dollars) | Floating Point | Keyboard | for TAX and PAY | --- |
| TAX | Sales tax (in dollars) | Floating Point | Calculated | for PAY | Screen |
| PAY | Total payment (in dollars) | Floating Point | Calculated | --- | Screen |
Notice that the DESCRIPTION column clearly identifies each piece of data that must be stored including units of measure. The DATA TYPE column indicates the data type of the variable. The last three columns are used to indicate: (SOURCE) where the data comes from, (USAGE) what happens to it while it is stored, and (DESTINATION) where it will end up (for example, a screen, a printer, or disk storage). Notice that each variable has an entry in the SOURCE column (all data comes from somewhere) and each variable has an entry in at least one of the other two columns as well. In some programs, variables have an entry in all three columns.
If the analyst chooses to use a named constant rather than a symbolic constant to represent the constant value 0.065 within the analysis, then there would be no symbolic constant list written. Instead, RATE would be listed within the variable list exactly as it was in its row from the symbolic constant list with the value 0.065 written in the Source column.
An algorithm is a recipe or plan describing how perform the program's task. Algorithms must have the following properties:
Algorithms can be written in a variety of forms, including:
The following algorithm is written in outline form.
A. Start.
B. Output Intro. & Instructions as shown in Softcopy.
B1. Program Title on Line 1.
B2. Blank Line.
B3. Program Credits (Author and Date).
B4. Blank Line.
B5. Introduction on Softcopy Lines 5 - 7.
B5a. (As shown on line 5).
B5b. (As shown on line 6 using the value of RATE for the tax rate).
B5c. (As shown on line 7).
B6. Blank Line.
C. Request and store input data as shown in Softcopy on line 9.
C1. Display prompt for PUR (w/o carriage return).
C2. Read keyboard entry and store it in PUR, then display car. return.
D. Calculate and store interrim and final results.
D1. Assign TAX as PUR times RATE.
D2. Assign PAY as PUR plus TAX.
E. Display results on the screen as shown in the Softcopy on lines 10-12.
E1. Blank Line 10.
E2. Line 11.
E2a. "The sales tax for that amount is: $".
E2b. TAX (rounded to two decimal places followed by a carriage return.
E3. Line 12.
E3a. "The total payment with tax is: $".
E3b. PAY (rounded to two decimal places followed by a carriage return.
F. End.
The level of detail in each step should be such that it can be translated into a single statement in a programming language. Any step that is more complex than that (such as B5 or D) should be broken into sub-steps. That activity is referred to as "step-wise refinement".
The algorithm above could be written in the form of pseudocode instead of as an outline. Pseudocode is an informal but terse notation using simple verbs in place of the strict syntax of most computer programming languages. There is no formal standard for pseudocode. The statements below comprise an example of the algorithm above written as pseudocode.
BEGIN DISPLAY Intro. & Instructions (as shown in softcopy) DISPLAY Introduction on softcopy lines 5 - 8. DISPLAY Prompt for PUR READ PUR ASSIGN TAX as PUR times RATE ASSIGN PAY as PUR plus TAX DISPLAY Blank Line DISPLAY "The sales tax for that amount is: $", then TAX (rounded to cents) DISPLAY "The total payment with tax is: $", then PAY (rounded to cents) END
The informal nature of pseudocode makes it popular with programmers and fast to write. But that characteristic allows for imprecise algorithms, so we will not use pseudocode on assignments in this class. A reasonable alternative to an outline or pseudocode is a flowchart. Flowcharts use graphic symbols to represent each action that a computer can perform. These are linked together by arrows to indicate the flow of control from one step to the next. The algorithm above can also be written in the form of a flowchart (see the linked image). For more information on flowcharting, see the Flowcharting Symbols & Guidelines page on this site. Only one form of algorithm should be used on an analysis. If an outline is written, then pseudocode and flowcharts would be redundant.
A desk check is an activity performed by the analyst to confirm that all prior documentation is valid. To perform a desk check, simply make up some sample input data to use as you read through and perform the steps in your algorithm on paper. Whenever your algorithm says to output something, write that on a blank piece of paper simulating either the monitor screen or printed paper output. Above the output, write a title of Test Softcopy (or Test Hardcopy for printed output). Whenever your algorithm says to store a value, record that in a tracing chart similar to the one below so that you can easily keep track of all of the variables while performing your test. When you reach the end of your algorithm, the simulated output (softcopy or hardcopy) should exactly match the ones that you wrote as sample goals at the beginning of the analysis. The important thing to remember here is to write what the algorithm indicates rather than what you expect the program to produce. The purpose of the desk check activity is to test the algorithm you wrote rather than what you are thinking the program should produce.
The Data Tracing Chart is used to document what would be happening in the computer's memory during the execution of the steps described in your algorithm. A column is provided for each variable in your analysis with an additional column (#1) to serve as a reference to steps in your algorithm. Note in the chart below that it only relates to steps in the algorithm that effect the memory (ie. steps C & D).
| Input | Calculated | ||
|---|---|---|---|
| PUR | TAX | PAY | |
| C2 | 101.10 | ||
| D1 | 6.5065 | ||
| D2 | 106.6065 | ||
The Test Output is produced by manually reading through the steps in your algorithm and recording (on paper) any output (softcopy or hardcopy) that your steps would produce. For example, given the test data used in the chart above, the algorithm on the following page would produce the following softcopy.
1 2 3 4 5 6 7 8 9 10 11 12 |
SALES TAX CALCULATOR Written by (your name here) - 01 January 2011 This program will request a purchase amount and then calculate the sales tax based on a 6.5% tax rate and display the result rounded to 2 decimal places. Enter the purchase amount: $[100.10] The sales tax for that amount is: $6.51 The total payment with tax is: $106.61 |
Notice that the value stored in TAX by step D1 in the desk check's tracing chart was 6.5065. That step mentioned nothing about rounding. The rounding was a requirement placed on the output value rather than on the stored value; so it was mentioned in step E2b instead. The tracing chart properly shows the stored value. The test softcopy shows the rounded value. Some programs require that stored values be rounded. In those programs, the formulas in the calculation steps would mention rounding the stored value.
Only after the analysis (the logical part of programming) has been completed and verified by the desk check, should the programmer consider the physical part of programming and move on to the act of translating the algorithm (in conformance with the other analysis documentation) into source code (statements written in a computer programming language.) Some of the assignments in this course will require only an analysis and would not require the submission of source code. Other assignments will provide the students with the analysis documentation and ask them to write the source code. And others will request both the analysis documentation and the source code.
It is not unusual for a programmer to wait until the analysis is complete to decide which programming language is best suited to writing the source code. Over the years, this course has used a variety of different programming languages to demonstrate coding practices. At the present time, we are using the language C++ for that. The C++ source code for the analysis above is shown on the page entitled "Primary Example of Coding in C++". For an explanation of the statements in that source code, study chapters 2 and 3 in your textbook and the coding examples on this web site.