Building Performance Optimization: Introduction to Programming

In this post, which is the second post on API programming for beginners, an introduction to programming is provided. This post is created using the provided notes by Dr. Mark Clayton on programming.

General definitions

Computer programming is the process and science of creating software. Although the field has a great deal of sophistication, it is based on just a few fundamental concepts.

Program

A software program is a collection of operations upon data that work together to accomplish a task. Typically, the program must have some user interface by which a user can invoke one or more of the operations. When the operations in the program are occurring, the program is said to be executing or running. To invoke the operations and cause them to take place is to run or execute the program.

Algorithm

A software program implements an algorithm. An algorithm is a step-by-step process for accomplishing a task. For example, a recipe in a cookbook is an algorithm. A chemistry experiment must be described in an algorithm so that it is repeatable.

Programming languages

Programs must be written using a language. The language defines some operations that perform fundamental actions. Computer scientists have invented numerous languages. Making the chips and circuits in the computer do anything useful requires writing instructions in low level “machine code.” Because this is very difficult at the mathematical level of an integrated circuit, computer scientists have written “high level” languages that enable more than just mathematicians and electrical engineers to write programs. Since LISP and FORTRAN were invented in the late 1950’s, computer scientists have invented hundreds of other languages.

Although computer languages all essentially do the same thing, they pose the fundamental concepts in different terms to appeal to different categories of specialists and to achieve optimization for particular kinds of problems. For example, LISP excels at artificial intelligence applications. FORTRAN is tailored to mathematical and engineering calculations. COBOL was invented for data processing. Pascal was intended as an aid to teaching people to program. Newer languages implement more sophisticated concepts that aid in creation of complex, modern programs. C++ and C#, both derived from C, are general-purpose languages that are widely used in contemporary software development.

Interpreting and compiling

There are two general categories of software for translating a high level language into machine language:

1. An interpreter is a system that converts the statements in the programming language into machine language and executes them one at a time. Interpreters are good for learning to program because the user may simply type in a single statement in the language and see what it does. Interpreters have the disadvantage that to run the program, one must have a copy of the program and a copy of the interpreter. Also, interpreters tend to be slow in executing large programs.

2. Compilers use a different strategy. First, the program must be written completely. Then the user invokes the compiler on the entire program instead of a single statement. The compiler converts the entire program into machine code in an "executable" format. The user can then run the program. The user does not need the compiler to run the program. The program produced by the compiler tends to run faster than the same program executed by the interpreter.

Programming concepts

All programming uses a few simple concepts. In some ways, these concepts are so simple that they are confusing. They relate to symbolic logic, classification of data, and manipulation of data. A fundamental idea is that of a symbol, which is a pairing of a name with a value. Referring to the name retrieves the value.
There are basically two kinds of things represented in symbols: data, and actions. A data symbol can be fixed and unchanging, in which case it is called a “constant”, or its value can change as the result of actions, in which case it is called a “variable”. There are also basically two kinds of action symbols, although these concepts are not as clear cut. An action symbol can be an “operator” that simply transforms one or two values into a new value. Or an action symbol can be a complex set of operations that have been named so that they can be retrieved as a group. These are variously called routines, procedures, functions, or methods. While there are differences among each of these concepts, the terms are usually used interchangeably and imprecisely. For now we can consider them all to be the same thing.
A statement is a line of code that tells the computer to do something. A program is made of lots of statements that manipulate the various symbols to do something useful. A statement is similar to a sentence in that it must be syntactically correct. Many languages end each statement with a semicolon.

Data types

Most languages define a variety of types of data. Most languages further specify that when a symbol is created its type must be declared. Writing a program is a matter of declaring the symbols and then sequencing the execution of operators upon those symbols.

The common simple types of data are characters, integers, real numbers, and strings (collections of characters such as words or sentences) and Booleans (variables that can be either true or false.) In the example below, a new symbol is created of type “int” (integer).

int num1;

Operators

A first step in understanding programming is to understand the actions that the software can take. There are really very few. The most basic operation is assignment, or the declaration of a value that is to be associated with a symbol. Next, there are arithmetic operations like addition, subtraction, multiplication and division. There are also logical operators that compare one value to another, such as <, >, and ==.

Assigning

One of the most fundamental operations is assignment. In assignment, a value is given to a symbol. Assignments are often written in a programming language with an equal sign, so the assignment might look like (in C#):

   num1 = 5;

This simple statement can be understood as placing the value “5” into the symbol “num1”.

Control structures

A set of statements may be executed in a particular sequence. Obviously, the sequence is important. Doing a set of statements in random order will produce random results. However, there are only three patterns for executing a set of statements. They are three control structures for the program execution.

A sequence is the default pattern or control structure. Most programming languages start at the top of the page and work their way down.
A conditional is the next kind of control structure. The basic idea is that if an expression is true, then do one set of statements. If an expression is false, do another set of statements.
A loop is the third control structure. A loop says to do the same set of statements repeatedly until some new condition is met. Loops are the true power of computing. Doing something once is often no faster with a computer. Doing the same thing dozens or hundreds of times is almost always faster with a computer. A common mistake is to make a loop but forget to change the value of the variable checked in the condition of the loop. The loop executes forever, or until you crash the program.

Function (Method)

In computer programming, a function is a sequence of actions that are bundled together and named so that they can be used as a group. A function has arguments or parameters that act as the input information to the actions. It often has a return value or result that is the new information computed or manipulated by the function. The use of a function in other code is referred to as "calling" the procedure or "invoking" the procedure.

The essence of computer programming is writing functions that call other functions. In this way, each piece of the problem may be defined as a small and focused task. Complex operations can be built up from many small and simple actions.

Building Performance Optimization

Wednesday, October 10, 2012

Introduction to Programming