Beyond C: Programming Languages Past, Present, Future
Table of Contents
from the July 1985 issue of Unix World magazine
by David Spencer
Current third-generation languages such as C and FORTRAN will have to move aside at some point for a new family of fourth-generation languages.
At 30 years old, FORTRAN is graying at the temples; third-generation programming languages are in their heyday. So you are probably wondering how we will speak to computers during the next decade. If current projections hold true, computers will seem (and talk) more like us fairly soon. In order for that to happen, however, current third-generation computer languages (such as C and fortran) will have to move aside for a new family of fourth-generation languages.
Have no fear, though. As computer architectures and programming methodologies have come to simulate human thinking more closely, programming languages have increasingly abandoned the procedural approach (how machines do something) in favor of a nonprocedural, functional one (what is to be done). As the address-related recall capability of the microprocessor is transformed into the associative recall of the brain, high-level languages (HLLs) move closer to, and will eventually be replaced by, the very high-level languages (VHLLs), also known as fourth-generation languages.
To belong to the fourth generation, a language must have crossed the threshold of a world where the programmer specifies the task to be done to the point where the knowledge of how to do the task is contained in the language itself.
Historically, languages have moved further from the machine level, becoming more abstract with each generation. To understand fourth-generation languages, you must first have some idea of what came before.
First-generation languages were little more than machine code, which was closest to the computer’s flip-flopping switches. Each set of ones and zeros represented current-on/current-off settings. The steps for even a simple operation were necessarily very discrete and seemed artificial in their total lack of assumed information. Arithmetic was all done in binary, and how a program was coded directly mirrored how an operation was performed by the machine.
The second generation brought assembly languages, which separated operations (instructions) from operands (data) and substituted names for binary numbers. Assembly languages added a layer of address-related recall to the programmer’s repertoire as assembler programs did the tedious, error-prone work of putting machine code together. Assembly languages were purely procedural because addition/subtraction involved the load-move-store of binary data. A simple operation still took many discrete steps, however, and decisions were made according to condition tests—the state of the machine’s registers—instead of the task’s inherent logic.
THIRD-GENERATION HLLs
The third-generation HLLs gave abstraction an algebraic form. Statements subsumed a greater number of incremental steps. Choices of action based on logical conditions replaced the comparisons of register contents and condition tests, moving closer to the human perception of the job to be done and away from the machine’s demands for a binary representation.
In an article entitled “Programming Languages,” James W. Hunt proposes task-centered criteria for a good language. Any HLL should allow the programmer to design programs easily, document programs, debug programs, move programs between machines (portability), verify a program’s correctness, and compile parts of a program separately.
These criteria are met to some degree by most widely used HLLs: fortran, COBOL, Pascal, and C. Some functions associated with fourth-generation vhlls also appear in existing languages. Ada, for instance, employs a class concept to abstract data structures and also allows a basic type definition to incorporate new items in the class or to be changed in response to specified behaviors.
A similar extension to C, called Objective-C, seeks to make object-oriented programming part of the Unix system repertoire. Ada, some Pascals, and Modula-2 provide concurrency, which allows separate processes to share resources, including the CPU. But these languages are not true members of the fourth generation because their syntactical structures haven’t really changed.
DIFFERS FROM ITS PREDECESSORS
As I said before, a fourth-generation language differs from its predecessors in the degree to which it minimizes the user’s need to specify machine behavior. Heather Bryce, in an article in Electronic Design, lists the general characteristics of fourth-generation languages:
(one) designed for on-line operation;
(two) easily used by nonprofessional programmers (generally, users should be able to learn a subset of the language in two days and get satisfactory results);
(three) employs a database management system (dbms) directly and requires one-tenth the number of instructions necessary for coding in COBOL or PL/1;
(four) uses nonprocedural code and makes intelligent default assumptions where possible, encourages structured code, produces code that is easily understood and maintained by others, and allows easy debugging of programs.
According to Greg Blanpied, Xerox vice president of technology development, fourth-generation language is an umbrella term that usually covers four areas of new software;
(one) presentation languages, such as formal query languages, natural query languages, reporting, and graphics;
(two) specialty/specialized functions, such as spreadsheets, modeling, analysis, and simulation;
(three) application generators, usually for COBOL;
(four) VHLL, including such nonprocedural languages as LISP and Forth.
The decisions to be made in each area reflect the needs of the user. For example, a growing number of users are primarily ordinary business people whose needs may be served by different capabilities of each area. Such users generally want the presentation languages, coupled with some DBMS system, to give them easy access to a company’s huge information system without becoming programmers or being totally dependent upon management information systems (MIS) staff.
They often need a special program for predicting the results of business decisions and modeling scenarios based on hypothetical situations or predictions. They don’t want to wait for mis to make changes in existing programs; instead, they want to use COBOL program generators that tailor general programs to their specific needs. And they need decision-support systems (sometimes called expert systems) to analyze and extract data based on knowledge stored in the system.
Fourth-generation languages ard generally one of three types: declarative, functional, or object-oriented. Declarative, or rule-based, languages use a set of operators to define the relationship between data. Once all the rules are established, the program executes them (Malpas and O’Leary, 1984). Functional languages apply mathematical expressions to data to get a result. The relationships come from applied mathematics, and the program has no constructs that change the original data. Each statement is executed independently, and the state of the machine does not affect program in any way.
In object-oriented programs, a contruct called an object contains the and commands to which the data responds. Objects can be organized into classes and analyzed for common features, “so that conclusions can be drawn or deductions made about the data” (Hindin, 1984).
WHAT COMES AFTER C?
Many people consider the Unix system and the C language to be the ultimate children of the third generation. But acceptance by systems programmers does not necessarily lead to general acceptance by the rest of the computing world. The Unix system and C form one kind of programming environment. However, several factors will determine whether that environment will support fourth-generation languages as well.
DBMS, the user interface, menus and windows—whole issues of various publications have been devoted to the ways in which the Unix system can be tamed and brought into the “user-friendly” world now inhabited by the Sun workstations and Apple Macintosh systems (see “References”). But the languages themselves, the VHLLs used to implement and control such systems, will still have to become less machine-oriented before the Unix system loses its “fit for true hackers only” reputation.
Specific implementations of existing languages such as C and Pascal often provide a programming environment in which fourth-generation characteristics might be incorporated. Proponents of the Unix system and C, for example, may suggest extensions to bring C into the fourth generation (Cox, 1983, 1984) or the addition of a widely used functional language to the Unix and C environment (Saunders, 1984).
Building the next generation’s language on top of the current one is a common and sensible approach, considering the time needed to start from scratch. It remains to be seen whether C, which most closely resembles assembly language, is the best third-generation base for such development. Adding an object-oriented component or fourth-generation interface to an effective Pascal implementation (such as Pascal-2) would be equally practical approaches.
Adding languages is a relatively easy way to make the Unix system accessible. Two special-purpose declarative languages, make
and yacc
, for example, are already part of the Unix system’s utility set. With relatively little training, users of these utilities can greatly increase their productivity (Malpas and O’Leary, 1984). Query languages accompany many of the relational databases available for Unix-based systems. General-purpose declarative languages are the next step. Prolog, for example, offers increased efficiency for users and for machines. William Wong claims that “Prolog compilers on larger machines generate code that is as efficient as C or LISP so that programming logically does not necessarily imply inefficiency” (Wong, 1984).
FUTURE LANGUAGES FOR THE UNIX SYSTEM
For the Unix system, the new languages developed by artificial intelligence (ai) researchers are generally expected to succeed the third-generation languages now in use. LISP and Prolog are most often mentioned; APL and Forth are less prominent, perhaps with good reason. To give you a taste of each, without getting bogged down in academic generalities or technical specifics, suppose we solve a simple problem with a demonstration program in each language.
This problem comes from the best-known book on Prolog (Clock-sin and Mellish’s Programming in Prolog, 1981). In a database, we have four countries. For each country, we know the population (in millions) and the size (in millions of square miles). From that data, we want to know each country’s population density and relative information such as the country with the largest area, largest population, or greatest population density.
APL
APL (A Programming Language) was an early attempt to program according to the logic of the problem rather than to the architecture of the machine. Originally a notation for applied math algorithms, APL has been adopted by IBM and DEC and is now available on supermicros.
Proponents tout APL’s terse form, simple rules, and concise representation of concepts through graphic symbols. You enter calculations as if you were using a calculator. Figure 1 illustrates our sample problem in APL. The first two lines establish our database. The “country” matrix has four items; each item has six characters. The “data” matrix is a corresponding matrix of four items, each with two numbers. The last three lines define the relationships as mathematical calculations.
To compute the density, we divide the first element of each item pair in the data matrix by the second element. In the fourth line, the function symbol for “maximum” is compressed onto density with the function symbol (“/”). Compression allows one element to be picked from a group, in this case the largest one. Using this same technique, the final line gives us the name of the country with the highest population.
I can hardly discuss APL without mentioning the “funny symbols”; they are the focal point, it seems, for the real debate about APL’s usefulness as a language, APL’s features may appeal more to mathematicians than to business people. Its creator, Kenneth Iverson, offers in his book A Programming Language this incentive: “The descriptive and analytic power of an adequate programming language repays the effort required for its mastery.” The degree to which APL rewards casual users is debatable: Other languages allow users to concentrate on the problem, free from “concern with computer-oriented details. ” And these languages use more familiar symbols.
LISP
LISP stands for List Processing Language (not Lots of Inconsequential Silly Parentheses). It was one of the first Al languages to become well known outside Al research centers. Popular architectures for Unix systems, such as the Motorola 68000, have allowed LISP to reveal its true potential.
Several LISP features are a definite move toward associative memory recall. One such feature is modularity: Data structures can be linked to form larger ones. By changing a set of pointers, you change the relative location of the structure. As a result, the essentially dynamic allocation of storage areas replaces the lengthy definitions of parameters for each program. LISP also provides “automatic garbage collection” as part of its efficient management of memory.
Figure 2 shows how to solve our demonstration problem with a LISP program. As you can see, the program in Figure 2 requires little more than a knowledge of the relationships between the data. Using English words or recognizable abbreviations, we define those relationships in the function definitions. In the applications definition, the relationships may be ordered to produce the desired results. Most computer manufacturers are offering LISP or Common LISP (a more recent and more consistent implementation) as part of their system software packages.
PROLOG
Prolog (the name stands for Proamming in Logic) allows you to do ay of the same things you can do LISP. Both languages employ ict-oriented relationships. Pro’s basic relationship is the Horn se, a predicate calculus formula tha|t contains only one conclusion. Prolog uses a rule-based search to manipulate the data in our sample problem, a slightly different approach from LISP’s function/application definitions. Facts are added to the database and are then interpreted according to the rule.
The statements in Figure 3 headed by pop
and area
are clauses stating the facts from which conclusions may be found if we know the population and area of X and divide the population by the area.
Prolog is heavily endorsed in Japan and England, and many American firms are using it as an interface to Unix system databases.
FORTH
Forth is more compact than LISP, so it’s more popular on current microcomputers. It’s also very transportable and can do real-time applications.
A Forth routine consists primarily of addresses that point to commands written in machine code, “primitives” in Forth terminology. The user’s instructions, called “secondaries, ” are written in primitives, and all secondaries for a program go into a dictionary. The syntax of the language is a string consisting of Forth words separated by spaces. Subroutines are called implicitly by “words” that start actions.
The big difference between Forth and other functional languages is mostly a matter of applicability. Like APL, Forth’s syntax is terse and specialized, but unlike APL, it would be very difficult to do even our simple demonstration program. We would have to extend the language by entering new words into Forth’s dictionary, or we’d have to do our arithmetic using registers and a stack, as we would in assembly language.
Forth is a machine-level programmer’s language. Even more specialized than APL, Forth is an object-oriented language for those who want to manipulate the computer itself.
TOO SOON TO CHOOSE?
Fourth-generation languages may be common about 10 years from now. However, government funding might accelerate the process and make micro-based artificial intelligence software available even sooner. But success of any one language or product will probably depend as much on the marketing of the product as on the technical development.
Harvey J. Hindin, special features editor for Computer Design, expressed skepticism about the readiness of ai languages in an article (Hindin, 1984) on the new software: “Prolog may end up being fundamentally flawed just because it is a logic-based language. It turns out that logic-based languages (some with even more features than Prolog) have been proposed before and found lacking. They have turned out to be duds because logic-based languages are not flexible enough for the real world. . . . Other issues in the great debate between LISP and Prolog await the test of time. ”
The fifth-generation languages will be Al-inspired natural-language systems that characteristically handle a variety of grammatical/non-grammatical constructions, infer from user inquiry where data will be found in a database, and execute necessary manipulations, procedures, and formatting.
These systems are yet to be made generally available, but you can see the beginnings in such Ai languages as Prolog and LISP.
In 1969 J. E. Sammet concluded her seminal book on the history of programming languages (Programming Languages: History and Fundamentals) with the prediction that future developments would either be theory-oriented or user-oriented. The goal of a theory-oriented approach is to give the system a complete characterization of all objectives to be achieved and let it design the program accordingly. Many fourth-generation languages take this approach.
The goal of a user-oriented approach is to implement a natural language to allow the system to respond to the specific idiosyncrasies of each user’s needs. Sixteen years later, the debate still continues, and both approaches are still in progress. □
David Spencer is technical publications manager at Oregon Software Inc., Portland, Ore. After teaching English for 10 years, he has written software documentation over the last five years for Sperry Univac and Oregon Software. His interest in programming languages stems from his graduate days at USC, where he studied rhetoric, linguistics, and literature.
References
Baker, Linda, and Mitch Derick. Pocket Guide to Forth, Reading, Mass.: Addison-Wesley, 1983.
Berg, Eric H. “Bell Labs Unveil New 1-Megabit Chip.” The (Portland) Oregonian, December 21, 1984, page E9.
Blanpied, Greg. “Using Fourth-Generation Languages Well.” Software News, July 1984.
Bryce, Heather. “Software Engineers Seek to Make Computers Understand Natural Languages.” Electronic Design, May 3, 1984.
Cantral, David. “Speaking the Users’ Language.” Software News, July 1984, page 24.
Clocksin, W. F., and C. S. Mellish. Programming in Prolog. New York: Springer-Verlag, 1981.
Cox, Brad J. “Object-Oriented Programming in C.” Unix Review, October/November 1983, page 67, and February/March 1984, page 56.
Epstein, Arnold, Jeffrey D. Morris, and Barry Unger. “Forth Efficiency Blends with C and Pascal Syntax.” Computer Design, November 1984, page 183.
Fletcher, George W. “Talking to Your Computer in English.” Software News, July 1984, page 30.
Grosch, Herbert. Computer Design, November 1984, page 150.
Hindiln, Harvey J. “Fifth-Generation Computing: Dedicated Software Is The Key." Computer Design, September 1984, page 150.
James W. “Programming Languages.” Computer, April 1982, page 70.
Li, Deyi. A Prolog Database System. New York: John Wiley & Sons, 1984.
Lipkin, Efrem, and Theodore Goldstein, “Software Wars and the Development Post-Structural Programming.” Unshed paper, 1985.
Malpas, John, and Kathy O’Leary, “Declarative Languages Under Unix.” Microsystems, August 1984, page 94.
Roland, Jon. “Unix Database Management Systems.” In three parts: Unix Review, December/January 1983, page 43; February/March 1984, page 26; and April/May 1984, page 24.
Roland, Jon. “AI, Unix, and C.” Unix/WORLD. Vol. 1, No. 3, 1984, page 96.
Sammet, J. E. Programming Languages: Histry and Fundamentals. Englewood Cliffs, N.J.: Prentice-Hall, 1969.
Santiarelli, Mary-Beth. “What Kind of Aid? It Depends on Skill.” Software News, July 1984, page 22.
Saunders, David. “Unix, C, and APL.” Unix/World, Vol. 1, No. 6, page 59.
Shinder, Max. “Engineering Software.” Eletronic Design, January 12, 1984, page 150.
Spivey, Mike. University of York Portable Prolog System Users Guide, University of York, 1984.
Wong, William D. “Prolog: A Tutorial/Review.” Microsystems, January 1984, page 104.
GLOSSARY
Declarative languages. A program consists of a set of rules governing the relationships between various types of data.
Procedural languages. Program consists of flow-control constructs and data structures; users must bind data to types.
Rule-based language. Same as declarative language.
Functional programming. Program consists of a set of computational instructions.
Object-oriented programming. Data is coupled with a set of operations; the combination (called an object) is activated by commands to do things.
Query language. A query language such as SQL is used to access a database.
Expert system. Stores knowledge that a human expert might give in response to questions.
Knowledge system. Same as an expert system.