Copyright © 2007 Steven F. Lott
This work is licensed under a Creative Commons License. You are free to copy, distribute, display, and perform the work under the following conditions:
Attribution. You must give the original author, Steven F. Lott, credit.
Noncommercial. You may not use this work for commercial purposes.
No Derivative Works. You may not alter, transform, or build upon this work.
For any reuse or distribution, you must make clear to others the license terms of this work.
6/1/2007
Table of Contents
math and
random% — The Message
Formatting OperatorList of Figures
List of Examples
example1
Shell Scriptexample1.bat
Batch FileList of Equations
Table of Contents
The Walrus and the Carpenter.
"The time has come," the Walrus said, "To talk of many things: Of shoes — and ships — and sealing-wax — Of cabbages — and kings — And why the sea is boiling hot — and whether pigs have wings."
You'll need to read this book when you have the following three things happening at the same time:
Or, perhaps you are tinkerer who likes to know how things really work. For many people, a computer is just an appliance. You may not find this satisfactory, and you want to know more. People who tinker with computers are called hackers, and you are about to join their ranks.
Python is what you've been looking for. It is an easy-to-use tool that can do any kind of processing on any kind of data. Seriously: any processing, any data. Programming is the term for setting up a computer to do the processing you define on your data. Once you learn the Python language, you can solve your data processing problem.
Our objective is to get you, a non-programming newbie, up and running. When you're done with this book, you'll be ready to move on to a more advanced Python book. For example, a book about the Python libraries. You can use these libraries can help you build high-quality software with a minimum of work.
This book is about many things. The important topics include the Python, programming, languages, data, processing, and some of the skills that make up the craft of programming. We'll talk about the core intellectual tools of abstraction, algorithms and the formality of computer languages. We'll also touch on math and logic, statistics, and casino games.
Python. Python is a powerful, flexible toolbox and workbench that can help solve your data processing problem. If you need to write customized software that does precisely what you want, and you want that software to be readable, maintainable, adaptable, inexpensive and make best use of your computer, you need Python.
Here's a very important distinction:
What does this distinction mean? First, there is an opportunity for us to confuse Python (the program) and Python (the language). We'll attempt to be as clear as we can on the things the Python program does when you give it commands in the Python Language. For people very new to computers, this raises questions like “what is a programming language?” and “why can't it just use English?” and “what if I'm not good with languages?” We'll return to these topics in the section called “Concepts FAQ's”. For now, we'll emphasize the point that the Python language is more precise than English, and also easy to read and write.
The other thing that the distinction between program and language means is that we will focus our efforts on learning the language. The data processing we want to perform will be completely defined by a sequence of statements in the Python language. Learning a computer language isn't a lot different from learning a human language, making our job relatively easy. We'll be reading and writing Python in no time.
Programming. When we've written a sequence of Python statements, we can then use that sequence over and over again. We can process different sets of data in a standard, automatic fashion. We've created a program that can automate data processing tasks. It can replace tedious or error-prone pointing and clicking in other software tools. Or it can do new things that other desktop tools can't do at all.
The big picture is this: the combination of the Python program plus our unique sequence of Python language statements has the effect of creating a new application program for our computer. It means that our new program is built on the existing Python program as its foundation. The Python program, in turn, depends on many other libraries and programs on your computer. The whole structure forms a kind of technology stack, with your program on top, controlling the whole assembly.
Languages. We'll look at three facets of a programming language: how you write it, what it means, and the additional practical considerations that make a program useful. We'll use these three concepts to organize our presentation of the language. We need to separate these concepts to assure that there isn't a lot of confusion between the real meaning and the ways we express that meaning.
The sentences “Xander wrote a tone poem for chamber orchestra” and “The chamber orchestra's tone poem was written by Xander” have the same meaning, but express it different ways. They have the same semantics, but different syntax. For example, in one sentence the verb is “wrote”, in the other sentence it is “was written by”: different forms of the verb to write. While they have the same semantics, the first form is written in active voice, and second form is called the passive voice. Pragmatically, the first form is slightly clearer and more easily understood.
The syntax of the Python language is covered here, and in the Python Reference Manual. Python syntax is simple, and very much like English. We'll provide many examples of language syntax. We'll also provide additional tips and hints focused on the newbies and non-programmers. Also, when you install Python, you will also install a Python Tutorial that presents some aspects of the language, so you'll have at least three places to learn syntax.
The semantics of the language specify what a statement really means. We'll define the semantics of each statement by showing what it makes the Python program do to your data. We'll also be able to show where there are alternative syntax choices that have the same meaning. In addition to semantics being covered in this book, you'll be able to read about the meaning of Python statements in the Python Reference Manual, the Python Tutorial, and chapter two of the Python Library Reference.
In this book, we'll try to provide you with plenty of practical advice. In addition to breaking the topic into bite-sized pieces, we'll also present lots of patterns for using Python that you can apply to real-world problems.
Extensions. Part of the Python technology stack are the extensive libraries. These libraries are added onto Python, which has the advantage of keeping the language trim and fit. Software components that you might need for specialized processing are kept separate from the core language. Plus, you can safely ignore the components you don't need.
This means that we actually have two things to learn. First, we'll learn the language. After that, we'll look at a few of the essential libraries. Once we've seen that, we can see how to make our own libraries, and our own application programs.
Programming and Computer Skills. We're going to focus on programming skills, which means we have to presume that you already have general computer skills. You should fit into one of these populations.
What skills will you need? How will we build up your new skills?
Skills You'll Need. This book assumes an introductory level of skill with any of the commonly-available computer systems. Python runs on almost any computer; because of this, we call it platform-independent. We won't presume a specific computer or operating system. Some basic skills will be required. If these are a problem, you'll need to brush up on these before going too far in this book.
How We Help. Newbie programmers with an interest in Python are our primary audience. We provide specific help for you in a number of ways.
Programming is an activity that includes the language skills, but also includes design, debugging and testing; we'll help you develop each of these skills.
When you've finished with this book you should be able to do the following.
A Note on Clue Absorption. Learning a programming language involves accumulating many new and closely intertwined concepts. In our experience teaching, coaching and doing programming, there is an upper limit on the “Clue Absorption Rate”. In order to keep below this limit, we've found that it helps to build up the language as ever-expanding layers. We'll start with a very tiny, easy to understand subset of statements; to this we'll add concepts until we've covered the entire Python language and all of the built-in data types.
Our part of the agreement is to do things in small steps. Here's your part: you learn a language by using it. In order for each layer to act as a foundation for the following layers, you have to let it solidify by doing small programming exercises that exemplify the layer's concepts. Learning Python is no different from learning Swedish. You can read about Sweden and Swedish, but you must actually use the language to get it off the page and into your head. We've found that doing a number of exercises is the only way to internalize each language concept. There is no substitute for hands-on use of Python. You'll need to follow the examples and do the exercises. As you can probably tell from this paragraph, we can't emphasize this enough.
The big difference between learning Python and learning Swedish is that you can immediately interact with the Python program, doing real work in the Python language. Interacting in Swedish can more difficult. The point of learning Swedish is to interact with people: for example, buying some kanelbulle (cinnamon buns) for fika (snack). However, unless you live in Sweden, or have friends or neighbors who speak Swedish, this interactive part of learning a human language is difficult. Interacting with Python only requires a working computer, not a trip to Sweden.
Also, your Swedish phrase-book gives you little useful guidance on how to pronounce words like sked (spoon) or sju (seven); words which are notoriously tricky for English-speakers like me. Further, there are some regional accents within Sweden, making it more difficult to learn. Python, however, is a purely written language so you don't have subtleties of pronunciation, you only have spelling and grammar.
This book falls into thirteen distinct parts. To manage the clue absorption rate, the parts are organized in a way that builds up the language in layers from simple, central concepts to more advanced features. Each part introduces a few new concepts. Programming exercises are provided to encourage further exploration of each layer.
Some programming languages (like Pascal or Basic) were specifically designed to help teach programming. Most other programming languages (like Python) are designed for doing the practical work of solving information processing problems. One consequence of this is that Python is a tightly integrated whole. Some features of the language will have both simple and advanced semantics. In many cases some simple-looking features will actually depend on some more advanced parts of the language. This forces us to revisit some subjects several times, first for an introduction, then for more in-depth treatment.
Chickens and Eggs. One subtext woven into this book is the two-sided coin labeled “data processing”. The processing side of the coin reflects the imperative-voice verb statements in the Python language. This active sense of “first do this, then do that” is central to programming. On the other side of the coin, we have the data side, which includes numbers, strings of letters, related groups of values, lists of values and relationships between values. Often, when we think of computer data, we think of files. The way we structure our data is also central to programming.
Data and Processing have a chicken-and-egg relationship. We could cover either of these topics first and get to the other second. In this book, we had to choose and we elected to look at processing first, and then, in Getting Our Bearings, switching over to the data side.
The other topics that weave through this book are the design, debugging and testing skills you'll need to grow. We'll develop these skills through hands-on use, so each chapter has five kinds of information.
Some Big Problems. There are a couple of problems that we'll use throughout this book to show how you use Python. Both problems are related to casino games. We don't embrace gambling; indeed, as you work through these sample problems, you'll see precisely how the casino games are rigged to take your money. We do, however, like casino games because they are moderately complex and not very geeky. Really complex problems require whole books just to discuss the problem and its solution. Simple problems can be solved with a spreadsheet. In the middle are problems that require Python.
We'll provide some of the rules for Roulette in Chapter 4, Two Minimally-Geeky Problems as well as some of the rules for Craps. We'll look at a couple of interesting casino gambling problems in this chapter that will give us a representative problem that we can solve with Python programming.
Getting Started. Part I, “Getting Started” introduces the basics of computers, languages and Python. Chapter 1, About Computers defines the basic concepts we'll be working with. Chapter 2, About Programs will more fully define a program and the art of programming. Chapter 3, Let There Be Python covers installation of Python. Chapter 4, Two Minimally-Geeky Problems gives an overview of two problems we'll use Python to solve. Chapter 5, Why Python is So Cool provides some history and background on Python.
Using Python. Part II, “Using Python” introduces using Python and the IDLE development environment. We'll cover direct use of Python in Chapter 6, Instant Gratification. We'll cover IDLE in Chapter 7, IDLE Time.
Additional sections will add depth to this material as we explore more of the language. Chapter 14, Turning Python Loose With a Script shows how to control Python with a script of statements. Chapter 23, Turning Python Loose with More Sophisticated Scripts will make use of the Python control statements for more sophisticated scripts.
Processing. Part III, “Arithmetic and Expressions” introduces the basic features of the Python language. Chapter 8, Simple Arithmetic includes the basic arithmetic operations and numeric types. Chapter 9, Better Arithmetic Through Functions introduces the most useful built-in functions. Chapter 11, Special Ops covers some additional operators for more specialized purposes. Chapter 12, Peeking Under the Hood has some additional topics that may help you get a better grip on how Python works.
Part IV, “Programming Essentials” introduces the essential programming constructs for input, processing and output. Chapter 13, Seeing Results shows how to do output with the print statement. Chapter 14, Turning Python Loose With a Script shows how to control Python with a script of statements. Chapter 15, Generalizing a Calculation introduces variables and the assignment statement. We'll cover some additional assignment topics in Chapter 16, Assignment Bonus Features, including multiple assignment and how to make best use of the Python shell. Chapter 17, Can We Get Your Input? shows the two simple input functions.
Part V, “Some Self-Control” introduces the various ways to control which statements execute. Chapter 18, Truth and Logic adds truth and conditions to the language. We'll look at comparisons in Chapter 19, Making Decisions. Chapter 20, Conditional Processing: Only When Necessary adds conditional and Chapter 21, Iterative Processing: While We Have More To Do adds iterative processing statements. In Chapter 22, Becoming More Controlling we'll cover some additional topics in control. Chapter 23, Turning Python Loose with More Sophisticated Scripts will make use of these control statements for more sophisticated scripts.
Part VI, “Organizing Programs with Function Definitions” shows how to define functions to organize a
program. Chapter 24, Function Definitions — Adding New Verbs introduces the basic function
definition and use. From there we'll look at Extra Functions: math and
random. Chapter 25, Flexibility, Clarity and a Close
Relative adds some useful features to these basic.
Chapter 26, A Few More Tools describes concepts like returning
multiple values.
After introducing some basic types of collections in the next part, we'll return to the language topics in Part IX, “Additional Processing Patterns — Exceptions and Iterations”. This will add exceptions in Chapter 33, Exceptions and Unusual Events and generators in Chapter 34, Looping Back To Look At Iteration.
Course Change. Programming is all about data and processing. Up to this point, we've focused on processing. From this point forward, we'll focus on data. Since these are two sides of the same coin, there's no absolute separation, it's only a matter of focus. Getting Our Bearings will clarify this relationship between data and processing.
Data. We'll start covering the data side of data processing in Part VIII, “Basic Collections of Data: Strings, Lists and Tuples”, which is an overview of the sequential collections. Chapter 28, Collecting Items in Sequence extends the data types to include various kinds of sequences. These include Chapter 29, Sequences of Characters: Strings, Chapter 30, Doubles, Triples, Quadruples: Tuples! and Chapter 31, Using Lists To Stay Organized. We'll look at some additional topics in Chapter 32, Common List Design Patterns.
We'll revisit some processing elements in Additional Processing Patterns — Exceptions and Iterations. This will include Exceptions and Unusual Events as well as Looping Back To Look At Iteration.
We'll cover more data structures in Part X, “More Data Collections: Sets, Mappings, Dictionaries and Files”. We'll look at the set in Collecting Items in a Set. Mappings and Dictionaries describes mappings and dictionaries. We'll use the map and sequence structure in Defining More Flexible Functions. Chapter 38, Files — The Permanent Record covers the basics of files. Chapter 39, Files II — Beyond the Basics covers several closely related operating system (OS) services. Chapter 40, Files III — The Grand Unification presents some additional material on files and how you can use them from Python programs.
Organization and Structure. Part XI, “Data + Processing = Objects” describes the object-oriented programming features of Python. Objects: A Retrospective reviews objects we've already worked with. Then we can examine the basics of class definitions in Defining New Objects. In Inheritance, Generalization and Specialization we'll introduce a very significant technique for simplifying programs. Additional Classy Topics describes some more tools that help simplify class definition.
We'll take a first look at how we can write classes that look like Python's built-in classes in Special Behavior Requires Special Methods. Sophisticated Numbers: Fractions and Currency shows how we can build very useful kinds of numbers. We can create more sophisticated collections using the techniques in Creating New Types of Collections.
Part XII, “Organizing Programs with Modules” describes modules, which provide a
higher-level grouping of class and function definitions. It also
summarizes selected extension modules provided with the Python
environment. Chapter 48, Module Definitions — Adding New
Concepts provides basic semantics
and syntax for creating modules. It also covers the organization of the
available Python modules. Chapter 49, Essential Modules surveys
the modules you're most likely to use. We'll look at how to handle
currency in Fixed-Point Numbers — Doing High
Finance. Chapter 51, Time and Date Processing defines
the time and calendar
modules. Chapter 52, Text Processing and Pattern Matching shows how to do string pattern
matching and processing with the re
module.
Some of the commonly-used modules are covered during earlier
chapters. In particular the math and
random modules are covered in the section called “The math Module —
Trig and Logs” and the string module
is covered in Chapter 29, Sequences of Characters: Strings. Chapter 39, Files II — Beyond the Basics touches on os,
os.path, glob, and
fnmatch.
Fit and Finish. We finish talking about the fit and finish of a completed program in Part XIII, “Fit and Finish: Complete Programs”. The basics of a complete program are covered in Chapter 53, Wrapping and Packaging Our Solution. Many species of programs are described in Chapter 54, Architectural Patterns — A Family Tree.
Here is how we'll show Python programs in the rest of the book. The programs will be in separate boxes, in a different font, often with numbered “callouts” to help explain the program. This example is way too advanced to read in detail (it's part of Chapter 36, Mappings and Dictionaries) it just shows what examples look like.
Example 1. Python Example
combo = { }
for i in range(1,7):
for j in range(1,7):
roll= i+j
combo.setdefault( roll, 0 )
combo[roll] += 1
for n in range(2,13):
print "%d %.2f%%" % ( n, combo[n]/36.0 )
The output from the above program will be shown as follows:
2 0.03% 3 0.06% 4 0.08% 5 0.11% 6 0.14% 7 0.17% 8 0.14% 9 0.11% 10 0.08% 11 0.06% 12 0.03% Tool completed successfully
We will use the following type styles for references to a specific
Class, method function,
attribute, which includes both class variables or
instance variables.
There will be design tips, and warnings, in the material for each exercise. These reflect considerations and lessons learned that aren't typically clear to starting OO designers.
I have to thank all of the people at my employer, CTG, for giving me so many decades of opportunities to practice the craft of programming.
This part provides some necessary background to help non-programming newbies get ready to write their own programs. If you have good computer skills, this section may be all review. If you are very new to computers, our objective is to build up your skills by providing as complete an introduction as we can. Computing has a lot of obscure words, and we'll need some consistent definitions.
We'll start with the big picture. In About Computers we'll provide a list of concepts that are central to computers, programs and programming. In About Programs we'll narrow our focus to programs and how we create them.
In Let There Be Python we'll describe how to install Python. You'll need to choose just one of Windows Installation, Macintosh Installation or GNU/Linux and UNIX Installation. This chapter has the essential first step in starting to build programs: getting our tools organized.
We'll describe two typical problems that Python can help us solve in Two Minimally-Geeky Problems. We'll provide many, many more exercises and problems than just these two. But these are representative of the problems we'll tackle.
We also provide some history and background to help show why Python is so cool. If you are already convinced that Python is your tool of choice, you can skip Why Python is So Cool. If you've heard about Visual Basic, Java or C++ and wonder why Python is better, you might find something helpful in that section. It involves some computer-science jargon; you've been warned.
Table of Contents
Outr job as a programmer is to write statements in the Python language that will control our computer system. This chapter describes the basic topics of what a computer is and how we set up a computer to perform a task. We need to be perfectly clear on what computing is so that you can be successful in programming a computer to solve your problems.
In Terminology we'll provide a common set of terms, aimed at newbies who will soon become programmers. The computer industry has a lot of marketing hype, which can lead to confusing use of terms. Worse, the computer industry has some terminology that is intended to make computers easier to use, but really only succeeds in muddying the waters.
We'll build on the terminology foundation in What is a Program? and define a program more completely. This is, after all, our goal, and we'll need to have it clearly defined so we can see how we're closing in on it.
We want to clearly define some terms that we'll be using throughout the book. We're going to build up our Python understanding from this foundational terminology. In the computer world, many concepts are new, and we'll try to make them familiar. Further, some of the concepts are abstract, forcing us to borrow existing words and extend or modify their meanings. We'll also define them by example as we go forward in exploring Python.
This section is a kind of big-picture road map of computers. We'll refer back to these definitions in the sections which follow.
Okay, this is perhaps silly, but we want to be very clear. We're talking about the whole system of interconnected parts that make up a computer. We're including displays and keyboards and mice. We're drawing a line between our computer and the network that interconnects it to other computers. Inside a computer system there are numerous electronic components, one of which is the processor, which controls most of what a computer does.
It helps to think of two species of computers: your personal computer — desktop or laptop — sometimes called a “client” and shared computers called “servers”. When you are surfing a web site, you are using more than one computer: your personal computer is running the web browser, and one or more server computers are responding to your browser's requests. Most of the internet things you see involve your desktop and a server somewhere else.
We do need to note that the principle of abstraction is being applied here. A number of electronic devices are all computers on which we can do Python programming. Laptops, desktops, iMacs, PowerBooks, clients, servers, Dells and HP's are all examples of this abstraction we're calling a computer system.
We have a number of devices that are part of our computers. Most devices are plugged into the computer box and connected by wires, putting them on the periphery of the computer. A few devices are wireless; they connect using Bluetooth, WiFi (IEEE 802.11) or infrared (IR) signals. We call the connection the interface.
The most important devices are hidden within the box, physically adjacent to the central processor. These central items are memory (called random-access memory, RAM) and a disk. The disk, while inside the box, is still considered peripheral because once upon a time, disks were huge and expensive.
The other peripheral devices are the ones we can see: display, keyboard and mouse. After that are other storage devices, including CD's, DVD's, USB drives, cameras, scanners, printers, drawing tablets, etc. Finally we have network connections, which can be Ethernet, wireless or a modem. All devices are controlled by pieces of software called drivers.
Note that we've applied the abstraction principle again. We've lumped a variety of components into abstract categories.
The computer's working memory (Random-Access Memory, or RAM) contains two things: our data and the processing instructions (or program) for manipulating that data. Most modern computers are called stored program digital computers. The program is stored in memory along with the data. The data is represented as digits, not mechanical analogies. In contrast, an analog computer uses mechanical analogs for numbers, like spinning gears that make an analog speedometer show the speed, or the strip of metal that changes shape to make an analog meat thermometer show the temperature.
The central processor fetches each instruction from the computer's memory and then executes that instruction. We like to call this the fetch-execute loop that the processor carries out. The processor chip itself is hardware; the instructions in memory are called software. Since the instructions are stored in memory, they can be changed. We take this for granted every time we double click an icon and a program is loaded into memory.
The data on which the processor is working must also be in memory. When we open a document file, we see it read from the disk into memory so we can work on it.
Memory is dynamic: it changes as the software does its work. Memory which doesn't change is called Read-Only Memory (ROM).
Memory is volatile: when we turn the computer off, the contents vanish. When we turn the computer on, the contents of memory are random, and our programs and data must be loaded into memory from some persistent device. The tradeoff for volatility is that memory is blazingly fast.
Memory is accessed “randomly”: any of the 512 million bytes of my computer's memory can be accessed with equal ease. Other kinds of memory have sequential access; for example, magnetic cassette tapes must be accessed sequentially.
For hair-splitters, we recognize that there are special-purpose computing devices which have fixed programs that aren't loaded into memory at the click of a mouse. These devices have their software in read-only memory, and keep only data in working memory. When our program is permanently stored in ROM, we call it firmware instead of software. Most household appliances that have computers with ROM.
We call these disk drives because the memory medium is a spinning magnetizable disk with read-write heads that shuttle across the surface; you can sometimes hear the clicking as the heads move. Individual digits are encoded across the surface of the disk in separate blocks of data. Some people are in the habit of calling them “hard” to distinguish them from the obsolete “floppy” disks that were used in the early days of personal computing.
Disk memory isn't completely random-access because the read-write heads have to move across the surface and the surface is rotating. There are delays while the computer waits for the heads to arrive at the right position. There are also delays while the computer waits for the disk to spin to the proper location under the heads.
Your computer's disk can be imagined as persistent, slow memory: when we turn off the computer, the data remains intact. The tradeoff is that it is agonizingly slow: it reads and writes in milliseconds, close to a million times slower than dynamic memory.
Disk memory is also cheaper than RAM by a factor of at almost 100: we buy 40 gigabytes (40 billion bytes, or 40,000 megabytes) of disk for $100; the cost of 512 megabytes of memory.
The human interface to the computer typically consists of three devices: a display, a keyboard and a mouse. Some people use additional devices: a second display, a microphone, speakers or a drawing tablet are common examples. Some people replace the mouse with a trackball. These are often wired to the computer, but wireless devices are also popular.
In the early days of computers — before the invention of the mouse — the displays and keyboards could only handle characters: letters, numbers and punctuation. When we used computers in the early days, we spelled out each command, one line at a time. Now, we have the addition of sophisticated graphical displays and the mouse. When we use computers now, we point and click, using graphical gestures as our commands. Consequently, we have two kinds of human interfaces: the Command-Line Interface (CLI), and the Graphical User Interface (GUI).
A keyboard and a mouse provide inputs to software. They work by interrupting what the computer is doing, providing the character you typed, or the mouse button you pushed. A piece of software called the Operating System has the job of collecting this stream of input and providing it to the application software. A stream of characters is pretty simple. The mouse clicks, however, are more complex events because they involve the screen location as well as the button information, plus any keyboard shift keys.
A display shows you the outputs from software. The display device has to be shared by a number of application programs. Each program has one or more windows where their output is sent. The Operating System has the job of mediating this sharing to assure that one program doesn't disturb another program's window. Generally, each program will use a series of drawing commands to paint the letters or pictures. There are many, many different approaches to assembling the output in a window. We won't touch on this because of the bewildering number of choices.
Historically, display devices used paper; everything was printed. Then they switched to video technology. Currently, displays use liquid crystal technology. Because displays were once almost entirely video, we sometimes summarize the human interface as the Keyboard-Video-Mouse (KVM).
In order to keep things as simple as possible, we're going to focus on the command-line interface. Our programs will read characters from the keyboard, and display characters in an output window. Even though the programs we write won't respond to mouse events, we'll still use the mouse to interact with the operating system and programs like IDLE.
These storage devices are slightly different from the internal disk drive or hard drive. The differences are the degree of volatility of the medium. Packaged CD's and DVD's are read-only; we call them CD Read-Only Memory (CD-ROM). When we burn our own CD or DVD, we used to call it creating a Write-Once-Read-Many (WORM) device. Now there are CD-RW devices which can be written (slowly) many times, and read (quickly) many times, making the old WORM acronym outdated.
Where does that leave Universal Serial Bus USB drives (known by a wide variety of trademarked names like Thumb Drive™ or Jump Drive™) and the memory stick in our camera? These are just like the internal disk drive, except they don't involve a spinning magnetized disk. They are slower, have less capacity and are slightly more expensive than a disk.
These are usually USB devices; they are unique in that they send data in one direction only. Scanners send data into our computer; our computer sends data to a printer. These are a kind of storage, but they are focused on human interaction: scanning or printing photos or documents.
The scanner provides a stream of data to an application program. Properly interpreted, this stream of data is a sequence of picture elements (called “pixels”) that show the color of a small section of the document on the scanner. Getting input from the scanner is a complex sequence of operations to reset the apparatus and gather the sequence of pixels.
A printer, similarly, accepts a stream of data. Properly interpreted, this stream of data is a sequence of commands that will draw the appropriate letters and lines in the desired places on the page. Some printers require a sequence of pixels, and the printer uses this to put ink on paper. Other printers use a more sophisticated page description language, which the printer processes to determine the pixels, and then deposits ink on paper. One example of these sophisticated graphic languages is PostScript.
A network is built from a number of cooperating technologies. Somewhere, buried under streets and closeted in telecommunications facilities is the global Internet: a collection of computers, wires and software that cooperates to route data. When you have a cable-modem, or use a wireless connection in a coffee shop, or use the Local Area Network (LAN) at school or work, your computer is (indirectly) connected to the Internet. There is a physical link (a wire or an antenna), there are software protocols for organizing the data and sharing the link properly. There are software libraries used by the programs on our computer to surf web pages, exchange email or purchase MP3's.
While there are endless physical differences among network devices, the rules, protocols and software make these various devices almost interchangeable. There is stack of technology that uses the principle of abstraction very heavily to minimize the distinctions among wireless and wired connections. This kind of abstraction assures that a program like a web browser will work precisely the same no matter what the physical link really is. The people who designed the Internet had abstraction very firmly in mind as a way to allow the Internet to expand with new technology and still work consistently.
The Operating System (OS) ties all of the computer's devices together to create a usable, integrated computer system. The operating system includes the software called device drivers that make the various devices work consistently. It manages scarce resources like memory and time by assuring that all the programs share those resources. The operating system also manages the various disk drives by imposing some organizing rules on the data; we call the organizing rules and the related software the file system.
The operating system creates the desktop metaphor that we see. It manages the various windows; it directs mouse clicks and keyboard characters to the proper application program. It depicts the file system with a visual metaphor of folders (directories) and documents (files). The desktop is the often shown to you by a program called the “finder” or “explorer”; this program draws the various icons and the dock or task bar.
In addition to managing devices and resources, the OS starts programs. Starting a program means allocating memory, loading the instructions from the disk, allocating processor time to the program, and allocating any other resources in the processor chip.
When we double click an icon, a fair number of things are going on under the hood. Since we'll be writing our own programs, we'll need to look closely at what is really happening.
Finally, we have to note that it is the OS that provides most of the abstractions that make modern computing possible. The idea that a variety of individual types of machines could be summarized by a single abstraction of “storage” allows disk drives, CD-ROM's, DVD-ROM's and thumb drives to peacefully co-exist. It allows us to run out and buy a thumb drive and plug it into our computer and have it immediately available to store the pictures of our trip to Sweden.
A program is started by the operating system to do something useful. We'll look at this in depth in What is a Program? and What Happens When a Program “Runs?”. Since we will be writing our own programs, we need to be crystal clear on what programs really are and how they make our computer behave.
There isn't a useful distinction between words like “program”, “command”, “application”, “application program”, “application system”. Some vendors even call their programs “solutions”. We'll try to stick to the word program. A program is rarely a single thing, so we'll try to identify a program with the one file that contains the main part of the program.
In Terminology we provided a kind of road map to computers. Here, we're going to look a little more closely at these things called “programs”.
What — Exactly — is the Point? The essence of a program is that it sets up a computer to do a specific task. We could say that it is a program which applies the computer to a particular problem. Sometimes we call them “application programs” because the programs are applied to definite data processing needs.
There is a kind of parallel between a computer system running programs and a television playing a particular TV show. Without the program, the computer is just a pile of inert electronics. Similarly, if there is no TV show, the television just plays noise and hisses.
We're going to focus on data and processing. We'll be aiming at programs which read and write files of data, much like our ordinary desktop tools open and save files. We aren't excluding game programs or programs that control physical processes. A game's data is the control actions from the player plus the description of the game's levels and environments. The processing that a game does matches the inputs, the current state and the level to determine what happens next. An interactive game, however, is considerably more complex than a program to evaluate a file that has a list of our stocks.
Program Varietals. At this point, we need to make a distinction between two varieties
of programs: a binary executable
and a script. A binary
executable or binary application is a program
that takes direct control computer's processor. We call it binary
because it uses the binary codes specific to the processor chip inside
the computer. If you haven't encountered “binary” before,
see Binary
Codes. Most programs
that you buy or download fit this description. Most of the office
applications you use are binary executables
(NeoOffice/J is the notable exception.) A web
browser, for example, is a binary executable, as is the
python program (named
python.exe in Windows.)
Your operating system (for example, Windows or GNU/Linux or MacOS) is a complex collection of binary executables. Primarily, when the computer starts running, a “kernel” of binary software is loaded; this kernel controls everything that goes on in the computer. Once this kernel of software is running, it then loads a number of additional programs that constitute the working operating system with which we interact. These programs don't solve any particular problem, but they enable the computer to be used by non-engineers.
A binary executable's direct control over the processor is beneficial because it gives the best speed and uses the fewest resources. However, the cost of this control is the relative opacity of the coded instructions that control the processor chip. The processor instruction codes are focused on the electronic switching arcana of gates, flip-flops and registers. They are not focused on data processing at a human level. If you want to see how complex and confusing the processor chip can be, go to Intel or AMD's web site and download the technical specifications for one of their processors.
One subtlety that we have to acknowledge is that even the binary applications don't have complete control over the entire computer system. Recall that the computer system loads a kernel of software when it starts. All of the binary applications outside this kernel do parts of their work by using program fragments provided by the kernel. This important design feature of the operating system assures that all of the application programs share resources politely. One of the kernel's two jobs is to coordinate among the application programs. If every binary application simply grabbed resources willy-nilly, one badly behaved program could stop all other programs from working. Imagine the tedium of quitting your browser to make notes in your word processor, then quitting your word processor to go back to your web browser. The other of the kernel's two jobs is to embody the abstraction principle and make a wide variety of processors have a nearly identical set of features.
Layers of Abstraction. Let's take a close look at our metaphor again. We said there is a strong parallel between a computer running a program and a TV playing a particular TV show. We now have two layers of meaning here: