Table of Contents
Before digging into the details, we'll introduce the common features of three of the structured data types that manipulate sequences of values. In the chapters that follow we'll look at Chapter 12, Strings, Chapter 13, Tuples and Chapter 14, Lists in detail. In Chapter 15, Mappings and Dictionaries, we'll introduce another structured data type for manipulating mappings between keys and values.
In the section called “Semantics” we will provide an overview of the semantics of sequences. We describes the common features of the sequences in the section called “Overview of Sequences”.
The sequence is central to programming and central to Python. A number of statements and functions we have covered have sequence-related features that we have glossed over, danced around, and generally avoided.
We'll revisit a number of functions and statements we covered in previous sections, and add the power of sequences to them. In particular, the for statement is something we glossed over in the section called “Iterative Processing: For All and There Exists”.
A sequence is a container of objects which are kept in a specific order. We can identify objects in a sequence by their position or index. Positions are numbered from zero in Python; the element at index zero is the first element.
We call these containers because they are a single object which contains (or collects) any number of other objects. The “any number” clause means that they can collect zero other objects, meaning that an empty container is just as valid as a container with one or thousands of objects.
In some programming languages, they use words like "vector" or "array" to refer to sequential containers. Further in other languages there are very specific implementations of sequential containers. For example, in C or Java, the primitive array has a statically allocated number of positions. In Java, a reference outside that specific number of positions raises an exception. In C, however a reference outside the defined positions of an array is an error that may never be detected. Really.
There are four commonly-used subspecies of sequence containers.
The string, the Unicode
string, the tuple and the
list. A string is a
container of single-byte ASCII characters. A Unicode
string is a container of multi-byte Unicode (or Universal
Character Set) characters. A tuple and a
list are more general containers.
When we create a tuple or
string, we've created an
immutable, or static object. We can examine the
object, looking at specific characters or objects. We can't change the
object. This means that we can't put additional data on the end of a
string. What we can do, however, is create a new
string that is the concatenation of the two
original strings.
When we create a list, on the other hand,
we've created a mutable object. A
list can have additional objects appended to it
or inserted in it. Objects can be removed from a
list, also. A list can
grow and shrink; the order of the objects in the
list can be changed without creating a new
list object.
One other note on string. While
string are sequences of characters, there is no
separate character data type. A character is simply a
string of length one. This relieves programmers
from the C or Java burden of remembering which quotes to use for single
characters as distinct from multi-character
strings. It also eliminates any problems when
dealing with Unicode multi-byte characters.