> Page Moved to www.itmaybeahack.com

This page has been moved to http://www.itmaybeahack.com/homepage/books/python/html/p02/p02c10_adv_seq.html. Please update your bookmarks.

Functional Programming with Collections

This chapter presents some advanced collection concepts. In Lists of Tuples we describe the relatively common Python data structure built from a list of tuples.

We’ll cover a powerful list construction method called a list comprehension in List Comprehensions. We can use this to simplify some common list-processing patterns.

In Sequence Processing Functions: map(), filter() and reduce() we’ll cover three functions that can simplify some list processing. map(), filter() and reduce() provide features that overlap with list comprehensions.

In Advanced List Sorting we cover some advanced sequence sorting techniques. In particular, we’ll look closely at how to provide a suitable key function to control sorting.

We’ll look at lambda forms in The Lambda. These aren’t essential for Python programming, but they’re handy for clarifying a piece of code in some rare cases.

In Multi-Dimensional Arrays or Matrices we cover simple multidimensional sequences. These are sometimes called matrices or arrays.

Even more complex data structures are available, of course. Numerous modules handle the sophisticated representation schemes described in the Internet standards (called Requests for Comments, RFC’s). We’ll touch on these in The Python Library.

Lists of Tuples

The list of tuple structure is remarkably useful. In other languages, like Java or C++, we are forced to either use built-in arrays or create an entire class definition to simply keep a few values togther.

One common situation is processing list of simple coordinate pairs for 2-dimensional or 3-dimensional geometries. Additional examples might include examining a list of tuples that contain the three levels for red, green and blue that define a color. Or – for printing – the values for cyan, magenta, yellow and black that define a color.

As an example of using a red, green, blue tuple, we may have a list of individual colors that looks like the following.

colorScheme = [ (0,0,0), (0x20,0x30,0x20), (0x10,0xff,0xff) ]

We’ve already seen how dictionaries (Mappings and Dictionaries) have an dict.items() method that provides the dictionary keys and values as a list of 2-tuples. Additionally, the zip() built-in function interleaves two or more sequences to create a list of tuples.

The for Statement. A interesting form of the for statement is one that exploits multiple assignment to work with a list of tuples. Consider the following examples:

for c,f in [ ("red",18), ("black",18), ("green",2) ]:
    print "%s occurs %f" % (c, f/38.0)

for r, g, b in [ (0,0,0), (0x20,0x30,0x20), (0x10,0xff,0xff) ]:
    print "red: %x, green: %x, blue: %x" % ( 255-r, 255-g, 255-b )

In these examples, we have created a list of tuples. The for statement uses a form of multiple assignment to split up each tuple into a fixed number of variables.

The first example is equivalent to the following.

for p in [ ("red",18), ("black",18), ("green",2) ]:
    c,f = p
    print "%s occurs %f" % (c, f/38.0)

This technique works because tuples are expected to have a fixed, known number of elements.

Here’s an example using dict.items(). We looked at dictionaries in Mappings and Dictionaries.

d = { 'red':18, 'black':18, 'green':2 }
for c,f in d.items():
    print "%s occurs %f" % (c, f/38.0)

List Comprehensions

Python provides several list literals or “displays”. The most common list display is the simple literal value shown in List Literal Values: values are written as follows:

[ expression 〈 , ... 〉 ]

Python has a second kind of list display, based on a list comphrehension. A list comprehension is an expression that combines a function, a for statement and an optional if statement into one tidy package. This allows a simple, clear expression of the processing that will build up an iterable sequence.

List Comprehension Semantics. The most important thing about a list comprehension is that it is an iterable that applies a calculation to another iterable. A list display can use a list comprehension iterable to create a new list.

When we write a list comprehension, we will provide an iterable, a variable and an expression. Python will process the iterator as if it was a for-loop, iterating through a sequence of values. It evaluates the expression, once for each iteration of the for-loop. The resulting values can be collected into a fresh, new list, or used anywhere an iterator is used.

List Comprehension Syntax. A list comprehension is – technically – a complex expression. It’s often used in list displays, but can be used in a variety of places where an iterator is expected.

expr for-clause

The expr is any expression. It can be a simple constant, or any other expression (including a nested list comprehension).

The for-clause mirrors the for statement:

for variable in sequence

A common use for this is in a list display. We’ll show some list comprehension examples used to create new lists.

even = [ 2*x for x in range(18) ]
hardways = [ (x,x) for x in (2,3,4,5) ]
samples = [ random.random() for x in range(10) ]
even:This is a list of values [ 0, 2, 4, ..., 14 ].
hardways:This is a list of 2-tuples. Each 2-tuple is built from the values in the given sequence. The result is [(2,2), (3,3), (4,4), (5,5)].
samples:This is a list of 10 random numbers.

A list display that uses a list comprehension behaves like the following loop:

r= []
for  variable in sequence :
    r.append( expr )

The basic process, then, is to iterate through the sequence in the for-clause, evaluating the expression, expr. The values that result are assembled into the list. If the expression depends on the for-clause, each value in the list can be different. If the expression doesn’t depend on the for-clause, each value will be the same.

Here’s an example where the expression depends on the for-clause.

>>> [ v*2+1 for v in range(10) ]
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
>>> sum(_)
100

This creates the first 10 odd numbers. It starts with the sequence created by range(10). The for-clause assigns each value in this sequence to the local variable v. The expression, v*2+1, is evaluated for each distinct value of v. The expression values are assembled into the resulting list.

Here’s an example where the expression doesn’t depend on the for-clause.

b= [ 0 for i in range(10) ]

b will be a list of 10 zeroes.

Comprehensions Outside List Displays. A list comprehension can be used outside of a list display. When we write a list display (using [ and ]) were using an iterable (the list comprehension) to create a new list.

We can use the iterable list comprehension in other contexts that expect an iterator.

square = sum( (2*a+1) for a in range(10) )
column_1 = tuple( 3*b+1 for b in range(12) )
rolls = ( (random.randint(1,6),random.randint(1,6)) for u in range(100) )
hardways = any( d1==d2 for d1,d2 in rolls )
square:

This is the sum of odd numbers. The list comprehension ((2*a+1) for a in range(10)) is an iterable which the sum() function can use.

column_1:

This creates a tuple of 12 values using a a list comprehension.

rolls:

This creates a generator object that will iterate over 100 values using a list comprehension. This does not actually create 100 values – it defines an iterator that will produce 100 values.

Note that this generator has an internal state: it can only be used once. Once it has generated all it’s values, it will not generate any other values.

A statement like r1 = list(rolls) will use the iterator object, rolls with the list() function to actually create an object that has 100 random values.

hardways:

This iterates through the iterator named rolls to see if any of the pairs had the number rolled “the hard way” with both values equal.

The if Clause. A list comprehension can also have an if-clause.

The basic syntax is as follows:

expr for-clause if-clause

The for-clause mirrors the for statement:

for variable in sequence

The if-clause mirrors the if statement:

if filter

Here is an example of a complex list comprehension in a list display.

hardways = [ (x,x) for x in range(1,7) if x+x not in (2, 12) ]

This more complex list comprehension behaves like the following loop:

r= []
for  variable in sequence :
    if filter:
        r.append( expr )

The basic process, then, is to iterate through the sequence in the for-clause, evaluating the if-clause. When the if-clause is True, evaluate the expression, expr. The values that result are assembled into the list.

Here’s another example.

>>> [ (x,2*x+1) for x in range(10) if x%3 == 0 ]
[(0, 1), (3, 7), (6, 13), (9, 19)]

This works as follows:

  1. The for-clause iterates through the 10 values given by range(10), assigning each value to the local variable x.
  2. The if-clause evaluates the filter function, x%3==0. If it is False, the value is skipped. If it is True, the expression, at (x,2*x+1), is evaluated and retained.
  3. The sequence of 2-tuples are assembled into a list.

A list comprehension can have any number of for-clauses and if-clauses, freely-intermixed. A for-clause must be first. The clauses are evaluated from left to right.

Sequence Processing Functions: map(), filter() and reduce()

The map(), filter(), and reduce() built-in functions are handy functions for processing sequences without writing lengthy for-loops. These owe much to the world of functional programming languages. The idea of each is to take a small function you write and apply it to all the elements of a sequence, saving you from writing an explicit loop. The implicit loop within each of these functions may be faster than an explicit for loop.

Additionally, each of these is a pure function, returning a result value. This allows the results of the functions to be combined into complex expressions relatively easily.

Processing Pipeline. It is very, very common to apply a single function to every value of a list. In some cases, we may apply multiple simple functions in a kind of “processing pipeline”.

Here’s an example.

>>> def ftin_2_in( ftin ):
...     feet, inches = ftin
...     return 12.0*feet + inches
...
>>> heights = [ (5,8), (5,9), (6,2), (6,1), (6,7) ]
>>> map( ftin_2_in, heights )
[68.0, 69.0, 74.0, 73.0, 79.0]
>>>
>>> def in_2_m( inches ):
...     return inches * 0.0254
...
>>> map( in_2_m, map( ftin_2_in, heights ) )
[1.7271999999999998, 1.7525999999999999, 1.8795999999999999, 1.8541999999999998, 2.0065999999999997]
  1. We defined a simple function, ftin_2_in() that converts a distance in the form (ft,in) into a distance measured in inches.
  2. We defined a set of people’s heights in heights.
  3. We used the map() function to apply our ftin_2_in() function to each value in heights, creating a new list of values.
  4. We defined a simple function, in_2_m() that converts a distance in inches into a distance in meters.
  5. We did a fairly complex calculation where we applied ftin_2_in() to each value in heights. We then applied in_2_m() to each of those values. We’ve converted a list of values from English (ft,in) to proper metric units by applying two simple functions to each value.

This concept can be used extensively using these functions and list comprehensions to create complex and sophisticated software from a series of simple transformations.

Definitions. Each of the map(), filter() and reduce() functions transform an iterable (a sequence or generator function).

The map() and filter() each apply some function to a sequence to create a new sequence.

The reduce() function applies a function which will reduce the sequence to a single value. There are a number of special-purpose reduce functions that we’ve already seen. These reductions include sum(), any(), all().

The map() and filter() functions have no internal state, they simply apply the function to each individual value of the sequence. The reduce() function, in contrast, maintains an internal state which is seeded from an initial value, passed to the function along with each value of the sequence and returned as the final result.

Here are the formal definitions.

map(function, sequence[, ...]) → list

Create a new list from the results of applying the given function to the items of the the given sequence.

>>> map( int, [ "10", "12", "14", 3.1415926, 5L ] )
[10, 12, 14, 3, 5]

This function behaves as if it had the following definition.

def map( aFunction, aSequence ):
    return [ aFunction(v) for v in aSequence ]

It turns out that more than one sequence can be given. In this case, the function must accept multiple arguments, and there must be as many sequences as arguments to the function. The corresponding items from each sequence are provided as arguments to the function.

If any sequence is too short, None is used for missing the values. If the function is None, map() will create tuples from corresponding items in each list, much like the zip() function.

filter(function, sequence) → list

Return a list containing those items of sequence for which the given function is True. If the function is None, return a list of items that are equivalent to True.

This function behaves as if it had the following definition.

def filter( aFunction, aSequence ):
    return [ v for v in aSequence if aFunction(v) ]

Here’s an example

>>> import random
>>> rolls = list( (random.randint(1,6),random.randint(1,6)) for u in range(100) )
>>> def hardways( pair ):
...     d1, d2 = pair
...     return d1 == d2 and d1+d2 in ( 4, 6, 8, 10 )
>>> filter( hardways, rolls )
[(4, 4), (5, 5), (2, 2), (5, 5), (4, 4), (5, 5), (5, 5), (3, 3), (2, 2), (2, 2), (5, 5), (4, 4)]
>>> len(_)
12
  1. We created 100 random rolls.
  2. We defined a function that evaluates a roll to see if the point was made the “hard way”.
  3. We applied our filter to our rolls to see that 12/100 rolls are hardway rolls.

Here’s anther example

>>> def over_2( m ):
...     return m > 2.0
>>> filter( over_2, map( in_2_m, map( ftin_2_in, heights ) ) )
[2.0065999999999997]
  1. We defined a filter function which returns True if the argument value is greater than 2.0.
  2. We filtered our list of heights to locate any heights over 2.0 meters.
reduce(function, sequence[, initial=0]) → value

The given function must accept two argument values. This function will apply that function to an internal accumulator and each item of a sequence, from left to right, so as to reduce the sequence to a single value.

The internal accumulator is initialized with the given initial value (or 0 if no value is provided.)

This behaves as if it had the following definition.

def reduce( aFunction, aSequence, init= 0 ):
    r= init
    for s in aSequence:
        r= aFunction( r, s )
    return r

Here’s an example.

>>> def plus( a, b ):
...     return a+b
>>> reduce( plus, [1, 3, 5, 7, 9] )
25

Note that Python has a number of built-in reductions: sum(), any() and all() are kinds of reduce functions.

Costs and Benefits. What are the advantages? First, the functional version can be clearer. It’s a single line of code that summarizes the processing. Second, and more important, Python can execute the sequence processing functions far faster than the equivalent explicit loop.

You can see that map() and filter() are equivalent to simple list comprehensions. This gives you two ways to specify these operations, both of which have approximately equivalent performance. This also means that map() and filter() aren’t essential to Python, but they are widely used.

The reduce() function is a bit of a problem. It can have remarkably bad performance if it is misused. Consequently, there is some debate about the value of having this function.

Another Example. Here’s an interesting example that combines reduce() and map(). This uses two functions defined in earlier examples, add() and oddn().

def plus( a, b ):
    return a+b
def oddn( n ):
    return 2*n+1
for i in range(10):
    sq=reduce( plus, map(oddn, range(i)), 0 )
    print i, sq

Let’s look at the evaluation of sq from innermost to outermost.

  1. The range(i) generates a sequence of numbers from 0 to i -1.
  2. A map() function applies oddn() to the sequence created by range(i), creating i odd values. The oddn() function returns the n th odd value.
  3. A reduce() function applies plus() to the sequence of odd values, creating a sum.

The zip() Function. The zip() function interleaves values from two or more sequences to create a new sequence. The new sequence is a sequence of tuples. Each item of a tuple is the corresponding values from from each sequence.

zip(sequence[, sequence...]) → sequence
Interleave values from the various sequences to create tuples. If any sequence is too long, truncate it.

Here’s an example.

>>> zip( range(5), range(1,12,2) )
[(0, 1), (1, 3), (2, 5), (3, 7), (4, 9)]

In this example, we zipped two sequences together. The first sequence was range(5), which has five values. The second sequence was range(1,12,2) which has 6 odd numbers from 1 to 11. Since zip() truncates to the shorter list, we get five tuples, each of which has the matching values from both lists.

The map() function behaves a little like zip() when there is no function provided, just sequences. However, map() does not truncate, it fills the shorter list with None values.

>>> map( None, range(5), range(1,12,2) )
[(0, 1), (1, 3), (2, 5), (3, 7), (4, 9), (None, 11)]

Advanced List Sorting

Consider a list of tuples. We could get such a list when processing information that was extracted from a spreadsheet program.

For example, if we had a spreadsheet with raw census data, we can easily transform it into a sequence of tuple s that look like the following.

jobData= [
(001,'Albany','NY',162692),
(003,'Allegany','NY',11986),
...
(121,'Wyoming','NY',8722),
(123,'Yates','NY',5094)
]

We can also save the spreadsheet data in csv format and use the csv module to read it. We’ll return to the csv module in Components, Modules and Packages.

Sorting this list can be done trivially with the list.sort() method.

jobData.sort()

Recall that this updates the list in place. The sort() method specifically does not return a result. A common mistake is to say something like: a= b.sort(). The sort method always returns None.

This kind of sort will simply compare the tuple items in the order presented in the tuple. In this case, the county number is first. What if we want to sort by some other column, like state name or jobs?

Let’s say we wanted to sort by state name, the third element in the tuple. We have two strategies for sorting when we don’t want the simplistic comparison of elements in order.

  1. We can provide a “key extraction” function to the sort() method. This will locate the key value (or a tuple of key values) within the given objects.

  2. We can use the “decorate - sort - undecorate” pattern. What we do is decorate each element in the list, making it into a new kind of 2-tuple with the fields on which we want to sort as the first element of this tuple and the original data as the second element of the tuple.

    This has the side effect of creating a second copy of the original list.

Sorting With Key Extraction. The sort() method of a list can accept a keyword parameter, key, that provides a key extraction function. This function returns a value which can be used for comparison purposes. To sort our jobData by the third field, we can use a function like the following.

def byState( a ):
    return a[2]
jobData.sort( key=byState )

This byState() function returns the selected key value, which is then used by sort to order the tuples in the original list. If we want to sort by a multi-part key, we cna do something like the following.

def byStateJobs( a ):
    return ( a[2], a[3] )

This function will create a two-value tuple and use these two values for ordering the items in the list.

Sorting With List Decoration. Superficially, this method appears more complex. However it is remarkably flexible. This is a slightly more general solution than using a key extractor function.

The idea is to transform the initial list of values into a new list of 2-tuples, with the first item being the key and the second item being the original tuple. The first item, only used for sorting, is a decoration placed in front of the original value.

In this example, we decorate our values with a 2-tuple of state names and number of jobs. We can sort this temporary list of 2-tuples. Then we can strip off the decoration and recover the original values.

Here’s the example shown as three distinct steps.

deco= [ ((a[2],a[3]),a) for a in jobData ]
deco.sort()
state_jobs= [ v for k,v in deco ]

Here’s an example shown with a single operation.

state_jobs = [ v for k,v in sorted( ((a[2],a[3]),a) for a in jobData ) ]

This works by evaluating the “undecorate” list comprehension, v for k,v in sorted(). That list comprehension depends on the output from the sorted() function. The sorted() function depends on the “decorate” list comprehension, (...,a) for a in jobData.

The Lambda

The functions map(), filter(), reduce(), and the list.sort() method all use small functions to control their operations.

For example, we would write something like:

def byState( x ):
    return x[2]
data.sort( key=byState )

In this case, we provided the byState() function to the list.sort() method.

In many cases, this function is used only once, and it hardly seems necessary to define a function object for a single use like this.

Instead of defining a function, Python allows us to provide a lambda form. This is a kind of anonymous, one-use-only function body in places where we only need a very, very simple function.

A lambda form is like a defined function: it has parameters and computes a value. The body of a lambda, however, can only be a single expression, limiting it to relatively simple operations. If it gets complex, you’ll have to define a real function.

Syntax. The syntax is relatively simple.

lambda :replaceable: parameter 〈 , ... 〉 : expression

There can be any number of parameters. The result of the expression is the value when the lambda is applied to arguments. No actual return statement is required. No statements can be used, just an expression.

Note that a lambda form is not a statement; it’s an expression that’s used within other expressions. The lambda form does not define an object with a long life-time. The lambda form object – generally – exists just in the scope of a single statement’s execution.

Generally, a lambda will look like this.

lambda a: a[0]*2+a[1]

This is a lambda which takes a tuple argument value and returns a value based on the first two elements of the tuple.

Examples. Here’s an example of using a lambda form and applying it directly to it’s arguments.

>>> from math import pi
>>> (lambda x: pi*x*x)(5)
78.539816339744831
  1. The (lambda x: pi*x*x) is a function with no name; it accepts a single argument, x, and computes pi*x*x.
  2. We apply the lambda to an argument value of 5. The value of applying the lambda to 5 is the value 5^2\pi = 25\pi.

Here’s a lambda form used in the map function.

>>> map( lambda x: pi*x*x, range(8) )
[0.0, 3.1415926535897931, 12.566370614359172, 28.274333882308138,
50.26548245743669, 78.539816339744831, 113.09733552923255,
153.93804002589985]

This map() function applies a lambda form to the values from 0 to 7 as created by the range(). The input sequence is mapped to the output sequence by having the lambda object applied to each value.

Parameterizing a Lambda. Sometimes we want to have a lambda with an argument defined by the “context” or “scope” in which the lambda is being used.

This is very similar to a closure, in which a free variable is bound into the lambda expression.

Here’s the basic form that’s commonly used to create a closure in Python. This is a function that returns a lambda for later use.

>>> def timesX( x ):
...     return lambda a: x*a
...
>>> t2= timesX(2)
>>> t2(5)
10
>>> t3= timesX(3)
>>> t3(5)
15

We can use this kind of thing as follows. We call our “closure” function to create a lambda that’s got a constant bound into it. We can then use the resulting lambda. In this case, we’re using it in a map() function evaluation.

>>> map( timesX(3), range(5) )
[0, 3, 6, 9, 12]

Consider this more complex example.

>>> spins = [ (23,"red"), (21,"red"), (0,"green"), (24,"black") ]
>>> def byColor( color ):
...     return lambda t: color == t[1]
...
>>> filter( byColor("red"), spins )
[(23, 'red'), (21, 'red')]
>>> filter( byColor("green"), spins )
[(0, 'green')]
  1. We have four sample spins of a roulette wheel, spins.

  2. We have a function, byColor(), that creates a closure. This function binds the name of a color into a simple lambda, lambda t: color == t[1].

    The resulting lambda can be used anywhere a function is required.

  3. We use the byColor() function to create a lambda what we use for filtering our collection of spins. We create a lambda with byColor("red") that will return True for spins that have the color of "red".

  4. We use the byColor() function to create another lambda what we use for filtering our collection of spins. We create a lambda with byColor("green") that will return True for spins that have the color of "green".

As an alternative to creating lists with the filter() function, similar results can be created with a list comprehension.

>>> byRed= byColor("red")
>>> [ s for s in spins if byRed(s) ]
[(23, 'red'), (21, 'red')]

Multi-Dimensional Arrays or Matrices

There are situations that demand multi-dimensional arrays or matrices. In many languages (Java, COBOL, BASIC) this notion of multi-dimensionality is handled by pre-declaring the dimensions (and limiting the sizes of each dimension). In Python, these are handled somewhat more simply.

If you have a need for more sophisticated processing than we show in this section, you’ll need to get the Python Numeric module, also known as NumPy. This is a Source Forge project, and can be found at http://numpy.sourceforge.net/.

Let’s look at a simple two-dimensional tabular summary. When rolling two dice, there are 36 possible outcomes. We can tabulate these in a two-dimensional table with one die in the rows and one die in the columns:

  1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12

In Python, a multi-dimensional table like this can be implemented as a sequence of sequences. A table is a sequence of rows. Each row is a sequence of individual cells. This allows us to use mathematical-like notation. Where the mathematician might say A_{i,j}, in Python we can say A[i][j]. In Python, we want the row i from table A, and column j from that row.

This is essentiall the like the list of tuples, yet again. See Lists of Tuples.

List of Lists Example. We can build a table using a nested list comprehension. The following example creates a table as a sequence of sequences and then fills in each cell of the table.

table= [ [ 0 for i in range(6) ] for j in range(6) ]
print table
for d1 in range(6):
    for d2 in range(6):
        table[d1][d2]= d1+d2+2
print table

This program produced the following output.

[[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0]]
[[2, 3, 4, 5, 6, 7], [3, 4, 5, 6, 7, 8], [4, 5, 6, 7, 8, 9],
[5, 6, 7, 8, 9, 10], [6, 7, 8, 9, 10, 11], [7, 8, 9, 10, 11, 12]]
  1. First, we created a six by six table of zeroes, named table.

    Each item in the table is a 6-item list of zeroes. We used a list comprehension to create an object for each value of j in the range of 0 to 6. Each of the objects is a list of zeroes, one for each value of i in the range of 0 to 6.

  2. We printed that list of lists.

  3. We then filled this with each possible combination of two dice. We iterate over all combinations of two dice, filling in each cell of the table. This is done as two nested loops, one loop for each of the two dice. The outer enumerates all values of one die, d1. The loop enumerates all values of a second die, d2.

    Updating each cell involves selecting the row with table[d1]; this is a list of 6 values. The specific cell in this list is selected by [d2]. We set this cell to the number rolled on the dice, d1+d2+2.

Additional Examples. The printed list of list structure is a little hard to read. The following loop would display the table in a more readable form.

>>> for row in table:
...     print row
...
[2, 3, 4, 5, 6, 7]
[3, 4, 5, 6, 7, 8]
[4, 5, 6, 7, 8, 9]
[5, 6, 7, 8, 9, 10]
[6, 7, 8, 9, 10, 11]
[7, 8, 9, 10, 11, 12]

As an exercise, we’ll leave it to the reader to add some features to this to print column and row headings along with the contents. As a hint, the "%2d" % value string operation might be useful to get fixed-size numeric conversions.

Explicit Index Values. Let’s summarize our matrix of die rolls, and accumulate a frequency table. We’ll use a simple list with 13 buckets (numbered from 0 to 12) to accumulate the frequency of each die roll.

fq= 13*[0]
for i in range(6):
    for j in range(6):
        c= table[i][j]
        fq[ c ] += 1
  1. We initialize the frequency table, fq, to be a list of 13 zeroes.
  2. The outer loop sets the variable i to the values from 0 to 5.
  3. The inner loop sets the variable j to the values from 0 to 5.
  4. We use the index value of i to select a row from the table, and the index value of j to select a column from that row. This is the value, c . We then accumulate the frequency occurances in the frequency table, fq .

This looks very mathematical and formal. However, Python gives us an alternative, which can be somewhat simpler.

Using List Iterators Instead of Index Values. Since our table is a list of lists, we can make use of the power of the for statement to step through the elements without using an index.

fq= 13*[0]
print fq
for row in table:
    for c in row:
        fq[c] += 1
  1. We initialize the frequency table, fq, to be a list of 13 zeroes.
  2. The outer loop sets the variable row to each element of the original table variable. This decomposes the table into individual rows, each of which is a 6-element list.
  3. The inner loop sets the variable c to each column’s value within the row. This decomposes the row into the individual values.
  4. We count the actual occurances of each value, c by using the value as an index into the frequency table, fq . The increment the frequency value by 1.

Mathematical Matrices. We use the explicit index technique for managing the mathematically-defined matrix operations. Matrix operations are done more clearly with this style of explicit index operations.

We’ll show matrix addition as an example, here, and leave matrix multiplication as an exercise in a later section.

m1 = [ [1, 2, 3, 0], [4, 5, 6, 0], [7, 8, 9, 0] ]
m2 = [ [2, 4, 6, 0], [1, 3, 5, 0], [0, -1, -2, 0] ]
m3= [ 4*[0] for i in range(3) ]
for i in range(3):
    for j in range(4):
        m3[i][j]= m1[i][j]+m2[i][j]
  1. In this example we created two input matrices, m1 and m2, each three by four.
  2. We initialized a third matrix, m3, to three rows of four zeroes, using a comprehension.
  3. We iterated through all rows (using the i variable), and all columns (using the j variable) and computed the sum of m1 and m2.

Python provides a number of modules for handling this kind of processing. In Components, Modules and Packages we’ll look at modules for more sophisticated matrix handling.

Exercises

  1. All Dice Combinations. Write a list comprehension that uses nested for-clauses to create a single list with all 36 different dice combinations from (1,1) to (6,6).

  2. Temperature Table. Write a list comprehension that creates a list of tuple s. Each tuple has two values, a temperature in Farenheit and a temperature in Celsius.

    Create one list for Farenheit values from 0 to 100 in steps of 5 and the matching Celsius values.

    Create another list for Celsius values from -10 to 50 in steps of 2 and the matching Farenheit values.

  3. Define max() and min(). Use reduce() to create versions of the built-ins max() and min().

    You may find this difficult to do this with a simple lambda form. However, consider the following. We can pick a value from a tuple like this: (a,b)[0] == a, and (a,b)[1] == b. What are the values of (a,b)[a<b] and (a,b)[a>b]?

  4. Compute the Average or Mean. A number of standard descriptive statistics can be built with reduce(). The basic formulae are given in Tuples.

    Write a function based on reduce() which computes the mean. Mean is a simple “add-reduction” of the values in a sequence divided by the length.

    In essence, you’re inventing a version of sum() based on reduce().

  5. Compute the Variance and Standard Deviation. A number of standard descriptive statistics can be built with reduce(). The basic formulae are given in Tuples.

    Given a sequence of values A = \lbrace a_0, a_1, ..., a_n \rbrace, the standard deviation has a number of alternative definitions. One approach is to sum the values and square this number, as well as sum the squares of each number. Summing squares can be done as a map() to compute squares and then use the sum() function.

    \begin{split}
s_1& \gets \sum_{0 \leq i < n} (A_i^2)\\
s_2& \gets (\sum_{0 \leq i < n} A_i)^2\\
v& \gets \frac{s_1 - s_2}{n}\\
\sigma_A& \gets \sqrt{v}
\end{split}

    Also the standard deviation can be defined as the square root of the average variance.

    \begin{split}
m& \gets \dfrac{ \sum_{0 \leq i < n} A_i }{ n }\\
v& \gets \dfrac{ \sum_{0 \leq i < n} (A_i - m)^2 }{ n-1 }\\
\sigma_A& \gets \sqrt{v}
\end{split}

  6. Distinct Values In A Sequence. In List Exercises, one of the exercises looked at accumulating the distinct values in a sequence.

    Given an input sequence, seq, we can easily sort this sequence. This will put all equal-valued elements together. The comparison for unique values is now done between adjacent values, instead of a lookup in the resulting sequence.

    Unique Values of a Sequence, seq , using sort()

    Initalize

    Set result \gets \text{[ ]} an empty sequence.

    Sort the input sequence, seq.

    Loop. For each value, v, in the sorted seq.

    Already in result? Is v the last element in result? If so, ignore it. If not, append v to the sequence result.

    Result. Return array result, which has distinct values from seq.

    While this is appealing in it’s simplicity, the sort operation makes it relatively inefficient.

  7. Compute the Median. The median function arranges the values in sorted order. It locates either the mid-most value (if there are an odd number) or it averages two adjacent values (if there are an even number).

    If len(data) % 2 == 1, there is an odd number of values, and (len(data)+1)/2 is the midmost value. Otherwise there is an even number of values, and the len(data)/2 and len(data)/2-1 are the two mid-most values which must be averaged.

  8. Portfolio Reporting. In Tuple Exercises, one of the exercises presented a stock portfolio as a sequence of tuples. Plus, we wrote two simple functions to evaluate purchase price and total gain or loss for this portfolio.

    Develop a function (or a lambda form) to sort this porfolio into ascending order by current value (current price × number of shares). This function (or lambda) will require comparing the products of two fields instead of simply comparing two fields.

  9. Matrix Formatting. Given a 6 × 6 matrix of dice rolls, produce a nicely formatted result. Each cell should be printed with a format like "| %2s" so that vertical lines separate the columns. Each row should end with an ‘|’. The top and bottom should have rows of "----" printed to make a complete table.

  10. Three Dimensions. If the rolls of two dice can be expressed in a two-dimensional table, then the rolls of three dice can be expressed in a three-dimensional table. Develop a three dimensional table, 6 x 6 x 6, that has all 216 different rolls of three dice.

    Write a loop that extracts the different values and summarizes them in a frequency table. The range of values will be from 3 to 18.

Table Of Contents

Previous topic

Files

Next topic

Advanced Mapping Techniques

This Page