Persistent Objects and Workflow-Management with Paos

by Carlos Maltzahn
(translated into English by Jeff Bizarro)


Paos is a system for the remote-network and consistent administration of Python objects. Carlos leads us in today's issue of Python Tools in providing background to this interesting tool. As the larger application developed besides the Workflow Managment system, Chautauqua thereby is introduced and occupies the possibilities by Python in real applications.  (A German version of this article appeared first in the Linux Magazin 7/1996, pp.56-59)


The Python module shelve supports the storing of Python objects into a file. After opening a shelve files one can enter any Phython object under a name:

>>> import shelve
>>> db = shelve.open('database')
>>> db['first object'] = [1, 2, 3]
>>> db['second object'] = ('hallo', [])
>>> db['first object']
[1, 2, 3]
>>> db.close()

After terminating this interpreter session the entered objects are preserved in the a file called database. With the next call of Python these objects can be loaded using shelve.open('database'), i.e. these objects are persistent. The implementation of shelve can be configured at compile time of the Python interpreter. Different data base c-libraries are available, e.g. dbm, gdbm and bsddb. These libraries implement data structures, which enable fast access to the stored data. In two important points shelve however offers no support: it does not implement parallel access control, i.e. if several processes access persistent objects from the same shelve file, the file can become corrupt. Furthermore, shelve provides no query language.

Paos (Python Active Object server) is based on shelve and implements a Client/Server architecture with parallel access control and a simple query language.

http://www.cs.colorado.edu/~carlosm/paos1-arch-english.gif

In addition Paos makes a notification service available, by which Python can use the server to be able to be informed about certain conditions. These conditions are defined by applications in the form of inquiries and registered with notifications the service in the server. Every time an application stores something with the server, the server applies the registered inquiries again to the stored objects. If a response on a request is not empty, the server transmits the response to the application, which registered the inquiry.

An example and somewhat more (or too much?) Detail

The following example illustrated how an application with the server constructs a connection, sets an inquiry and accesses to attributes of a loaded, persistent object. We assume a Paos server on the machine runs cheesy.cs.colorado.edu and waits for inquiries on the port 5000. The example produces some objects of the class person, stores it and executes an inquiry.

import Client
import ExampleSchema

# builds connection with the Paos server on
conn = Client.Connection('cheesy.cs.colorado.edu', 5000, 'example')

# produces objects
john = ExampleSchema.Person()
john.name = 'John'
sue = ExampleSchema.Person()
sue.name = 'Sue'
john.loves = sue
bill = ExampleSchema.Person()
bill.name = 'Bill'
sue.loves = bill
bill.loves = sue

# registers objects with the server
conn.register_objs([john, sue, bill])

# stores objects off
conn.commit([john, sue, bill])

# gets all instances of ' person ', those who Sue falls in love with
answer = conn.get('r', 'ExampleSchema.Person', [('loves', '==', sue)])

# for each object in the response prints out the names of the loved.
for obj in answer:
    if hasattr(obj, 'loves'):
        print obj.name, obj.loves.name

First we import the module Client, in order to be able to communicate with the Paos server. Afterwards we import the module ExampleSchema which defines the class Person (see further below). Finally we connect to Paos, by instantiating the class Client.Connection. We indicate the host names and the port, on which the Paos server runs. The third argument accepts any string which can be used to identify the particular client application.

We produce three ExampleSchema.Person instances and assign them attribute values. Before we can store these objects, they must be registered with the server. The registration assigns unique data base object IDs to the new objects. This benefits us in the following query: "give me all objects from the class ExampleSchema.Person which love sue" If sue had not been registered, the server could not compare this object with the stored objects.

The first argument 'r' in the query means that we only request the read rights for objects returned by the query. If wanted to modify any of the returned objects we would have to use either 'rw' in the query, or subsequently acquire the write rights for objects with the method conn.lock. Only one client can have write access to a given object (more accurately: only one Client.Connection instance). If we own the write rights, no different client can modify the corresponding objects. Every call to the conn.commit method relinquishes all access rights we own. The same is true if the Paos server loses the connection to us.

The query supplies a list with two new objects, which are equivalent to john and bill. In the  following loop we print the names  of the lover and loved. This harmless looking loop hides a bit of tricky logic, though: The Client module guarantees that sue, john.loves and bill.loves point to the same object. This is made possible by the registration of sue and a resolution process, which is built into the attribute access of john.loves and bill.loves. This resolution process is implemented using the built-in Python method __getattr__, which allows the redefinition of attribute access. The resolution process also automatically loads objects from the Paos server as they are needed. For instance, if we would have just loaded john, the expression john.loves would automatically load the object sue from the Paos server. Subsequent access to john.loves is served out of a cache. This greatly simplifies the writing of Paos client applications and reduces client/server traffic (the implementation of the attribute access is defined in Schema.py in the class DBobject. The method register_objs defined in the class Client.Connection installs this attribute access for each new object in the argument list).

It is important to understand that this resolution process can provide only for the referential consistency of registered objects among themselves. In the above example the variables john and bill point to objects, which are not contained in the response. It is the responsibility of the programmer to detect when variables point to outdated objects. With a simple trick one can make variable point to the most up-to-date object: john = conn.cache[john.db_id].  The Client.Connection instance maintains a cache for the most recently loaded objects. Each object in the cache can be accessed by its database object ID (db_id). Since the older and newer version of john have the same database object ID one can easily make the variable john point to the most recently loaded object.

Paos's most interesting characteristic is however the notification service. The following example shows how this service is used:

import Client
import ExampleSchema
import Utilities
import os
import pickle

# defines a Pipe for notifications
(read_pipe_fd, write_pipe_fd) = os.pipe()

# builds connection with the server on
conn = Client.Connection('cheesy.cs.colorado.edu', 5000,
                         'example', (read_pipe_fd, write_pipe_fd))

# registers inquiry with notifications the service
request_id = conn.register('ExampleSchmea.Person', [('name', '==', 'Sue')])

while 1:

    # control room on a notification and reads it
    data = Utilities.READ(read_pipe_fd, 10000)

    # packet notification out
    (req_id, obj_list, other_client) = pickle.loads(data)

    # packet identification from other client
    (other_host, other_pid, other_uid, other_name) = other_client

    # makes something with

Compared with the first example three additional modules must be loaded: Utilities is a module with auxiliary procedures which are used in all Paos modules. The modules os and pickle are built-in modules of Python which provide operating system services and functions for the transformation of objects into strings (object serialization).

First we define a Pipe on which we will receive notifications. A pipe is a directional communication channel that consists of two file descriptors, one for the reading end and one for the writing end. As in the previous example we create a connection to the Paos server by instantiating the Client.Connection class. This time, however, we add the Pipe as a fourth argument. Then we register a notification request, which ensures that the server sends us all new person objects with the name 'Sue', as soon as these objects are entered into the data base. The call to Conn.register(...)returns a registration number. Each notification will include this registration number so that we can match notifications with notification requests in the case we issued multiple notification requests.

We receive notification on the reading end of the Pipe. We use the Utilities procedure READ which guarantees that the full length of the notification of the Pipe is read. The notification is sent as string over the network and is converted into a Python object using pickle.loads(data). A notification consists of

  1. the registration number of the notification request,
  2. the response of the notification request in form of a non-empty object list and
  3. the identification of application which triggered the notification, which in turn consists of
    1. the host name
    2. the process ID
    3. the number of the user running the application, and
    4. the third argument of the Client.Connection instantiation in that application.

Chautauqua: A larger application with Paos

Paos is a spin-off  product of a workflow research project. One of the results of this project is the experimental Workflow system Chautauqua. With its notification service Paos implements the communication infrastructure for the different Chautauqua system components. Chautauqua users interact with the system using a web browser and a graph editor. The web browser displays dynamically generated todo lists for each employee, and is used for filling out forms. The graph editor displays the structure of an office process and the status of different jobs. E.g. if an office employee completes a form, Paos notifies the Chautauqua Workflow manager, who delegates the form to the next office employee. Each user can continually track this process on their graph editor because each graph editor receives and converts the appropriate notifications immediately into graphic representations.

The specialty of Chautauqua is that it enables to the users to change the structure of the office processes during running jobs. In the following snapshot we see the structure of an office process:

http://www.cs.colorado.edu/~carlosm/paos2_ICN.gif
Employees are represented by asterisks, office roles by squares, and activities by circles and triangles. Small points on the upper right of activities represent "tokens", which show the status of the work and which move along edges as work progresses. The graph editor allows users to change any part of the graph. If activities are deleted, tokens can lose their location. Chautauqua offers mechanisms to gather and reassign these lost tokens to new locations in the changed graph.

Further Information

Paos and Chautauqua are completely programmed in Python and freely available at:

ftp://ftp.cs.colorado.edu/users/carlosm/paos-1.4.tar.gz

ftp://ftp.cs.colorado.edu/users/carlosm/chautauqua-1.4.tar.gz (Chautauqua contains Paos)

More detailed documentation for Paos and Chautauqua is currently under preparation and will be announced in the news group comp.lang.python.



Carlos Maltzahn is at present a computer science student in the Ph.D. program of the University of Colorado in Boulder. His research interests in research concentrate at the moment on Internet caches and distributed indexing. In his spare time he roams either somewhere in the fantastically beautiful Rocky Mountains or spends his time building mobile robots from FischerTechnik. To reach him use carlosm@cs.colorado.edu

Copyright © Linux Magazin