arrayfrombuffer

The arrayfrombuffer Python package makes it possible to memory-map files as Numerical Python arrays. It contains two modules:

There is some documentation for arrayfrombuffer and maparray available for online perusal, and the package is free software and available for download under an X11-style license.

Evangelism

Memory-mapping files as arrays has the following advantages.

In theory, you could also use this package for things like arrays in memory shared between programs or arrays in distributed shared memory. I haven't tried using it for those things, though.

Someone built a package called "Vmaps" that does something similar, but provides nice things like atomic compare-and-swap for shared-memory arrays. It was released 2002-01-22.

Usage

Using it is very easy. To create the array file on disk:

    open('tmp.foo', 'wb').write(somenumericarray.tostring())

To load it back in as type 'l', flattened:

    import maparray
    myarray, mymmapobj, myfile = maparray.maparray('tmp.foo')

Now myarray is a perfectly ordinary Numeric array whose data just happens to be stored in the file 'tmp.foo'.

If you want a different data type, you can specify it:

    myarray, mymmapobj, myfile = maparray.maparray('tmp.foo', 'f')
    myarray, mymmapobj, myfile = maparray.maparray('tmp.foo', typecode='f')

You can specify a shape as well:

    myarray, mymmapobj, myfile = maparray.maparray('tmp.foo', 'f', (-1, 24))
    myarray, mymmapobj, myfile = maparray.maparray('tmp.foo', shape=(-1, 24))

If you make changes via myarray and you want them reflected in the file before you delete all references to the mmap object and the array, do this:

    mymmapobj.flush()

Wishlist

It might be nice to have some of the following features:

kragen@pobox.com | Kragen's software | Kragen's home page