Software written in Python.

rss2email

Available in a git repository.
Repository: rss2email
Browsable repository: rss2email
Author: W. Trevor King

Since November 2012 I've been maintaining rss2email, a package that converts RSS or Atom feeds to email so you can follow them with your mail user agent. Rss2email was created by the late Aaron Swartz and maintained for several years by Lindsey Smith. I've added a mailing list (hosted with mlmmj) and PyPI package and made the GitHub location the homepage.

Overall, setting up the standard project infrastructure has been fun, and it's nice to see interest in the newly streamlined code picking up. The timing also works out well, since the demise of Google Reader may push some talented folks in our direction. I'm not sure how visible rss2email is, especially the fresh development locations, hence this post ;). If you know anyone who might be interested in using (or contributing to!) rss2email, please pass the word.

SymPy

SymPy is a Python library for symbolic mathematics. To give you a feel for how it works, lets extrapolate the extremum location for f(x) given a quadratic model:

(1)f(x)=Ax 2+Bx+C

and three known values:

(2)f(a) =Aa 2+Ba+C f(b) =Ab 2+Bb+C f(c) =Ac 2+Bc+C

Rephrase as a matrix equation:

(3)(f(a) f(b) f(c))=(a 2 a 1 b 2 b 1 c 2 c 1)(A B C)

So the solutions for A, B, and C are:

(4)(A B C)=(a 2 a 1 b 2 b 1 c 2 c 1) 1(f(a) f(b) f(c))=(long complicated stuff)

Now that we've found the model parameters, we need to find the x coordinate of the extremum.

(5)dfdx=2Ax+B,

which is zero when

(6)2Ax =B x =B2A

Here's the solution in SymPy:

>>> from sympy import Symbol, Matrix, factor, expand, pprint, preview
>>> a = Symbol('a')
>>> b = Symbol('b')
>>> c = Symbol('c')
>>> fa = Symbol('fa')
>>> fb = Symbol('fb')
>>> fc = Symbol('fc')
>>> M = Matrix([[a**2, a, 1], [b**2, b, 1], [c**2, c, 1]])
>>> F = Matrix([[fa],[fb],[fc]])
>>> ABC = M.inv() * F
>>> A = ABC[0,0]
>>> B = ABC[1,0]
>>> x = -B/(2*A)
>>> x = factor(expand(x))
>>> pprint(x)
 2       2       2       2       2       2   
a *fb - a *fc - b *fa + b *fc + c *fa - c *fb
---------------------------------------------
 2*(a*fb - a*fc - b*fa + b*fc + c*fa - c*fb) 
>>> preview(x, viewer='pqiv')

Where pqiv is the executable for pqiv, my preferred image viewer. With a bit of additional factoring, that is:

(7)x=a 2[f(b)f(c)]+b 2[f(c)f(a)]+c 2[f(a)f(b)]2{a[f(b)f(c)]+b[f(c)f(a)]+c[f(a)f(b)]}
Mutt-LDAP

Available in a git repository.
Repository: mutt-ldap
Browsable repository: mutt-ldap
Author: W. Trevor King

I wrote this Python script to query an LDAP server for addresses from Mutt. In December 2012, I got some patches from Wade Berrier and Niels de Vos. Anything interesting enough for others to hack on deserves it's own repository, so I pulled it out of my blog repository (linked above, and mirrored on GitHub).

The README is posted on the PyPI page.

curses_check_for_keypress

Available in a git repository.
Repository: curses-check-for-keypress
Browsable repository: curses-check-for-keypress
Author: W. Trevor King

There are some points in my experiment control code where the program does something for an arbitrary length of time (e.g, waits while the operator manually adjusts a laser's alignment). For these situations, I wanted to be able to loop until the user pressed a key. This is a simple enough idea, but the implementation turned out to be complicated enough for me to spin it out as a stand-alone module.

pyassuan

Available in a git repository.
Repository: pyassuan
Browsable repository: pyassuan
Author: W. Trevor King

I've been trying to come up with a clean way to verify detached PGP signatures from Python. There are a number of existing approaches to this problem. Many of them call gpg using Python's multiprocessing or subprocess modules, but to verify detached signatures, you need to send the signature in on a separate file descriptor, and handling that in a way safe from deadlocks is difficult. The other approach, taken by PyMe is to wrap GPGME using SWIG, which is great as far as it goes, but development seems to have stalled, and I find the raw GPGME interface excessively complicated.

The GnuPG tools themselves often communicate over sockets using the Assuan protocol, and I'd already written an Assuan server to handle pinentry (originally for my gpg-agent post, not part of pyassuan). I though it would be natural if there was a gpgme-agent which would handle cryptographic tasks over this protocol, which would make the pgp-mime implementation easier. It turns out that there already is such an agent (gpgme-tool), so I turned my pinentry script into the more general pyassuan package. Now using Assuan from Python should be as easy (or easier?) than using it from C via libassuan.

The README is posted on the PyPI page.

pygrader

Available in a git repository.
Repository: pygrader
Browsable repository: pygrader
Author: W. Trevor King

The last two courses I've TAd at Drexel have been scientific computing courses where the students are writing code to solve homework problems. When they're done, they email the homework to me, and I grade it and email them back their grade and comments. I've played around with developing a few grading frameworks over the years (a few years back, one of the big intro courses kept the grades in an Excel file on a Samba share, and I wrote a script to automatically sync local comma-separated-variable data with that spreadsheet. Yuck :p), so I figured this was my change to polish up some old scripts into a sensible system to help me stay organized. This system is pygrader.

During the polishing phase, I was searching around looking for prior art ;), and found that Alex Heitzmann had already created pygrade, which is the name I under which I had originally developed my own project. While they are both grade databases written in Python, Alex's project focuses on providing a more integrated grading environment.

Pygrader accepts assignment submissions from students through its mailpipe command, which you can run on your email inbox (or from procmail). Students submit assignments with an email subject like

[submit] <assignment name>

mailpipe automatically drops the submissions into a student/assignment/mail mailbox, extracts any MIME attachments into the student/assignment/ directory (without clobbers, with proper timestamps), and leaves you to get to work.

Pygrader also supports multiple graders through the mailpipe command. The other graders can request a student's submission(s) with an email subject like

[get] <student name>, <assignment name>

Then they can grade the submission and mail the grade back with an email subject like

[grade] <student name>, <assignment name>

The grade-altering messages are also stored in the student/assignment/mail mailbox, so you can peruse them later.

Pygrader doesn't spawn editors or GUIs to help you browse through submissions or assigning grades. As far as I am concerned, this is a good thing.

When you're done grading, pygrader can email (email) your grades and comments back to the students, signing or encrypting with pgp-mime if either party has configured a PGP key. It can also email a tab-delimited table of grades to the professors to keep them up to speed. If you're running mailpipe via procmail, responses to grade request are sent automatically.

While you're grading, pygrader can search for ungraded assignments, or for grades that have not yet been sent to students (todo). It can also check for resubmissions, where new submissions come in response to earlier grades.

The README is posted on the PyPI page.

Cython

Cython is a Python-like language that makes it easy to write C-based extensions for Python. This is a Good Thing™, because people who will write good Python wrappers will be fluent in Python, but not necessarily in C. Alternatives like SWIG allow you to specify wrappers in a C-like language, which makes thin wrappers easy, but can lead to a less idomatic wrapper API. I should also point out ctypes, which has the advantage of avoiding compiled wrappers altogether, at the expense of dealing with linking explicitly in the Python code.

The Cython docs are fairly extensive, and I found them to be sufficient for writing my pycomedi wrapper around the Comedi library. One annoying thing was that Cython does not support __all__ (cython-users). I took a stab at fixing this, but got sidetracked cleaning up the Cython parser (cython-devel, later in cython-devel). I must have bit off more than I should have, since I eventually ran out of time to work on merging my code, and the Cython trunk moved off without me ;).

SWIG

SWIG is a Simplified Wrapper and Interface Generator. It makes it very easy to provide a quick-and-dirty wrapper so you can call code written in C or C++ from code written in another (e.g. Python). I don't do much with SWIG, because while building an object oriented wrapper in SWIG is possible, I could never get it to feel natural (I like Cython better). Here are my notes from when I do have to interact with SWIG.

%array_class and memory management

%array_class (defined in carrays.i) lets you wrap a C array in a class-based interface. The example from the docs is nice and concise, but I was running into problems.

>>> import example
>>> n = 3
>>> data = example.sample_array(n)
>>> for i in range(n):
...     data[i] = 2*i + 3
>>> example.print_sample_pointer(n, data)
Traceback (most recent call last):
  ...
TypeError: in method 'print_sample_pointer', argument 2 of type 'sample_t *'

I just bumped into these errors again while trying to add an insn_array class to Comedi's wrapper:

%array_class(comedi_insn, insn_array);    

so I decided it was time to buckle down and figure out what was going on. All of the non-Comedi examples here are based on my example test code.

The basic problem is that while you and I realize that an array_class-based instance is interchangable with the underlying pointer, SWIG does not. For example, I've defined a sample_vector_t struct:

typedef double sample_t;
typedef struct sample_vector_struct {
  size_t n;
  sample_t *data;
} sample_vector_t;

and a sample_array class:

%array_class(sample_t, sample_array);

A bare instance of the double array class has fancy SWIG additions for getting and setting attributes. The class that adds the extra goodies is SWIG's proxy class:

>>> print(data)  # doctest: +ELLIPSIS
<example.sample_array; proxy of <Swig Object of type 'sample_array *' at 0x...> >

However, C functions and structs interact with the bare pointer (i.e. without the proxy goodies). You can use the .cast() method to remove the goodies:

>>> data.cast()  # doctest: +ELLIPSIS
<Swig Object of type 'double *' at 0x...>
>>> example.print_sample_pointer(n, data.cast())
>>> vector = example.sample_vector_t()
>>> vector.n = n
>>> vector.data = data
Traceback (most recent call last):
  ...
TypeError: in method 'sample_vector_t_data_set', argument 2 of type 'sample_t *'
>>> vector.data = data.cast()
>>> vector.data  # doctest: +ELLIPSIS
<Swig Object of type 'double *' at 0x...>

So .cast() gets you from proxy of <Swig Object ...> to <Swig Object ...>. How you go the other way? You'll need this if you want to do something extra fancy, like accessing the array members ;).

>>> vector.data[0]
Traceback (most recent call last):
  ...
TypeError: 'SwigPyObject' object is not subscriptable

The answer here is the .frompointer() method, which can function as a class method:

>>> reconst_data = example.sample_array.frompointer(vector.data)
>>> reconst_data[n-1]
7.0

Or as a single line:

>>> example.sample_array.frompointer(vector.data)[n-1]
7.0

I chose the somewhat awkward name of reconst_data for the reconstitued data, because if you use data, you clobber the earlier example.sample_array(n) definition. After the clobber, Python garbage collects the old data, and becase the old data claims it owns the underlying memory, Python frees the memory. This leaves vector.data and reconst_data pointing to unallocated memory, which is probably not what you want. If keeping references to the original objects (like I did above with data) is too annoying, you have to manually tweak the ownership flag:

>>> data.thisown
True
>>> data.thisown = False
>>> data = example.sample_array.frompointer(vector.data)
>>> data[n-1]
7.0

This way, when data is clobbered, SWIG doesn't release the underlying array (because data no longer claims to own the array). However, vector doesn't own the array either, so you'll have to remember to reattach the array to somthing that will clean it up before vector goes out of scope to avoid leaking memory:

>>> data.thisown = True
>>> del vector, data

For deeply nested structures, this can be annoying, but it will work.

Maple

Some of the classes I TA use Maple. (Caveat: I prefer Python, as a more general language. Use SymPy or Sage if you need symbolic processing.) Anyhow, I get Maple worksheets to grade. SSHing into the department computer lab to fire up xmaple is a pain, so I wrote mw2txt.py to extract the Maple commands from the worksheet. It benefits from the fact that worksheets are fairly clean XML. Graphs and equations are more difficult, since they have complicated layout and are stored as encoded blobs. Other than that, things work pretty well. Here's the output from my example.mw example worksheet, picking out the math-mode sections (in red) and unprocessed blocks (in yellow) from the comments (uncolored).

$ mw2txt.py --color example.mw 
Hi there
> restart;
> interface(prettyprint=0):
> 1;# one  + plus 2 two ;
1
> 3 + 4;  bold
7
Equation
Posted
Unicode `long_description` in `setup.py`

I've been trying to figure out how to setup Unicode long descriptions in setup.py. I often use Unicode in my README files and then use the contents of the README to set the long description with something like:

…
_this_dir = os.path.dirname(__file__)
…
setup(
    …
    long_description=codecs.open(
        os.path.join(_this_dir, 'README'), 'r', encoding='utf-8').read(),
    )

This crashed in Python 2.7 with a UnicodeDecodeError when I tried to register the package on PyPI. The problem is that packages are checked before registration to avoid being registered with broken metadata, and Unicode handling was broken in distutils (bug 13114). Unfortunately, there haven't yet been Python releases containing the fixes (applied in October 2011).

How do you work around this issue until get a more recent Python 2.7? Just use Python 3.x, where Unicode handling is much cleaner. You may need hide Python-3-incompatible code inside:

if _sys.version_info < (3,0):

blocks, but you shouldn't be pulling in huge amounts of code for setup.py anyway.

Posted