Universal mailbox converter in 24 lines

Posted on . Updated on .

Sometimes you have a quantity of email messages stored in a specific mailbox under a given format and you need to convert them to another mailbox format, like from Maildir to mbox or MH to Maildir, etc. There are some solutions out there. Programs capable of translating from one format to another, or sometimes mail clients (MUAs) that can use the two formats in question and you can do the conversion by moving messages from within the client itself. Some days ago I found out, by accident, that Python has a mailbox module that can read and write to many different mailbox formats (at the time I’m writing this Python 2.5.1 understands Maildir, mbox, MH, Babyl, and MMDF). This module can be used to create a very simple "universal mailbox converter" easily, like the following one:

#!/usr/bin/env python
import mailbox
import os.path
import re
import sys

try:
    appname = os.path.basename(sys.argv[0])
    source = sys.argv[1]
    dest = sys.argv[2]

    (sfmt, dfmt) = re.match(r'^(.+)2(.+)', appname).groups()
    sbox = mailbox.__dict__[sfmt](source)
    dbox = mailbox.__dict__[dfmt](dest)

    for key in sbox.iterkeys():
        dbox.add(sbox.get_message(key))

except IndexError:
    sys.exit('Usage: %s source destination' % appname)
except (AttributeError, KeyError):
    sys.exit('ERROR: invalid mailbox type')
except mailbox.Error, err:
    sys.exit('ERROR: %s' % err)

This 24 lines program uses the mailbox module to do the conversion. The type of conversion it performs depends on the name under which it’s invoked. If you call the program "Maildir2mbox" it translates between those two formats. In general, the name should be "[source format]2[destination format]" where the source and destination formats are the ones I mentioned above: Maildir, mbox, MH, Babyl, MMDF. It expects two arguments, which are the paths to the source and destination mailboxes, respectively. If more formats are added in the future they will be automatically supported by invoking the program with the appropriate name. For example, in my hard drive I have called the program "Maildir2mbox" and I have created symbolic links to every other permutation of formats:

$ ls -l
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 Babyl2MH -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 Babyl2MMDF -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 Babyl2Maildir -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 Babyl2mbox -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MH2Babyl -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MH2MMDF -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MH2Maildir -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MH2mbox -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MMDF2Babyl -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MMDF2MH -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MMDF2Maildir -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 MMDF2mbox -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 Maildir2Babyl -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 Maildir2MH -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 Maildir2MMDF -> Maildir2mbox
-rwxr-xr-x 1 root root  566 2007-11-22 12:22 Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 mbox2Babyl -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 mbox2MH -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 mbox2MMDF -> Maildir2mbox
lrwxrwxrwx 1 root root   12 2007-11-22 12:36 mbox2Maildir -> Maildir2mbox

For Python programmers, the trick is very simple. The mailbox module has a general class called Mailbox with one subclass for each supported mailbox type. For example, mailbox.Maildir is the class implementing the Maildir mailboxes. Fortunately, you can create objects by a string representing their class name in Python, thanks to the dict dictionary or map every Python module has, that maps each class name to the corresponding class. To create a mailbox.Maildir object I can use mailbox.Maildir(…​) or mailbox.dict['Maildir'](…​). The program simply extracts the mailbox format names from the program name obtained with sys.argv[0] and uses the method to create mailboxes in the right format.

Load comments