26 July 2018

Using Python 3 imaplib to connect to Gmail - Part 1: Searching and Labeling

Using imaplib to access Gmail. The full notebook file can found on github here.
In [1]:
from imaplib import IMAP4_SSL #Secure connection subclass of imaplib
import getpass #interactive password prompt works on command line too
In [2]:
#First we need to authenticate and get an IMAP session
M = IMAP4_SSL('imap.gmail.com')
M.login('myEmail@gmail.com', getpass.getpass()) #you will need to have setup an auth method that doesn't use 2FA
········
Out[2]:
('OK', [b'myEmail@gmail.com authenticated (Success)'])
Let's take a look at the folders (in gmail these are actually labels, and mail can have multiple labels) for Gmail in particular you can see these special attributes referenced in https://tools.ietf.org/html/rfc6154 These are some Gmail folders, in particular look at the special indicators All, Junk, Trash and so forth;
  b'(\\All \\HasNoChildren) "/" "[Gmail]/All Mail"',
  b'(\\HasNoChildren \\Trash) "/" "[Gmail]/Bin"',
  b'(\\Drafts \\HasNoChildren) "/" "[Gmail]/Drafts"',
  b'(\\HasNoChildren \\Important) "/" "[Gmail]/Important"',
  b'(\\HasNoChildren \\Sent) "/" "[Gmail]/Sent Mail"',
  b'(\\HasNoChildren \\Junk) "/" "[Gmail]/Spam"',
  b'(\\Flagged \\HasNoChildren) "/" "[Gmail]/Starred"',
\\All is the All mail tab, and has everything (with the exception of \\Trash)
In [3]:
M.list()
Out[3]:
('OK',
 [b'(\\HasNoChildren) "/" "BookedEmail"',
  b'(\\HasNoChildren) "/" "SpecialBooking"',
  b'(\\HasChildren \\Noselect) "/" "[Gmail]"',
  b'(\\All \\HasNoChildren) "/" "[Gmail]/All Mail"',
  b'(\\HasNoChildren \\Trash) "/" "[Gmail]/Bin"',
  b'(\\Drafts \\HasNoChildren) "/" "[Gmail]/Drafts"',
  b'(\\HasNoChildren \\Important) "/" "[Gmail]/Important"',
  b'(\\HasNoChildren \\Sent) "/" "[Gmail]/Sent Mail"',
  b'(\\HasNoChildren \\Junk) "/" "[Gmail]/Spam"',
  b'(\\Flagged \\HasNoChildren) "/" "[Gmail]/Starred"',
  b'(\\HasNoChildren) "/" "automaticThings"'])
The next stage is to select a folder that you'd like to start working on. When using imaplib you'll get a tuple response with the status and the result. The result in this case is the number of items in that particular folder.
In [4]:
typ, data = M.select('"[Gmail]/All Mail"') 
print(typ)
print(data)
OK
[b'634633']
We can search using standard IMAP search which is documented in the IMAP RFC https://tools.ietf.org/html/rfc3501
In [5]:
typ, data = M.search(None, '(ON 1-May-2018)', 'ALL') #Find all mail from this date
print(typ)
print("Number of results: {0}".format(len(data[0].decode().split()))) #it returns a space seperated list of indexes
OK
Number of results: 1657
Google has included their own special extensions that in some cases are the only way to actually work with Gmailhttps://developers.google.com/gmail/imap/imap-extensions Here are the two that I have needed to get things done;
  • X-GM-RAW this allows you to use the full Gmail search syntax, and is much easier than pulling down an email and doing a text search on the body
  • X-GM-LABELS allows you to apply and remove labels to a particular piece of mail
In [6]:
typ, data = M.search(None, 'X-GM-RAW "{search} label:SpecialBooking"'.format(search="Subscription"), 'ALL')
print(typ)
print("Number of results: {0}".format(len(data[0].decode().split()))) #it returns a space seperated list of indexes
OK
Number of results: 2568
The data returned is a space seperated list of messages indexed by the selected folder you're in. Before you can actually work on a message you need the UID.
In [7]:
messageIdx = data[0].decode().split() #get a list of the indexes
for midx in messageIdx[:1]: #I'm just doing the first out of the list in this case
    resp, data = M.fetch(midx, "(UID)") #Have to get the "actual" UID...
    print(resp)
    print(data)
    messageUID = data[0].decode().split()[-1][:-1] #This is a quick and dirty hack to pull the standalone number out
    print(messageUID)
    typ, data = M.uid('STORE', messageUID, '+X-GM-LABELS', '"MyLabel"') #This adds MyLabel to the mail
    print(typ)
    print(data)
    #If you did -X-GM-LABELS it would remove the label
OK
[b'211 (UID 2706090)']
2706090
OK
[b'211 (X-GM-LABELS ("MyLabel" NewLabel) UID 2706090)']
Now just cleanup and close your connection. Next time I'll show how you can pull a message down, parse the header and body so you can do further work.
In [8]:
M.close()
Out[8]:
('OK', [b'Returned to authenticated state. (Success)'])