High Performance Python

Greetings All,

I have recently come across a nice and short youtube talk given by Ben Lerner on how to improve performance of Python code. Here is a quick rundown of the techniques he mentions which I found particularly useful:

Avoid accidentally quadratic code – take advantage of the build-in data types that give O(n) for free:

The running complexity of the code below is O(n^2). The square bit comes from searching the second list for each item in the first list. Nice.
A typical job interview question by a smug interviewer will be – can we do better than that? “Yes, we can”, you can reply smiling even “smug-lier” – we will just use python sets.

import numpy as np
import time

np.random.seed(1)

list1=range(60000)
random=np.random.randint(0,59999,60000)
list2 = random.tolist() # WORST PERFORMANCE with a list
intersect=[]

start_time = time.time()

for l1 in list1:
    if l1 in list2:
        intersect.append(l1)

print "List one has %s elements in common with list two" %(len(intersect))
print("--- %s seconds ---" % (time.time() - start_time))

list2=random # BETTER PERFORMANCE with NumPy array
intersect=[]
start_time = time.time()

for l1 in list1:
    if l1 in list2:
        intersect.append(l1)

print "List one has %s elements in common with list two" %(len(intersect))
print("--- %s seconds ---" % (time.time() - start_time))

list2=set(random) # BEST PERFORMANCE WITH set
intersect=[]
start_time = time.time()

for l1 in list1:
    if l1 in list2:
        intersect.append(l1)

print "List one has %s elements in common with list two" %(len(intersect))
print("--- %s seconds ---" % (time.time() - start_time))

Note that the only thing we have changed is to put list2 in a set. This example is trivial and can be done with two sets and an intersection(). But sometimes you need to perform a more complex task than finding an intersection, and keeping the “set” trick in mind will be helpful. Just remember that putting a list in a set will remove duplicates.

Function calls and global variables are expensive

In general, the advice is to avoid unnecessary function calls and avoid using global variables. Function calls require stack pointer saves, and global variables stick around.

Use map/filter/reduce instead of for-loops

map applies a function to every item on an iterable (i.e. map(function,iterable))
filter is similar, and will return those items for which the function is true on an iterable.
reduce will cumulatively apply a function on an iterable to reduce it to a single item output.

Use cProfile to analyse slow code

cProfile can be used to generate code statistics to pin down that loop or that function call that slows down your code. It is very easy to use.

In-line C with Cython

The main reason NumPy is fast is that about 60% of its code is C. In-lining C with python is not hard with Cython, which adds a small number of keywords to the Python language to give access to static data types of C or C++. Finally, Cython can be used to simply compile Python into C.

Hope this gives you enough useful pointers. Happy flying!