Python gotchas

Here is the thing – I am a big fan of Python programming language. Now that there is an Intel distribution of Python, I don’t think I ever want to write in any other language again…

Having said that, Python has its moments. Most of the examples below are based on Fluent Python book by Luciano Romalho. I highly recommend it to all Python programmers.

Here are some “gotchas” I am taking about:

***********************************
*   Leaking Variables 
*   Times what? 
*   An Inside Job
*   Deeply Shallow
*   Out of Order
*   We are all Sharing  
***********************************

Leaking Variables

In Python 2.x variables created inside list comprehension are leaked, offering nasty surprise.

x = "I need you later"
ctoten = [-1, -2, -3, -4, -5, -6, -7, -8, -9, -10]
abs_ctoten = [abs(x) for x in ctoten]
print("Oh no! ", x) # prints x to be -10

Note that this problem does not exist in generator expressions (aka. genexps):

y = "abcde"
w = "see you later"
upper_y = array.array('c',(str.upper(w) for w in y))
print ("Still here: ", w) # prints w to be "see you later"

Times what?

Let’s say I need a string of 20 a’s. I can simply create it like this:

twenty_as = "a"*20

Great. I now need a list of three lists. I proceed to create it with * and end up with another surprise!

abc_list = [['a', 'b','c']]*3
print abc_list
abc_list[1][1]='x'
print abc_list  # prints ['a', 'x', 'c'], ['a', 'x', 'c'], ['a', 'x', 'c']]

This happens because the abc_list is made of references to the same [‘a’, ‘b’, ‘c’] list. The solution is to ensure that each list a separate/new copy:

abc_list = [['a', 'b','c'] for i in range(3)]

An Inside Job

Tuples are immutable and one can take an advantage of this when an immutability is required. However, if you put a mutable object inside a tuple, keep in mind that it can still be changed.

imm = (1,2)
imm[0]+=1 # will throw an exception
imm2 = (1, 2, [3, 4])
imm2[2]+=[10] # succeeds to modify the inner list and throws an exception

Deeply Shallow

You did not think I was going to write a post on Python’s dark corners without touching on deep copying, did you?
Here is a nice little trick for you to create a shallow copy with a slicing operator. It works the first time, but fails the second time when we need a deep copy instead.

list1 = [1,2,3]
list2 = list1[:] # shallow copy
list2[2] = 5

print ([(l, k) for l, k in zip(list1, list2)]) # all good

list1 = [1, 2, 3, [8,9]]
list2=list1[:]  # shallow copy again
list2[3][0] = 7

print ([(l, k) for l, k in zip(list1, list2)]) # shows that both are modified

Out of Order

Unless you are using collections.OrderedDict, the order of Python’s dicts’s keys and values cannot be relied on. This has to do which how Python’s dicts are stored in the memory. Also, dicts equality is determined on the basis of key-item pairs, and not their order in the dict. Take a look at the example below. The output of this code is implementation dependent. Finally, adding new items to dicts will likely to reorder the keys. Python’s sets also do not guarantee a particular order will be maintained. There is no “orderedset” in the standard library, but if you need one, you can find a PyPi package (e.g. orderedset).

FRUIT_CODES = [
    ("orange", 1),
    ("apple", 45),
    ("banana", 70),
    ("grapes", 81),
    ("pineapple", 86),
    ("kiwi", 52),
    ("papaya", 413),
    ("mango", 55),
    ("lemon", 62),
    ("nectarine", 910)
]

orig = copy.copy(FRUIT_CODES)
sorted1 = sorted(FRUIT_CODES, key=lambda x:x[0])
sorted2 = sorted(FRUIT_CODES, key=lambda x:x[1])

fruit_dict = dict(FRUIT_CODES)
fruit_sorted_dict1 = dict(sorted1)
fruit_sorted_dict2 = dict(sorted2)

print fruit_dict.keys() == fruit_sorted_dict1.keys() and fruit_sorted_dict1.keys() == fruit_sorted_dict2.keys() # prints False or True (implementation dependent)
print fruit_dict == fruit_sorted_dict1 and fruit_sorted_dict1 == fruit_sorted_dict2 # prints True

We are all Sharing

In Python, mutable types are passed to functions by sharing. This means that a function/method can modify the parameter, but it cannot replace it with another object. Here is a typical “gotcha” with functions being able to modify its parameters:

def plusone(my_list):
    my_list.append(1)  # can modify
    

def newlife(my_list, your_list):
    my_list=your_list  # cannot replace with a new object

first_list = [2, 3, 4]
plusone(first_list)
print first_list # prints [2, 3, 4, 1]

second_list = [5, 6, 7]
newlife(first_list, second_list)
print first_list # prints [2, 3, 4, 1]

This should give you enough “food for thought”. Happy programming everyone! 🙂