Submit Blog  RSS Feeds

Thursday, November 29, 2012

NumPy arrays vs regular arrays - common operations

NumPy is a set of scientific computing tools for python. I'm not going to talk about them, I mentioned NumPy, because it comes with an optimized array type numpy.array. Unlike casual python arrays, numpy.arrays work well with matrix/vector operators.

Arrays and numpy.arrays do not share a common interface, but they both support iteration and support the __setitem__ and __getitem__ methods. The first thing that differs is the initialization (obviously):

>>> import numpy as np
>>> a = np.array([1,2,3,4,5])
>>> b = [1,2,3,4,5]
>>> a
array([1, 2, 3, 4, 5])
>>> b
[1, 2, 3, 4, 5]



It's almost the same, you can iterate over both tables, set and get array elements. However when you want to add / remove elements you have to act differently - by using NumPy provided methods:

 >>> np.delete(a, 2)
array([1, 2, 4, 5])
>>> del b[2]
>>> b
[1, 2, 4, 5]


The prime difference here, is that casual array element deletion modifies the object, while np.delete creates a deep copy and than applies the effect. Numpy.append works in a similar maner, but instead it concatenates arrays (or adds elements).

Now lets look at the filter function, these to instructions are equivalent:

>>> filter(lambda x: x>2, b)
[3, 4, 5]
>>> a[a>2]
array([3, 4, 5])


And for the map  function:

>>> np_map = np.vectorize(lambda x: x**2)
>>> np_map(a)
array([ 1,  4,  9, 16, 25])
>>> map(lambda x: x**2, b)
[1, 4, 9, 16, 25]


Let's move on, and see where NumPy arrays have some more potential: vector operations. Let's suppose we want to make a sum of 2 vectors. Using regular python arrays you have to combine elements with the zip / sum functions, or just iterate and create sums on the fly. Example solution:

>>> b = [1,2,3,4,5]
>>> d = [3,3,3,4,4]
>>> map(lambda x: sum(x), zip(b,d))
[4, 5, 6, 8, 9]


Now for the NumPy solution:

>>> a = np.array([1,2,3,4,5])
>>> b = np.array([3,3,3,4,4])
>>> a + b
array([4, 5, 6, 8, 9])


And this works for all python binary operators, this is just awesome. When doing 'scientific' stuff, it's good to use appropriate data types. What's more, NumPy is very popular among other projects, so seeing numpy.array's from time to time doesn't surprise nobody. I guess matrices and more complex operations will be covered some other day.

Cheers!
KR


No comments:

Post a Comment

free counters