The Differences Between Python Lists and Numpy Arrays

By George Bennett

If your learning python you will quickly become familiar with the list datatype. Lists are an ordered collection of data, whether it be numbers, strings, collections, or any other objects. Numpy arrays are similar to lists but when dealing with numerical information they simplify mathematical processes. Numpy is a critical module to import no matter what type of work you are doing.

# import numpy module
import numpy as np

One example of numpy array magic is performing basic arithmetic. Ordinarily if you wanted to add each item in a list of numerical data with a corresponding value in another list you would need some type of loop. With arrays you can simply add the lists together. See the example:

list_a = [1, 2, 3]
list_b = [4, 5, 6]
# add the corresponding values of each list together# without numpy
sum_list = []
for index in range(3):
sum_list.append(list_a[index] + list_b[index])
# with numpy
sum_array = np.array(list_a) + np.array(list_b)

Using numpy also makes the process much less computationally expensive. This technique can also be applied when dividing, multiplying, and so on. As you can see, you can turn any list into an array by calling numpy.array and passing the list.

Another cool thing about arrays is that you can get statistical measures quickly. Simply call methods on an array such as .sum() for the total sum, .var() for the variance, or .median() for the median.

# find the median of an array
sum_array.median()
>>>7

Arrays don’t stop there though. They can be multidimensional and have a lot of nifty methods for reshaping. A 2-dimensional array is just an array of arrays. You can access the shape of an array as an attribute .shape. You can call .reshape on a 2x3 array to make it a 3x2 array. You can also transpose a matrix by calling .T().

# return shape of an array
twoDarray = np.array([[1, 2], [3, 4], [5, 6]])
twoDarray.shape>>>(3, 2)# reshape 3x2 array into a 2x3 arraytwoDarray = twoDarray.reshape(2, 3)print(twoDarray.shape)
twoDarray
>>>
(2, 3)
array([[1, 2, 3],
[4, 5, 6]])

Many modules in python are built upon numpy. For instance the pandas library uses it for all of its dataframes. Scikit learn uses it also. Numpy greatly increases computational efficiency and everyone who uses python should be familiar with it.

Data scientist learning at Flat Iron School