Introduction to NumPy¶
Introducing NumPy¶
NumPy is a library for Python designed for efficient scientific (numerical) computing. It is an essential library in Python that is used under the hood in many other modules. Here, we will get a sense of a few things NumPy can do.
Importing NumPy¶
To start using the NumPy module we will need to import
it.
import numpy as np
The import library as
syntax can be used to give the library a different name in memory.
Since we may want to use NumPy many time, shortening numpy
to np
is helpful.
Creating NumPy arrays of values¶
A common NumPy task is to create your own arrays to make a variable that has a range from one value to another.
A NumPy array is similar in concept to a Python list, but only contains data of one type.
If we wanted to calculate the sin()
of a variable x
at 10 points from zero to 2 * pi, we could do the following.
x = np.linspace(0., 2 * np.pi, 10)
print(x)
[0. 0.6981317 1.3962634 2.0943951 2.7925268 3.4906585
4.1887902 4.88692191 5.58505361 6.28318531]
In this case, x
starts at zero and goes to 2 * pi in 10 increments.
Alternatively, if we wanted to specify the size of the increments for a new variable x2
, we could use the np.arange()
function.
x2 = np.arange(0.0, 2 * np.pi, 0.5)
print(x2)
[0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 6. ]
In this case, x2
starts at zero and goes to the largest value that is smaller than 2 * pi by increments of 0.5.
Both of these types of array options are useful in different situations.
Math using NumPy¶
To calculate the sine function values, we can simply use the np.sin()
function.
sine = np.sin(x)
print(sine)
[ 0.00000000e+00 6.42787610e-01 9.84807753e-01 8.66025404e-01
3.42020143e-01 -3.42020143e-01 -8.66025404e-01 -9.84807753e-01
-6.42787610e-01 -2.44929360e-16]
Note that performing calculations on NumPy arrays will produce output in an array.
NumPy array data types¶
As before, we can check out the type of data in our arrays x
and x2
using the type()
function.
type(x)
numpy.ndarray
OK, so we have something new here. NumPy has its own data types that are part of the module. In this case, our data is stored in an NumPy n-dimensional array.
Size of NumPy arrays¶
How much data do we have in our x
variable?
print(x.shape)
(10,)
10 rows of data, 1 column.
In this case the single column value is suppressed.
shape
is a member or attribute of x
, and is part of any NumPy ndarray
.
Printing x.shape
tells us the size of the array.
Type of data in NumPy arrays¶
We can also check the data type of our data columns by using x.dtype
print(x.dtype)
float64
OK, so it seems that all the data in our file is float data type, i.e., decimal numbers (stored with a precision of 64 bytes).
Index values in NumPy arrays¶
Like lists, we can find any value in an array by using it’s indices.
We can also extract parts of an array using index slicing.
Perhaps we only want the first three values out of array x
.
x[0:3]
array([0. , 0.6981317, 1.3962634])
Nice! Note that in this case, the range of index values for the first 3 rows is 0-3.
The data extracted will start at 0
and go up to, but not include 3
.
Useful functions¶
Basic math¶
Like normal variables, array variables can also be used for various mathematical operations.
doublex = x * 2.0
print(doublex)
[ 0. 1.3962634 2.7925268 4.1887902 5.58505361 6.98131701
8.37758041 9.77384381 11.17010721 12.56637061]
NumPy methods¶
In addition to the attributes we saw prevously for NumPy ndarray
variables, there are also many methods that are part of the ndarray
data type.
print(x.mean())
print(doublex.mean())
3.141592653589793
6.283185307179586
No surprises here. If we think of variables as nouns, methods are verbs, actions for the variable values.
NOTE: When using methods, you always include the parentheses ()
to be clear we are referring to a method and not an attribute.
There are many other useful ndarray
methods, such as x.min()
, x.max()
, and x.std()
(standard deviation).
NumPy methods on data slices¶
In addition to the attributes we saw prevously for NumPy ndarray
variables, there are also many methods that are part of the ndarray
data type.
print(x[0:5].mean())
1.3962634015954634
Arrays of zeros or ones¶
It is pretty common that you will need to create arrays full of zeros or ones to store output from calculations. NumPy includes well-named functions for doing this.
zeros = np.zeros(10)
print(zeros)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
ones = np.ones(10)
print(ones)
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
Caution: Copying NumPy arrays¶
A word of caution, and the need for copies of arrays. Unlike many data types in Python, assigning an existing NumPy array to a new variable does not create a copy of the array, but rather simply creates pointer to the original array. Consider the example below.
a = np.ones(10)
b = a
a += 4
print(a)
[5. 5. 5. 5. 5. 5. 5. 5. 5. 5.]
print(b)
[5. 5. 5. 5. 5. 5. 5. 5. 5. 5.]
Oh no!
Here, we can see that even after assigning the values of array a
to array b
changes to a
will affect b
.
This is because array b
is simply a reference to a
.
But what if we want to save the values of a
to another array without having them change when a
changes?
For this we need to use np.copy()
.
c = np.copy(a)
a += 3
print(a)
[8. 8. 8. 8. 8. 8. 8. 8. 8. 8.]
print(c)
[5. 5. 5. 5. 5. 5. 5. 5. 5. 5.]
np.copy()
creates a complete copy of the referenced array that is independent of its source.
This is less efficient, so NumPy defaults to using pointers instead of making complete copies of arrays.
Exercise - Mean of the cosine¶
Create a NumPy array
x3
with a range of -π to +π (inclusive) with 20 incrementsCalculate the cosine of
x3
and store it ascosine
What is the mean value of
cosine
?Is this value what you expect?
What happens if you use a larger number of increments for
x3
?
# As was the case before, use this cell to complete the exercise