# Python Data Analysis and Visualization – Week One/Unit One/Introduction to NumPy

## Dimensions of Data

• The way in which a set of data organized represents specific relation
• One-dimension data -> set/list/array
• list： typed could be different
• array： all data with the same type
• Two-dimensions data -> two-dimension list
• Three-dimensions data -> expends along another dimension like time
• High dimensional data -> only use basic binary relation between data to display difficult structure -> use key-value to organize（like Json）
• dictionary or Json/XML/YAML

## NumPy Array Object：ndarray

• NumPy: Open source, basic library for scientific calculation in Python
• N-dimensions：ndarray
• Broadcast function：used in calculation between arrays
• Integrated tools for C/C++/Fortran
• Features for linear algebra, Fourier transformation and generation of random number
• NumPy is the base of SciPy、Pandas

• Import NumPy `import numpy as np`
• eg：Calculte A2 + B3，A and B are arrays

• Remove loop for each elements, make it like single data
• Set specialized array object to improve the calculation speed (use C at bottom)
• In scientific calculation, the type of data usually be the same
• Using array with same type is good for space saving

• Multiple dimensions array
• Actual data
• Metadata to describe the data(dimension, type)
• Subscript is started from 0

• elements are separated by blank space when printed
• axis – dimension
• rank – number of dimensions

.ndim number of dimensions
.shape n rows and m columns
.size number of elements，n * m
.dtype type of element
.itemsize size of each element, unit is byte

### Type of Element

Type Explanation
bool 1/0
intc similar to int in C，int32或int64
intp integer used to index，similar to ssize_t in C，int32 or int64
int8 integer with one byte length，[-128, 127]
int16 16 bits integer
int32 32 bits integer
int64 64 bits integer
uint8 8 bits unsigned integer，[0, 255]
uint16 16 bits unsigned integer
uint32 32 bits unsigned integer
uint64 64 bits unsigned integer
float16 16 bits semi precision floating point number
float32 32 bits semi precision floating point number
float64 64 bits semi precision floating point number
complex64 complex number，real and imaginary parts are both 32 bits floats(.real+.imag)
complex128 complex number，real and imaginary parts are both 64 bits floats

For elements with different types:

• Regard them as `O` objects

## Creation and Transformation of darray

### Methods of Creation

• Create from list or tuple in python
• `x = np.array(list/tuple)`
• `x = np.array(list/tuple, dtype=np.float32)`
• numpy determines type automatically without dtype

• List and tuple could be mixed with the same number of elements
• Create by NumPy, like arrange, ones, zeros
Function Explanation
np.arange(n) similar to range()，return ndarray with elements from 0 to n-1
np.ones(shape) shape is tuple, generate shaped array whose elements are all one
np.zeros(shape) shape is tuple, generate shaped array whose elements are all zero
np.full(shape, val) generate shaped array whose elements are all val
np.eye(n) return an n*n array with ones on the main diagonal and zeros elsewhere
np.ones_like(a) ones(a’s shape)
np.zeros_like(a) zeros(a’s shape)
np.full_like(a, val) full(a’s shape, val)
np.linspace(start, end, ele_num, endpoint=True) return evenly spaced numbers over a specified interval.，endpoint is used to set if end is the endpoint
np.concatenate() combine several arrays

• create from raw bytes
• create from file

### Transformation Methods

• Dimension and type transformation
• reshape(), resize() need invariable number of elements
• attention to whether the function change the origin array
• np.int would be modulated by program automatically
• astype() could be used to copy array
Method Explanation
.reshape(shape) return new array with the input shape
.resize(shape) change array to the input shape
.swapaxes(ax1, ax2) swap two dimensions in an array
.flatten() return new array after dimensionality reduction
.astype(new_type) return new array with new_type element
.tolist() generate list

## Operations of ndarray

• Index and slices
• Index: get element at specific location
• Slice: get subset of array

### One Dimension Array

• Index：similar to list in python
• Slice：`[arg1 : arg2 : arg3]`
• `arg1`: Start location
• `arg2`: End location(not included)
• `arg3`: step size

### Multiple Dimensions Array

• `::step` Jump slice
• `:` select all elements in one dimension

## Calculations of ndarray

### Calculate with Scalar

• Equivalent to operation of each element
• .mean() to get arithmetical average

Function Explanation
np.abs(x) np.fabs(x) calculate absolute value of each element
np.sqrt(x) calculate square root of each element
np.square(x) square each element
np.log(x) calculate logarithm of each element
np.log10(x) calculate logarithm based on 10 of each element
np.log2(x) calculate logarithm based on 2 of each element
np.ceil(x) round down each element
np.floor(x) round up each element
np.rint(x) round off each element
np.modf(x) divide each element to integer part and decimal part, return as two arrays
np.cos(x) np.cosh(x) np.sin(x) np.sinh(x) np.tan(x) np.tanh(x) trigonometric function for each element
np.exp(x) calculate exponent of each element
np.sign(x) calculate sign value of each element，1/0/-1

Function Explanation
+ – / * calculate the homologous elements
np.maximum(x,y) np.fmax() get max value of homologous elements
np.minimum(x,y) np.fmin() get min value of homologous elements
np.mod(x,y) mod operations between homologous elements
np.copysign(x,y) set sign value of each element in y to homologous element in x
> < >= <= != compare homologous elements and generate a bool array
• For maximum, it generates float array if the types of two arrays are int and float

## Unit Summary

• Dimension of data
• ndarray attributions, creation and transformation
• Index and slices
• Operations