Numpy tutorial#
This tutorial shows how pynapple interact with numpy.
Show code cell content
import numpy as np
import pynapple as nap
import pandas as pd
Multiple time series object are avaible depending on the shape of the data.
TsdTensor
: for data with of more than 2 dimensions, typically movies.TsdFrame
: for column-based data. It can be easily converted to a pandas.DataFrame. Columns can be labelled and selected similar to pandas.Tsd
: one-dimensional time series. It can be converted to a pandas.Series.Ts
: For timestamps data only.
Initialization#
tsdtensor = nap.TsdTensor(t=np.arange(100), d=np.random.rand(100, 5, 5), time_units="s")
tsdframe = nap.TsdFrame(t=np.arange(100), d=np.random.rand(100, 3), columns = ['a', 'b', 'c'])
tsd = nap.Tsd(t=np.arange(100), d=np.random.rand(100))
ts = nap.Ts(t=np.arange(100))
print(tsdtensor)
Time (s)
---------- -----------------------------
0.0 [[0.24537 ... 0.715385] ...]
1.0 [[0.148378 ... 0.670756] ...]
2.0 [[0.819463 ... 0.710617] ...]
3.0 [[0.222862 ... 0.556872] ...]
4.0 [[0.556037 ... 0.073276] ...]
5.0 [[0.411727 ... 0.999752] ...]
6.0 [[0.974068 ... 0.534545] ...]
...
93.0 [[0.30904 ... 0.271504] ...]
94.0 [[0.924983 ... 0.558904] ...]
95.0 [[0.160939 ... 0.030145] ...]
96.0 [[0.275429 ... 0.651706] ...]
97.0 [[0.921915 ... 0.327881] ...]
98.0 [[0.978571 ... 0.681109] ...]
99.0 [[0.416072 ... 0.805106] ...]
dtype: float64, shape: (100, 5, 5)
Tsd
and Ts
can be converted to a pandas.Series
.
print(tsd.as_series())
0.0 0.074969
1.0 0.780884
2.0 0.216066
3.0 0.875730
4.0 0.397126
...
95.0 0.367210
96.0 0.969868
97.0 0.702120
98.0 0.096609
99.0 0.821799
Length: 100, dtype: float64
TsdFrame
to a pandas.DataFrame
.
print(tsdframe.as_dataframe())
a b c
0.0 0.330856 0.623613 0.072056
1.0 0.737853 0.586844 0.844424
2.0 0.366937 0.493470 0.920038
3.0 0.269006 0.558888 0.705196
4.0 0.785051 0.337719 0.210883
... ... ... ...
95.0 0.997709 0.977151 0.152795
96.0 0.211646 0.673782 0.115556
97.0 0.261757 0.834962 0.875465
98.0 0.425633 0.324967 0.903085
99.0 0.252657 0.985066 0.172473
[100 rows x 3 columns]
Attributes#
The numpy array is accesible with the attributes .values
, .d
and functions .as_array()
, to_numpy()
.
The time index array is a TsIndex
object accessible with .index
or .t
.
.shape
and .ndim
are also accessible.
print(tsdtensor.ndim)
print(tsdframe.shape)
print(len(tsd))
3
(100, 3)
100
Slicing#
Slicing is very similar to numpy array. The first dimension is always time and time support is always passed on if a pynapple object is returned.
First 10 elements. Return a TsdTensor
print(tsdtensor[0:10])
Time (s)
---------- -----------------------------
0 [[0.24537 ... 0.715385] ...]
1 [[0.148378 ... 0.670756] ...]
2 [[0.819463 ... 0.710617] ...]
3 [[0.222862 ... 0.556872] ...]
4 [[0.556037 ... 0.073276] ...]
5 [[0.411727 ... 0.999752] ...]
6 [[0.974068 ... 0.534545] ...]
7 [[0.876753 ... 0.486251] ...]
8 [[0.8444 ... 0.868316] ...]
9 [[0.05606 ... 0.191265] ...]
dtype: float64, shape: (10, 5, 5)
First column. Return a Tsd
print(tsdframe[:,0])
Time (s)
---------- -----------
0.0 0.330856
1.0 0.737853
2.0 0.366937
3.0 0.269006
4.0 0.785051
5.0 0.99862
6.0 0.000885152
...
93.0 0.572158
94.0 0.802243
95.0 0.997709
96.0 0.211646
97.0 0.261757
98.0 0.425633
99.0 0.252657
dtype: float64, shape: (100,)
First element. Return a numpy ndarray
print(tsdtensor[0])
[[0.24537038 0.00835834 0.54201879 0.29477876 0.71538453]
[0.21909817 0.09944647 0.3109991 0.47655955 0.07249306]
[0.72828855 0.53596209 0.87653356 0.32258114 0.83538258]
[0.53991437 0.99765981 0.28428238 0.95939346 0.48076486]
[0.56214296 0.2556339 0.18444782 0.93938065 0.53003794]]
The time support is never changing when slicing time down.
print(tsd.time_support)
print(tsd[0:20].time_support)
index start end
0 0 99
shape: (1, 2), time unit: sec.
index start end
0 0 99
shape: (1, 2), time unit: sec.
TsdFrame
offers special slicing similar to pandas.DataFrame
.
Only TsdFrame
can have columns labelling and indexing.
print(tsdframe.loc['a'])
print(tsdframe.loc[['a', 'c']])
Time (s)
---------- -----------
0.0 0.330856
1.0 0.737853
2.0 0.366937
3.0 0.269006
4.0 0.785051
5.0 0.99862
6.0 0.000885152
...
93.0 0.572158
94.0 0.802243
95.0 0.997709
96.0 0.211646
97.0 0.261757
98.0 0.425633
99.0 0.252657
dtype: float64, shape: (100,)
Time (s) a c
---------- ------- -------
0.0 0.33086 0.07206
1.0 0.73785 0.84442
2.0 0.36694 0.92004
3.0 0.26901 0.7052
4.0 0.78505 0.21088
5.0 0.99862 0.33833
6.0 0.00089 0.08783
... ... ...
93.0 0.57216 0.12339
94.0 0.80224 0.15736
95.0 0.99771 0.1528
96.0 0.21165 0.11556
97.0 0.26176 0.87547
98.0 0.42563 0.90309
99.0 0.25266 0.17247
dtype: float64, shape: (100, 2)
Arithmetic#
Arithmetical operations works similar to numpy
tsd = nap.Tsd(t=np.arange(5), d=np.ones(5))
print(tsd + 1)
Time (s)
---------- --
0 2
1 2
2 2
3 2
4 2
dtype: float64, shape: (5,)
It is possible to do array operations on the time series provided that the dimensions matches. The output will still be a time series object.
print(tsd - np.ones(5))
Time (s)
---------- --
0 0
1 0
2 0
3 0
4 0
dtype: float64, shape: (5,)
Nevertheless operations like this are not permitted :
try:
tsd + tsd
except Exception as error:
print(error)
operand type(s) all returned NotImplemented from __array_ufunc__(<ufunc 'add'>, '__call__', Time (s)
---------- --
0 1
1 1
2 1
3 1
4 1
dtype: float64, shape: (5,), Time (s)
---------- --
0 1
1 1
2 1
3 1
4 1
dtype: float64, shape: (5,)): 'Tsd', 'Tsd'
Array operations#
The most common numpy functions will return a time series if the output first dimension matches the shape of the time index.
Here the TsdTensor
is averaged along the time axis. The output is a numpy array.
print(np.mean(tsdtensor, 0))
[[0.48494882 0.5111643 0.56130274 0.4698273 0.5088474 ]
[0.4780015 0.42811804 0.50466815 0.44712152 0.44792283]
[0.48922405 0.50726671 0.5499199 0.49761412 0.4808511 ]
[0.50046877 0.5163811 0.53252114 0.50255274 0.49313765]
[0.48238684 0.51385714 0.4952585 0.51156949 0.45739187]]
Here averaging across the second dimension returns a TsdFrame
.
print(np.mean(tsdtensor, 1))
Time (s) 0 1 2 3 4
---------- ------- ------- ------- ------- -------
0.0 0.45896 0.37941 0.43966 0.59854 0.52681
1.0 0.32131 0.301 0.52342 0.46412 0.62977
2.0 0.64149 0.54673 0.85443 0.47582 0.4893
3.0 0.36458 0.32478 0.48464 0.55528 0.48935
4.0 0.60018 0.22503 0.34867 0.51778 0.5737
5.0 0.43093 0.69253 0.39125 0.47602 0.3781
6.0 0.59021 0.41705 0.71806 0.46433 0.32637
... ... ... ... ... ...
93.0 0.69771 0.30509 0.45486 0.47371 0.25078
94.0 0.65236 0.55726 0.83663 0.68645 0.39349
95.0 0.23487 0.58809 0.59251 0.5653 0.28296
96.0 0.36042 0.23933 0.51268 0.43652 0.53896
97.0 0.50616 0.48037 0.67764 0.48498 0.42664
98.0 0.41575 0.59456 0.56618 0.56897 0.49006
99.0 0.64419 0.46024 0.47705 0.26557 0.52188
dtype: float64, shape: (100, 5)
This is not true for FFT functions though.
try:
np.fft.fft(tsd)
except Exception as error:
print(error)
no implementation found for 'numpy.fft.fft' on types that implement __array_function__: [<class 'pynapple.core.time_series.Tsd'>]
Concatenating#
It is possible to concatenate time series providing than they don’t overlap meaning time indexe should be already sorted through all time series to concatenate
tsd1 = nap.Tsd(t=np.arange(5), d=np.ones(5))
tsd2 = nap.Tsd(t=np.arange(5)+10, d=np.ones(5)*2)
tsd3 = nap.Tsd(t=np.arange(5)+20, d=np.ones(5)*3)
print(np.concatenate((tsd1, tsd2, tsd3)))
Time (s)
---------- --
0.0 1
1.0 1
2.0 1
3.0 1
4.0 1
10.0 2
11.0 2
...
13.0 2
14.0 2
20.0 3
21.0 3
22.0 3
23.0 3
24.0 3
dtype: float64, shape: (15,)
It’s also possible to concatenate vertically if time indexes matches up to pynapple float precision
tsdframe = nap.TsdFrame(t=np.arange(5), d=np.random.randn(5, 3))
print(np.concatenate((tsdframe, tsdframe), 1))
Time (s) 0 1 2 3 4 ...
---------- -------- -------- -------- -------- -------- -----
0 0.33183 -0.53476 -0.54328 0.33183 -0.53476 ...
1 1.84076 -1.3551 0.84971 1.84076 -1.3551 ...
2 -0.98628 0.71696 0.0618 -0.98628 0.71696 ...
3 1.88575 0.37527 -0.07462 1.88575 0.37527 ...
4 0.51714 -1.10206 0.21858 0.51714 -1.10206 ...
dtype: float64, shape: (5, 6)
Spliting#
Array split functions are also implemented
print(np.array_split(tsdtensor[0:10], 2))
[Time (s)
---------- -----------------------------
0 [[0.24537 ... 0.715385] ...]
1 [[0.148378 ... 0.670756] ...]
2 [[0.819463 ... 0.710617] ...]
3 [[0.222862 ... 0.556872] ...]
4 [[0.556037 ... 0.073276] ...]
dtype: float64, shape: (5, 5, 5), Time (s)
---------- -----------------------------
5 [[0.411727 ... 0.999752] ...]
6 [[0.974068 ... 0.534545] ...]
7 [[0.876753 ... 0.486251] ...]
8 [[0.8444 ... 0.868316] ...]
9 [[0.05606 ... 0.191265] ...]
dtype: float64, shape: (5, 5, 5)]
Modifying#
It is possible to modify a time series element wise
print(tsd1)
tsd1[0] = np.pi
print(tsd1)
Time (s)
---------- --
0 1
1 1
2 1
3 1
4 1
dtype: float64, shape: (5,)
Time (s)
---------- -------
0 3.14159
1 1
2 1
3 1
4 1
dtype: float64, shape: (5,)
It is also possible to modify a time series with logical operations
tsd[tsd.values>0.5] = 0.0
print(tsd)
Time (s)
---------- --
0 0
1 0
2 0
3 0
4 0
dtype: float64, shape: (5,)
Sorting#
It is not possible to sort along the first dimension as it would break the sorting of the time index
tsd = nap.Tsd(t=np.arange(100), d=np.random.rand(100))
try:
np.sort(tsd)
except Exception as error:
print(error)
no implementation found for 'numpy.sort' on types that implement __array_function__: [<class 'pynapple.core.time_series.Tsd'>]