Introduction to pynapple#

The goal of this tutorial is to quickly learn enough about pynapple to get started with data analysis. This tutorial assumes familiarity with the basics functionalities of numpy.

You can check how to install pynapple here.

Important

By default, pynapple will assume a time units in seconds when passing timestamps array or time parameters such as bin size (unless specified with the time_units argument)


Importing pynapple#

The convention is to import pynapple with a namespace:

import pynapple as nap

Hide code cell content

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

custom_params = {"axes.spines.right": False, "axes.spines.top": False}
sns.set_theme(style="ticks", palette="colorblind", font_scale=1.5, rc=custom_params)

Instantiating pynapple objects#

nap.Tsd: 1-dimensional time series#

If you have a 1-dimensional time series, you use the nap.Tsd object. The arguments t and d are the arguments for timestamps and data.

tsd = nap.Tsd(t=np.arange(100), d=np.random.rand(100))

print(tsd)
Time (s)
----------  --------
0.0         0.981797
1.0         0.227075
2.0         0.174474
3.0         0.417824
4.0         0.589733
5.0         0.733552
6.0         0.131195
...
93.0        0.580855
94.0        0.397841
95.0        0.14179
96.0        0.328513
97.0        0.420532
98.0        0.910559
99.0        0.104565
dtype: float64, shape: (100,)

nap.TsdFrame: 2-dimensional time series#

If you have a 2-dimensional time series, you use the nap.TsdFrame object. The arguments t and d are the arguments for timestamps and data. You can add the argument columns to label each columns.

tsdframe = nap.TsdFrame(
    t=np.arange(100), d=np.random.rand(100, 3), columns=["a", "b", "c"]
)

print(tsdframe)
Time (s)             a          b         c
----------  ----------  ---------  --------
0.0         0.783146    0.0563863  0.29359
1.0         0.306906    0.374352   0.886374
2.0         0.200337    0.300978   0.218232
3.0         0.978273    0.785629   0.872128
4.0         0.00352579  0.47368    0.23737
5.0         0.602095    0.307176   0.311793
6.0         0.539067    0.570504   0.996532
...
93.0        0.284974    0.721807   0.944043
94.0        0.965133    0.0188204  0.674949
95.0        0.449351    0.879736   0.630792
96.0        0.768285    0.401177   0.951723
97.0        0.423065    0.781563   0.481195
98.0        0.236627    0.105326   0.30449
99.0        0.887129    0.604899   0.421909
dtype: float64, shape: (100, 3)

nap.TsdTensor: n-dimensional time series#

If you have larger than 2 dimensions time series (typically movies), you use the nap.TsdTensor object . The arguments t and d are the arguments for timestamps and data.

tsdtensor = nap.TsdTensor(
    t=np.arange(100), d=np.random.rand(100, 3, 4)
)

print(tsdtensor)
Time (s)
----------  -----------------------------
0.0         [[0.95059  ... 0.704851] ...]
1.0         [[0.138863 ... 0.103282] ...]
2.0         [[0.883581 ... 0.817639] ...]
3.0         [[0.749603 ... 0.935647] ...]
4.0         [[0.570674 ... 0.239928] ...]
5.0         [[0.783571 ... 0.726888] ...]
6.0         [[0.37214 ... 0.22156] ...]
...
93.0        [[0.898736 ... 0.88278 ] ...]
94.0        [[0.084502 ... 0.851466] ...]
95.0        [[0.607833 ... 0.672821] ...]
96.0        [[0.418456 ... 0.661719] ...]
97.0        [[0.613337 ... 0.727743] ...]
98.0        [[0.487731 ... 0.507664] ...]
99.0        [[0.742192 ... 0.179891] ...]
dtype: float64, shape: (100, 3, 4)

nap.IntervalSet: intervals#

The IntervalSet object stores multiple epochs with a common time unit in a table format. The epochs are strictly non-overlapping. Both start and end arguments are necessary.

epochs = nap.IntervalSet(start=[0, 10], end=[5, 15])

print(epochs)
  index    start    end
      0        0      5
      1       10     15
shape: (2, 2), time unit: sec.

nap.Ts: timestamps#

The Ts object stores timestamps data (typically spike times or reward times). The argument t for timestamps is necessary.

ts = nap.Ts(t=np.sort(np.random.uniform(0, 100, 10)))

print(ts)
Time (s)
16.601563865519964
37.15399494087342
51.20124357417386
53.623330665972965
65.27749925197122
66.24782354247736
69.0286764984123
72.30822044226672
89.72783614561473
95.95639246732205
shape: 10

nap.TsGroup: group of timestamps#

TsGroup is a dictionnary that stores multiple time series with different time stamps (.i.e. a group of neurons with different spike times from one session). The first argument data can be a dictionnary of Ts, Tsd or numpy 1d array.

data = {
    0: nap.Ts(t=np.sort(np.random.uniform(0, 100, 1000))),
    1: nap.Ts(t=np.sort(np.random.uniform(0, 100, 2000))),
    2: nap.Ts(t=np.sort(np.random.uniform(0, 100, 3000))),
}

tsgroup = nap.TsGroup(data)

print(tsgroup, "\n")
  Index     rate
-------  -------
      0  10.0009
      1  20.0017
      2  30.0026 

Interaction between pynapple objects#

Time support : attribute of time series#

A key feature of how pynapple manipulates time series is an inherent time support object defined for Ts, Tsd, TsdFrame and TsGroup objects. The time support object is defined as an IntervalSet that provides the time serie with a context. For example, the restrict operation will automatically update the time support object for the new time series. Ideally, the time support object should be defined for all time series when instantiating them. If no time series is given, the time support is inferred from the start and end of the time series.

In this example, a Tsd is instantiated with and without a time support of intervals 0 to 5 seconds. Notice how the shape of the Tsd varies.

time_support = nap.IntervalSet(start=0, end=2)

print(time_support)
  index    start    end
      0        0      2
shape: (1, 2), time unit: sec.

Without time support :

print(nap.Tsd(t=[0, 1, 2, 3, 4], d=[0, 1, 2, 3, 4]))
Time (s)
----------  --
0            0
1            1
2            2
3            3
4            4
dtype: int64, shape: (5,)

With time support :

print(
    nap.Tsd(
        t=[0, 1, 2, 3, 4], d=[0, 1, 2, 3, 4], 
        time_support = time_support
        )
    )
Time (s)
----------  --
0            0
1            1
2            2
dtype: int64, shape: (3,)

The time support object has become an attribute of the time series. Depending on the operation applied to the time series, it will be updated.

tsd = nap.Tsd(
    t=np.arange(10), d=np.random.randn(10), 
    time_support = time_support
    )

print(tsd.time_support)
  index    start    end
      0        0      2
shape: (1, 2), time unit: sec.

Restricting time series to epochs#

The central function of pynapple is the restrict method of Ts, Tsd, TsdFrame and TsGroup. The argument is an IntervalSet object. Only time points within the intervals of the IntervalSet object are returned. The time support of the time series is updated accordingly.

tsd = nap.Tsd(t=np.arange(10), d=np.random.randn(10))

ep = nap.IntervalSet(start=[0, 7], end=[3.5, 12.4])

print(ep)
  index    start    end
      0        0    3.5
      1        7   12.4
shape: (2, 2), time unit: sec.

From :

print(tsd)
Time (s)
----------  ---------
0            0.804629
1            1.26361
2            1.30974
3           -0.151615
4            1.20676
5           -1.24094
6            0.846706
7            0.196981
8           -0.170083
9           -0.863571
dtype: float64, shape: (10,)

to :

new_tsd = tsd.restrict(ep)

print(new_tsd)
Time (s)
----------  ---------
0            0.804629
1            1.26361
2            1.30974
3           -0.151615
7            0.196981
8           -0.170083
9           -0.863571
dtype: float64, shape: (7,)

Numpy & pynapple#

Pynapple relies on numpy to store the data. Pynapple objects behave very similarly to numpy and numpy functions can be applied directly

tsdtensor = nap.TsdTensor(t=np.arange(100), d=np.random.rand(100, 3, 4))

If a numpy function preserves the time axis, a pynapple object is returned.

In this example, averaging a TsdTensor along the second dimension returns a TsdFrame:

print(
    np.mean(tsdtensor, 1)
    )
Time (s)           0         1         2         3
----------  --------  --------  --------  --------
0.0         0.456529  0.486815  0.442358  0.824491
1.0         0.463704  0.451731  0.531552  0.669151
2.0         0.51206   0.485676  0.464334  0.613446
3.0         0.554352  0.437293  0.84252   0.411003
4.0         0.702253  0.208185  0.520198  0.433445
5.0         0.724006  0.764171  0.42113   0.690267
6.0         0.711306  0.330314  0.663318  0.248536
...
93.0        0.629412  0.482771  0.713345  0.356533
94.0        0.340105  0.495151  0.573317  0.748272
95.0        0.639254  0.826335  0.524838  0.490723
96.0        0.380355  0.160256  0.138811  0.221134
97.0        0.518439  0.64976   0.429102  0.629225
98.0        0.762987  0.512729  0.387474  0.522333
99.0        0.365394  0.669908  0.437793  0.586046
dtype: float64, shape: (100, 4)

Averaging along the time axis will return a numpy array object:

print(
    np.mean(tsdtensor, 0)
    )
[[0.52628581 0.49835007 0.50849513 0.53301711]
 [0.46935322 0.50595799 0.47134544 0.5316411 ]
 [0.47956317 0.4509429  0.46007184 0.47244292]]

Slicing objects#

Slicing time series and intervals#

Like numpy array#

Ts, Tsd, TsdFrame, TsdTensor and IntervalSet can be sliced similar to numpy array:

tsdframe = nap.TsdFrame(t=np.arange(10)/10, d=np.random.randn(10,4))
print(tsdframe)
Time (s)              0          1          2            3
----------  -----------  ---------  ---------  -----------
0           -1.51935      0.547754  -1.13527    1.83891
0.1          0.282107     0.577314   0.744491   0.870785
0.2         -0.173759     0.545507  -1.76398    1.28289
0.3         -0.00757362  -1.59744   -0.511918  -0.290852
0.4         -1.82648      1.14726    0.110428  -1.80471
0.5         -1.60242     -0.740818   2.30154   -0.385621
0.6         -0.040088    -1.00082    0.386546   0.705828
0.7         -0.162878     0.640762   1.01921   -0.00110099
0.8          0.620065    -0.871657   0.765684   1.27653
0.9          1.02289      0.532443  -1.42729    0.548201
dtype: float64, shape: (10, 4)
print(tsdframe[4:7])
Time (s)            0          1         2          3
----------  ---------  ---------  --------  ---------
0.4         -1.82648    1.14726   0.110428  -1.80471
0.5         -1.60242   -0.740818  2.30154   -0.385621
0.6         -0.040088  -1.00082   0.386546   0.705828
dtype: float64, shape: (3, 4)
print(tsdframe[:,0])
Time (s)
----------  -----------
0           -1.51935
0.1          0.282107
0.2         -0.173759
0.3         -0.00757362
0.4         -1.82648
0.5         -1.60242
0.6         -0.040088
0.7         -0.162878
0.8          0.620065
0.9          1.02289
dtype: float64, shape: (10,)
ep = nap.IntervalSet(start=[0, 10, 20], end=[4, 15, 25])
print(ep)
  index    start    end
      0        0      4
      1       10     15
      2       20     25
shape: (3, 2), time unit: sec.
print(ep[0:2])
  index    start    end
      0        0      4
      1       10     15
shape: (2, 2), time unit: sec.
print(ep[1])
  index    start    end
      0       10     15
shape: (1, 2), time unit: sec.

Like pandas DataFrame#

Important

This page references all the way to slice TsdFrame

TsdFrame can be sliced like pandas DataFrame when the columns have been labelled with strings :

tsdframe = nap.TsdFrame(t=np.arange(10), d=np.random.randn(10,3), columns=['a', 'b', 'c'])
print(tsdframe['a'])
Time (s)
----------  ----------
0           -1.3251
1           -0.114843
2           -1.28809
3            0.0641798
4            0.439171
5           -0.682909
6           -0.928745
7           -0.750153
8           -0.386185
9           -0.856761
dtype: float64, shape: (10,)

but integer-indexing only works like numpy if a list of integers is used to label columns :

tsdframe = nap.TsdFrame(t=np.arange(4), d=np.random.randn(4,3), columns=[3, 2, 1])
print(tsdframe, "\n")
print(tsdframe[3])
Time (s)            3          2          1
----------  ---------  ---------  ---------
0           -1.39729    0.365866  0.360347
1            0.758081  -1.60525   0.0020692
2            2.82341    0.456726  1.22386
3           -0.505519   2.09828   0.073601
dtype: float64, shape: (4, 3) 

[-0.50551873  2.09828278  0.07360097]

The loc method can be used to slice column-based only:

print(tsdframe.loc[3])

Slicing TsGroup#

TsGroup object can be indexed to return directly the timestamp object or sliced to return a new TsGroup.

Indexing:

print(tsgroup[0], "\n")
Time (s)
0.007183573386704278
0.03025594799509479
0.12199619777989446
0.13235092381476132
0.13923446440824438
0.24720933126959332
0.31461038200787383
...
99.62618226047667
99.66830336991302
99.81799794207575
99.87411778248244
99.97082393009332
99.98872232740732
99.9911619053701
shape: 1000 

Slicing:

print(tsgroup[[0, 2]])
  Index     rate
-------  -------
      0  10.0009
      2  30.0026

Core functions#

Objects have methods that can help transform and refine time series. This is a non exhaustive list.

Binning: counting events#

Time series objects have the count method that count the number of timestamps. This is typically used when counting number of spikes within a particular bin over multiple intervals. The returned object is a Tsd or TsdFrame with the timestamps being the center of the bins.

count = tsgroup.count(1)

print(count)
Time (s)              0    1    2
------------------  ---  ---  ---
0.5007438125380954   12   25   29
1.500743813          11   28   40
2.500743813          12   10   29
3.500743813          11   15   32
4.500743813          18   19   35
5.500743813          13   19   31
6.500743813          12   28   24
...
93.500743813         10   18   27
94.500743813         11   19   33
95.500743813          9   22   31
96.500743813         11   24   28
97.500743813         13   20   25
98.500743813         12   20   23
99.500743813         11   15   38
dtype: int64, shape: (100, 3)

Thresholding#

Some time series have specific methods. The threshold method of Tsd returns a new Tsd with all the data above or below a given value.

tsd = nap.Tsd(t=np.arange(10), d=np.random.rand(10))

print(tsd)

print(tsd.threshold(0.5))
Time (s)
----------  ---------
0           0.229696
1           0.742656
2           0.937468
3           0.844134
4           0.510586
5           0.516754
6           0.336158
7           0.811269
8           0.737273
9           0.0791339
dtype: float64, shape: (10,)
Time (s)
----------  --------
1           0.742656
2           0.937468
3           0.844134
4           0.510586
5           0.516754
7           0.811269
8           0.737273
dtype: float64, shape: (7,)

An important aspect of the tresholding is that the time support get updated based on the time points remaining. To get the epochs above/below a certain threshold, you can access the time support of the returned object.

print(tsd.time_support)

print(tsd.threshold(0.5, "below").time_support)
  index    start    end
      0        0      9
shape: (1, 2), time unit: sec.
  index    start    end
      0      0      0.5
      1      5.5    6.5
      2      8.5    9
shape: (3, 2), time unit: sec.

Time-bin averaging of data#

Many analyses requires to bring time series to the same rates and same dimensions. A quick way to downsample a time series to match in size for example a count array is to bin average. The bin_average method takes a bin size in unit of time.

tsdframe = nap.TsdFrame(t=np.arange(0, 100)/10, d=np.random.randn(100,3))

print(tsdframe)
Time (s)             0           1           2
----------  ----------  ----------  ----------
0.0         -1.59876     0.0626272   0.0695619
0.1          0.176153   -0.996522   -1.83217
0.2         -0.967167   -0.366253   -0.744114
0.3         -0.974757    0.178446   -0.067902
0.4         -0.530757   -0.177611    0.334055
0.5         -0.433819   -1.2486     -0.524047
0.6          0.312946    2.89925     0.0538885
...
9.3         -0.247851    0.162006    0.0207975
9.4         -0.0406946   0.896613    0.296054
9.5          0.0616369  -0.556634   -0.112685
9.6          1.25613     0.725925   -0.869725
9.7          0.613995    0.422976    0.716562
9.8          0.306412   -0.641917   -0.247436
9.9          0.859191   -0.166973    0.374823
dtype: float64, shape: (100, 3)

Here we go from a timepoint every 100ms to a timepoint every second.

print(tsdframe.bin_average(1))
Time (s)             0           1           2
----------  ----------  ----------  ----------
0.5         -0.424407    0.0373967  -0.0331947
1.5         -0.0817346  -0.133237    0.236837
2.5         -0.459146    0.104964    0.117151
3.5          0.0877348  -0.0738502  -0.291741
4.5          0.279906    0.313906    0.195541
5.5          0.177091   -0.140188   -0.283464
6.5          0.0794043   0.317674    0.170849
7.5          0.62742     0.0650999   0.379449
8.5          0.46178     0.2797      0.147989
9.5         -0.0985257  -0.0155943   0.131447
dtype: float64, shape: (10, 3)

Loading data#

See here for more details about loading data.

Loading NWB#

Pynapple supports by default the NWB standard.

NWB files can be loaded with :

nwb = nap.load_file("path/to/my.nwb")

or directly with the NWBFile class:

nwb = nap.NWBFile("path/to/my.nwb")

print(nwb)
my.nwb
┍━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┑
│ Keys            │ Type        │
┝━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━┥
│ units           │ TsGroup     │
┕━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┙

The returned object behaves like a dictionnary. The first column indicates the keys. The second column indicate the object type.

print(nwb['units'])
  Index    rate  location      group
-------  ------  ----------  -------
      0    1.0  brain        0
      1    1.0  brain        0
      2    1.0  brain        0

Overview of advanced analysis#

The process module of pynapple contains submodules that group methods that can be applied for high level analysis. All of the method are directly available from the nap namespace.

Discrete correlograms & ISI

This module analyses discrete events, specifically correlograms (for example by computing the cross-correlograms of a population of neurons) and interspike interval (ISI) distributions.

Bayesian decoding

The decoding module perfoms bayesian decoding given a set of tuning curves and a TsGroup.

Filtering

Bandpass, lowpass, highpass or bandstop filtering can be done to any time series using either Butterworth filter or windowed-sinc convolution.

Perievent time histogram

The perievent module has a set of functions to center time series and timestamps data around a particular events.

Randomizing

The randomize module holds multiple technique to shuffle timestamps in order to create surrogate datasets.

Spectrum

The spectrum module contains the methods to return the (mean) power spectral density of a time series.

Tuning curves

Tuning curves of neurons based on spiking or calcium activity can be computed.

Wavelets

The wavelets module performs Morlet wavelets decomposition of a time series.

Phases & envelopes

This modules allows for computing analytic signals and extracting the phase and envelope.

Warping

This module provides methods for building trial-based tensors and time-warped trial-based tensors.