Computing perievent#
This tutorial demonstrates how we use Pynapple on various publicly available datasets in systems neuroscience to streamline analysis. In this tutorial, we will examine the dataset from Zheng et al (2022), which was used to generate Figure 4c in the publication.
The NWB file for the example used here is provided in this repository. The entire dataset can be downloaded here.
import matplotlib.pyplot as plt
import numpy as np
import pynapple as nap
import seaborn as sns
Stream the data from DANDI#
from pynwb import NWBHDF5IO
from dandi.dandiapi import DandiAPIClient
import fsspec
from fsspec.implementations.cached import CachingFileSystem
import h5py
# Enter the session ID and path to the file
dandiset_id, filepath = ("000207", "sub-4/sub-4_ses-4_ecephys.nwb")
with DandiAPIClient() as client:
asset = client.get_dandiset(dandiset_id, "draft").get_asset_by_path(filepath)
s3_url = asset.get_content_url(follow_redirects=1, strip_query=True)
# first, create a virtual filesystem based on the http protocol
fs = fsspec.filesystem("http")
# create a cache to save downloaded data to disk (optional)
fs = CachingFileSystem(
fs=fs,
cache_storage="nwb-cache", # Local folder for the cache
)
# next, open the file
file = h5py.File(fs.open(s3_url, "rb"))
io = NWBHDF5IO(file=file, load_namespaces=True)
/home/runner/.local/lib/python3.12/site-packages/hdmf/spec/namespace.py:535: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.5.0 because version 1.8.0 is already loaded.
warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/runner/.local/lib/python3.12/site-packages/hdmf/spec/namespace.py:535: UserWarning: Ignoring cached namespace 'core' version 2.5.0 because version 2.7.0 is already loaded.
warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/runner/.local/lib/python3.12/site-packages/hdmf/spec/namespace.py:535: UserWarning: Ignoring cached namespace 'hdmf-experimental' version 0.1.0 because version 0.5.0 is already loaded.
warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
Parsing the data#
The first step is to load the data from the Neurodata Without Borders (NWB) file. This is done as follows:
custom_params = {"axes.spines.right": False, "axes.spines.top": False}
sns.set_theme(style="ticks", palette="colorblind", font_scale=1.5, rc=custom_params)
data = nap.NWBFile(io.read()) # Load the NWB file for this dataset
# What does this look like?
print(data)
4
┍━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┑
│ Keys │ Type │
┝━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━┥
│ units │ TsGroup │
│ timediscrimination_table │ IntervalSet │
│ recognition_table │ IntervalSet │
│ encoding_table │ IntervalSet │
│ experiment_ids │ Tsd │
│ events │ Tsd │
┕━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┙
Get spike timings
spikes = data["units"]
What does this look like?
print(spikes)
Index rate x y z imp location filtering ...
------- ------- ----- ------ ------ ----- ----------------- ----------- -----
0 7.0009 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
1 7.24478 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
2 6.09486 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
3 6.91548 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
4 0.40162 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
5 0.46117 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
6 1.81066 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
... ... ... ... ... ... ... ... ...
28 1.13432 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
29 0.60013 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
30 0.18433 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
31 2.3282 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
32 0.31052 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
33 0.69087 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
34 0.65578 26.63 -15.83 -16.49 nan hippocampus_right 300-3000Hz ...
This TsGroup has, among other information, the mean firing rate of the unit, the X, Y and Z coordinates, the brain region the unit was recorded from, and the channel number on which the unit was located.
Next, let’s get the encoding table of all stimulus times, as shown below:
encoding_table = data["encoding_table"]
# What does this look like?
print(encoding_table)
index start end fixcross_time ExperimentID boundary1_time boundary2_time boundary3_time stimCategory Clip_name
0 1.06846275 9.10133 0.05655475 70.0 5.06846275 nan nan 0 NB_24.mp4
1 10.18234225 18.1145585 9.116686249999999 70.0 12.88060075 14.43444675 nan 1 SB_23.mp4
2 19.071072 27.13738775 18.13049825 70.0 23.071071999999997 nan nan 2 HB_20.mp4
3 28.1604755 36.390234 27.15492625 70.0 32.1604755 nan nan 0 NB_26.mp4
4 37.470909 44.5844975 36.40934675 70.0 41.470909 nan nan 0 NB_25.mp4
5 45.5747065 53.5579785 44.6036855 70.0 49.5747065 nan nan 0 NB_11.mp4
6 54.58672125 62.78414525 53.577170249999995 70.0 56.074423249999995 nan nan 1 SB_6.mp4
... ... ... ... ... ... ... ... ... ...
83 1.06846275 9.10133 867.00219025 70.0 868.63873425 869.91904325 871.67373 1 SB_30.mp4
84 10.18234225 18.1145585 882.9728282499999 70.0 887.9325232499999 886.1784335 890.2121689999999 2 HB_6.mp4
85 19.071072 27.13738775 891.7479727499999 70.0 898.284193 nan nan 1 SB_2.mp4
86 28.1604755 36.390234 900.940121 70.0 905.9068504999999 nan nan 2 HB_27.mp4
87 37.470909 44.5844975 910.19093775 70.0 915.1383965 912.6979262499999 917.06715075 2 HB_10.mp4
88 45.5747065 53.5579785 919.3549025 70.0 921.90619325 923.47853075 924.9241169999999 1 SB_27.mp4
89 54.58672125 62.78414525 929.7180985 70.0 932.99710575 934.19136925 nan 1 SB_25.mp4
shape: (90, 2), time unit: sec.
This table has, among other things, the scene boundary times for which we will plot the peri-event time histogram (PETH).
There are 3 types of scene boundaries in this data. For the purposes of demonstration, we will use only the “No boundary” (NB) and the “Hard boundary” (HB conditions). The encoding table has a stimCategory field, which tells us the type of boundary corresponding to a given trial.
stimCategory = np.array(
encoding_table.stimCategory
) # Get the scene boundary type for all trials
# What does this look like?
print(stimCategory)
[0 1 2 0 0 0 1 0 2 1 2 0 2 0 1 1 1 1 0 0 2 0 0 2 0 1 0 2 0 0 2 0 0 0 0 2 2
2 0 0 1 1 1 1 0 2 1 1 0 2 1 0 2 2 2 0 1 0 1 1 2 2 0 2 2 2 1 1 2 1 0 2 2 1
0 1 2 0 2 2 1 1 1 1 2 1 2 2 1 1]
Trials marked 0 correspond to NB, while trials marked 2 correspond to HB. Let’s extract the trial numbers for NB and HB trials, as shown below:
indxNB = np.where(stimCategory == 0) # NB trial indices
indxHB = np.where(stimCategory == 2) # HB trial indices
The encoding table also has 3 types of boundary times. For the purposes of our demonstration, we will focus on boundary1 times, and extract them as shown below:
boundary1_time = np.array(encoding_table.boundary1_time) # Get timings of Boundary1
# What does this look like?
print(boundary1_time)
[ 5.06846275 12.88060075 23.071072 32.1604755 41.470909
49.5747065 56.07442325 72.803867 82.1299925 92.77667275
99.9925845 109.0787315 118.0778575 133.12435825 140.94827125
147.8236585 160.726736 170.441769 183.29262575 191.11327175
199.63382425 208.5142425 217.6186295 232.62687825 241.7270365
250.58327775 259.5545145 268.6661045 278.026902 287.04517875
301.2426685 310.21093375 319.1822215 328.13037175 337.16751875
363.737004 372.785663 381.86749625 391.2704085 400.5077995
407.28660475 416.0272855 431.792285 441.75272875 448.88401175
456.97766275 463.88864475 472.13035175 490.53584775 499.7064625
507.6917495 518.78793775 528.140675 537.1412335 552.5116415
562.4287995 569.44012625 580.99649325 591.974217 597.74463025
608.3638655 624.44389125 633.4175285 642.3969695 651.776966
660.699774 676.9439885 681.3602465 692.799878 699.53861175
711.341688 720.4819275 729.43782175 772.99961425 780.19490625
788.1175045 798.34264525 807.54519 817.25781375 835.1736475
842.54760125 852.294991 861.40924725 868.63873425 887.93252325
898.284193 905.9068505 915.1383965 921.90619325 932.99710575]
This contains the timings of all boundaries in this block of trials. Note that we also have the type of boundary for each trial. Let’s store the NB and HB boundary timings in separate variables, as Pynapple Ts objects:
NB = nap.Ts(boundary1_time[indxNB]) # NB timings
HB = nap.Ts(boundary1_time[indxHB]) # HB timings
Peri-Event Time Histogram (PETH)#
A PETH is a plot where we align a variable of interest (for example, spikes) to an external event (in this case, to boundary times). This visualization helps us infer relationships between the two.
For our demonstration, we will align the spikes of the first unit, which is located in the hippocampus, to the times of NB and HB. You can do a quick check to verify that the first unit is indeed located in the hippocampus, we leave it to you.
With Pynapple, PETHs can be computed with a single line of code!
NB_peth = nap.compute_perievent(
spikes[0], NB, minmax=(-0.5, 1)
) # Compute PETH of unit aligned to NB, for -0.5 to 1s windows
HB_peth = nap.compute_perievent(
spikes[0], HB, minmax=(-0.5, 1)
) # Compute PETH of unit aligned to HB, for -0.5 to 1s windows
Let’s plot the PETH
plt.figure(figsize =(15,8))
plt.subplot(211) # Plot the figures in 2 rows
plt.plot(NB_peth.to_tsd(), "o",
color=[102 / 255, 204 / 255, 0 / 255],
markersize=4)
plt.axvline(0, linewidth=2, color="k", linestyle="--") # Plot a line at t = 0
plt.yticks([0, 30]) # Set ticks on Y-axis
plt.gca().set_yticklabels(["1", "30"]) # Label the ticks
plt.xlabel("Time from NB (s)") # Time from boundary in seconds, on X-axis
plt.ylabel("Trial Number") # Trial number on Y-axis
plt.subplot(212)
plt.plot(HB_peth.to_tsd(), "o",
color=[255 / 255, 99 / 255, 71 / 255],
markersize=4) # Plot PETH
plt.axvline(0, linewidth=2, color="k", linestyle="--") # Plot a line at t = 0
plt.yticks([0, 30]) # Set ticks on Y-axis
plt.gca().set_yticklabels(["1", "30"]) # Label the ticks
plt.xlabel("Time from HB (s)") # Time from boundary in seconds, on X-axis
plt.ylabel("Trial Number") # Trial number on Y-axis
plt.subplots_adjust(wspace=0.2, hspace=0.5, top=0.85)
Awesome! From the PETH, we can see that this neuron fires after boundary onset in HB trials. This is an example of what the authors describe here as a boundary cell.
PETH of firing rate for NB and HB cells#
Now that we have the PETH of spiking, we can go one step further. We will plot the mean firing rate of this cell aligned to the boundary for each trial type. Doing this in Pynapple is very simple!
Use Pynapple to compute binned spike counts
bin_size = 0.01
counts_NB = NB_peth.count(bin_size) # Spike counts binned in 10ms steps, for NB trials
counts_HB = HB_peth.count(bin_size) # Spike counts binned in 10ms steps, for HB trials
Compute firing rate for both trial types
fr_NB = counts_NB / bin_size
fr_HB = counts_HB / bin_size
Smooth the firing rate with a gaussian window with std=4*bin_size
counts_NB = counts_NB.smooth(bin_size*4)
counts_HB = counts_HB.smooth(bin_size*4)
Compute the mean firing rate for both trial types
meanfr_NB = fr_NB.mean(axis=1)
meanfr_HB = fr_HB.mean(axis=1)
Compute standard error of mean (SEM) of the firing rate for both trial types
from scipy.stats import sem
error_NB = sem(fr_NB, axis=1)
error_HB = sem(fr_HB, axis=1)
Plot the mean +/- SEM of firing rate for both trial types
plt.figure(figsize =(15,8))
plt.plot(
meanfr_NB, color=[102 / 255, 204 / 255, 0 / 255], label="NB"
) # Plot mean firing rate for NB trials
# Plot SEM for NB trials
plt.fill_between(
meanfr_NB.index.values,
meanfr_NB.values - error_NB,
meanfr_NB.values + error_NB,
color=[102 / 255, 204 / 255, 0 / 255],
alpha=0.2,
)
plt.plot(
meanfr_HB, color=[255 / 255, 99 / 255, 71 / 255], label="HB"
) # Plot mean firing rate for HB trials
# Plot SEM for NB trials
plt.fill_between(
meanfr_HB.index.values,
meanfr_HB.values - error_HB,
meanfr_HB.values + error_HB,
color=[255 / 255, 99 / 255, 71 / 255],
alpha=0.2,
)
plt.axvline(0, linewidth=2, color="k", linestyle="--") # Plot a line at t = 0
plt.xlabel("Time from boundary (s)") # Time from boundary in seconds, on X-axis
plt.ylabel("Firing rate (Hz)") # Firing rate in Hz on Y-axis
plt.legend(loc="upper right")
<matplotlib.legend.Legend at 0x7fde90b79dc0>
This plot verifies what we visualized in the PETH rasters above, that this cell responds to a hard boundary. Hence, it is a boundary cell. To learn more about these cells, please check out the original study here.
I hope this tutorial was helpful. If you have any questions, comments or suggestions, please feel free to reach out to the Pynapple Team!
Authors
Dhruv Mehrotra
Guillaume Viejo