Metadata#
Metadata can be added to TsGroup, IntervalSet, and TsdFrame objects at initialization or after an object has been created.
TsGroupmetadata is information associated with each Ts/Tsd object, such as brain region or unit type.IntervalSetmetadata is information associated with each interval, such as a trial label or stimulus condition.TsdFramemetadata is information associated with each column, such as a channel or position.
Adding metadata at initialization#
At initialization, metadata can be passed via a dictionary or pandas DataFrame using the keyword argument metadata. The metadata name is taken from the dictionary key or DataFrame column, and it can be set to any string name with a couple class-specific exceptions.
Class-specific exceptions
If column names are supplied to
TsdFrame, metadata cannot overlap with those names.The
rateattribute forTsGroupis stored with the metadata and cannot be overwritten.
The length of the metadata must match the length of the object it describes (see class examples below for more detail).
TsGroup#
Metadata added to TsGroup must match the number of Ts/Tsd objects, or the length of its index property.
metadata = {"region": ["pfc", "pfc", "hpc", "hpc"]}
tsgroup = nap.TsGroup(group, metadata=metadata)
print(tsgroup)
Index rate region
------- ------- --------
1 1.00015 pfc
2 2.00031 pfc
3 3.00046 hpc
4 4.00061 hpc
When initializing with a DataFrame, the index must align with the input dictionary keys (only when a dictionary is used to create the TsGroup).
metadata = pd.DataFrame(
index=group.keys(),
data=["pfc", "pfc", "hpc", "hpc"],
columns=["region"]
)
tsgroup = nap.TsGroup(group, metadata=metadata)
print(tsgroup)
Index rate region
------- ------- --------
1 1.00015 pfc
2 2.00031 pfc
3 3.00046 hpc
4 4.00061 hpc
IntervalSet#
Metadata added to IntervalSet must match the number of intervals, or the length of its index property.
metadata = {
"reward": [1, 0, 1],
"choice": ["left", "right", "left"],
}
intervalset = nap.IntervalSet(starts, ends, metadata=metadata)
print(intervalset)
index start end reward choice
0 0 30 1 left
1 35 65 0 right
2 70 100 1 left
shape: (3, 2), time unit: sec.
Metadata can be initialized as a DataFrame using the metadata argument, or it can be inferred when initializing an IntervalSet with a DataFrame.
df = pd.DataFrame(
data=[[0, 30, 1, "left"], [35, 65, 0, "right"], [70, 100, 1, "left"]],
columns=["start", "end", "reward", "choice"]
)
intervalset = nap.IntervalSet(df)
print(intervalset)
index start end reward choice
0 0 30 1 left
1 35 65 0 right
2 70 100 1 left
shape: (3, 2), time unit: sec.
TsdFrame#
Metadata added to TsdFrame must match the number of data columns, or the length of its columns property.
metadata = {
"color": ["red", "blue", "green"],
"position": [10,20,30],
"label": ["x", "x", "y"]
}
tsdframe = nap.TsdFrame(d=d, t=t, columns=["a", "b", "c"], metadata=metadata)
print(tsdframe)
Time (s) a b c
---------- --- ---- -----
0.0 1 2 3
1.0 1 2 3
2.0 1 2 3
3.0 1 2 3
4.0 1 2 3
Metadata
color red blue green
position 10 20 30
... ... ... ...
dtype: int64, shape: (5, 3)
When initializing with a DataFrame, the DataFrame index must match the TsdFrame columns.
metadata = pd.DataFrame(
index=["a", "b", "c"],
data=[["red", 10, "x"], ["blue", 20, "x"], ["green", 30, "y"]],
columns=["color", "position", "label"],
)
tsdframe = nap.TsdFrame(d=d, t=t, columns=["a", "b", "c"], metadata=metadata)
print(tsdframe)
Time (s) a b c
---------- --- ---- -----
0.0 1 2 3
1.0 1 2 3
2.0 1 2 3
3.0 1 2 3
4.0 1 2 3
Metadata
color red blue green
position 10 20 30
... ... ... ...
dtype: int64, shape: (5, 3)
Adding metadata after initialization#
After creation, metadata can be added using the class method set_info(). Additionally, single metadata fields can be added as a dictionary-like key or as an attribute, with a few noted exceptions outlined below.
Note
The remaining metadata examples will be shown on a TsGroup object; however, all examples can be directly applied to IntervalSet and TsdFrame objects.
set_info#
Metadata can be passed as a dictionary or pandas DataFrame as the first positional argument, or metadata can be passed as name-value keyword arguments.
tsgroup.set_info(unit_type=["multi", "single", "single", "single"])
print(tsgroup)
Index rate region unit_type
------- ------- -------- -----------
1 1.00015 pfc multi
2 2.00031 pfc single
3 3.00046 hpc single
4 4.00061 hpc single
Using dictionary-like keys (square brackets)#
Most metadata names can set as a dictionary-like key (i.e. using square brackets). The only exceptions are for IntervalSet, where the names “start” and “end” are reserved for class properties.
tsgroup["depth"] = [0, 1, 2, 3]
print(tsgroup)
Index rate region unit_type depth
------- ------- -------- ----------- -------
1 1.00015 pfc multi 0
2 2.00031 pfc single 1
3 3.00046 hpc single 2
4 4.00061 hpc single 3
Using attribute assignment#
If the metadata name is unique from other class attributes and methods, and it is formatted properly (i.e. only alpha-numeric characters and underscores), it can be set as an attribute (i.e. using a . followed by the metadata name).
tsgroup.label=["MUA", "good", "good", "good"]
print(tsgroup)
Index rate region unit_type depth label
------- ------- -------- ----------- ------- -------
1 1.00015 pfc multi 0 MUA
2 2.00031 pfc single 1 good
3 3.00046 hpc single 2 good
4 4.00061 hpc single 3 good
Allowed data types#
As long as the length of the metadata container matches the length of the object (number of columns for TsdFrame and number of indices for IntervalSet and TsGroup), elements of the metadata can be any data type.
tsgroup.coords = [[1,0],[0,1],[1,1],[2,1]]
print(tsgroup)
Index rate region unit_type depth label coords ...
------- ------- -------- ----------- ------- ------- -------- -----
1 1.00015 pfc multi 0 MUA [1 0] ...
2 2.00031 pfc single 1 good [0 1] ...
3 3.00046 hpc single 2 good [1 1] ...
4 4.00061 hpc single 3 good [2 1] ...
Accessing metadata#
Metadata is stored as a pandas DataFrame, which can be previewed using the metadata attribute.
print(tsgroup.metadata)
rate region unit_type depth label coords
1 1.000153 pfc multi 0 MUA [1, 0]
2 2.000305 pfc single 1 good [0, 1]
3 3.000458 hpc single 2 good [1, 1]
4 4.000611 hpc single 3 good [2, 1]
Single metadata columns (or lists of columns) can be retrieved using the get_info() class method:
print(tsgroup.get_info("region"))
1 pfc
2 pfc
3 hpc
4 hpc
Name: region, dtype: object
Similarly, metadata can be accessed using key indexing (i.e. square brakets)
print(tsgroup["region"])
1 pfc
2 pfc
3 hpc
4 hpc
Name: region, dtype: object
Note
Metadata names must be strings. Key indexing with an integer will produce different behavior based on object type.
Finally, metadata that can be set as an attribute can also be accessed as an attribute.
print(tsgroup.region)
1 pfc
2 pfc
3 hpc
4 hpc
Name: region, dtype: object
Overwriting metadata#
User-set metadata is mutable and can be overwritten.
print(tsgroup, "\n")
tsgroup.set_info(label=["A", "B", "C", "D"])
print(tsgroup)
Index rate region unit_type depth label coords ...
------- ------- -------- ----------- ------- ------- -------- -----
1 1.00015 pfc multi 0 MUA [1 0] ...
2 2.00031 pfc single 1 good [0 1] ...
3 3.00046 hpc single 2 good [1 1] ...
4 4.00061 hpc single 3 good [2 1] ...
Index rate region unit_type depth label coords ...
------- ------- -------- ----------- ------- ------- -------- -----
1 1.00015 pfc multi 0 A [1 0] ...
2 2.00031 pfc single 1 B [0 1] ...
3 3.00046 hpc single 2 C [1 1] ...
4 4.00061 hpc single 3 D [2 1] ...
Dropping metadata#
To drop metadata, use the drop_info() method. Multiple metadata columns can be dropped by passing a list of metadata names.
print(tsgroup, "\n")
tsgroup.drop_info("coords")
print(tsgroup)
Index rate region unit_type depth label coords ...
------- ------- -------- ----------- ------- ------- -------- -----
1 1.00015 pfc multi 0 A [1 0] ...
2 2.00031 pfc single 1 B [0 1] ...
3 3.00046 hpc single 2 C [1 1] ...
4 4.00061 hpc single 3 D [2 1] ...
Index rate region unit_type depth label
------- ------- -------- ----------- ------- -------
1 1.00015 pfc multi 0 A
2 2.00031 pfc single 1 B
3 3.00046 hpc single 2 C
4 4.00061 hpc single 3 D
Restricting metadata#
Instead of dropping multiple metadata fields, you may want to restrict to a set of specified fields, i.e. select which columns to keep. For this operation, use the restrict_info() method. Multiple metadata columns can be kept by passing a list of metadata names.
import copy
tsgroup2 = copy.deepcopy(tsgroup)
tsgroup2.restrict_info("region")
print(tsgroup2)
Index rate region
------- ------- --------
1 1.00015 pfc
2 2.00031 pfc
3 3.00046 hpc
4 4.00061 hpc
Note
The rate column will always be kept for a TsGroup.
Using metadata to slice objects#
Metadata can be used to slice or filter objects based on metadata values.
print(tsgroup[tsgroup.label == "A"])
Index rate region unit_type depth label
------- ------- -------- ----------- ------- -------
1 1.00015 pfc multi 0 A
groupby: Using metadata to group objects#
Similar to pandas, metadata can be used to group objects based on one or more metadata columns using the object method groupby, where the first argument is the metadata columns name(s) to group by. This function returns a dictionary with keys corresponding to unique groups and values corresponding to object indices belonging to each group.
print(tsgroup,"\n")
print(tsgroup.groupby("region"))
Index rate region unit_type depth label
------- ------- -------- ----------- ------- -------
1 1.00015 pfc multi 0 A
2 2.00031 pfc single 1 B
3 3.00046 hpc single 2 C
4 4.00061 hpc single 3 D
{'hpc': array([3, 4]), 'pfc': array([1, 2])}
Grouping by multiple metadata columns should be passed as a list.
tsgroup.groupby(["region","unit_type"])
{('hpc', 'single'): array([3, 4]),
('pfc', 'multi'): array([1]),
('pfc', 'single'): array([2])}
The optional argument get_group can be provided to return a new object corresponding to a specific group.
tsgroup.groupby("region", get_group="hpc")
Index rate region unit_type depth label
------- ------- -------- ----------- ------- -------
3 3.00046 hpc single 2 C
4 4.00061 hpc single 3 D
groupby_apply: Applying functions to object groups#
The groupby_apply object method allows a specific function to be applied to object groups. The first argument, same as groupby, is the metadata column(s) used to group the object. The second argument is the function to apply to each group. If only these two arguments are supplied, it is assumed that the grouped object is the first and only input to the applied function. This function returns a dictionary, where keys correspond to each unique group, and values correspond to the function output on each group.
print(tsdframe,"\n")
print(tsdframe.groupby_apply("label", np.mean))
Time (s) a b c
---------- --- ---- -----
0.0 1 2 3
1.0 1 2 3
2.0 1 2 3
3.0 1 2 3
4.0 1 2 3
Metadata
color red blue green
position 10 20 30
... ... ... ...
dtype: int64, shape: (5, 3)
{'x': np.float64(1.5), 'y': np.float64(3.0)}
If the applied function requires additional inputs, these can be passed as additional keyword arguments into groupby_apply.
feature = nap.Tsd(t=np.arange(100), d=np.repeat([0,1], 50))
tsgroup.groupby_apply(
"region",
nap.compute_tuning_curves,
features=feature,
bins=2)
{'hpc': <xarray.DataArray (unit: 2, 0: 2)> Size: 32B
array([[3.02, 2.88],
[4.1 , 3.82]])
Coordinates:
* unit (unit) int64 16B 3 4
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [50. 50.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0,
'pfc': <xarray.DataArray (unit: 2, 0: 2)> Size: 32B
array([[1.02, 0.94],
[1.72, 2.22]])
Coordinates:
* unit (unit) int64 16B 1 2
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [50. 50.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0}
Alternatively, an anonymous function can be passed instead that defines additional arguments.
func = lambda x: nap.compute_tuning_curves(x, features=feature, bins=2)
tsgroup.groupby_apply("region", func)
{'hpc': <xarray.DataArray (unit: 2, 0: 2)> Size: 32B
array([[3.02, 2.88],
[4.1 , 3.82]])
Coordinates:
* unit (unit) int64 16B 3 4
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [50. 50.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0,
'pfc': <xarray.DataArray (unit: 2, 0: 2)> Size: 32B
array([[1.02, 0.94],
[1.72, 2.22]])
Coordinates:
* unit (unit) int64 16B 1 2
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [50. 50.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0}
An anonymous function can also be used to apply a function where the grouped object is not the first input.
func = lambda x: nap.compute_tuning_curves(
data=tsgroup,
features=feature,
bins=2,
epochs=x)
intervalset.groupby_apply("choice", func)
{'left': <xarray.DataArray (unit: 4, 0: 2)> Size: 64B
array([[1.06451613, 1.03333333],
[1.70967742, 2.3 ],
[2.96774194, 2.93333333],
[4.06451613, 4.06666667]])
Coordinates:
* unit (unit) int64 32B 1 2 3 4
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [31. 30.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0,
'right': <xarray.DataArray (unit: 4, 0: 2)> Size: 64B
array([[0.86666667, 0.875 ],
[1.66666667, 1.875 ],
[3. , 3. ],
[3.73333333, 3.875 ]])
Coordinates:
* unit (unit) int64 32B 1 2 3 4
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [15. 16.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0}
Alternatively, the optional parameter input_key can be passed to specify which keyword argument the grouped object corresponds to. Other required arguments of the applied function need to be passed as keyword arguments.
intervalset.groupby_apply(
"choice",
nap.compute_tuning_curves,
input_key="epochs",
data=tsgroup,
features=feature,
bins=2)
{'left': <xarray.DataArray (unit: 4, 0: 2)> Size: 64B
array([[1.06451613, 1.03333333],
[1.70967742, 2.3 ],
[2.96774194, 2.93333333],
[4.06451613, 4.06666667]])
Coordinates:
* unit (unit) int64 32B 1 2 3 4
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [31. 30.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0,
'right': <xarray.DataArray (unit: 4, 0: 2)> Size: 64B
array([[0.86666667, 0.875 ],
[1.66666667, 1.875 ],
[3. , 3. ],
[3.73333333, 3.875 ]])
Coordinates:
* unit (unit) int64 32B 1 2 3 4
* 0 (0) float64 16B 0.25 0.75
Attributes:
occupancy: [15. 16.]
bin_edges: [array([0. , 0.5, 1. ])]
fs: 1.0}