1. Data wrangling¶
1.1. Working with data from the AllenSDK¶
The data we use in our connectivity models comes from the AAV tracing experiments performed at the Allen Institute for Brain Science. The experiments consist of injecting an AAV viral tracer into a region of the mouse brain and subsequently imaging the brain after the virus has propagated down the axons of the infected neurons. This imaging reveals how a group of infected neurons are structurally connected to other regions of the brain.
1.1.1. AllenSDK Package¶
allensdk is a python package provided by the Allen Institute to allow for
retrieval and manipulation of the data generated by the experiments they perform.
We utilize the mcmodels.core
subpackage which supports the data service of the
previously mentioned viral tracing experiments. Specifically, we incorporate
the allensdk.core.MouseConnectivityCache object to pull experimental data
as well as register the data in the Allen 3D Reference Space. More information
can be found here
1.2. Core Package¶
1.2.1. VoxelModelCache
¶
The VoxelModelCache
extends allensdk.core.MouseConnectivityCache to download
and pull the latest iteration of our voxel model. Additionally, this
class implements the get_experiment_data
to pull
experiment injection and projection volumes given that the
experiment satisfies all supplied parameters
>>> from mcmodels.core import VoxelModelCache
>>> cache = VoxelModelCache(manifest_file='connectivity/voxel_model_manifest.json')
>>> # download and cache the latest voxel model
>>> # this method returns a tuple with object types:
>>> # (VoxelConnectivityArray, Mask, Mask)
>>> voxel_array, source_mask, target_mask = cache.get_voxel_connectivity_array()
>>> # download and cache a regionalized voxel model
>>> normalized_connection_density = cache.get_normalized_connection_density()
>>> # get all wildtype, cortical experiment data
>>> # this method returns a VoxelData object
>>> cortex_data = cache.get_experiment_data(injection_structure_ids=[315], cre=None)
See VoxelModelCache
and VoxelData
for more information.
1.2.2. Mask
class¶
In our package, we define methods relating to registering data into the 3D
reference space in our Mask
class. Specifically, we can:
- query only specific structures of the brain
- map masked vectors of the brain back to their corresponding locations in the 3D reference space.
- determine to which structure each element of a masked vector belongs
Mask
is most often initialized through the Mask.from_cache
classmethod using a VoxelModelCache
object
and optional keyword arguments for subetting either hemispheres or structures. In the case of
the new voxel scale model, we define the source to be right hemisphere,
and in this case the cortex:
>>> from mcmodels.core import Mask
>>> source_mask = Mask.from_cache(cache, hemisphere=2, structure_ids=[315])
>>> source_mask
Mask(hemisphere_id=2, structure_ids=[315])
The method get_experiment_data
in VoxelModelData
or VoxelData
sets source and target matrices as attributes which have masked, flattened
injection and projection volumes for each experiment as rows. One can determine
the structure_id of a given column in either of these arrays using the method
get_key
from the Mask
object:
>>> import numpy as np
>>> key = source_mask.get_key()
>>> key.shape
xxxx
>>> np.unique(key)
np.array([315])
The key by default will include only the structure ids specified in the
construction of the Mask
object. However, we can pass specific
structure_ids
to the get_key
method if we are interested in a finer or
coarser level in the ontology
>>> # get set of summary structures
>>> structure_tree = cache.get_structure_tree()
>>> summary_structures = structure_tree.get_structures_by_set_id([167587189])[0]
>>> # the new ccf does not have sturcture 934 as a structure id
>>> structure_ids = [s['id'] for s in summary_structures if s['id'] != 934]
>>> key = source_mask.get_key(structure_ids=structure_ids)
>>> len(np.unique(key))
293
The key
array has length equal to the number of voxels in the cortex (R
hemisphere) as that is the definition of our mask
. However, if we just want
the indices for a given structure:
>>> # get sturcture id correponding to VISp
>>> visp_id = structure_tree.get_structures_by_acronym(["VISp"])[0]["id"]
>>> visp_idx = source_mask.get_structure_indices(structure_ids=[visp_id])
>>> len(visp_idx)
xxx
Given a masked injection/projection volume (a row in the source or target arrays) one can map the flattened vector back to the 3D reference space:
>>> # our key is a masked, flattened volume, lets map it back
>>> key_volume = source_mask.map_masked_to_annotation(key)
>>> key_volume.shape
(132, 80, 114)