Introduction to pybids
Contents
Introduction to pybids
¶
pybids
is a tool to query, summarize and manipulate data using the BIDS standard.
In this tutorial we will use a pybids
test dataset to illustrate some of the functionality of pybids.layout
from bids import BIDSLayout
from bids.tests import get_test_data_path
import os
The BIDSLayout
¶
At the core of pybids is the BIDSLayout
object. A BIDSLayout
is a lightweight Python class that represents a BIDS project file tree and provides a variety of helpful methods for querying and manipulating BIDS files. While the BIDSLayout
initializer has a large number of arguments you can use to control the way files are indexed and accessed, you will most commonly initialize a BIDSLayout
by passing in the BIDS dataset root location as a single argument:
# Here we're using an example BIDS dataset that's bundled with the pybids tests
data_path = '/home/neuro/workshop/data/ds000114/'
# Initialize the layout
layout = BIDSLayout(data_path)
# Print some basic information about the layout
layout
BIDS Layout: ...e/neuro/workshop/data/ds000114 | Subjects: 10 | Sessions: 20 | Runs: 0
Querying the BIDSLayout
¶
When we initialize a BIDSLayout
, all of the files and metadata found under the specified root folder are indexed. This can take a few seconds (or, for very large datasets, a minute or two). Once initialization is complete, we can start querying the BIDSLayout
in various ways. The workhorse method is .get()
. If we call .get()
with no additional arguments, we get back a list of all the BIDS files in our dataset:
all_files = layout.get()
print("There are {} files in the layout.".format(len(all_files)))
print("\nThe first 10 files are:")
all_files[:10]
There are 173 files in the layout.
The first 10 files are:
[<BIDSFile filename='/home/neuro/workshop/data/ds000114/CHANGES'>,
<BIDSJSONFile filename='/home/neuro/workshop/data/ds000114/dataset_description.json'>,
<BIDSFile filename='/home/neuro/workshop/data/ds000114/dwi.bval'>,
<BIDSFile filename='/home/neuro/workshop/data/ds000114/dwi.bvec'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/anat/sub-01_ses-retest_T1w.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/dwi/sub-01_ses-retest_dwi.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_bold.nii.gz'>,
<BIDSDataFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_events.tsv'>]
The returned object is a Python list. By default, each element in the list is a BIDSFile
object. We discuss the BIDSFile
object in much more detail below. For now, let’s simplify things and work with just filenames:
layout.get(return_type='filename')[:10]
['/home/neuro/workshop/data/ds000114/CHANGES',
'/home/neuro/workshop/data/ds000114/dataset_description.json',
'/home/neuro/workshop/data/ds000114/dwi.bval',
'/home/neuro/workshop/data/ds000114/dwi.bvec',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/anat/sub-01_ses-retest_T1w.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/dwi/sub-01_ses-retest_dwi.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_events.tsv']
layout.get_subjects()
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10']
layout.get_tasks()
['covertverbgeneration',
'fingerfootlips',
'linebisection',
'overtverbgeneration',
'overtwordrepetition']
layout.get_entities()
{'subject': <bids.layout.models.Entity at 0x7f77cc836950>,
'session': <bids.layout.models.Entity at 0x7f77cc836710>,
'task': <bids.layout.models.Entity at 0x7f77cc8a1910>,
'acquisition': <bids.layout.models.Entity at 0x7f77cc74fad0>,
'ceagent': <bids.layout.models.Entity at 0x7f77cc74fdd0>,
'reconstruction': <bids.layout.models.Entity at 0x7f77cc76bbd0>,
'direction': <bids.layout.models.Entity at 0x7f77cc76b1d0>,
'run': <bids.layout.models.Entity at 0x7f77cc76bb90>,
'proc': <bids.layout.models.Entity at 0x7f77cc76b190>,
'modality': <bids.layout.models.Entity at 0x7f77cc76be50>,
'echo': <bids.layout.models.Entity at 0x7f77cc50e9d0>,
'recording': <bids.layout.models.Entity at 0x7f77cc50ef90>,
'space': <bids.layout.models.Entity at 0x7f77cc50ecd0>,
'suffix': <bids.layout.models.Entity at 0x7f77cc836c10>,
'scans': <bids.layout.models.Entity at 0x7f77cc50ec50>,
'fmap': <bids.layout.models.Entity at 0x7f77cc50e8d0>,
'datatype': <bids.layout.models.Entity at 0x7f77cc836310>,
'extension': <bids.layout.models.Entity at 0x7f77cc836550>,
'EchoTime': <bids.layout.models.Entity at 0x7f77cc8a1410>,
'FlipAngle': <bids.layout.models.Entity at 0x7f77cc8a1650>,
'RepetitionTime': <bids.layout.models.Entity at 0x7f77cc8a1950>,
'SliceTiming': <bids.layout.models.Entity at 0x7f77cc8a1c10>,
'TaskName': <bids.layout.models.Entity at 0x7f77cc8a1e50>}
This time, we get back only the names of the files.
Filtering files by entities¶
The utility of the BIDSLayout
would be pretty limited if all we could do was retrieve a list of all files in the dataset. Fortunately, the .get()
method accepts all kinds of arguments that allow us to filter the result set based on specified criteria. In fact, we can pass any BIDS-defined keywords (or, as they’re called in PyBIDS, entities) as constraints. For example, here’s how we would retrieve all BOLD runs with .nii.gz
extensions for subject ‘01’:
# Retrieve filenames of all BOLD runs for subject 01
layout.get(subject='01', extension='nii.gz', suffix='bold', return_type='filename')
['/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtverbgeneration_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtwordrepetition_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-covertverbgeneration_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-linebisection_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-overtverbgeneration_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-overtwordrepetition_bold.nii.gz']
layout.get(subject='02', return_type='file', task="linebisection")
['/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-linebisection_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-linebisection_events.tsv',
'/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-linebisection_bold.nii.gz',
'/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-linebisection_events.tsv']
If you’re wondering what entities you can pass in as filtering arguments, the answer is contained in the .json configuration files housed here. To save you the trouble, here are a few of the most common entities:
suffix: The part of a BIDS filename just before the extension (e.g., ‘bold’, ‘events’, ‘physio’, etc.).
subject: The subject label
session: The session label
run: The run index
task: The task name
New entities are continually being defined as the spec grows, and in principle (though not always in practice), PyBIDS should be aware of all entities that are defined in the BIDS specification.
Filtering by metadata¶
All of the entities listed above are found in the names of BIDS files. But sometimes we want to search for files based not just on their names, but also based on metadata defined (per the BIDS spec) in JSON files. Fortunately for us, when we initialize a BIDSLayout
, all metadata files associated with BIDS files are automatically indexed. This means we can pass any key that occurs in any JSON file in our project as an argument to .get()
. We can combine these with any number of core BIDS entities (like subject
, run
, etc.).
For example, say we want to retrieve all files where (a) the value of SamplingFrequency
(a metadata key) is 100
, (b) the acquisition
type is 'prefrontal'
, and (c) the subject is '01'
or '02'
. Here’s how we can do that:
# Retrieve all files where SamplingFrequency (a metadata key) = 100
# and acquisition = prefrontal, for the first two subjects
layout.get(subject=['01', '02'], SamplingFreequency=100, acq='prefrontal')
[<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/anat/sub-01_ses-retest_T1w.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/dwi/sub-01_ses-retest_dwi.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-fingerfootlips_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_bold.nii.gz'>,
<BIDSDataFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-linebisection_events.tsv'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-retest/func/sub-01_ses-retest_task-overtwordrepetition_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/anat/sub-01_ses-test_T1w.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/dwi/sub-01_ses-test_dwi.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-covertverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-linebisection_bold.nii.gz'>,
<BIDSDataFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-linebisection_events.tsv'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-overtverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-overtwordrepetition_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/anat/sub-02_ses-retest_T1w.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/dwi/sub-02_ses-retest_dwi.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-covertverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-fingerfootlips_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-linebisection_bold.nii.gz'>,
<BIDSDataFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-linebisection_events.tsv'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-overtverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-retest/func/sub-02_ses-retest_task-overtwordrepetition_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/anat/sub-02_ses-test_T1w.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/dwi/sub-02_ses-test_dwi.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-covertverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-fingerfootlips_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-linebisection_bold.nii.gz'>,
<BIDSDataFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-linebisection_events.tsv'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-overtverbgeneration_bold.nii.gz'>,
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-02/ses-test/func/sub-02_ses-test_task-overtwordrepetition_bold.nii.gz'>]
Notice that we passed a list in for subject
rather than just a string. This principle applies to all filters: you can always pass in a list instead of a single value, and this will be interpreted as a logical disjunction (i.e., a file must match any one of the provided values).
Other return_type
values¶
While we’ll typically want to work with either BIDSFile
objects or filenames, we can also ask get()
to return unique values (or ids) of particular entities. For example, say we want to know which subjects have at least one T1w
file. We can request that information by setting return_type='id'
. When using this option, we also need to specify a target entity (or metadata keyword) called target
. This combination tells the BIDSLayout
to return the unique values for the specified target
entity. For example, in the next example, we ask for all of the unique subject IDs that have at least one file with a T1w
suffix:
# Ask get() to return the ids of subjects that have T1w files
layout.get(return_type='id', target='subject', suffix='T1w')
['01', '02', '03', '04', '05', '06', '07', '08', '09', '10']
If our target
is a BIDS entity that corresponds to a particular directory in the BIDS spec (e.g., subject
or session
) we can also use return_type='dir'
to get all matching subdirectories:
layout.get(return_type='dir', target='subject')
[]
Other get()
options¶
The .get()
method has a number of other useful arguments that control its behavior. We won’t discuss these in detail here, but briefly, here are a couple worth knowing about:
regex_search: If you set this to
True
, string filter argument values will be interpreted as regular expressions.scope: If your BIDS dataset contains BIDS-derivatives sub-datasets, you can specify where you wa
The BIDSFile
¶
When you call .get()
on a BIDSLayout
, the default returned values are objects of class BIDSFile
. A BIDSFile
is a lightweight container for individual files in a BIDS dataset. It provides easy access to a variety of useful attributes and methods. Let’s take a closer look. First, let’s pick a random file from our existing layout
.
# Pick the 15th file in the dataset
bf = layout.get()[15]
# Print it
bf
<BIDSImageFile filename='/home/neuro/workshop/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz'>
Here are some of the attributes and methods available to us in a BIDSFile
(note that some of these are only available for certain subclasses of BIDSFile
; e.g., you can’t call get_image()
on a BIDSFile
that doesn’t correspond to an image file!):
.path
: The full path of the associated file.filename
: The associated file’s filename (without directory).dirname
: The directory containing the file.get_entities()
: Returns information about entities associated with thisBIDSFile
(optionally including metadata).get_image()
: Returns the file contents as a nibabel image (only works for image files).get_df()
: Get file contents as a pandas DataFrame (only works for TSV files).get_metadata()
: Returns a dictionary of all metadata found in associated JSON files.get_associations()
: Returns a list of all files associated with this one in some way
Let’s see some of these in action.
# Print all the entities associated with this file, and their values
bf.get_entities()
{'datatype': 'func',
'extension': 'nii.gz',
'session': 'test',
'subject': '01',
'suffix': 'bold',
'task': 'fingerfootlips'}
# Print all the metadata associated with this file
bf.get_metadata()
{'EchoTime': 0.05,
'FlipAngle': 90,
'RepetitionTime': 2.5,
'SliceTiming': [0.0,
1.2499999999999998,
0.08333333333333333,
1.333333333333333,
0.16666666666666666,
1.4166666666666663,
0.25,
1.4999999999999996,
0.3333333333333333,
1.5833333333333328,
0.41666666666666663,
1.666666666666666,
0.5,
1.7499999999999993,
0.5833333333333333,
1.8333333333333326,
0.6666666666666666,
1.9166666666666659,
0.75,
1.9999999999999991,
0.8333333333333333,
2.083333333333332,
0.9166666666666666,
2.1666666666666656,
1.0,
2.249999999999999,
1.0833333333333333,
2.333333333333332,
1.1666666666666665,
2.416666666666665],
'TaskName': 'finger_foot_lips'}
# We can the union of both of the above in one shot like this
bf.get_entities(metadata='all')
{'EchoTime': 0.05,
'FlipAngle': 90,
'RepetitionTime': 2.5,
'SliceTiming': [0.0,
1.2499999999999998,
0.08333333333333333,
1.333333333333333,
0.16666666666666666,
1.4166666666666663,
0.25,
1.4999999999999996,
0.3333333333333333,
1.5833333333333328,
0.41666666666666663,
1.666666666666666,
0.5,
1.7499999999999993,
0.5833333333333333,
1.8333333333333326,
0.6666666666666666,
1.9166666666666659,
0.75,
1.9999999999999991,
0.8333333333333333,
2.083333333333332,
0.9166666666666666,
2.1666666666666656,
1.0,
2.249999999999999,
1.0833333333333333,
2.333333333333332,
1.1666666666666665,
2.416666666666665],
'TaskName': 'finger_foot_lips',
'datatype': 'func',
'extension': 'nii.gz',
'session': 'test',
'subject': '01',
'suffix': 'bold',
'task': 'fingerfootlips'}
Here are all the files associated with our target file in some way. Notice how we get back both the JSON sidecar for our target file, and the BOLD run that our target file contains physiological recordings for.
bf.get_associations()
[<BIDSJSONFile filename='/home/neuro/workshop/data/ds000114/task-fingerfootlips_bold.json'>]
In cases where a file has a .tsv.gz
or .tsv
extension, it will automatically be created as a BIDSDataFile
, and we can easily grab the contents as a pandas DataFrame
:
# Use a different test dataset--one that contains physio recording files
data_path = os.path.join(get_test_data_path(), 'synthetic')
layout2 = BIDSLayout(data_path)
# Get the first physiological recording file
recfile = layout2.get(suffix='physio')[0]
# Get contents as a DataFrame and show the first few rows
df = recfile.get_df()
df.head()
onset | respiratory | cardiac | |
---|---|---|---|
0 | 0.0 | -0.757342 | 0.048933 |
1 | 0.1 | -0.796851 | 0.355185 |
2 | 0.2 | -0.833215 | 0.626669 |
3 | 0.3 | -0.866291 | 0.836810 |
4 | 0.4 | -0.895948 | 0.965038 |
While it would have been easy enough to read the contents of the file ourselves with pandas’ read_csv()
method, notice that in the above example, get_df()
saved us the trouble of having to read the physiological recording file’s metadata, pull out the column names and sampling rate, and add timing information.
Mind you, if we don’t want the timing information, we can ignore it:
recfile.get_df(include_timing=False).head()
respiratory | cardiac | |
---|---|---|
0 | -0.757342 | 0.048933 |
1 | -0.796851 | 0.355185 |
2 | -0.833215 | 0.626669 |
3 | -0.866291 | 0.836810 |
4 | -0.895948 | 0.965038 |
Other utilities¶
Filename parsing¶
Say you have a filename, and you want to manually extract BIDS entities from it. The parse_file_entities
method provides the facility:
path = "/a/fake/path/to/a/BIDS/file/sub-01_run-1_T2w.nii.gz"
layout.parse_file_entities(path)
{'subject': '01', 'run': 1, 'suffix': 'T2w', 'extension': 'nii.gz'}
Path construction¶
You may want to create valid BIDS filenames for files that are new or hypothetical that would sit within your BIDS project. This is useful when you know what entity values you need to write out to, but don’t want to deal with looking up the precise BIDS file-naming syntax. In the example below, imagine we’ve created a new file containing stimulus presentation information, and we want to save it to a .tsv.gz
file, per the BIDS naming conventions. All we need to do is define a dictionary with the name components, and build_path
takes care of the rest (including injecting sub-directories!):
entities = {
'subject': '01',
'run': 2,
'task': 'nback',
'suffix': 'bold'
}
layout.build_path(entities)
'sub-01/func/sub-01_task-nback_run-2_bold.nii.gz'
You can also use build_path
in more sophisticated ways—for example, by defining your own set of matching templates that cover cases not supported by BIDS out of the box. For example, suppose you want to create a template for naming a new z-stat file. You could do something like:
# NBVAL_SKIP
# Define the pattern to build out of the components passed in the dictionary
pattern = "sub-{subject}[_ses-{session}]_task-{task}[_acq-{acquisition}][_rec-{reconstruction}][_run-{run}][_echo-{echo}]_{suffix<z>}.nii.gz",
entities = {
'subject': '01',
'run': 2,
'task': 'n-back',
'suffix': 'z'
}
# Notice we pass the new pattern as the second argument
layout.build_path(entities, pattern, validate=False)
'sub-01_task-n-back_run-2_z.nii.gz'
Exporting a BIDSLayout
to a pandas Dataframe
¶
If you want a summary of all the files in your BIDSLayout
, but don’t want to have to iterate BIDSFile
objects and extract their entities, you can get a nice bird’s-eye view of your dataset using the to_df()
method.
# Convert the layout to a pandas dataframe
df = layout.to_df()
df.head()
entity | path | datatype | extension | session | subject | suffix | task |
---|---|---|---|---|---|---|---|
0 | /home/neuro/workshop/data/ds000114/dataset_des... | NaN | json | NaN | NaN | description | NaN |
1 | /home/neuro/workshop/data/ds000114/dwi.bval | NaN | bval | NaN | NaN | dwi | NaN |
2 | /home/neuro/workshop/data/ds000114/dwi.bvec | NaN | bvec | NaN | NaN | dwi | NaN |
3 | /home/neuro/workshop/data/ds000114/sub-01/ses-... | anat | nii.gz | retest | 01 | T1w | NaN |
4 | /home/neuro/workshop/data/ds000114/sub-01/ses-... | dwi | nii.gz | retest | 01 | dwi | NaN |
We can also include metadata in the result if we like (which may blow up our DataFrame
if we have a large dataset). Note that in this case, most of our cells will have missing values.
layout.to_df(metadata=True).head()
entity | path | EchoTime | FlipAngle | RepetitionTime | SliceTiming | TaskName | datatype | extension | session | subject | suffix | task |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | /home/neuro/workshop/data/ds000114/dataset_des... | NaN | NaN | NaN | NaN | NaN | NaN | json | NaN | NaN | description | NaN |
1 | /home/neuro/workshop/data/ds000114/dwi.bval | NaN | NaN | NaN | NaN | NaN | NaN | bval | NaN | NaN | dwi | NaN |
2 | /home/neuro/workshop/data/ds000114/dwi.bvec | NaN | NaN | NaN | NaN | NaN | NaN | bvec | NaN | NaN | dwi | NaN |
3 | /home/neuro/workshop/data/ds000114/sub-01/ses-... | NaN | NaN | NaN | NaN | NaN | anat | nii.gz | retest | 01 | T1w | NaN |
4 | /home/neuro/workshop/data/ds000114/sub-01/ses-... | NaN | NaN | NaN | NaN | NaN | dwi | nii.gz | retest | 01 | dwi | NaN |
BIDSValidator¶
pybids
implicitly imports a BIDSValidator
class from the separate bids-validator
package. You can use the BIDSValidator
to determine whether a filepath is a valid BIDS filepath, as well as answering questions about what kind of data it represents. Note, however, that this implementation of the BIDS validator is not necessarily up-to-date with the JavaScript version available online. Moreover, the Python validator only tests individual files, and is currently unable to validate entire BIDS datasets. For that, you should use the online BIDS validator.
from bids import BIDSValidator
# Note that when using the bids validator, the filepath MUST be relative to the top level bids directory
validator = BIDSValidator()
validator.is_bids('/sub-01/ses-test/anat/sub-01_ses-test_T1w.nii.gz')
True
# Can decide if a filepath represents a file part of the specification
validator.is_file('/sub-01/ses-test/anat/sub-01_ses-test_T1w.nii.gz')
True
# Can check if file a dataset top
validator.is_top_level('/dataset_description.json')
True
# or subject (or session) level
validator.is_subject_level('/dataset_description.json')
False
validator.is_session_level('/sub-01/ses-test/anat/sub-01_ses-test_T1w.json')
False
# Can decide if a filepath represents phenotypic data
validator.is_phenotypic('/sub-01/ses-test/anat/sub-01_ses-test_T1w.nii.gz')
False
Report generation¶
PyBIDS
also allows you to automatically create data acquisition reports based on the available image
and meta-data
information. Comparable to functionality in e.g. fMRIPrep this enables a new level of standardization and subsequent advancements (transparency, FAIR-ness, meta-analyses, etc.). Let’s import the BIDSReport
function from the reports
submodule and initiate a BIDS
dataset:
from bids.reports import BIDSReport
from bids.tests import get_test_data_path
layout = BIDSLayout(os.path.join(get_test_data_path(), 'synthetic'))
Great, now we only need to apply the BIDSReport
function to our layout
and generate our report. (We’ll get some warnings as the example data set is missing some parts).
report = BIDSReport(layout)
counter = report.generate()
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-02/anat/sub-01_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-02/anat/sub-01_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-02/anat/sub-01_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-01/ses-02/anat/sub-01_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-01/anat/sub-02_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-01/anat/sub-02_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-02/anat/sub-02_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-02/anat/sub-02_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-01/anat/sub-02_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-01/anat/sub-02_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-02/anat/sub-02_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-02/ses-02/anat/sub-02_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-01/anat/sub-03_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-01/anat/sub-03_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-02/anat/sub-03_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-02/anat/sub-03_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-01/anat/sub-03_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-01/anat/sub-03_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-02/anat/sub-03_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-03/ses-02/anat/sub-03_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-01/anat/sub-04_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-01/anat/sub-04_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-02/anat/sub-04_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-02/anat/sub-04_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-01/anat/sub-04_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-01/anat/sub-04_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-02/anat/sub-04_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-04/ses-02/anat/sub-04_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-01/anat/sub-05_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-01/anat/sub-05_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-02/anat/sub-05_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-02/anat/sub-05_ses-02_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-01/anat/sub-05_ses-01_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-01/anat/sub-05_ses-01_T1w.nii.gz
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-02/anat/sub-05_ses-02_T1w.nii
WARNING:pybids.reports.parsing:No json file found for /opt/miniconda-latest/envs/neuro/lib/python3.7/site-packages/bids/tests/data/synthetic/sub-05/ses-02/anat/sub-05_ses-02_T1w.nii.gz
Number of patterns detected: 1
Remember to double-check everything and to replace <deg> with a degree symbol.
As we can see, one pattern was detected and we can evaluate the actual corresponding information, that is our report
:
main_report = counter.most_common()[0][0]
print(main_report)
For session 01:
MR data were acquired using a UNKNOWN-Tesla MANUFACTURER MODEL MRI scanner.
Two runs of N-Back UNKNOWN-echo fMRI data were collected (64 slices; repetition time, TR=2500ms; echo time, TE=UNKNOWNms; flip angle, FA=UNKNOWN<deg>; field of view, FOV=128x128mm; matrix size=64x64; voxel size=2x2x2mm). Each run was 2:40 minutes in length, during which 64 functional volumes were acquired.
Zero runs of Rest UNKNOWN-echo fMRI data were collected (64 slices; repetition time, TR=2500ms; echo time, TE=UNKNOWNms; flip angle, FA=UNKNOWN<deg>; field of view, FOV=128x128mm; matrix size=64x64; voxel size=2x2x2mm). Each run was 2:40 minutes in length, during which 64 functional volumes were acquired.
For session 02:
MR data were acquired using a UNKNOWN-Tesla MANUFACTURER MODEL MRI scanner.
Two runs of N-Back UNKNOWN-echo fMRI data were collected (64 slices; repetition time, TR=2500ms; echo time, TE=UNKNOWNms; flip angle, FA=UNKNOWN<deg>; field of view, FOV=128x128mm; matrix size=64x64; voxel size=2x2x2mm). Each run was 2:40 minutes in length, during which 64 functional volumes were acquired.
Zero runs of Rest UNKNOWN-echo fMRI data were collected (64 slices; repetition time, TR=2500ms; echo time, TE=UNKNOWNms; flip angle, FA=UNKNOWN<deg>; field of view, FOV=128x128mm; matrix size=64x64; voxel size=2x2x2mm). Each run was 2:40 minutes in length, during which 64 functional volumes were acquired.
Dicoms were converted to NIfTI-1 format. This section was (in part) generated automatically using pybids (0.10.2).