Your pipeline from A to Z
Environment Setup
We recommend using VS Code and the development container
. Follow the
setup documentation. You should be working in a folder containing the github
repo: nf-neuro-tutorial.
Explore the github repository structure
The tutorial folder is pre-configured with necessary files and directories. The config, tests, modules and subworkflows folders contain the pre-installed nf-neuro components, while the data folder contains the data provided for the tutorial. Before starting to play with the data and the code, we will review some of the major files and folders in the structure: nextflow.config, main.nf and the data folder.
Here the current structure:
Directorynf-neuro-tutorial
- .devcontainer
Directoryconfig/
- …
Directorydata/
- …
Directorymodules/
- …
Directorysubworkflows/
- …
Directorytests/
- …
- .gitignore
- .nf-core.yml
- README.md
- main.nf
- modules.json
- nextflow.config
- nf-test.config
nextflow.config
The nextflow.config
file defines execution parameters and default configurations.
It also contains parameters that users can change when calling your pipeline
(prefixed with params.
). Here is an example of a basic nextflow.config
file :
profiles { docker { docker.enabled = true conda.enabled = false singularity.enabled = false podman.enabled = false shifter.enabled = false charliecloud.enabled = false apptainer.enabled = false docker.runOptions = '-u $(id -u):$(id -g)' }}
manifest { name = 'scilus/nf-neuro-tutorial' description = """nf-neuro-tutorial is a Nextflow pipeline for processing neuroimaging data.""" version = '0.1dev'}
params.input = falseparams.output = 'result'
The parameters defined with params.
can be changed at execution by another nextflow.config
file or
by supplying them as arguments when calling the pipeline using nextflow run
:
nextflow run main.nf --input /path/to/input --output /path/to/output
main.nf
This file is your pipeline execution file. It contains all modules and subworkflows you want to run, and the
channels that define how data passes between them. This is also where you define how to fetch your input pipeline files.
This can be done using a workflow definition called get_data
.
#!/usr/bin/env nextflow
workflow get_data { main: if ( !params.input ) { log.info "You must provide an input directory containing all files using:" log.info "" log.info " --input=/path/to/[input] Input directory containing the file needed" log.info " |" log.info " └-- Input" log.info " └-- participants.*" log.info "" error "Please resubmit your command with the previous file structure." }
input = file(params.input) // ** Loading all files. ** // participants_channel = Channel.fromFilePairs("$input/participants.*", flat: true) { "participants_files" }
emit: participants = participants_channel}
workflow { // ** Now call your input workflow to fetch your files ** // data = get_data() data.participants.view()}
Data
To keep things simple, we’ll consider you want to process a BIDS dataset that contains for one subject and session a DWI and a T1 image, as follows:
Directorydata
- dataset_description.json
- participants.json
- participants.tsv
Directorysub-003
Directoryses-01
Directoryanat
- sub-003_ses-01_T1w.json
- sub-003_ses-01_T1w.nii.gz
Directorydwi
- sub-003_ses-01_dir-AP_dwi.bval
- sub-003_ses-01_dir-AP_dwi.bvec
- sub-003_ses-01_dir-AP_dwi.json
- sub-003_ses-01_dir-AP_dwi.nii.gz