Test data infrastructure
nf-neuro
provides an infrastructure to host and distribute test data, freely available to all
contributors. Access to this data is done through the dedicated subworkflow LOAD_TEST_DATA
. Test
data packages are listed in tests/config/test_data.json
. Introspection of their content is
available in VS Code only, using the Test Data Explorer
extension. Once installed, it adds the Test Data Explorer
to the explorer panel (Ctrl+Shift+E
),
which you can use to browse and download test data packages and their content.
To download test data inside your test workflows, first include the LOAD_TEST_DATA
workflow
in their main.nf
:
include { LOAD_TEST_DATA } from '../../../../../subworkflows/nf-neuro/load_test_data/main'
The workflow has two inputs :
-
A channel containing a list of package names to download.
-
A name for the temporary directory where the data will be put.
To call it, use the following syntax :
archives = Channel.from( [ "<archive1>", "archive2", ... ] )
LOAD_TEST_DATA( archives, "<directory>" )
Important
This will download the archives
and unpack them under the directory
specified, using the archive’s names as sub-directories
to unpack to.
The archives contents are accessed using the output parameter of the workflow
LOAD_TEST_DATA.out.test_data_directory
. To create the test input from it for
a given PROCESS
to test use the .map
operator :
input = LOAD_TEST_DATA.out.test_data_directory
.map{ test_data_directory -> [
[ id:'test', single_end:false ], // meta map
file("${test_data_directory}/<file for input 1>"),
file("${test_data_directory}/<file for input 2>"),
...
] }
Then feed it to it :
PROCESS( input )
Info
The subworkflow must be called individually in each test workflow, even if they download the same archives, since there is no mechanism to pass data channels to them from the outside. Nevertheless, downloaded archives are cached locally to ensure efficiency and preserve bandwidth.