Writing a pipeline interface


If you want to use looper to run samples in a PEP through an arbitrary shell command, you will need to write a pipeline interface. Here is a basic walkthrough to write a simple interface file. Once you've been through this, you can consult the formal pipeline interface format specification for further details and reference.


Let's start with a simple example from the hello_looper repository:

pipeline_name: count_lines
pipeline_type: sample
  pipeline: {looper.piface_dir}/
command_template: {pipeline.var_templates.pipeline} {sample.file}

You can edit this to start your own interface.

First, think of a unique name for your pipeline and put it in pipeline_name. This will be used for messaging and identification.

Next, choose a pipeline_type, which can be either "sample" or "project". Most likely, you're writing a sample pipeline, but you can read more about sample and project pipelines if you like.

Next, we need to set the pipeline path to our script. This path is relative to the pipeline interface file, so you need to put the pipeline interface somewhere specific relative to the pipeline; perhaps in the same folder or in a parent folder. Note: previous versions used the path variable instead of var_templates: pipeline:. However, path functionality will be deprecated in the future.

Finally, populate the command_template. You can use the full power of Jinja2 Python templates here, but most likely you'll just need to use a few variables using curly braces. In this case, we refer to the script with {pipeline.var_templates.pipeline}, which points directly to the pipeline variable defined above. Then, we use {sample.file} to refer to the file column in the sample table specified in the PEP. This pipeline thus takes a single positional command-line argument. You can make the command template much more complicated and refer to any sample or project attributes, as well as a bunch of other variables made available by looper.

Now, you have a basic functional pipeline interface. There are many more advanced features you can use to make your pipeline more powerful, such as providing a schema to specify inputs or outputs, making input-size-dependent compute settings, and more. For complete details, consult the formal pipeline interface format specification.

Example Pipeline Interface Using Pipestat

pipeline_name: example_pipestat_pipeline
pipeline_type: sample
output_schema: pipestat_output_schema.yaml
command_template: >
  python {looper.piface_dir}/ {sample.file} {sample.sample_name} {pipestat.results_file}