Skip to content
Snippets Groups Projects

Add Submodule Support

Merged Peng Lian requested to merge submodule into master
Compare and
13 files
+ 18063
248
Preferences
File browser
Compare changes
+ 70
10
# Astrocyte git based workflow
# Astrocyte Base Workflow
This is an example workflow using nextflow as a single runner by cloning a given repo and running it as is.
This is a minimal example that uses Astrocyte as a runner to run an existing workflow in a given git repo.
## The Workflow
## Requirements
- The Astrocyte base workflow, [`astrocyte_example_git_source`](https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source). This repo is used to wrap the existing workflow list below, so that it can be run on the Astrocyte platforms.
- The existing workflow, [`astrocyte_example_git_source__slave`](https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave). This repo contains an existing workflow written in bash.
The workflow `workflow/main.nf` has three processes:
## The existing workflow
There is only on bash script, `complicated_program.sh`, in the existing workflow. It captures arguments from the command line, displays them in the output, and performs a word count calculation if the input file exists. The usage of this script is as below.
- Convert all the text in the input files to uppercase
- Split the text so each word is on a separate line
- Sort, find and count the occurence of unique words
./complicated_program.sh [-p1 A] [-p2 B] [data.txt]
## Parameters
## The Astrocyte base workflow
The base workflow interacts with the Astrocyte platform and the existing workflow. It contains some basic components that are required by the Astrocyte platform. It will be imported to the Astrocyte platform as an independent Astrocyte workflow after connecting to the existing workflow you have. So, please clone this repo, rename it, and edit the meta information in the `astrocyte_pkg.yml` file to reflect the function of your existing workflow.
There is a single parameter `story`. This provides 1 or more files that the
workflow should run on.
## Critical files in the base workflow
### `astrocyte_pkg.yml`
This file contains the meta information and the parameter definitions of the astrocyte workflow.
Parameter defined in the `workflow_parameters` section will be received from the Astrocyte website and then passed to the existing workflow. In the current example, it only defined four parameters as shown in below, but you can define as many as the existing workflow requires.
- `source_entrypoint`. This is the command in the existing workflow repo that need to be executed.
- `parameter1`. This is an example parameter that need to be passed to the above command.
- `parameter2`. This is an example parameter that need to be passed to the above command.
- `parameter_nonopt`. This is an example non-option parameter that need to be passed to the above command.
### `workflow/main.nf`
The parameters defined in this file such as `params.source_entrypoint`, `params.parameter1`, `params.parameter2`, and `params.parameter_nonopt` work as default values for the Astrocyte workflow. They will be overwritten by the parameters defined in `astrocyte_pkg.yml` when runs on the local machine, or by the parameters received from the website when runs on the Astrocyte website. If you defined more parameters in the `astrocyte_pkg.yml`, add them in this file as well.
It has two processes:
- `process parameters`. Show the parameters it received.
- `process run_source`. Pass the parameters to the `source_entrypoint` command to run the existing workflow and put the output and errors to the `workflow/output/` folder.
### `.gitmodules` and `workflow/external_repo/`
These are the places where the magic happens. The existing repo will be cloned and saved to `workflow/external_repo/` as a submodule, once the `.gitmodules` configured correctly. Below is an example configuration. It defines the submodule name, place in this repo, the URL of the existing workflow, and branch to use.
[submodule "workflow/external_repo/astrocyte_example_git_source__slave"]
path = workflow/external_repo/astrocyte_example_git_source__slave
url = https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave.git
branch = master
Please run the following commands in the base repo to update the submodule and check the status every time you made any changes in the remote repo or the `.gitmodules` file.
# update changes from the remote repo to the submodule
git submodule update --remote
# check the status of the submodule
git submodule status
### `docs/index.md`
The documentation file of your workflow. The content in this file will be displayed on the website once the workflow imported to the Astrocyte platforms.
### `test_data/`
This folder saves the data for test only. When run `astrocyte_cli test YOUR_WORKFLOW_FOLDER`, it will check this folder for the test data defined in `main.nf` or `astrocyte_pkg.yml`.
### `vizapp/`
This folder contains the downstream visualization files for the Astrocyte workflow. Currently, only R-shiny is supported.
## Publish your workflow to Astrocyte platform
- Check the submodule status and make sure it is updated.
`git submodule status`
- Check your Astrocyte workflow with the astrocyte_cli.
`astrocyte_cli check YOUR_WORKFLOW_FOLDER`
- Test your Astrocyte workflow with the astrocyte_cli.
`astrocyte_cli test YOUR_WORKFLOW_FOLDER`
- Commit and push your changes into your GitLab repo.
- Create a publish_x.x.x/test_x.x.x tag of your repo from GitLab page. (x.x.x is the version number you defined)
- Contact the BioHPC team at biohpc-help@utsouthwestern.edu.