Skip to content
Snippets Groups Projects
Commit 50fdc1f9 authored by Peng Lian's avatar Peng Lian
Browse files

Reorg the slave repo as submodule

parent b35bc652
Branches
Tags test_0.0.2
1 merge request!1Add Submodule Support
Pipeline #11640 failed with stages
[submodule "workflow/external_repo/astrocyte_example_git_source__slave"]
path = workflow/external_repo/astrocyte_example_git_source__slave
url = https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave.git
branch = master
\ No newline at end of file
Copyright © 2016. The University of Texas Southwestern Medical Center
Copyright © 2022. The University of Texas Southwestern Medical Center
5323 Harry Hines Boulevard Dallas, Texas, 75390 Telephone 214-648-3111
# Example Wordcount Package
# Astrocyte Base Workflow
[![Build
Status](https://git.biohpc.swmed.edu/BioHPC/astrocyte_example_wordcount/badges/master/build.svg)](https://git.biohpc.swmed.edu/BioHPC/astrocyte_example_git_source/commits/master)
[![Astrocyte](https://img.shields.io/badge/astrocyte-%E2%89%A50.1.0-blue.svg)](https://astrocyte-test.biohpc.swmed.edu/static/docs/index.html)
This is a minimal example that uses Astrocyte as a runner to run an existing workflow in a given git repo.
# Astrocyte git based workflow
## Requirements
- The Astrocyte base workflow, [`astrocyte_example_git_source`](https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source). This repo is used to wrap the existing workflow list below, so that it can be run on the Astrocyte platforms.
- The existing workflow, [`astrocyte_example_git_source__slave`](https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave). This repo contains an existing workflow written in bash.
This is an example workflow using nextflow as a single runner by cloning a given repo and running it as is.
## The existing workflow
There is only on bash script, `complicated_program.sh`, in the existing workflow. It captures arguments from the command line, displays them in the output, and performs a word count calculation if the input file exists. The usage of this script is as below.
## The Workflow
./complicated_program.sh [-p1 A] [-p2 B] [data.txt]
The workflow `workflow/main.nf` has three processes:
## The Astrocyte base workflow
The base workflow interacts with the Astrocyte platform and the existing workflow. It contains some basic components that are required by the Astrocyte platform. It will be imported to the Astrocyte platform as an independent Astrocyte workflow after connecting to the existing workflow you have. So, please clone this repo, rename it, and edit the meta information in the `astrocyte_pkg.yml` file to reflect the function of your existing workflow.
- Convert all the text in the input files to uppercase
- Split the text so each word is on a separate line
- Sort, find and count the occurence of unique words
## Critical files in the base workflow
### `astrocyte_pkg.yml`
This file contains the meta information and the parameter definitions of the astrocyte workflow.
Parameter defined in the `workflow_parameters` section will be received from the Astrocyte website and then passed to the existing workflow. In the current example, it only defined four parameters as shown in below, but you can define as many as the existing workflow requires.
## Parameters
- `source_entrypoint`. This is the command in the existing workflow repo that need to be executed.
- `parameter1`. This is an example parameter that need to be passed to the above command.
- `parameter2`. This is an example parameter that need to be passed to the above command.
- `parameter_nonopt`. This is an example non-option parameter that need to be passed to the above command.
There is a single parameter `story`. This provides 1 or more files that the
workflow should run on.
### `workflow/main.nf`
The parameters defined in this file such as `params.source_entrypoint`, `params.parameter1`, `params.parameter2`, and `params.parameter_nonopt` work as default values for the Astrocyte workflow. They will be overwritten by the parameters defined in `astrocyte_pkg.yml` when runs on the local machine, or by the parameters received from the website when runs on the Astrocyte website. If you defined more parameters in the `astrocyte_pkg.yml`, add them in this file as well.
It has two processes:
- `process parameters`. Show the parameters it received.
- `process run_source`. Pass the parameters to the `source_entrypoint` command to run the existing workflow and put the output and errors to the `workflow/output/` folder.
### `.gitmodules` and `workflow/external_repo/`
These are the places where the magic happens. The existing repo will be cloned and saved to `workflow/external_repo/` as a submodule, once the `.gitmodules` configured correctly. Below is an example configuration. It defines the submodule name, place in this repo, the URL of the existing workflow, and branch to use.
[submodule "workflow/external_repo/astrocyte_example_git_source__slave"]
path = workflow/external_repo/astrocyte_example_git_source__slave
url = https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave.git
branch = master
Please run the following commands in the base repo to update the submodule and check the status every time you made any changes in the remote repo or the `.gitmodules` file.
# update changes from the remote repo to the submodule
git submodule update --remote
# check the status of the submodule
git submodule status
### `docs/index.md`
The documentation file of your workflow. The content in this file will be displayed on the website once the workflow imported to the Astrocyte platforms.
### `test_data/`
This folder saves the data for test only. When run `astrocyte_cli test YOUR_WORKFLOW_FOLDER`, it will check this folder for the test data defined in `main.nf` or `astrocyte_pkg.yml`.
### `vizapp/`
This folder contains the downstream visualization files for the Astrocyte workflow. Currently, only R-shiny is supported.
## Publish your workflow to Astrocyte platform
- Check the submodule status and make sure it is updated.
`git submodule status`
- Check your Astrocyte workflow with the astrocyte_cli.
`astrocyte_cli check YOUR_WORKFLOW_FOLDER`
- Test your Astrocyte workflow with the astrocyte_cli.
`astrocyte_cli test YOUR_WORKFLOW_FOLDER`
- Commit and push your changes into your GitLab repo.
- Create a publish_x.x.x/test_x.x.x tag of your repo from GitLab page. (x.x.x is the version number you defined)
- Contact the BioHPC team at biohpc-help@utsouthwestern.edu.
#
# metadata for the example astrocyte ChipSeq workflow package
# metadata for the example astrocyte git source workflow package
#
# -----------------------------------------------------------------------------
......@@ -9,14 +9,14 @@
# A unique identifier for the workflow package, text/underscores only
name: 'example_git_source'
# Who wrote this?
author: 'Erand Smakaj'
author: 'Peng Lian, Erand Smakaj'
# A contact email address for questions
email: 'biohpc-help@utsouthwestern.edu'
# A more informative title for the workflow package
title: 'Example git Based Workflow'
# A summary of the workflow package in plain text
description: |
This is a minimal example that takes as parameter a git url, uses nextflow to clone and run it as is.
This is a minimal example that uses Astrocyte as a runner to run an existing workflow in a given git repo.
#### New Features in Astrocyte 0.4.0 and above ####
citation: |
......@@ -97,26 +97,32 @@ workflow_modules:
workflow_parameters:
- id: source_git_repo
- id: source_entrypoint
type: string
required: true
default: "https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave.git"
default: "complicated_program.sh"
description: |
The URL to the repo you wish to run via astrocyte. Must be a public repo
The command in the other repo to execute.
- id: source_git_branch
- id: parameter1
type: string
required: true
default: "astrocyte_target"
required: false
default: "A"
description: |
The target branch or tag to checkout from the repo.
A parameter that need to be passed to the above command
- id: source_entrypoint
- id: parameter2
type: string
required: true
default: "sh complicated_program.sh"
required: false
default: "B"
description: |
A parameter that need to be passed to the above command
- id: parameter_nonopt
type: string
required: false
description: |
The command to execute after cloning the repo.
A non-optinal parameter that need to be passed to the above command
# -----------------------------------------------------------------------------
# SHINY APP CONFIGURATION
......
# Astrocyte git based workflow
# Astrocyte Base Workflow
This is an example workflow using nextflow as a single runner by cloning a given repo and running it as is.
This is a minimal example that uses Astrocyte as a runner to run an existing workflow in a given git repo.
## The Workflow
## Requirements
- The Astrocyte base workflow, [`astrocyte_example_git_source`](https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source). This repo is used to wrap the existing workflow list below, so that it can be run on the Astrocyte platforms.
- The existing workflow, [`astrocyte_example_git_source__slave`](https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave). This repo contains an existing workflow written in bash.
The workflow `workflow/main.nf` has three processes:
## The existing workflow
There is only on bash script, `complicated_program.sh`, in the existing workflow. It captures arguments from the command line, displays them in the output, and performs a word count calculation if the input file exists. The usage of this script is as below.
- Convert all the text in the input files to uppercase
- Split the text so each word is on a separate line
- Sort, find and count the occurence of unique words
./complicated_program.sh [-p1 A] [-p2 B] [data.txt]
## Parameters
## The Astrocyte base workflow
The base workflow interacts with the Astrocyte platform and the existing workflow. It contains some basic components that are required by the Astrocyte platform. It will be imported to the Astrocyte platform as an independent Astrocyte workflow after connecting to the existing workflow you have. So, please clone this repo, rename it, and edit the meta information in the `astrocyte_pkg.yml` file to reflect the function of your existing workflow.
There is a single parameter `story`. This provides 1 or more files that the
workflow should run on.
## Critical files in the base workflow
### `astrocyte_pkg.yml`
This file contains the meta information and the parameter definitions of the astrocyte workflow.
Parameter defined in the `workflow_parameters` section will be received from the Astrocyte website and then passed to the existing workflow. In the current example, it only defined four parameters as shown in below, but you can define as many as the existing workflow requires.
- `source_entrypoint`. This is the command in the existing workflow repo that need to be executed.
- `parameter1`. This is an example parameter that need to be passed to the above command.
- `parameter2`. This is an example parameter that need to be passed to the above command.
- `parameter_nonopt`. This is an example non-option parameter that need to be passed to the above command.
### `workflow/main.nf`
The parameters defined in this file such as `params.source_entrypoint`, `params.parameter1`, `params.parameter2`, and `params.parameter_nonopt` work as default values for the Astrocyte workflow. They will be overwritten by the parameters defined in `astrocyte_pkg.yml` when runs on the local machine, or by the parameters received from the website when runs on the Astrocyte website. If you defined more parameters in the `astrocyte_pkg.yml`, add them in this file as well.
It has two processes:
- `process parameters`. Show the parameters it received.
- `process run_source`. Pass the parameters to the `source_entrypoint` command to run the existing workflow and put the output and errors to the `workflow/output/` folder.
### `.gitmodules` and `workflow/external_repo/`
These are the places where the magic happens. The existing repo will be cloned and saved to `workflow/external_repo/` as a submodule, once the `.gitmodules` configured correctly. Below is an example configuration. It defines the submodule name, place in this repo, the URL of the existing workflow, and branch to use.
[submodule "workflow/external_repo/astrocyte_example_git_source__slave"]
path = workflow/external_repo/astrocyte_example_git_source__slave
url = https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave.git
branch = master
Please run the following commands in the base repo to update the submodule and check the status every time you made any changes in the remote repo or the `.gitmodules` file.
# update changes from the remote repo to the submodule
git submodule update --remote
# check the status of the submodule
git submodule status
### `docs/index.md`
The documentation file of your workflow. The content in this file will be displayed on the website once the workflow imported to the Astrocyte platforms.
### `test_data/`
This folder saves the data for test only. When run `astrocyte_cli test YOUR_WORKFLOW_FOLDER`, it will check this folder for the test data defined in `main.nf` or `astrocyte_pkg.yml`.
### `vizapp/`
This folder contains the downstream visualization files for the Astrocyte workflow. Currently, only R-shiny is supported.
## Publish your workflow to Astrocyte platform
- Check the submodule status and make sure it is updated.
`git submodule status`
- Check your Astrocyte workflow with the astrocyte_cli.
`astrocyte_cli check YOUR_WORKFLOW_FOLDER`
- Test your Astrocyte workflow with the astrocyte_cli.
`astrocyte_cli test YOUR_WORKFLOW_FOLDER`
- Commit and push your changes into your GitLab repo.
- Create a publish_x.x.x/test_x.x.x tag of your repo from GitLab page. (x.x.x is the version number you defined)
- Contact the BioHPC team at biohpc-help@utsouthwestern.edu.
This diff is collapsed.
singularity {
enabled = true
runOptions = '--disable-cache' // use this one for production
// runOptions = '--disable-cache --bind /vagrant:/vagrant' // use this one for vagrant development env only
cacheDir = "$baseDir/images/singularity"
}
process {
executor = 'local'
withName:parameters {
container = 'docker://ubuntu:latest'
}
withName:uppercase {
container = 'docker://ubuntu:latest'
}
withName:tolines {
container = 'docker://ubuntu:latest'
}
withName:wordcounts {
container = 'docker://centos:centos8'
}
executor = 'slurm'
queue = 'super'
}
Subproject commit b9d4ab49a3e88ea0071e71a1dd5d55c9b770c493
/*
* Copyright (c) 2021. The University of Texas Southwestern Medical Center
* Copyright (c) 2022. The University of Texas Southwestern Medical Center
*
* This is a minimal test workflow package that uses another git repo to run as is.
*
* @authors
* Erand Smakaj
* Peng Lian, Erand Smakaj
*
*/
// This workflow needs at least these 3 parameters
// the source git that will be cloned
params.source_git_repo = "https://git.biohpc.swmed.edu/biohpc/astrocyte_example_git_source__slave.git"
params.source_git_branch = "astrocyte_target"
params.source_entrypoint = "sh complicated_program.sh"
// other parameters may be added by author of source_git_repo
params.other_param_a = "hello world"
params.other_param_b = "foo"
params.other_param_c = "bar=3"
// The executable script in the other repo
params.source_entrypoint = "complicated_program.sh"
// Parameters may be passed to the executable script
params.parameter1 = "A"
params.parameter2 = "B"
params.parameter_nonopt = "$baseDir/../test_data/mobydick.txt"
process parameters {
......@@ -32,29 +28,27 @@ process parameters {
echo ""
echo "Test Parameters Provided..."
echo "Source git repo: ${params.source_git_repo}"
echo "Source git branch: ${params.source_git_branch}"
echo "Entrypoint command: ${params.source_entrypoint}"
echo "Paramter 1: ${params.parameter1}"
echo "Paramter 2: ${params.parameter2}"
echo "Paramter non-optional: ${params.parameter_nonopt}"
"""
}
process clone_and_run_source {
process run_source {
// Publish the outputs we create here into the workflow output directory
publishDir "$baseDir/source_git/output", mode: 'copy'
publishDir "$baseDir/output", mode: 'copy'
executor 'slurm'
queue 'super'
cpus 1
"""
# clone the repo
git clone ${params.source_git_repo} ./source_git
cd ./source_git
git checkout ${params.source_git_branch}
# trasfer vizapp files
rm -rf $baseDir/../vizapp
cp -R vizapp $baseDir/../
output:
file "astrocyte_run.out"
file "astrocyte_run.err"
# run the program
eval ${params.source_entrypoint} 1> $baseDir/output/slave_run.out 2> $baseDir/output/slave_run.err
"""
# run the program with received parameters
eval $baseDir/../workflow/external_repo/astrocyte_example_git_source__slave/${params.source_entrypoint} -p1 ${params.parameter1} -p2 ${params.parameter2} ${params.parameter_nonopt} 1> astrocyte_run.out 2> astrocyte_run.err
"""
}
\ No newline at end of file
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment