Skip to content
Snippets Groups Projects
user avatar
e807faa9

Example Wordcount Package

Build Status Astrocyte

This is an example workflow package for astrocyte. It contains a workflow to count the occurrences of each word in a text file.

The Workflow

The workflow workflow/main.nf has three processes:

  • Convert all the text in the input files to uppercase
  • Split the text so each word is on a separate line
  • Remove all the empty lines generated from the split
  • Sort, find and count the occurrence of unique words

This example uses SLURM to finish the calculation tasks. It may be slow due to the workload of the SLURM on BioHPC.

Parameters

There is a single parameter story. This provides 1 or more files that the workflow should run on.

Example Usage

Check the workflow package and Astrocyte configurations.

module load astrocyte/2.0.0
astrocyte_cli check astrocyte_example_wordcount

Test run this workflow package with default parameters

astrocyte_cli test astrocyte_example_wordcount

Run this workflow package with default parameters

astrocyte_cli run astrocyte_example_wordcount

Run this workflow package with your own txt file (e.g. mobydick-1.txt)

astrocyte_cli run astrocyte_example_wordcount --param="story,mobydick-1.txt"

Prepare the containers and libraries for Vizapp.

astrocyte_cli viz-prepare astrocyte_example_wordcount

Run Vizapp

astrocyte_cli viz astrocyte_example_wordcount
# Go to http://127.0.0.1:8123 in your browser to access the vizapp
# Use `Ctrl-C` to terminate the R shiny server once finished

Containers

Astrocyte workflow containers

The workflow_containers section in astrocyte_pkg.yml defines the URL of the containers used by this workflow. When the astrocyte_cli run this workflow, it will pull the images in this section and save them to workflow/images/singularity/ for later use. In the workflow, the default configuration located in workflow/configs/biohpc.config defines how the Nextflow command to use the singularity executor for the uppercase, toLines, cleanlines, and word count processes.

Vizapp containers

The astrocyte_cli version 2.0.0 supports containerized vizapp, so that vizapp becomes more independent and can be run on multiple platforms. The astrocyte workflow developers do not need to change their vizapp_* settings in astrocyte_pkg.yml to use this feature. The viz-prepare command with the --with-container option will pull the right version of R container from the BioHPC "Astrocyte Container Images" registry and save it to the vizapp folder for later use. Also, it will create a folder vizapp/.rlibrary to install the dependencies packages.

Questions

If you have any questions about this workflow example, or Astrocyte in general, please contact the BioHPC team via biohpc-help@utsouthwestern.edu