process_align
Purpose: Alignment of raw reads to reference genome
Input:
- Experiment.csv
- Species
- Experiment Settings.csv
- Paired_End (boolean)
- Has_Strand_Specific_Information (boolean)
- Used_Spike_Ins (boolean)
- Trimmed *.fastq.gz
Reference: Human = GRCh38p12 Mouse = GRCm38p6 Spike-in = ERCC92 (if spike ins used)
Outputs for later processes: *.bam (sorted) *.alignment.summary.txt *.bai
Process:
- Single End
- -x: Reference Index
- -U: fastq file
- --rna-strandness: R or F or remove
- Paired End
- -x: Reference Index
- -1: fastq read 1 file
- -2: fastq read 1 file
- --rna-strandness: R or F or remove
- --no-mixed: Only align paired reads
- --no-discordant: Only align properly paired reads
- Samtools: sam to bam
- samtools view -bS -F 4 -F 8 -F 256
- Samtools: index
- samtools index
Tools: HISAT2 Samtools