Running scGNN also requires an [installation of R](https://www.r-project.org/). Below are the packages required for running scGNN. (It is highly recommend to use R version >=4.1)
To get the protein prediction from sciPENN, run the file **./sciPENN/preprocessing_data.ipynb**. For time effiency, we chosed 8000 cells randomly instead of using the entire datasets. The gene and protein expressions are stored in the file **./scGNN/sample_Data/pmbc/Gene/original_top_expression.csv** and **./scGNN/sample_Data/pmbc/Protein/original_top_expression.csv** respectively.
## Run LTMG
scGNN requires the input data have its corresponding LTMG results. To run LTMG, run the file **LTMG.R**.
**Note**: Due to sciPENN prediction will have negative values, some of the genes and cells in the protein prediction will be filtered out (if using original protein expression data, such senario won't happen). Therefore, we re-constrained the gene expression to the same cells as the protein expressions.
**Note**: Usually the argument "--load_dataset_1" is used to load gene-related data, and "--load_dataset_2" is used to load protein-related data. For additional arguments and details, visit [scGNN2.0 documentation](https://github.com/OSU-BMBL/scGNN2.0) and [scGNN documentation](https://github.com/juexinwang/scGNN).
## Methods Used

The major change we did compare to scGNN's original method is adding protein expression as our input. To merge two inputs together, we use Seurat after the graphing embeddings for cell clusters and graph. Two inputs are trained separately using scGNN, and only merged when predicting the cell clusters. The merging process is achieved in the **Seurat.R** file
## Result Comparsion
To compare our new methods to sciPENN, we used correlation of the predicted protein expression as our metrics. For more details, run the file **./sciPENN/comparsion.ipynb**
## Acknowledgement
- Ruanfeng Pei
- Yang Xie
- Guanghua Xiao
## Citations
- Lakkis, J., Schroeder, A., Su, K. et al. A multi-use deep learning method for CITE-seq and single-cell RNA-seq data integration with cell surface protein prediction and imputation. Nat Mach Intell 4, 940–952 (2022). https://doi.org/10.1038/s42256-022-00545-w
- Haocheng Gu, Hao Cheng, Anjun Ma, Yang Li, Juexin Wang, Dong Xu, Qin Ma, scGNN 2.0: a graph neural network tool for imputation and clustering of single-cell RNA-Seq data, Bioinformatics, Volume 38, Issue 23, 1 December 2022, Pages 5322–5325, https://doi.org/10.1093/bioinformatics/btac684
- Wang, J., Ma, A., Chang, Y. et al. Author Correction: scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat Commun 13, 2554 (2022). https://doi.org/10.1038/s41467-022-30331-6