Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
CNN-ChIPr
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Requirements
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Ahmed Abbas
CNN-ChIPr
Commits
9296b7ca
Commit
9296b7ca
authored
1 year ago
by
Ahmed Abbas
Browse files
Options
Downloads
Patches
Plain Diff
Update Readme.md
parent
39c15ec7
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
HiC_prediction/Readme.md
+36
-2
36 additions, 2 deletions
HiC_prediction/Readme.md
with
36 additions
and
2 deletions
HiC_prediction/Readme.md
+
36
−
2
View file @
9296b7ca
...
@@ -8,8 +8,42 @@ https://cloud.biohpc.swmed.edu/index.php/s/yyp8DidNzzH4tgb
...
@@ -8,8 +8,42 @@ https://cloud.biohpc.swmed.edu/index.php/s/yyp8DidNzzH4tgb
1.
Go using the terminal to the downloaded folder
`IMR90_data_CNN-ChIPr`
1.
Go using the terminal to the downloaded folder
`IMR90_data_CNN-ChIPr`
2.
Compile the C++ script
`prepare_chipseq_data.cpp`
by typing:
`g++ prepare_chipseq_data.cpp -o imr90.out`
2.
Compile the C++ script
`prepare_chipseq_data.cpp`
by typing:
`g++ prepare_chipseq_data.cpp -o imr90.out`
3.
Run the program by typing:
`./imr90.out`
3.
Run the program by typing:
`./imr90.out`
4.
An example file showing how to compile and run the program is
`run_cpp.sh`
. You can adjust the file as necessary for your working environment
4.
It may take more than 1 day to finish
5.
An example file showing how to compile and run the program is
`run_cpp.sh`
. You can adjust the file as necessary for your working environment
## Prepare the sequence files needed for training the model
1.
The script used for this step is:
`get_seq_ip_files.R`
2.
To run the script, type on terminal:
`Rscript get_seq_ip_files.R`
3.
It needs an active R environment and needs to install the library
`bedtoolsr`
4.
It may take more than 1 day to finish this step
5.
An example file showing how to run the script is
`get_seq_files.sh`
## Prepare the CTCF orientation flags and TADs flags files
1.
The script used for this step is:
`get_CTCF_ORI_TADS_IMR90.R`
2.
To run the script, type on terminal:
`Rscript get_CTCF_ORI_TADS_IMR90.R`
3.
It needs an active R environment and needs to install the library
`bedtoolsr`
4.
An example file showing how to run the script is
`get_ctcf_tads_ori_IMR90.sh`
## The files needed for training are now ready and stored
1.
All the files needed for training the model are now ready and stored in the previously created folder:
`Hi-C_data`
## Training the model
1.
In the extracted folder
`CNN-ChIPr-Data`
, create a new folder and name it:
`RAD21_model_GM12878`
2.
The file used for training the model is:
`train_for_GM12878_RAD21.py`
3.
To train the model, you need to have active python envirinment, and install all the libraries listed at the top of
`train_for_GM12878_RAD21.py`
4.
To train the model, type:
`python train_for_GM12878_RAD21.py`
5.
It may need several hours to finish.
**I used a computer with 256GB memory in this step.**
6.
After this step finishes, the trained model will be in:
`RAD21_model_GM12878/RAD21_all_inputs_trained_GM12878.h5`
## Testing the model
1.
In the downloaded data, the folder
`RAD21_inputs_K562`
contains data to test the model and get results for the K562 cell line
2.
This K562 data was prepared in the same way the data of GM12878 was prepared as in the previous steps
3.
The file used for testing the model is:
`get_results_K562_RAD21.py`
4.
To run the script, type on terminal:
`python get_results_K562_RAD21.py`
5.
It should print the pearson correlation values between the predictions and original K562 interactions for each chromosome.
## For questions, comments, or bug reporting
-
Please contact ahmed.abbaselmahdi@utsouthwestern.edu
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment