README.md 2.79 KB
Newer Older
Yin Xi's avatar
Yin Xi committed
1
2
# Radiomics Pipeline

Yin Xi's avatar
Yin Xi committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
This is a set of python and R routines for converting imaging and mask information to radiomic features and perform simple classification task.

there are three independent compartments in this repository:
1. reading and storing ROIs 
-- input: folder containing DICOM data; 
-- output: 1. folder of imagins in NRRD format, 2. folder of ROIs in NRRD format, 3. a csv meta file that each row represents a unique ROI with coresponding image
2. radiomics extraction
-- input: a csv meta file that each row represents a unique ROI with coresponding image (may also need a parameter file for extraction configuration)
-- output: a csv dataset with the patient ID and extracted radiomics features in subcequent columns
3. analysis
-- input: a vector of the dependent variable (binary), a matrix of independen variables (extracted radiomic features)
-- output: list of two objects, 1. cross-validated predicted probability 2. selected features in descending order with selected %.

these compartments are designed to run independently. So if NRRD files are readily available, step 1 can be skipped. Compartment 3 can also be applied to any binary classification problem other than radiomics.

Details:
1. reading and storing ROIs
there are currently two ways of drawing ROIs that can be used for radiomics, 1. save ROIs into DICOM header via pyOsirix plug-in; 2. MINT (limited access). Two separate python routines were developed to deal with each of this situation. the end products are the same, a folder for the image (.NRRD) and a folder for masks (.NRRD) and a csv list of paths of each mask and corresponding image.

Yin Xi's avatar
Yin Xi committed
22
For extracting ROI from DICOM header, both classic and enhanced DICOM formats were considered. This code is well tested for classic DICOM but not extensively tested for enhanced DICOM. For DWI where multiple b values are in the same series, ROIs will be orders the same as the image.
Yin Xi's avatar
Yin Xi committed
23
24
25
26
27

2. radiomics extraction
this is just a wrapper for feature extraction via pyRdiomics. A parameter file may be needed.

3. analysis
Yin Xi's avatar
Yin Xi committed
28

Yin Xi's avatar
Yin Xi committed
29
30
31
32
33
it has two modules, one for ROC analysis using LASSO logistics another is descriptive using heatmaps and volcano plot.

LASSO Logistics: This is just a wrapper for the cv.glmnet function from glmnet package in R for repeated nested-cross validation. Currently support only binary dependent variable, i.e. logistic regression. variable selection is done via LASSO (glmnet package). Selection of hyperparameter, lambda, for LASSO is based on minimizing cross-validated AUC (in the case of small sample size and/or small number of events, deviance metric may be used instead). Some example code for subsequent anlyses include drawing ROC curve, calculating AUC and displaying top slected features.

heatmaps and volcano plot: direct implementation of heatmaply with built-in clustering and everything.