Module: Advanced Expression Survival
Step 1: Select Cancer type
ACC: Adrenocortical Carcinoma (n=92)
BLCA: Bladder Urothelial Carcinoma (n=412)
BRCA: Breast Invasive Carcinoma (n=1098)
CESC: Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (n=307)
CHOL: Cholangiocarcinoma (n=51)
COAD: Colon Adenocarcinoma (n=461)
DLBC: Lymphoid Neoplasm Diffuse Large B-cell Lymphoma (n=58)
ESCA: Esophageal Carcinoma (n=185)
GBM: Glioblastoma Multiforme (n=617)
HNSC: Head and Neck Squamous Cell Carcinoma (n=528)
KICH: Kidney Chromophobe (n=113)
KIRC: Kidney Renal Clear Cell Carcinoma (n=537)
KIRP: Kidney Renal Papillary Cell Carcinoma(n=291)
LAML: Acute Myeloid Leukemia (n=200)
LGG: Brain Lower Grade Glioma (n=516)
LIHC: Liver Hepatocellular Carcinoma (n=377)
LUAD: Lung Adenocarcinoma (n=585)
LUSC: Lung Squamous Cell Carcinoma (n=504)
MESO: Mesothelioma (n=87)
OV: Ovarian Serous Cystadenocarcinoma (n=608)
PAAD: Pancreatic Adenocarcinoma (n=185)
PCPG: Pheochromocytoma and Paraganglioma (n=179)
PRAD: Prostate Adenocarcinoma (n=500)
READ: Rectum Adenocarcinoma (n=172)
SARC: Sarcoma (n=261)
SKCM: Skin Cutaneous Melanoma (n=470)
STAD: Stomach Adenocarcinoma (n=443)
TGCT: Testicular Germ Cell Tumors (n=150)
THCA: Thyroid Carcinoma (n=507)
THYM: Thymoma (n=124)
UCEC: Uterine Corpus Endometrial Carcinoma (n=560)
UCS: Uterine Carcinosarcoma (n=57)
UVM: Uveal Melanoma (n=80)


 
ACC BLCA BRCA CESC CHOL
COAD DLBC ESCA GBM HNSC
KICH KIRC KIRP LAML LGG
LIHC LUAD LUSC MESO OV
PAAD PCPG PRAD READ SARC
SKCM STAD TGCT THCA THYM
UCEC UCS UVM

Step 2: Input Genes
OR
OR



BP: Biological Process
CC: Cellular Component
MF: Molecular Function

BP  CC  MF

Step 3: Analysis Options


%
%


Overall    Progression-free
Disease-specific    Disease-free   

Option1: single gene
Runtime 1 gene: ~40 secs
Runtime 10 genes: ~40 secs
Runtime 100 genes: ~1 min 4 secs

Option2: composite gene expression score
Runtime 10 genes: ~54 secs
Runtime 40 genes: ~55 secs
Runtime 100 genes: ~66 secs
Methods to correct Overfitting:
Permutation 100  Cross Validation

Option3: Feature selection then selective inference
   
Convergence threshold for coordinate descent. Each inner coordinate-descent loop continues until the maximum change in the objective after any coefficient update is less than thresh times the null deviance.

  LASSO
Lasso is a regularization technique for performing linear regression, which can be used by investigators to predict an outcome by selecting a subset of the variables that minimizes prediction error. Lasso uses a penalty term that constrains the size of the estimated coefficients. Therefore, it resembles ridge regression. Lasso is a shrinkage estimator: it generates coefficient estimates that are biased to be small. Nevertheless, a lasso estimator can have smaller mean squared error (i.e., smaller variance) than an ordinary least-squares estimator when you apply it to new data. This is known as the bias-variance tradeoff. Unlike ridge regression, as the penalty term increases, lasso sets more coefficients to zero, which results in smaller model with fewer predictors.
Elastic Net
Elastic net is a hybrid of ridge regression and lasso regularization. The main difference between Lasso and Ridge is the penalty term they use. Ridge uses L2 penalty term which limits the size of the coefficient vector. Lasso uses L1 penalty which imposes sparsity among the coefficients and thus, makes the fitted model more interpretable. Elastic net is introduced as a compromise between these two techniques, and has a penalty which is a mix of L1 and L2 norms. Like lasso, elastic net can perform variable selection by shrinking some coefficients to zero. If your study has highly-correlated variables, Ridge regression shrinks the two coefficients towards one another. Lasso is similar and generally picks one over the other (depending on the context, one does not know which variable gets picked). Elastic-net is a compromise between the two that attempts to shrink and do a sparse selection simultaneously. Empirical studies have suggested that the elastic net technique can outperform lasso on data with highly correlated predictors.

Use for analysis option 3 only

 



Citation

AESA paper https://www.ncbi.nlm.nih.gov/pubmed/31607216
Advancing Pan-cancer Gene Expression Survival Analysis by Inclusion of Non-coding RNA. RNA Biology, 2019