Clinical application of genomic data in cancerIssuing time:2018-10-23 00:00 With the application of high-throughput experimental techniques such as sequencing in tumor research, more and more genomic data have been used to diagnose, treat and predict the prognosis of tumor patients in recent years. More and more analytical methods are used in the analysis and interpretation of genomic data. The major analysis processes and tools for the genome are shown in Figure 1 [1] 1. Identification of mutations, including Alignment of sequencing sequences to the genome (Alignment); Identifying mutation ( Variant Calling) and differentiating somatic mutation from germline mutation. Annotate the mutation data. 2. Interpretation of mutations: identification of molecular changes that can be used to guide targeted treatment; Patients were divided into different risk groups through survival analysis to achieve individualized detection and treatment [2] 。
In this article, we describe two analytical tools that use genomic data to predict prognosis and guide therapy. 1. Application of genomic data to predict prognosis in cancer patients (1) survival analysis with survival time as a continuous variable All samples were first divided into a training cohort (for screening prognostic-related variables and survival modeling) and a test cohort (for testing survival constructed in the training cohort)model ). In general, 80 % of all samples are randomly defined as training queues Set), the remaining 20% is defined as a test queue Set).For the training cohort, prognostic variables were first selected by univariate Cox survival analysis (P < 0.05 ). Then two modeling methods were used to model the survival of the selected prognostic variables : 1. Cox: Cox multivariate regression model and LASSO model, 2. Rsf: Random Survival Forest.Finally, the predictive validity of the established prognostic model was evaluated by C-index ( see Figure 2 ). In the above analysis process, the main analytical tools are: Cox univariate and multivariate analysis: survival package in R language; LASSO analysis: R language "glmnet" package; Random Survival Forest analysis : R package, Random Survivalforest "or the latest version" Random Forest SRC package.
Analysis of prognostic status as a categorical variable Setting a cutoff divides the continuous survival time into categorical variables " good prognosis " and " poor prognosis ". First, we can select the prognostic variables in two ways : ANOVA and Shrinking Centroids. Then there are eight algorithms for categorizing prognostic variables: diagonal Discriminant Analysis ( DDA ), K-near Neighbor (KNN) Analysis (DA), Logistic Regression ( LR ), Near est Centroid (NC), partial Least Square ( PLS ), Random Forest (RF) And Support Vector Machine (SVM).10-fold cross-validation is used to assess utility.The predictive power of the established prognostic model was evaluated by c-index. 2. Identifying somatic mutations in clinically relevant genes. It's a good idea Heuritics For Interpreting This is a lot of people Landscape (phial): Use precision Heuritics For Interpreting The Alternation Landscape ( PHIAL ) algorithm classifies somatic mutations to determine the relevance of gene mutation and clinical application, and thus identify potential therapeutic targets. Clinical related genes Related The concept of genes refers to genes that produce somatic mutations that can cause resistance or response to treatment, and / or have an impact on diagnosis or prognosis.Clinically Actionable Gene's concept refers to any gene in cancer that produces somatic mutations that can predict therapeutic efficacy or therapeutic resistance to a particular cancer treatment,the ability to diagnose or predict prognosis is considered a clinically Actionable Gene. First of all, through literature search, manual screening and reference to expert advice on the way, resulting in 121 Actionable Genes are stored in the database TARGET ( http : / www.broadinstitute.org / cancer / cga / target ). Genes in the database can be used to guide treatment, predict prognosis and assist diagnosis. The integration principles of the 121 TARGET genes include : 1. By clinical and biological relevance 2. Linking TARGET genes to other biologically relevant pathways or gene sets Three Demotion is of unknown significance to determine the mutation. Thus forming the PHIAL. In order to distinguish and sort mutations to the greatest extent, in addition to considering the genes in TARGET, other criteria are applied to the classification of mutation points.Including Cancer. Gene Current in Census Alternatives;curated in msigdb Cancer Actionable changes in the same sample for pathways analysis Genes pathway;cancer pathways or gene sets analyzed by msigdb;and the mutations in COSMIC ( Figure 3 ). All PHIAL code can be obtained from R package ( http : / www.broadinstitute.org / cancer / cga / phial / ) 3, 4.
Some of these methods can be used in the interpretation of individual genomic data and in the retrospective study for the analysis of genomic data. Hope to give you some hints. |