The below is a part of a poster presented in the past. The full content can be obtained upon request.
Prognosis vs. Prediction in Precision Medicine
Prognostic Biomarker: provides information about the patients’ overall outcome, regardless of therapy.
Statistical test: is Marker X associated with an efficacy endpoint?
Predictive Biomarker: provides information about the effect of a therapeutic intervention; can be a target for therapy
Statistical test: is Marker X associated with the differential effect between treatments on an efficacy endpoint (treatment comparison)
Statistical methods to evaluate a biomarker’s prognosis and prediction.
Clinical trial designs based on prognostic and predictive biomarkers
Prognostic Enrichment: to identify patients with a greater likelihood of having the event (or a large change in a continuous measure) of interest in a trial
Advantages: to increase the power of a study to detect any given level of risk reduction.
Predictive Enrichment: to identify patients more likely to respond to a particular intervention
Advantages: to better detect increased study efficiency or feasibility, or enhanced benefit-risk relationship
Define fit-for-purpose threshold on a biomarker with a continuous scale
To determine the optimization goal:
the maximized differentiation in an outcome between marker selected and the un-selection populations by the cutoff
the maximized or a targeted differentiation between treatments in a marker selected the population
the maximized interaction effect between treatment and a categorized biomarker
a target efficacy outcome, e.g. response rate, median survival
an optimal sensitivity and specificity combination, e.g. Youden index by ROC
prevalence consideration
concordance with a reference biomarker
Finding Subgroups of Patients Responsive to Treatment
There has been a growing interest in finding subgroups to administer an individualized treatment to a subgroup characterized by special properties in modern medicine. When it comes to the problem of finding responsive subgroups of a treatment, it involves a set of classification methods that include logistic regression, discriminate analysis, nearest neighborhood, or CART, etc. Specially, traditional methods tend to examine the treatment effect influenced by subgroups by incorporating the characteristic factor as a moderator in a multiple regression model. For example, the regression model Y= a + b*X + c*T + d*X*T + e has always been in use to examine whether the effect of a treatment (T) may have a significant change between subgroups (X). If the interactive effect between X and T is significant, the treatment (T) effect may be subject to a change between different subgroups (X), and the subgroup with positive treatment effect can be identified as a responsive subgroup in terms of the treatment (T). This method seems straightforward. However, in reality, as the number of characteristic factors grows, this method may be vulnerable to high Type I error rate when examining numerous characteristic factors as moderators of the treatment effect. An application of Bonferroni adjustment can be overly conservative, and leads to no significant treatment-by-subgroup effect. In addition, the sequence of moderator-by-treatment tests may ignore examining multiple moderators meantime, so to cause difficulties in terms of interpreting the results. Thirdly, when the model needs cross-validation, due to the limited samples in both training and test datasets, there may exist large error rate (2-fold), or inconsistent moderator-by-treatment interactions across several sets of training data (n-fold). Therefore, it's very necessary to develop methodology to identify subgroups with numerous characteristics based on multivariate modeling techniques.
The method used here is derived from principal component and Bayesian statistics, which allows to pull in multiple characteristic factors simultaneously in a single principal component, and the goal is to maximize the posterior probabilities of a characteristic factor given a certain group membership. The number of subgroups can be determined by model comparisons on criteria, such as AIC, BIC, aBIC, etc. More technique details can be found at http://methodology.psu.edu/ra/lca. The method has also been applied to another study in diagnostic testing, Concordance between Gambling Disorder Diagnoses in the DSM-IV and DSM-5, which provides with nicely interpretable results. The difference here is that this type of work is adding a distal outcome (responsive/non-responsive to treatment) which is associated with group memberships.
The outcome of the study was positive/non-positive changes in patients' lipid profile affected by pharmaceutical drugs. As seen from the plot above, the entire patient population was classified to 2 subgroups - one was responsive group, and the other was non-responsive group. In patients who were responders, they are more likely to be males, and have high levels of cholesterol and LDL in pre-treatment conditions, and less likely to have high systolic blood pressures and low levels of HDL prior to the treatment. The finding about male responders was consistent with another study which applied traditional regression method to the same dataset. The patients' characteristics about levels of cholesterol, LDL, HDL, and blood pressure seem to be associated with the distal outcome, the lipid profile change. Although further interpretations related to functional mechanisms of the pharmaceutical drugs are needed, such a classification method depicts an entire profile of patients who are responsive differently to a certain type of treatment with the likelihood of each individual characteristic factor, which may serve as a Bayesian based multivariate machine learning method to be applied elsewhere.
Merry Christmas~!
The Space of Common Psychiatric Disorders in Adolescents: Comorbidity Structure and Individual Latent Liabilities →
A novel way of understanding psychiatric disorders in adolescents is mapping the disorders into a geometric space with a limited number of dimensions and no disorder aligning along one single dimension. In addition, it has also been found that the geometric dimensions are hierarchically organized, allowing for analyses at different levels of the organization. Furthermore, individuals with psychiatric disorders present with a broad range of liabilities, reflecting the diversity of their clinical presentations.
Method
Exploratory factor analyses of data from the National Comorbidity Survey Adolescent Supplement (NCS-A) with the psychiatric diagnoses as indicators were used to identify the latent major psychopathological dimensions. The loadings of the disorders on those dimensions were served as "coordinates" to calculate the Euclidean distances between disorders. The distribution of individuals in the space was based on the latent factor scores reflecting the major psychopathological conditions. 2nd-order factor analysis was also developed to show that these common psychiatric disorders were hierarchically organized.
Hierarchical structure of common psychiatric disorders: results of 2nd-order factor analysis of the 16 common psychiatric disorders in adolescents - a paper publication version.
Hierarchical structure of common psychiatric disorders: results of 2nd-order factor analysis of the 16 common psychiatric disorders in adolescents - a web clickable version.