Invented by Antoaneta Petkova VLADIMIROVA, Yogesh P. PANDIT, Vishakha SHARMA, Tod M. Klingler, Hari SINGHAL, Roche Molecular Systems Inc
The Roche Molecular Systems Inc invention works as followsThe invention relates to methods and systems that perform a clinical forecast. The method includes, in one example: receiving first biopsies of a patient and first molecular information of that patient. The machine learning model then processes the data using the first biopsies and first molecular information to predict the response of the patients to treatment.
Background for Multimodal machine learning-based clinical predictor
Cancer is a scary disease. Early detection of many cancer types can be difficult, and the cancer may already have advanced to an advanced stage when it’s diagnosed. Ovarian cancer, for example, is responsible for about 3% deaths in women. It has been dubbed the “silent killer”. The symptoms can be mistaken for benign conditions. Early signs of lung carcinoma, like coughing and chest pains, can be confused with other benign conditions (e.g. cold). Many cancer patients, who adhere to standard care (such as surgery, followed by platinum-based chemo), may also develop resistance to platinum treatment. The ability to predict who will relapse and become resistant may lead to the consideration of other therapeutic options such as clinical trials earlier in the journey of the patient, increasing the chances of recovery and survival. There is currently no predictor which accurately predicts the response of a cancer patient to treatment.
Disclosed are techniques to perform a clinical predict based on multimodal data, using a machine-learning model. Multimodal clinical data is a term that refers to different types of data such as biopsy image data or molecular data. Clinical prediction can include, for example, predicting the response to treatment, predicting the risk that the patient will have the disease, outcome predictions, biomarker extraction, target identification for drug research, etc. As an example, the clinical prediction could include predicting whether a patient is likely to be sensitive or resistant to platinum drug treatment of ovarian cancer. Predicting the patient’s survival rate for different treatments (e.g. immunotherapy, chemo, etc.) is another example of a clinical prediction. for lung cancer. “The techniques can be used for other disease areas and clinical hypotheses.
In some embodiments the techniques can include receiving multimodal data from a patient. This data may include first molecular and first biopsy images. These techniques also include using a machine-learning model to process the first molecular and first biopsy image data in order to make a clinical forecast. The machine learning model can be generated or updated using second molecular and second biopsy images of multiple patients. The techniques also include generating a clinical prediction output.
The molecular data can include feature vectors that represent, for example, RNA-seq data (ribonucleic acids (RNA) sequencing), microRNA-seq data (miRNA sequence data), protein expression data (for example, PD-L1 antibody), gene mutation data (for deoxyribonucleic acids (DNA) methylation), or copy number variations (CNVs) data. RNA-seq and miRNA-seq can include gene-expression patterns that are included in RNAs or miRNAs. The biopsy images may include histopathology or hematoxylin and eosin stained (H&E). The machine learning models may include, for instance, the Naive Bayes model (NB), the logistic regression model (LR), a random forest model (RF), a support vector model (SVM), an artificial neural net model (ANN), a multilayerperceptron model (MLP), a convolutional network (CNN), or other machine learning and deep learning models. The machine learning model may be updated/trained by using either supervised or unsupervised learning techniques.
Below are detailed descriptions of these and other embodiments. Other embodiments include, for example, systems, devices and computer-readable media that are associated with the methods described in this document.
The following detailed description, together with the accompanying drawings, will help you to better understand the nature and benefits of embodiments of this invention.
Disclosed are methods for making a clinical forecast.” A clinical prediction can include predicting the response to a drug treatment, or predicting the likelihood of a patient developing a particular disease. A clinical prediction example is predicting if the patient will be sensitive or resistant to platinum drug treatment of ovarian cancer. The disclosed techniques can be applied to other cancer types and clinical questions.
More precisely, multimodal clinical information, such as a patient’s molecular data or biopsy image data can be collected and processed by a machine-learning model in order to make a clinical prediction about the patient?s response to treatment. The molecular information may include, for instance, RNA-seq and miRNA-seq data, gene mutations, DNA methylation, copy number variations (CNV), etc. The molecular information can be represented numerically by feature vectors based, for instance, on a mapping of the molecular (e.g. gene expressions) to pre-determined codes.
The biopsy image data can include numerical feature vectors that are extracted from a sample image, such as a primary tumor biopsy. Pre-processing of the biopsy image is possible to include histopathology (H&E-stained) data. Raw biopsy image data can be used to extract numerical feature vectors. A second machine-learning model can be used to extract the feature vectors. This model can include a CNN model. The CNN model can include, for instance, at least one of the following: VGG (Visual Geometry Group), ResNet, DenseNet, GAN, etc. In some cases, the CNN model is derived from transfer learning techniques. The weights for the lower layers are used to perform other tasks, such as identifying a shape or boundary. The weights for the higher layer layers of the CNN are trained using the outputs from the lower layers in order to identify features of cells and tissues that can be used to predict a patient’s reaction to treatment. The second machine-learning model can also be configured in some cases to process radiology images.
As part of the extraction process, the data from the biopsy images can be segmented by tissue type. In addition, cells such as lymphocytes that serve as biomarkers for measuring a treatment response can be detected. The segmentation of the biopsy image data and the detection of the lymphocytes, along with their location, can be used to create biopsy image feature data that can then be fed into the machine learning model.
The clinical prediction can include predicting if the patient is sensitive or resistant to a platinum-drug treatment for a specific cancer type. It could also predict a survival rate for patients who receive a certain treatment for that cancer type. In some cases, the machine-learning model can be created/updated using supervised learning techniques based on labeled molecular and biopsy image data from a plurality patients. The plurality can include one group of cancer patient who is sensitive to a drug treatment, and another group who is resistant to that drug treatment. The machine learning models can be trained, for example to compute a score using biopsy image features and molecular features.
In certain examples, the machine-learning model can be created/updated using unsupervised learning techniques. The correlation between biopsy image features and molecular features can be calculated, and highly correlated pairs of biopsy image features and molecular features can be identified. Highly correlated molecular features and biopsy images of a pool of patients can be classified into groups according to their values. Patients with the same molecular features and biopsy images can then be divided into cohorts. The clinical predictions for new patients can be made based on the group that the molecular features and biopsy images of the new patient fall into and the treatment response of the cohort associated with the group.
The disclosed embodiments allow a machine learning (ML)-based clinical predictor to predict a patient’s reaction to treatment using molecular and biopsy data. These data may contain known or unknown information about the patient such as pathological data, biological data, etc. that could determine their response to treatment. The predictor is able to learn from the molecular and biopsy data of patients, their treatment response, and refine its prediction using various machine learning techniques. In embodiments of the disclosure, a prediction of a treatment response can be made earlier and more accurately. Corrective actions, such as considering additional therapeutic options or available clinical trials can also be taken earlier. “All of these factors can increase the chances of recovery and survival for the patient.
I. Patient Treatment Response Analysis Over Time
FIG. “FIG. 1B show an example of a mechanism for analyzing the patient’s response over time to a particular treatment. FIG. FIG. 1A shows a chart of 100 that illustrates an example timeline to determine if a patient has ovarian cancer and is resistant or sensitive to the platinum drug treatment. As shown in FIG. As shown in FIG. The patient can start taking platinum drugs at time T1, but stop them at timeT2. The platinum drug treatment can have a variety of outcomes. The primary tumor may return or progress during the platinum drug treatment between T1 and time T2. The primary tumor may also recur within 180 days between the time T2, which ends the platinum drug therapy, and the time T3. In both situations, the patient’s resistance to platinum drug treatment can be determined. If the primary tumor does not progress or recur between T2 and T3 or after T3, then the patient may be sensitive to platinum drug treatment. If the primary tumor of a patient returns or advances after the time T3, this can be interpreted as a sign that the patient has sensitivity to platinum drugs.
FIG. Chart 150 illustrates an example of Kaplan-Meier plot (K-M), which is used to study survival statistics for patients with a particular type of cancer, such as lung cancer, who have received a specific treatment. A K-M plot displays the survival rate change over time of a group that has received a specific treatment. As time passes, the survival rate of some patients will decrease. Other patients may be censored from the plot because of events that are not related to the event studied. These unrelated events will appear as ticks on the K-M plot. The length of the horizontal lines represents the duration of survival for each interval. All survival estimates up to a certain point represent the cumulative probabilities of survival to that point.
As shown in FIG. “As shown in FIG. Chart 150 can be plotted using different K-M plots. Figure 1B shows that the median survival (50 percent of patients would be estimated to survive) in cohort A is approximately 11 months, whereas in cohort B it is about 6.5 months. In cohort B, the median survival (50 percent of patients are estimated to survive) is approximately 6.5 months. K-M analyses are often reported using a hazard rate, which represents the probability (or risk) that an event will occur in a group of patients who receive a certain treatment. A low hazard can indicate that a cohort has a higher rate of survival compared to another cohort who received a different treatment or no treatment.
FIG. The chart 200 in Figure 2 shows an improved method for predicting the response to platinum drug treatments and performing a therapy based on that prediction. As shown in FIG. As shown in FIG. “At time T0.
At block 200, a treatment-response prediction can be done to predict the patient’s response to treatment A. Treatment response prediction can occur before starting medical treatment A. At time T1. The prediction can determine whether or not medical treatment A should be recommended. Predictions may include, for instance, whether the patient is likely to respond or resist medical treatment A, the predicted survival of the patient and whether medical treatment will have an impact on the predicted survival of the patients, etc.
The prediction of the patient’s response to treatment A in block 202 is used at block 204 to determine the medical treatment recommended for patient 201.Click here to view the patent on Google Patents.