Harnessing Each Patient’s Data to Help Many More
At Cancer Commons, we don’t just help people navigate cancer treatment; we learn from everyone we help. Here, our Curious Dr. George asks Cancer Commons Clinical Scientist Kaumudi Bhawe, PhD, to share how new knowledge can be captured from every patient to help many more.
Curious Dr. George: Cancer Commons has accumulated in-depth data on many hundreds of patients with various cancers. Because of the molecular nature of cancers, many are very complex—even unique. How might these data be best studied and reported to help inform diagnostic and therapeutic decisions for many additional patients, who are in need of information to guide their own journeys in precision oncology?
Dr. Bhawe: We are more than just the sum of our body parts.
We are stories, waiting to be told.
This is especially true for cancer patients who are suddenly put face-to-face with the fear of death. Most of the time, their stories remain untold, and this wealth of information and insights may be lost upon the medical community. Most organizations developing new oncology drugs and preventative screens realize the need for deep “N-of-1” studies—essentially, rigorous analyses of individual patients’ testing and treatment outcomes—the nuances of which can be mined for novel, testable insights. But many organizations are not equipped for employing an N-of-1 approach.
Enter Cancer Commons.
Here at Cancer Commons, we communicate in-depth with each person who contacts us to understand their cancer-care possibilities, considering their own unique cancer context. Since we place high value on such a holistic understanding, we can tailor our recommendations of clinical trials and diagnostic testing possibilities in a truly personalized manner. Most patients come back to us when they are faced with a new decision point in their journey, and as a result, we have longitudinal information on these patients, capturing a specific window of time in their lives. Cancer Commons has been helping patients for over ten years now, so we have accumulated a wealth of such information that can be mined for important insights.
Our patients’ interests always come first, so data de-identification to preserve patient anonymity is a must. Once that is done, there are multiple ways to use patient data in order to better understand the relationships between variables that have been captured in the dataset.
One very powerful way to conduct this kind of analysis is by employing a Bayesian machine-learning approach. I will explain below, but in technical terms, this is a statistical strategy in which current knowledge in the field can be used to construct an initial directed acyclic graphical model of an initial multifactorial hypothesis, which itself is a set of nested, interconnected sub-hypotheses wherein there is an initial 50/50 probability of both the overall hypothesis (model) and any nested sub-hypothesis (relationship between two variables in the model) being true. The beauty of such an approach is that the initial model (termed “prior”) can be tested and modified continually in real time as real-world data is added to the dataset, and as new clinical information is added to the larger body of medical literature. See this presentation by Cancer Commons founder Marty Tenenbaum, PhD, for a deeper understanding of the larger conceptual framework of using Bayesian machine learning and artificial intelligence to beat cancer.
So what does the above paragraph really mean?
Let’s take a realistic but fictional example to understand this better. Suppose our question is: “What is an ovarian cancer patient’s likelihood of benefitting from treatment with a novel KRAS inhibitor?” The factors influencing patient benefit might include: cancer stage, grade, histological subtype, number of prior treatments, type of prior treatments, KRAS mutation type, and tumor mutational burden (TMB) status, among others. Our initial directed acyclic graph model (prior) might state that each of the known factors independently, and with equal weightage, feeds into a binary output called “benefit from trial” with two possible states being “yes” and “no”. We assign an initial 50/50 probability to believing that our proposed “prior” is the best representative model of the situation. We can then feed real-world data into a machine-learning algorithm that tests the model (prior) and proposes a “better fit” model based on the data we have at hand. The “better fit” model becomes the “prior” for the next step, and as new clinical information becomes available—in our fictional example, let’s suppose that the novel KRAS inhibitor in question showed benefit to lung cancer patients with a G12D mutation—it can also be accounted for. The above process is thus carried out iteratively, and has the potential to generate novel testable hypothesis-generating insights, such as the following fictional example: “KRAS G12D mutations are associated with low TMB status in some third-line platinum-resistant high-grade serous ovarian cancer patient tumors, and our historic-data-based model predicts these patients as having a higher probability of experiencing greater than one year of progression-free-survival from the novel KRAS inhibitor in question as compared to patients with any other KRAS mutations or TMB status.”
In order for researchers to use computers to carry out the above sort of modeling, data from structured and unstructured sources must be parsed, tabulated, and processed before it can be used as input. But it is important to remember that the above fictional scenario also describes an analytical process that has been used for centuries by practicing physicians and caregivers.
At Cancer Commons, we are a dedicated team of professionals helping each patient navigate next-step possibilities in real time, while curating our collective learning, so as to help future patients navigate their own experiences with precision oncology.
Dr. Bhawe can be reached at kaumudi.bhawe@cancercommons.org.