"Just Another Statistic"Oncologist. 1998; 3(3):III-IV.O
On returning from a medical meeting, we learned that sadly a patient, "Mr. B.," had passed away. His death was a completely unexpected surprise. He had been doing well nine months after a course of intensive radiotherapy for a locally advanced head and neck cancer; in his most recent follow-up notes, he was described as a "complete remission." Nonetheless, he apparently died peacefully in his sleep from a cardiac arrest one night and was found the next day by a concerned neighbor. In our absence, after Mr. B. expired, his death certificate was filled out by a physician who didn't know him in detail, but did know why he recently was treated in our department. The cause of death was listed as head and neck cancer. It wasn't long after his death before we began to receive those notorious "requests for additional information," letters from the statistical office of a well-known cooperative group. Mr. B., as it turns out, was on a clinical trial, and it was "vital" to know further details of the circumstances of his passing. Perhaps this very large cancer had been controlled and Mr. B. succumbed to old age (helped along by the tobacco industry). On the other hand, maybe the residual "fibrosis" in his neck was actually packed with active tumor and his left carotid artery was finally 100% pinched off, or maybe he suffered a massive pulmonary embolism from cancer-related hypercoagulability. The forms and requests were completed with a succinct "cause of death uncertain," adding, "please have the Study Chairs call to discuss this difficult case." Often clinical reports of outcomes utilize and emphasize the endpoint "disease specific survival" (DSS). Like overall survival (OS), the DSS can be calculated by actuarial methods, with patients who have incomplete follow-up "censored" at the time of last follow-up pending further information. In the DSS, however, deaths unrelated to the index cancer of interest are censored at the time of death; thus, a death from intercurrent disease is considered a "success" (to the investigator, that is; obviously, not to the patient and his or her family). The DSS rate will always be superior to the OS rate. Obviously, for any OS curve, if one waits long enough it will ultimately come to zero. There is thus a very logical rationale for reporting the DSS separately, particularly in diseases where death from intercurrent disease is expected to be common. Analyzing the DSS allows researchers to better compare the biologic efficacy of two or more cancer treatments, since it does not necessarily come to zero. Unlike some other endpoints, including local-regional control or freedom from progression, it takes into account the possibility of salvage therapy. DSS also focuses on an endpoint of interest to the public-death from cancer. In a recent popular media survey in which people were asked how they would choose to die if they could, 0% selected cancer. However, there are two serious potential problems with heavy dependence on the DSS. First, since patients who die from intercurrent disease are considered "cured," it seriously inflates the apparent effectiveness of a cancer treatment. Given the same biologic disease and the same treatment, the DSS as calculated in an old, sick population at high risk of intercurrent death will be better than the DSS in a younger, healthier population whose major risk is from their cancer. This problem has been discussed with respect to early stage prostate cancer, in which the conservative approach of observation has been criticized. The studies at issue rely heavily on the DSS, suggesting a comparable DSS (90% at 10 years) with "watchful waiting" to other researchers' results with aggressive therapy. The problem is that these series of conservative management focus on a patient population (as opposed to individuals) with a high risk of competing causes of mortality, which is very different from the population of patients generally treated with aggressive therapy (in which some have shown overall survivals superior to age-matched controls). It is fallacious and illogical to compare nonrandomized series of observation to those of aggressive therapy. In addition to the above problem, the use of DSS introduces another potential issue which we will call the bias of cause-of-death-interpretation. All statistical endpoints (e.g., response rates, local-regional control, freedom from brain metastases), except OS, are known to depend heavily on the methods used to define the endpoint and are often subject to significant interobserver variability. There is no reason to believe that this problem does not occasionally occur with respect to defining a death as due to the index cancer or to intercurrent disease, even though this issue has been poorly studied. In many oncologic situations-for example, metastatic lung cancer-this form of bias does not exist. In some situations, such as head and neck cancer, this could be an intermediate problem (Was that lethal chest tumor a second primary or a metastasis?.Would the fatal aspiration pneumonia have occurred if he still had a tongue?.And what about Mr. B. described above?). In some situations, particularly relatively "good prognosis" neoplasms, this could be a substantial problem, particularly if the adjudication of whether or not a death is cancer-related is performed solely by researchers who have an "interest" in demonstrating a good DSS. What we are most concerned about with this form of bias relates to recent series on observation, such as in early prostate cancer. It is interesting to note that although only 10% of the "observed" patients die from prostate cancer, many develop distant metastases by 10 years (approximately 40% among patients with intermediate grade tumors). Thus, it is implied that many prostate cancer metastases are usually not of themselves lethal, which is a misconception to anyone experienced in taking care of prostate cancer patients. This is inconsistent with U.S. studies of metastatic prostate cancer in which the median survival is two to three years. It is possible that many deaths attributed to intercurrent disease in "watchful waiting" series were in fact prostate cancer-related, perhaps related to failure to thrive, urosepsis, or pulmonary emboli. We will not know without an independent review of the medical records of individual patients; in some cases, even the most detailed review, sometimes even an autopsy, will not be conclusive. There are only a few data available describing the problems created by cause-of-death-interpretation bias. One small study, presented only in abstract form, assessed the cause of death in 50 randomly selected prostate cancer patients who died. Five experts in prostate cancer were asked to assign the cause of death as due to or not due to prostate cancer. The DSS varied from 21% to 35% among the five reviewers, a relative difference of 66%. Studies of autopsies, which are now rarely done in the U.S., have shown that fatal malignant tumors were occasionally missed by clinicians and-even more sobering-an occasional patient thought to have died from metastatic cancer is found to have no tumor but to have died from a "benign" cause such as TB. One study suggested an error rate of approximately 8%. Clearly the use of DSS is here to stay and is a useful adjunct to OS in analyzing randomized trials. There needs to be more research on the validity and interobserver reproducibility of the DSS. In the meantime, researchers should not report DSS without reporting OS and the reasons for intercurrent deaths should be described-peer reviewers should enforce this. As with so many other problems with statistics in the medical literature, it is the job of the reader to remain skeptical. The rate of intercurrent deaths in a study should reflect the age and demographics of the study population. If the DSS is far superior to the OS, the population being studied may be unusually sick (and thus unrealistic), or there may be a bias in classifying the causes of death. Similarly, if the DSS and OS are identical (unless a highly virulent malignancy is being studied), it may suggest the researchers have only included an unusually healthy (and thus unrealistic) patient population. Finally, we would also be a bit suspicious of a sizeable series that did not have any deaths that were considered of "uncertain" cause, unless the researchers specifically included them as being due to the cancer. We honestly think that everybody has a few patients like Mr. B.