AANS Neurosurgeon | Volume 28, Number 2, 2019

Advertisement

Why Bigger May Not Be Better for Neurosurgery Patients

Print Friendly, PDF & Email

Few drumbeats are more emphatically sounded in today’s climate of academic neurosurgery than the value and importance of big data. Although definitions and techniques vary, the core principles of big data analyses rely on drawing large patient samples from sources that are at least one level of abstraction removed from primary medical records. These sources may include administrative databases, such as insurer records or the National Inpatient Sample (NIS) database, or multicenter registries such as the American College of Surgeons National Surgical Quality Improvement Program (ACS NSQIP) or the Surveillance, Epidemiology, and End Results (SEER) database.

In principle, the purported strengths of big data techniques include the ability to achieve much larger sample sizes than are accessible via single-institution studies while providing a standardized framework for capturing variables and outcomes between multiple centers. As a result, many note them as a cornerstone of the science of practice. (While these characteristics of big data approaches have distinct advantages—particularly in the study of rare diseases or complications—they are also compromised by a number of mitigating factors that are rarely publicized.

Big Data Concerns

These concerns are particularly relevant in a niche field characterized by small patient volumes, diverse disease presentations and highly variable local and individual treatment paradigms such as experienced by neurosurgeons. Correspondingly, we urge the community of academic neurosurgeons developing, assessing and implementing big data research to remain skeptical, particularly with regard to the reliability and validity of the underlying information.

A nearly universal vulnerability of big data applications to neurosurgical patients is the inconsistency between the clinical questions being asked and the variables used to derive an answer. To their credit, most big data publications in neurosurgery journals are appropriate and interesting in terms of their focus and study design; however, in many instances, the databases utilized were created for purposes other than neurosurgery research, such as actuarial analysis by insurers, Medicare cost analyses or general surgery research. Correspondingly, many neurosurgical outcomes-of-interest are rarely captured, and even the underlying diagnoses are rarely captured with sufficient specificity.

A Poignant Illustration

As an example, a vestibular schwannoma (VS) investigator may use big data to investigate risk factors for postoperative cerebrospinal fluid (CSF) leak—a clinically important outcome, sufficiently uncommon that it is self-evident how a very large sample might reveal certain risk factors that would not reach significance in a modestly sized single-institution cohort. NSQIP is generally agreed to be the largest and most reliable big data source for clinical questions pertinent to early postoperative outcomes; however, due to the coding structure, this question is effectively impossible to posit in the terms framed by the investigator.

In the NSQIP coding structure, VS are not reliably captured via a single ICD-9/-10 code; rather the CPT code for “benign neoplasm of cranial nerve” must be used, which includes a wide range of pathologies with potentially differential risk profiles. Attempts to refine the search by using other CPT codes may partially address the issue—for example, by isolating patients with CPT codes suggesting a possible posterior fossa operation—but even then, at least 14 different procedure codes include the terminology “posterior fossa” or “infratentorial,” and as any operation may include up to 10 CPT codes, there is no reassurance that an individual with one of those CPTs in their record underwent an operation that was predominantly focused on the cerebellopontine angle (CPA).

Complicating matters further, CSF leak is not captured in NSQIP; rather, the investigator would have to capture readmissions or reoperations within 30 days associated with a wound-related ICD-9/-10 code, and extrapolate from there, without a reliable mechanism to definitively distinguish wound infections or dehiscences associated with CSF leaks from those that occurred secondary to other unrelated risk factors.

Even More Challenges Await

Many other similar considerations arrise when more complicated questions or difficult-to-diagnose diseases or complications are invoked; , for the vast majority of research applications, big data techniques lack the granularity demanded in an age of individualized medicine.

Beyond the limitation that big data resources almost lack the robust detail and rigor required of clinical study in neurosurgery, two other core issues have sweeping implications for the utility of big data in our specialty:

  • Data integrity and
  • Distinction between statistical and clinical significance.

Perhaps the most important and concerning problem faced by neurosurgeons assessing the potential for integration of big data findings is the absence of a mechanism for independent validation of the data’s fidelity outside one’s home institution. NSQIP is frequently cited as the most accurate national registry available for big data research, with the ACS citing validity rates of approximately 70 percent, as compared to 60 percent when tested head-to-head against NIS in the same patient sample. These are potentially adequate for broad questions, (e.g. overall readmission rates in more generalized populations with particular risk factors, or socioeconomic analyses). However, the 30 percent margin-of-error completely eclipses the differences that are observed between the groups being compared in routine neurosurgical studies, dramatically undermining the interpretation of and conclusion drawn from such data sets.

False Conceit

This concern is amplified by another false conceit frequently noted in big data publications—the conflation of statistical and clinical significance. Mathematically, a large sample size is capable of powerfully driving statistical testing. Correspondingly, even if there were no concerns regarding accurate measurement of the diagnoses, risk factors, outcomes, or of the data fidelity, many big data analyses will observe a significant difference (e.g. p<0.05), even if the observed difference between groups is very small. This is frequently observed as a trivial increase in a patient characteristic, laboratory result, vital sign or other parameter, where the two mean values would be considered functionally equivalent in clinical practice.

Further, many big data analyses ask a large number of questions during the execution phase of the research, increasing the probability of a significant result being observed due to chance alone. Many other analytic techniques that utilize similarly large sample sizes (e.g. genome-wide association studies) employ a range of techniques to adjust significance thresholds in order reflect the influence of these statistical considerations, such as the Bonferroni correction. Unfortunately, many big data publications in neurosurgery continue to employ the unadjusted alpha level of 0.05, a decision that dramatically increases the probability of mistaken significance, and a practice that should be broadly rejected by journal editors and readers alike.

Buyer (Neurosurgeon) Beware

Like so many emergent research techniques, big data analyses have been adopted and extrapolated with extraordinary enthusiasm, largely fueled by the excitement of dedicated neurosurgeons whose laudable goal is to improve care for their patients. Unscrupulous applications to nuanced, granular neurosurgical questions may be more likely to yield unreliable results—particularly when compared to traditonal techniques such as the cohort or case-control study, to say nothing of randomized trials. Yet there may be optimal roles for big data in the research armamentarium. Until big data finds its proverbial nail, we encourage investigators and readers alike to remain scrupulous and periodically remind themselves that it is just one tool in the toolkit.

Leave a Reply

Be the first to reply using the above form.