Increasing rates of VCF are shown with each 1-point increase in score, ranging from 6% with a score of 0 to 52% with a score of 4.
Low-risk (score, 0-1) cases had improved freedom from VCF compared with 80% for intermediate-risk (score, 2; 99 [21.3%]) and 56% (score, 3-4; 92 [19.8%]) for high-risk cases, respectively (P < .001).
Stratification by individual score (A) and risk group (B) demonstrated statistical significance using the Gray test (P = .001).
eTable 1. Hazard Ratios and P values From the Final Cox Hazards
eTable 2. Rates of Vertebral Compression Fracture (VCF) With Scores and Risk Groups
eTable 3. Comparisons Between the Three Risk Groups
eTable 4. Rates of Vertebral Compression Fracture (VCF) With Scores and Risk Groups, Excluding Multilevel Treatments
eTable 5. Cumulative Incidence of Vertebral Compression Fracture (VCF) at 2 Years for Risk Cohorts and 9 Individual Scores
eFigure. Cumulative Incidence Curves for Local Failure and Vertebral Compression Fracture (VCF), With Death as a Competing Risk
Customize your JAMA Network experience by selecting one or more topics from the list below.
Kowalchuk RO, Johnson-Tesch BA, Marion JT, et al. Development and Assessment of a Predictive Score for Vertebral Compression Fracture After Stereotactic Body Radiation Therapy for Spinal Metastases. JAMA Oncol. 2022;8(3):412–419. doi:10.1001/jamaoncol.2021.7008
Can we develop and assess a risk stratification model for vertebral compression fracture (VCF) after stereotactic body radiation therapy to identify patients at highest risk of VCF?
In this cohort study of 331 patients who received treatment with stereotactic body radiation therapy, 4 key predictors of VCF were identified, with 1 point assigned for each: epidural tumor extension, lumbar location, a gross tumor volume of more than 10 cc, and a spinal instability neoplastic score of more than 6. Each 1-point increase in score was associated with increasing rates of VCF, and a high-risk group was found with scores of 3 to 4.
This study identified a group of patients at high risk for VCF who may potentially benefit from undergoing prophylactic spinal stabilization or vertebroplasty.
Vertebral compression fracture (VCF) is a potential adverse effect following treatment with stereotactic body radiation therapy (SBRT) for spinal metastases.
To develop and assess a risk stratification model for VCF after SBRT.
Design, Setting, and Participants
This retrospective cohort study conducted at a high-volume referral center included 331 patients who had undergone 464 spine SBRT treatments from December 2007 through October 2019. Data analysis was conducted from November 1, 2020, to August 17, 2021. Exclusions included proton therapy, prior surgical intervention, vertebroplasty, or missing data.
One and 3 fraction spine SBRT treatments were most commonly delivered. Single-fraction treatments generally involved prescribed doses of 16 to 24 Gy (median, 20 Gy; range, 16-30 Gy) to gross disease compared with multifraction treatment that delivered a median of 30 Gy (range, 21-50 Gy).
Main Outcomes and Measures
The VCF and radiography components of the spinal instability neoplastic score were determined by a radiologist. Recursive partitioning analysis was conducted using separate training (70%), internal validation (15%), and test (15%) sets. The log-rank test was the criterion for node splitting.
Of the 331 participants, 88 were women (27%), and the mean (IQR) age was 63 (59-72) years. With a median follow-up of 21 months (IQR, 11-39 months), we identified 84 VCFs (18%), including 65 (77%) de novo and 19 (23%) progressive fractures. There was a median of 9 months (IQR, 3-21 months) to developing a VCF. From 15 candidate variables, 6 were identified using the backward selection method, feature importance testing, and a correlation heatmap. Four were selected via recursive partitioning analysis: epidural tumor extension, lumbar location, gross tumor volume of more than 10 cc, and a spinal instability neoplastic score of more than 6. One point was assigned to each variable, and the resulting multivariable Cox model had a concordance of 0.760. The hazard ratio per 1-point increase for VCF was 1.93 (95% CI, 1.62-2.30; P < .001). The cumulative incidence of VCF at 2 years (with death as a competing risk) was 6.7% (95% CI, 4.2%-10.7%) for low-risk (score, 0-1; 273 [58.3%]), 17.0% (95% CI, 10.8%-26.7%) for intermediate-risk (score, 2; 99 [21.3%]), and 35.4% (95% CI, 26.7%-46.9%) for high-risk cases (score, 3-4; 92 [19.8%]) (P < .001). Similar results were observed for freedom from VCF using stratification.
Conclusions and Relevance
The results of this cohort study identify a subgroup of patients with high risk for VCF following treatment with SBRT who may potentially benefit from undergoing prophylactic spinal stabilization or vertebroplasty.
Stereotactic body radiation therapy (SBRT), also known as stereotactic ablative radiotherapy, is an advanced radiotherapy technique that enables the delivery of highly conformal ablative doses of therapeutic radiation. Stereotactic body radiation therapy is commonly used to manage spinal metastasis, with high rates of tumor control (80%-90%) and low rates of substantial toxic effects.1-5 With the substantial morbidity and mortality of progressive or uncontrolled spinal metastasis, spine SBRT is an effective means for palliation and supporting a patient’s quality of life.
Potential late adverse effects of therapy include radiation myelopathy and vertebral compression fracture (VCF). Radiation myelopathy is an extremely rare but potentially serious adverse effect, but minimizing the spinal cord maximum dose significantly decreases the risk of myelopathy.6 However, VCF is much more common, with reported rates of VCF of approximately 15% to 40%.7,8 Many potential predictive factors have been identified, including radiographic tumor features (eg, lytic disease or baseline VCF), patient-specific characteristics (eg, sex and age), and radiotherapy dose and fractionation (eg, dose per fraction ≥20 Gy).7,9-19 A thorough, multiinstitutional study also analyzed the Spinal Instability Neoplastic Scoring (SINS) system and its individual components, finding 3 of the 6 criteria to be important predictors of VCF.7
We aim to build on these prior studies by developing and assessing a risk stratification model for VCF to synthesize initial assessments of predictive factors. This score may serve as a predictive tool to select patients to receive treatment with prophylactic or early spinal stabilization.
A large, single-institutional database of 680 spine SBRT treatments was collected that comprised all spine SBRT treatments administered from December 2007 through October 2019. After obtaining institutional review board approval and a waiver of informed consent from Mayo Clinic Rochester, this information was obtained via retrospective health record review in accordance with best ethical research practices. Exclusion criteria from this initial data set included treatments with proton beam therapy, benign disease, prior spine surgery or vertebroplasty, lack of imaging follow-up, and missing data points. Few instances of missing data points were present, with the most common involving 9 instances of missing the time from primary diagnosis to SBRT. Prior surgical intervention included laminectomy, corpectomy, fusion, and decompression. Biopsy alone was insufficient for exclusion. Patients who received treatment with multiple courses of SBRT and had direct overlap with prior courses of radiation therapy were included in the data set.
Gross tumor volume (GTV), clinical target volume, planning treatment volume (PTV), and organs at risk delineation and prescription doses were determined by the treating physician. Our institutional SBRT practice uses 2 clinical target volumes, with a low risk PTV (PTV_low) and simultaneous integrated boost to GTV plus or minus a 1-mm to 2-mm margin (PTV_high).20 These target structures closely resemble the Radiation Therapy Oncology Group guidelines.21 One and three fraction treatments were most commonly administered. Single-fraction treatments generally involved prescribed doses of 16 to 24 Gy (median, 20 Gy; range, 16-30 Gy) for gross disease compared with multifraction treatment that administered a median of 30 Gy (range, 21-50 Gy) for gross disease.
The primary end point was VCF. All available computed tomography, magnetic resonance, and positron emission tomography images of the treated spine level were reviewed by 1 of 2 radiologists on a picture archiving and communication system workstation. Timing compared with SBRT and degree of VCF were documented based on imaging review. De novo fractures were defined as VCFs that occurred after treatment with SBRT, and VCF progression was defined as fractures that were present before treatment with SBRT but progressed afterwards. The de novo and progressive VCFs were considered treatment-related fractures in the statistical analysis, which is consistent with prior literature.7
The individual radiographic SINS score components were recorded by a radiologist. Other health record review was undertaken by a radiation oncologist, including the pain component of the SINS score. The SBRT treatments that involved contiguous multilevel vertebral bodies were considered single treatments. Disease characteristics were recorded as describing the entirety of the disease, as opposed to attempting to distinguish these components between the multiple involved vertebral segments. A planned subset analysis intended to consider the risk stratification model after excluding these multilevel treatments from the data set.
Statistical methods included Cox proportional hazards regression analysis and the Kaplan-Meier method. Multivariable Cox models were generated to report hazard ratios, along with 95% CIs. Comparative testing using the Kaplan-Meier method was undertaken using the log-rank test. In all cases, P < .05 was used as the threshold for statistical significance. Two-sided statistical testing was used throughout the analysis, and analyses were conducted using Python, version 3.8.0, and SAS, version 9.4 (SAS Institute).
Fifteen initial variables were selected for consideration for multivariable modeling: age, sex, time from primary diagnosis, cervical location, thoracic location, lumbar location, prostate primary disease, kidney primary disease, dose per fraction, vertebral body involvement, soft tissue involvement (paraspinal involvement), Bilsky score of greater than 0, existing VCF, GTV (≤10 vs >10 cc), and SINS score (≤6 vs >6). These were selected per prior reported predictive factors of VCF.7,9-19 The backward selection method was used, with an initial threshold of P < .50 and final threshold of P < .20. Feature importance testing was also performed to reduce the initial set of 15 candidate variables. Next, a correlation heatmap was generated to exclude variables with correlation coefficients of 0.4 or greater. Afterwards, the backward selection method was used again. Overall concordance of the Cox model was recorded as a relative measure of the loss of fidelity associated with variable selection.
Recursive partitioning analysis (RPA) was performed using a decision-tree analysis with open-source packages in Python (version 3.8.0) according to prior efforts in this respect.22-26 Additional analysis and figure production was conducted using SAS, version 9.4 (SAS Institute). Two separate end points were analyzed: VCF as a binary end point and freedom from VCF as a time-dependent variable. Cutoff values for continuous variables identified by RPA were rounded to increase clinical use. Data were first randomly split into training (70%), validation (15%), and test sets (15%) without further stratification by patient characteristics. Model development was undertaken using the training and validation sets, and assessment was performed using the independent test set. The highest-fidelity models for the assessment and tests sets were selected. Assessment was performed for each potential RPA model by reporting the accuracy of the model’s predictive power for VCF in the independent test set. Further details concerning variable selection and RPA are discussed in the eMethods in the Supplement.
Of the 680 candidate treatments, 92 (13.5%) were excluded for prior surgical intervention, 54 (7.9%) for lack of imaging for the assessment of VCF, 51 (7.5%) for the use of proton therapy, 16 (2.4%) for missing data, and 3 (0.4%) for treatment of benign tumors. These exclusions resulted in a final data set of 464 treatments in 331 unique patients.
At a median of 21 months of follow-up (IQR, 11.4-39.3 months), VCF was identified after 84 treatments (18%). Of these, 64 (76%) were de novo VCF, whereas 29 (23%) were progression of existing fractures. Fractures were categorized as severe (>40% vertebral body height loss), moderate (25%-40%), or mild (20%-25%). Severe fractures were the most common (41 of 83 [49%]) compared with moderate (18 of 83 [21%]) or mild fractures (24 of 83 [29%]). The VCFs occurred at a median of 9 months (IQR, 3-21 months) after treatment with SBRT.
Patient characteristics were generally similar between treatments for those with VCF and those without, including age, body mass index (calculated as weight in kilograms divided by height in meters squared), and smoking history. However, women demonstrated a higher rate of VCF (25% vs 16%; P = .03), and treatments of asymptomatic lesions demonstrated a decreased rate of VCF (16% vs 30%; P = .03) (Table 1). Lesion-specific factors included spinal location, multilevel disease, and primary histology. Notably, primary prostate disease had lower rates of VCF compared with other histologies (11% vs 24%; P < .001), whereas kidney primary disease had increased rates of VCF (30% vs 16%; P = .04). Cervical location was associated with a reduced rate of VCF (6% vs 20%; P < .001), whereas treatment for lumbar disease showed higher rates of VCF (30% vs 13%; P < .001).
Significant radiographic predictors of increased rates of VCF included vertebral body involvement (21% vs 3%; P < .001), soft tissue involvement (34% vs 15%; P = .002), epidural extension (34% vs 15%; P < .001), existing VCF (33% vs 16%; P = .01), and a SINS score of more than 6 (vs ≤6) (27% vs 11%; P < .001). The individual components of the SINS score were also considered separately (Table 2). Five components (location, pain, bone lesion, spinal alignment, and vertebral body collapse) demonstrated increasing rates of VCF as points in the SINS score were awarded. Of these, spinal alignment and pain appeared to be associated with an increased VCF risk with each added point of the SINS score; alternatively, location, bone lesion, and vertebral body collapse were not. The sixth SINS component, posterior spinal element involvement, was not associated with an increased rate of VCF.
While prescribed RT dose was comparable between treatments resulting in VCF and those without VCF, the biologically effective dose (BED) was higher in treatments with VCF compared with those without (mean [IQR], 72 [60-82] Gy vs 67 [60-82] Gy; P = .01), using an α/β ratio of 10 Gy. The mean BED (using α/β of 3) was 179 Gy (IQR, 130-216 Gy) vs 166 Gy (IQR, 130-216 Gy) for treatments with vs without fracture, respectively (P = .01). Treatments with a BED of more than 60 Gy (with an α/β of 10) were associated with an increased rate of VCF (25% vs 13%; P < .001). The GTVs were also a significant predictor of VCF. Treatments with a GTV of more than 10 cc had a 29% rate of VCF compared with 9% for tumors with a smaller GTV (P < .001).
Fifteen initial variables were selected for consideration for multivariable modeling: age, sex, time from primary diagnosis, cervical location, thoracic location, lumbar location, prostate primary disease, kidney primary disease, dose per fraction, vertebral body involvement, soft tissue involvement, a Bilsky score of more than 0, existing VCF, GTV, and SINS score. Adjusting for all 15 variables produced an initial concordance value of 0.798. A SINS score of more than 6 (vs ≤6) was selected as a threshold, per prior analyses.7 A GTV greater than 10 cc (vs ≤10 cc) was identified by RPA as a relevant threshold and verified for statistical significance on univariable analysis as well. Afterwards, these thresholds were applied to the Cox model.
The backward selection method was used. This resulted in the elimination of time from primary diagnosis, kidney primary disease, age, and thoracic location. Feature importance testing was implemented, and a correlation heatmap was generated. Sex and prostate primary were correlated, so prostate primary was eliminated because sex had the higher feature importance value. The backward selection method was again used, with a tighter threshold of P < .20.
Six remaining variables were selected: vertebral body involvement, GTV greater than 10 cc, lumbar location, soft tissue involvement, epidural extension, and a SINS score greater than 6. A final Cox model was developed, with each variable meeting statistical significance (eTable 1 in the Supplement). The final concordance value was 0.78.
These six variables were then assessed using RPA. Five high-fidelity models were generated, demonstrating 73% to 75% accuracy for the validation set and 73% to 81% accuracy for the test set. Four variables were selected by these models: GTV greater than 10 cc, lumbar location, epidural extension, and SINS score greater than 6. A multivariable Cox proportional hazards model using only these 4 variables demonstrated a concordance of 0.762. Because these variables demonstrated comparable hazard ratios on the final Cox hazards model, each were assigned 1 point in the resulting score. A Cox model was developed using only this resulting score, which showed a concordance of 0.760. This simpler and comparably predictive model was used for further study. Treatments demonstrated an association of increased rates of VCF with each 1-point increase in score, ranging from 6% with a score of 0 to 52% with a score of 4 (Figure 1; eTable 2 in the Supplement). The hazard ratio per 1-point increase for VCF was 1.93 (95% CI, 1.62-2.30, P < .001).
The VCFs were also assessed as a time-dependent outcome, and comparable results were demonstrated. The low-risk cohort (risk score, 0-1) had a 2-year freedom from VCF of 92% compared with 80% for the intermediate-risk (score, 2) and 56% (score, 3-4) for the high-risk cohorts, respectively (P < .001) (Figure 2; eTable 3 in the Supplement). When multilevel disease was excluded, the same trend in VCF was demonstrated (eTable 4 in the Supplement).
Cumulative incidence of VCF (with death as a competing risk) was 10.9% (95% CI, 8.4%-14.2%), 14.6% (95% CI, 11.7%-18.3%), and 18.2% (95% CI, 14.9%-22.4%) at 1, 2, and 3 years, respectively (P < .001) (Figure 3). The cumulative incidence of VCF at 2 years was 6.7% (95% CI, 4.2%-10.7%) for low-risk, 17.0% (95% CI, 10.8%-26.7%) for intermediate-risk, and 35.4% (26.7%-46.9%) for high-risk cases (eTable 5 in the Supplement). The cumulative incidence of local failure (with death as a competing risk) was 16.2% (95% CI, 13.2%-20.0%), 21.9% (95% CI, 18.4%-26.1%), and 25.3% (95% CI, 21.5%-29.8%) at 1, 2, and 3 years, respectively (eFigure in the Supplement).
This analysis describes the development and assessment of an RPA-based score that was predictive of the incidence of VCF after treatment with SBRT. Four key predictive factors were identified and included in the model: GTV greater than 10 cc, lumbar location, epidural extension, and SINS score greater than 6. Other predictive factors were also considered and tabulated to compare with prior literature. We also analyzed the individual components of the SINS score, demonstrating that 5 of its 6 components were associated with an increased risk of VCF. This exploratory analysis provides further assessment of the association of SINS scores with VCF. The high-risk cohort (score, 3-4) had a high 35% cumulative risk of VCF at 2 years, suggesting that additional intervention may be beneficial for this subgroup of patients. Additionally, low-risk cases (score, 0-1) had a low 7% cumulative risk of VCF at 2 years, suggesting the safety of treatment with spine SBRT in carefully selected patients.
The concordance index of the final Cox model using only the risk score was quite high (0.76) and only minimally decreased from that using all considered variables (0.80). This slight difference suggests that most of the predictive power of all variables is contained within the cumulative risk score. Of these, the SINS score has been the most studied regarding VCF.10,14 Using a multiinstitutional data set, the association between SINS score and VCF was verified, but only 3 of the original SINS criteria were individually predictive: baseline VCF, lytic tumor, and spinal deformity.7 Other analyses have also confirmed the importance of individual components of the SINS score.9,11-13,15,18,19 Overall, while importance of each individual component appears variable, there is consensus that the SINS score is associated with VCF.
The GTV and epidural extension have not been rigorously analyzed as predictors of VCF. It is intuitive that a larger tumor volume would be associated with increased susceptibility to fracture, as has been shown at other tumor sites.27 The threshold of 10 cc correlates to a large tumor, and it can be considered in the context of the tumor volume estimate as half of the product of the length, width, and height (ie, abc/2).28 We hypothesize that epidural extension also weakens the bone secondary to the breakthrough of disease through the cortex. The resulting alterations in the bone microstructure may be associated with increased susceptibility to VCF, as supported by animal models.29,30
Lumbar location was the final variable included in the model. In contrast, cervical and sacral locations were protective factors for the development of VCF, and there was no difference with thoracic location. This result matches studies concerning vertebral compression fractures from nononcologic etiologies (most commonly involving osteoporosis). In this setting, the midthoracic spine and the thoracolumbar transition zone are the most common locations for fracture, with as many as 60% to 75% of fractures occurring around the thoracolumbar region.31,32 This transition from the more rigid thoracic spine to the more mobile lumbar region increases susceptibility to fracture, and we hypothesize that the same principle applies in the setting of VCF and spine SBRT.
Potential surgical intervention, including vertebroplasty and kyphoplasty, has demonstrated efficacy in treating vertebral compression fractures.33-35 Generally, intervention for painful pathologic fractures is recommended as early as possible.33 For this reason, careful follow-up of high-risk patients may be particularly beneficial. Prophylactic intervention has also been explored, and benefits to prophylaxis may outweigh the risks in high-risk groups. In the present data set, the 92 patients excluded for undergoing prior surgical intervention or vertebroplasty demonstrated a lower rate of VCF (11%); however, further study into prophylactic intervention is needed.36,37
This analysis notably reports a longer median time to VCF compared with prior studies. Sahgal et al7 reported a median time to VCF of 2.5 months (range, 0.03 to 43.01 months) compared with 9.1 months (IQR, 3.2-20.7 months) in the present study, which also had a much longer median follow-up time of 21 months compared with 11.5 months in Sahgal et al.7 This difference likely explains the increased median time to fracture as well as the slightly increased risk of VCF reported in our analysis (18% vs 14%).7 The IQR demonstrates that 27% of VCF were identified more than 20 months after treatment with SBRT. As patient selection for spine SBRT and survival in the setting of oligometastatic disease continue to improve, longer term follow-up will become increasingly important after treatment with spine SBRT.38,39
Interestingly, cases with a BED of more than 60 Gy (with an α/β of 10) had higher rates of VCF while cases with a BED of 60 Gy or less did not. Treatment courses of 20 Gy (with an α/β of 10) in 1 fraction and 30 Gy in 3 fractions are equivalent to a BED of 60 Gy. Overall, these findings are consistent with the moderate level of evidence, suggesting that VCF risk increases with a dose per fraction of more than 20 Gy.7
This study is chiefly limited by the retrospective analysis of data from a single institution; however, a strength of the analysis was the large data set available for assessment. Additionally, independent review of the imaging results by a radiologist for VCF and SINS score delineation is a strength of the underlying methods. Further, RPA modeling involved the use of independent validation of all generated models, supporting our statistical approach. Our resulting model also predicts for VCF as a time-dependent and binary end point. We strongly favor further assessment of the model via external or prospective validation; however, in the absence of such data, we would suggest considering the potential clinical use of this model to select for patients at high risk for developing VCF after treatment with spine SBRT.
This study developed and assessed a score for predicting the development of VCF following treatment with spine SBRT. Four variables were included in this model: GTV greater than 10 cc, lumbar location, epidural extension, and a SINS score of more than 6. This score identifies a significant subgroup of patients with high risk for VCF (score, 3-4) who may benefit from undergoing spinal stabilization or vertebroplasty.
Accepted for Publication: October 22, 2021.
Published Online: January 27, 2022. doi:10.1001/jamaoncol.2021.7008
Corresponding Author: Kenneth W. Merrell, MD, Department of Radiation Oncology, Mayo Clinic, 200 First St SW, Rochester, MN 55905 (firstname.lastname@example.org).
Author Contributions: Dr Kowalchuk had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Kowalchuk, Marion, Mullikin, Rose, Morris, Gao, Sio, Trifiletti, Merrell.
Acquisition, analysis, or interpretation of data: Kowalchuk, Johnson-Tesch, Marion, Mullikin, Harmsen, Siontis, Kim, Costello, Morris, Shiraishi, Lucido, Sio, Trifiletti, Olivier, Owen, Stish, Waddle, Laack, Park, Brown, Merrell.
Drafting of the manuscript: Kowalchuk, Johnson-Tesch, Marion, Mullikin, Morris, Sio, Waddle, Brown, Merrell.
Critical revision of the manuscript for important intellectual content: Kowalchuk, Mullikin, Harmsen, Rose, Siontis, Kim, Costello, Morris, Gao, Shiraishi, Lucido, Sio, Trifiletti, Olivier, Owen, Stish, Waddle, Laack, Park, Merrell.
Statistical analysis: Kowalchuk, Harmsen, Morris, Merrell.
Administrative, technical, or material support: Kowalchuk, Johnson-Tesch, Mullikin, Kim, Morris, Shiraishi, Lucido, Laack, Brown, Merrell.
Supervision: Mullikin, Rose, Siontis, Costello, Morris, Sio, Trifiletti, Waddle, Laack, Park, Merrell.
Conflict of Interest Disclosures: Dr Kowalchuk reported that his spouse is a senior technical product manager for GE Healthcare. Dr Trifiletti reported personal fees from Boston Scientific and Springer and grants from Novocure outside the submitted work. Dr Park reported grants from MacroGenic and the National Cancer Institute and honorarium from AstraZeneca outside the submitted work. Dr Brown reported personal fees from UpToDate outside the submitted work. Dr Merrell reported grants from Varian Medical Education, AstraZeneca, Novartis, and Pfizer Medical Education outside the submitted work. No other disclosures were reported.
Meeting Presentation: The abstract for this work was presented at the 26th Annual Meeting of the Society for Neuro-Oncology; Boston, Massachusetts; November 18-21, 2021.