Introduction

Asymptomatic prostate cancers (PCa) can be managed without definitive local therapy [1, 2]. This is reflected in the National Comprehensive Cancer Network (NCCN) guidelines [3]. However, in randomized trials designed to assess the benefits of prostatectomy and radiotherapy [1, 2, 4], observation was associated with an increased risk of disease progression and metastasis. Furthermore, PCa remains the 2nd most common cause of cancer-death in the United States [5]. Therefore, curative therapies remain an important option for patients at greatest risk for death from PCa, and better strategies to risk-stratify PCa are needed.

The majority of prostate cancers can be managed with active surveillance (AS), avoiding overtreatment to low-risk patients particularly in the era of early detection. Yet, most men in the US continue to pursue active therapy. In a large retrospective study utilizing the US national hospital based oncology database, Parikh et al. showed that only 14.2% of patients pursued AS in a cohort of 40,839 patients with very low-risk disease (defined as ≤ T1c; GS ≤ 6, PSA < 10; and positive biopsy cores < 33%) [6]. In this study, 85.8% underwent definitive local therapy. The underutilization of AS is due, in part, to concerns about misclassifying disease risk when using standard clinical parameters [7, 8]. This concern is highlighted by the observation that ~36% of low grade cancers based on biopsy have high grade disease following prostatectomy [9]. Patel et al. reported that 25% of patients with low volume GS 3 + 4 on biopsy harbor adverse pathology (AP), defined as GS at least 4+3, advanced local stage (pT3b or greater) or lymph node involvement (LNI), upon radical prostatectomy (RP) [7].

Molecular biomarkers are projected to play an increasing role in clinical decision-making for newly diagnosed, localized PCa [10,11,12,13]. Qualification studies provide additional validation and expand indications for previously reported biomarkers. Decipher® is a 22-feature RNA biomarker assay that was developed to predict metastasis following prostatectomy [14]. Decipher has been shown to predict metastasis and PCa-specific mortality from biopsy tissue [12, 13]. This study assesses a role for Decipher in predicting AP, which would make patients a poor candidate for AS.

Materials & methods

Study Cohort

Patient selection

First, a prospective cohort of 16,806 RP samples (referred to as RP Cohort) was used to illustrate the stratification of Decipher post-RP by pathological features. Prospectively collected patient samples from the Decipher PCa classifier test (GenomeDx Biosciences Laboratory, San Diego, CA) in the GRID™ were de-identified and aggregated for analysis. The GRID™ collects genomic expression data when the commercial Decipher test is ordered and provides the aggregated data for research use (NCT02609269). A waiver of informed consent was obtained from Western IRB (protocol #20172337).

Second, we identified 266 NCCN-very-low/low or favorable-intermediate risk PCa patients who underwent diagnostic prostate biopsy between 2000 and 2014 and were treated with RP in six community or academic practices (referred to as biopsy cohort): University of Calgary, Cedars-Sinai, Spectrum Health, Cleveland Clinic, MD Anderson Cancer Center, and Johns Hopkins. Sample size was maximized; no a priori sample size estimation was performed. Patients with complete tumor pathology from biopsy and prostatectomy and Decipher genomic expression profiles generated from diagnostic biopsy specimens were selected for analysis. Low-risk PCa was cT1c or cT2a, and Gleason score (GS) ≤ 6, and PSA < 10 ng/ml; favorable-Intermediate risk was no greater than predominant GS 3 and percent positive biopsy cores < 50%, and either cT2b-cT2c or PSA 10–20 ng/ml. Institutional review boards (IRB) at the participating institutions approved the research protocol.

Specimen collection and processing

Specimen selection and processing was performed by GenomeDx Biosciences Laboratory (San Diego, CA, USA) using their CLIA-certified commercial platform as described previously [12]. RNA was extracted from the needle biopsy core with the highest GS and percentage of tumor involvement. Priority was given to the cancer nodule with the highest GS. Following microarray quality control using the Affymetrix Power Tools packages [15], probeset summarization and normalization were performed using the single channel array normalization (SCAN) algorithm [16]. Microarray data was depositied in NCBI GEO Microarray repository (GSE119616).

Decipher, a 22-marker prognostic gene-expression score, was determined from the Decipher PCa classifier assay (GenomeDx Biosciences Laboratory, San Diego, CA, USA) as previously described [12, 14, 17]. Cancer of the Prostate Risk Assessment (CAPRA) scores were calculated as previously described [12, 18]. Risk-group categorizations of Decipher and CAPRA were based on prior publications [12, 18, 19].

Statistical analysis

Box plots of Decipher and p-values resulted from Wilcoxon’s test in both cohorts were to demonstrate the association of Decipher with pathology features. Remaining analyses were based on the retrospective biopsy cohort. Descriptive statistics, medians, and ranges were reported for continuous variables and frequencies and proportions for categorical variables.

The primary objective of the study was to evaluate Decipher as an independent predictor of AP (defined as pT3b or greater, and/or primary Gleason pattern 4 or greater, and/or LNI) and to explore Decipher as a tool to identify a subgroup of patients likely to be free from AP. Univariable (UVA) and multivariable (MVA) logistic regression models were fitted with CAPRA and Decipher to evaluate Decipher as a prognostic indicator of AP compared with CAPRA. Firth’s penalization method was performed to account for the small number of events [20]. Odds ratio (OR), the corresponding 95% confidence interval and p-value were used for performance assessment. An area under the curve (AUC) [21] and bootstrapped 95% confidence interval were calculated through 1000 resamplings for each model. The incremental benefit of adding Decipher to a model with CAPRA was quantified by the difference in the AUCs. Optimism adjustment was performed on all MVAs to avoid overfitting in the models [22].

Generalized linear mixed models were used, treating institutions as the random intercepts [23] to account for potential confounding institutional effect. The primary analysis was repeated with time from biopsy to RP to adjust for biological progression over time. Decipher was compared to individual clinical risk factors and NCCN using univariable, multivariable logistic regression models, and AUCs.

Test characteristics of Decipher were summarized when using various cutoffs to dichotomize Decipher into a binary variable. Agreement metrics, specifically the negative predictive value (NPV), were used to investigate discriminatory ability for every Decipher score incremented by 0.05, which included the previously established Decipher low-risk cut-point of 0.45 [12, 13] and a lower-risk cut-point of 0.2 identified by Nguyen et al [13]. All NPVs were adjusted for 10% prevalence of AP [7] and the Wilson method was used to construct the 95% confidence intervals [24]. Logistic regression result of the 0.2 cut-point for predicting AP was performed as an example to demonstrate the discrimination provided by the genomic classifier.

All statistical tests were two-sided. P-values less than 0.05 were considered statistically significant. Analyses were performed in R, version 3.3.1 (R Foundation for Statistical Computing, Vienna, Austria).

Results

Association of Decipher score and AP for RP cohort

The distribution of Decipher was assessed from prospectively collected RPs to determine if Decipher is able to discriminate based on stage or grade (Supp. Figure 1). Decipher distributions were different when stratified by stage ( ≤ pT3a vs ≥ pT3b), primary Gleason grade ( ≤ 3 vs ≥ 4), LN status (negative vs positive) or the presence of any of these adverse pathologic features (p-values < 0.001). The median Decipher of patients with AP (n = 9356) and without AP (n = 6694) at RP were 0.62 and 0.45, respectively (p-value < 0.001).

Patient characteristics for biopsy cohort

Decipher in the biopsy cohort were examined in 266 men with favorable NCCN risk (64.7% with very low/low-risk disease and 35.3% with favorable-intermediate risk, Table 1). The goal was to determine if biopsy Decipher can predict prostatectomy pathology in men with very low/low and favorable-intermediate PCa who are candidates for AS. The median age was 62 years and the median PSA at diagnosis was 5.4 ng/mL (interquartile range [IQR] 4.16ng/mL–7.19 ng/mL). The majority of patients (84.6%) were diagnosed with cT1 PCa and 75.6% patients were in biopsy grade group 1. 186 (69.9%) and 76 (28.6%) of the patients were classified as CAPRA low and intermediate risk, respectively. At prostatecomty, 32 (12%) had AP (pT3b/N1 or primary Gleason pattern 4 or higher). The rate of AP was 11% (19/172) and 14% (13/94) for the NCCN-very-low/low and favorable-intermediate patients, respectively. The median time from biopsy to RP was 2.2 months (IQR of 1.35 and 3.63). Twenty eight (10.5%) patients had grade group 3–5; 27 (10.2%) harbored primary Gleason pattern 4 or higher. Seventy one (26.7%) were pT3a and 5 (1.9%) were pT3b. Positive LNs were found in three patients (1.1%). Median Decipher in this population was 0.28 (IQR 0.17–0.39) and was significantly higher among men with AP (0.34 IQR 0.25–0.47 vs 0.27 IQR 0.15–0.37, p-value < 0.001, Supp. Figure 2).

Table 1 Clinical characteristics of active surveillance candidates in biopsy cohort

Performance of Decipher and CAPRA for predicting AP in biopsy cohort

Decipher was an independent predictor of AP (Table 2). In UVA, Decipher was a predictor of AP with an OR of 1.32 (95% CI 1.07–1.63, p-value 0.011). In MVA when adjusting for CAPRA, Decipher was an independent predictor with an OR of 1.29 (95% CI 1.03–1.61, p-value 0.025). CAPRA was not a significant predictor of AP in either UVA (p-value 0.109) or MVA (p-value 0.239). When used alone, CAPRA had an AUC of 0.57 (95% CI 0.47–0.68). The MVA model of CAPRA and Decipher had an AUC of 0.65 (95% CI 0.58–0.70) after adjusting for optimism. Adding Decipher improved the AUC by 0.08 (Table 2). Receiver Operating Characteristics (ROC) curves of the models are shown in Supp. Figure 3.

Table 2 Logistic regression analysis for predicting adverse pathology in biopsy cohort

Similar results were found when the performance of Decipher for predicting AP accounted for institution (MVA OR 1.29, 95% CI 1.02–1.64, p-value 0.034, Supp. Table 1) and time from biopsy to RP (MVA OR 1.29, 95% CI 1.03–1.62, p-value 0.027, Supp. Table 2); similar effect sizes indicate the robustness of our models. Moreover, Decipher increased the AUC of NCCN from 0.53 to 0.64 when added to the NCCN model (Supp. Table 3; ROC curves in Supp. Figure 4). Additional UVA and MVA results comparing Decipher with individual clinical risk factors in CAPRA and NCCN can be found in Supp. Table 3; only Decipher was a statistically significant predictor of AP in both UVA and MVA.

Decipher for predicting AP in biopsy cohort

The sensitivities and specificities of various Decipher thresholds were evaluated to predict AP in radical prostatcomy (Table 3). For example, 17.7% of patients had Decipher > 0.45 and 19.1% had AP in this group, compared to 10.5% with AP for patients with Decipher ≤ 0.45. The sensitivity and specificity for predicting AP with this cutoff were 28% and 84%, respectively. At a threshold of 0.2, 66.9% of patients were in the high-risk group for AP, with a sensitivity of 88% and specificity of 36%. For patients with Decipher > 0.2, the AP rate was 15.7%, but for patients with a score ≤ 0.2, the AP rate improved to 4.5%.

Table 3 Sensitivity and specificity of Decipher risk thresholds for predicting AP in biopsy cohort

When considering AS, the NPV is useful as it determines the degree of confidence no AP at RP for a specific patient. Fig. 1 provides a histogram of NPVs as a function of various Decipher thresholds; for thresholds of 0.45 and 0.2, the NPV were 91% and 96%, respectively. Given the high NPV associated with a threshold of 0.2, a Decipher ≤ 0.2 would provide a strong case for patients considering AS. A Decipher risk group defined by a cut-point of 0.2 showed its prognostic potential (Table 4), statistically significant in both UVA (p-value 0.006) and MVA (p-value 0.016) for predicting AP. Patients with Decipher greater than 0.2 were more likely to have AP (OR 3.17, 95% CI 1.22–10.26) than patients with Decipher ≤ 0.2.

Fig. 1
figure 1

Negative predictive value (NPV) for the absence of AP at varying Decipher risk thresholds in biopsy cohort

Table 4 Logistic regression analysis for predicting adverse pathology using Decipher cut-point of 0.2 in biopsy cohort

Discussion

Molecular classification tools are helpful if they assist with patient decision-making by providing information beyond what is available from clinical variables. Although AS is now a well-accepted option for the majority of newly diagnosed low grade prostate cancers, the risk of understaging and undergrading prostate cancer remains a concern. The NCCN guidelines recommend that AS be considered for even favorable-intermediate risk patients. Therefore, we focused on the risk of pathologic findings at prostatectomy that would place a patient in a higher risk group. We used a previously characterized definition for AP that is universally accepted as being inappropriate for AS: pT3b or higher, GS at least 4 + 3 = 7 or LN metastasis [7]. The objective of this study was to determine if Decipher, which is commercially available and developed for predicting clinical metastasis following prostatectomy, can be used to stratify the risk for AP using diagnostic biopsy tissue.

When the Decipher test was applied to prostatectomy tissue, there was a strong association between the genomic score and each of the individual pathologic features defining AP. Therefore, the 22-transcript Decipher test reflects the biology that produces AP. However, the question remains whether Decipher can be applied to the diagnostic biopsy tissue to predict AP. Therefore, we identified a retrospective cohort of men who underwent RP based on historic treatment standards, but who are considered AS candidates by contemporary standards. In this group, Decipher from the prostate biopsy was a significant predictor of AP when used alone, or in MVA analysis with CAPRA, NCCN, or the individual elements of these clinical risk-stratification tools. In contrast, none of the clinical variables or risk-stratification tools were significant predictors of AP, but this is not surprising since our cohort is homogenous, including only AS candidates at low risk for harboring aggressive PCa. Accordingly, we observed the overall 12% AP event rate was not significantly different between NCCN very low/low risk (11%) and favorable-intermediate risk (14%) subgroups. We did, however, show that the AUC for Decipher was better than both CAPRA and NCCN. For such a homogenous cohort, an AUC of 0.65 for Decipher can be considered meaningful in this setting.

For newly diagnosed patients, a molecular test that can increase confidence in the absence of AP can increase acceptance of AS. To support this binary clinical decision, Decipher can be converted to a binary variable by applying a threshold. Of all patients in our biopsy cohort, 88% were free of AP. A useful molecular test should be able to identify a low-risk group where the likelihood of being free of AP is greater than 88%. In other words, the NPV should be above 88%. Thresholds of 0.45 and 0.2 have been reported and therefore their use avoids the risk of overfitting associated with exploring new cutoffs. A cutoff of ≤ 0.45 had a NPV of 91% and included 82.3% of the cohort. A cutoff of ≤ 0.2 had a NPV of 96% but included 33.1% of the cohort. A practical way to apply these cutoffs is to strongly recommend AS for scores ≤ 0.2 since the risk of AP is only 5%, and recommend definitive local therapy for scores > 0.45 since the risk of AP is 19%. Therefore, the Decipher test would result in a strong recommendation for or against AS in ~50% of patients considering AS solely based on clinical parameters.

Our study has several important strengths. It qualifies Decipher for prediction of AP from the prostate biopsy specimen. It uses a well-characterized commercial assay readily available for clinical use, and cutoffs for clinical decision-making that have been reported. It is a multi-institutional study that reduces the risk of bias that can result from institution-specific protocols for specimen handling and storage. We take advantage of tissue collected from an era where it was acceptable to recommend prostatectomy to all men with PCa and a cohort of modern AS candidates where both biopsy tissue and prostatectomy pathology are available. In the modern era, a prospective study of this type would not be ethical. Finally, rather than predict any degree of upgrading or upstaging, we predict a very high-risk group where there is universal acceptance that AS is inappropriate. This study did not have long-term follow-up to consider survival outcomes and the sample size and low number of events did not allow Decipher to be assessed in individual NCCN risk (e.g., favorable intermediate only) risk groups. An ongoing multi-institutional study of favorable-intermediate risk patients aims to address this limitation.

Conclusion

Decipher can be applied to prostate biopsies from NCCN-very-low/low and favorable-intermediate risk patients to predict AP found in prostatectomy pathology that would make a patient an inappropriate candidate for AS. Decipher’s high NPV for identifying patients without AP can provide reassurance for AS.