FormalPara Key Points for Decision Makers

The suitability of the published research and development (R&D) cost estimates used to represent the actual R&D costs of the pharmaceutical industry varies. We found a trade-off between transparency and replicability of the analysis and between specificity and reliability of the source data.

Average costs may obscure important differences, such as by therapeutic area, small/large molecules, orphan/non-orphan, original/licensed-in, and firm size; in particular, estimations suggest higher mean costs for oncological drugs.

Moreover, most R&D cost studies do not address current trends in pharmaceutical R&D (e.g., the complexities of drug discovery and clinical trials), nor do they consider regulatory changes over time for a new medicine’s approval.

1 Introduction

The escalation of research and development (R&D) expenditures, along with a corresponding decline in new molecular entities (NMEs) reaching the world markets, have created concerns around the sustainability of the biopharmaceutical industry’s business model [1]. Part of the discussion has been based on the increase in the R&D costs of an NME in the 1990s and early 2000s, triggered mainly by increasing attrition rates and duration of clinical trials [1]. Analysts at that time suggested that the model for developing new medicines was becoming unaffordable [2]. A report prepared for the European Commission in 2000 [3] stressed the crucial role of research productivity in the competitiveness of the pharmaceutical industry.

More recent studies suggest that the decline in productivity has not persisted [4, 5]. Given the dissatisfaction of researchers, patient groups, and policy makers with the pharmaceutical industry’s pricing policies [6,7,8], the focus of the debate has now shifted on to the importance of R&D costs for drug pricing. For instance, several researchers and organizations have raised concerns about the substantial increase of cancer drugs’ acquisition costs and its effects on affordability and accessibility [8,9,10], concerns that we share and that have motivated us to conduct this study. Some researchers are skeptical of using R&D costs as a justification for higher cancer drug prices, for instance considering the significant differences in prices among various countries [11, 12]. Even in the case of orphan drugs, researchers have expressed doubt that small patient numbers justify higher prices of NME-based products [13, 14]. From a health economics perspective, and one that endorses fair access to innovative value-added products, we believe that high R&D costs alone should not justify high medicine prices. Indeed, there is a trend towards a ‘value-based pricing’ model [15] where R&D costs alone do not necessarily play a crucial role. However, the inherent risks of pharmaceutical R&D cannot be ignored, and the value-based pricing model must contend with the ‘if’ and ‘how’ successful medicines compensate drug failures.

At the core of the discussion is whether the average R&D cost estimates of bringing an NME to market are valid. For instance, some methods used in these estimates are controversial. Of particular criticism are the studies based on the database created by the Tufts Center for the Study of Drug Development (hereafter termed ‘Tufts data’) [16,17,18]. The controversy stems over the magnitude of the estimates (often used to justify higher drug prices) [19] and the alleged close relationship the Tufts Center has with the pharmaceutical industry [20,21,22,23]. The debates around the Tufts’ studies, which also apply to other studies, mainly center on four issues. First, the transparency and the breadth of coverage of data used are called into question given the data are shielded from external scrutiny [20, 24,25,26,27,28,29,30]. Second, results might be overestimated [24,25,26,27,28,29,30] because the focus is on self-originated NMEs, which account for a minor proportion of all NMEs approved and tend to cost more [27, 28]. Third, some critics argue that the opportunity costs should not be considered and that the discount rates (cost of capital [COC]) applied were too high [20, 24, 26, 27, 31]. Fourth, the estimates ignored that drug R&D activities receive a considerable amount of public funding [20, 21, 23, 24, 26,27,28,29,30]. The authors from the Tufts group [32,33,34,35,36] have countered these criticisms, defending the representativeness of their samples, the use of opportunity costs, the reasons for focusing on self-originated NMEs, and for their exclusion of public funding. However, they have not offered sufficient justification for the inaccessibility of their data. Ultimately, their results cannot be substantiated.

The much lower estimates provided by Prasad and Mailankody [37] have also been criticized but for potential downward biases [38]. First, the authors used a sample of successful drugs from small companies. Second, 90% of the products were orphan medicines, which studies suggest can have 50% lower R&D costs than non-orphan medicines [39,40,41]. Third, 50% of the products were approved after phase II trials, thus excluding the costly phase III trials. Fourth, the COC for start-up technology companies appeared to be underestimated [42].

The estimation of R&D costs is a highly contentious research topic that needs an unbiased examination. The present review provides a systematic compilation and a critical assessment of all published estimates of the (pre-launch) average R&D costs per NME. Our assessment uses criteria addressing three key domains: (1) how the drugs’ success rates and development time used for cost estimation were obtained; (2) if the study considered potential sources attributing to the variation in R&D costs; and (3) what the components of the cost estimation were. Based on these domains, our main objective was to create a framework for assessing the comprehensiveness of cost estimates (e.g., what factors are to be considered and to what extent these factors are incorporated). This will help stakeholders understand what a particular R&D estimation is (and is not) capturing.

2 Methods

2.1 Systematic Literature Review

As a starting point for collecting studies pertaining to the R&D costs of bringing a new medicine to the market (pre-launch), we included all articles from a previous systematic review by Morgan et al. [43] that included literature until 19 January 2010. We then conducted a full systematic literature review with search dates from 1 January 2010 to 5 March 2020. Inclusion criteria as outlined by Morgan et al. included original research providing an estimation of the total R&D costs, studies describing the source of data and research methods in detail, and articles written in English.

Three academic search engines were used: Pubmed, Embase (OvidSP) and EconLit (EBSCO). Additionally, we searched Google Scholar to ensure the inclusion of grey literature. The search terms used were the combination of the concepts of “drug research and development” and “costs” or “drug research and development” and “expenditure”. Finally, we conducted a ‘snowballing’ of references (i.e., used the reference lists of selected articles to identify additional relevant articles).

Eleven articles from our search fulfilled all criteria. Combining these articles with the 11 articles previously identified by Morgan et al. [43], the analysis included a total of 22 articles (for additional information regarding the search criteria, selection process and data extraction, including the PRISMA flow diagram, see electronic supplementary information 1).

We extracted the average R&D cash and capitalized values estimated per successful drug from the selected articles. We also collected the total R&D costs and, when available, the R&D costs per phase (i.e., discovery, preclinical, clinical, and submission for market approval). When only R&D costs per phase were reported, we calculated the total R&D costs by summing the R&D costs of all reported phases. We extracted the currency and year for which the authors stated the R&D costs were reported. Based on this information, we adjusted the R&D costs to 2019 prices by using the gross domestic product (GDP) price deflator obtained from the US Bureau of Economic Analysis (BEA) [44]. Additionally, two articles reported results in non-US dollar (US$) currency. For the first article, Chit et al. [45], in which results were shown in Canadian dollars (CAN$) at 2011 prices, we used the 5 July 2011 exchange rate reported by the International Monetary Fund [46] to convert their estimates into US$. For the second article, Årdal et al. [47], in which the results were reported in Euros at 2015 prices, we converted their values by using the 1 July 2015 exchange rate [46]. All R&D costs are shown in 2019 US$ prices. Details on the total and per phase R&D costs, and details on all other extracted variables, can be found in electronic supplementary information 2.

2.2 Suitability Score

The methods used to estimate the (average) cost of drug R&D in the literature are heterogeneous and simply comparing the cost estimates from each study without examining how the estimates are calculated can be misleading. Instead of straight comparisons, an assessment is needed of the methods for calculating costs in order to determine their suitability (i.e., how well the estimates represent actual R&D costs of the drug industry). To this end, we designed a ‘suitability score’ framework to assess how comprehensively a study identifies the appropriate factors required for the estimation of R&D costs and to what extent these factors are incorporated into the analysis.

We identified 16 relevant factors (see electronic supplementary information 3) from the literature [20, 25, 28, 30, 43, 48] to form the framework. For each factor, six categories ranked on a scale from one to six were created. These categories denote the extent to which the studies have considered each of these factors. Total scores range from 16 to 96. Higher scores indicate that studies considered a wider range of factors and addressed them more comprehensively, and thus the final cost estimates may be considered a more suitable estimation of the actual R&D costs in the pharmaceutical industry.

The 16 factors are classified into three domains (details are reported in electronic supplementary information 3): (1) how drugs’ success rates and development time used for cost estimation were obtained; (2) if the study considered potential sources attributing to the variation in R&D costs; and (3) what the components of the cost estimation were. Considering the previously cited criticisms of studies lacking in breadth of coverage (meaning only a fraction of pharmaceutical industry pipeline being represented by the R&D cost estimate) as well as in transparency of the drug and cost data [20, 25, 28, 30], we emphasized the importance of replicability of the data in the categories. Specifically, we created categories to assess the replicability of sampling of the drugs used for estimating costs, success rates and development time, and for the replicability of cost data collection at the project, firm, or industry level.

For most factors, the consequence of a factor to be rated category 1 or 2 are similar; both signify that the factor was not considered in the R&D cost estimation (the framework’s terminology is summarized in electronic supplementary information 4). However, we assigned a category 2 to those articles where the authors mentioned the significance of the factor for the R&D costs, but the authors did not consider the factor in the R&D costs estimation (e.g., authors stated that the factor is important but the information available does not allow its consideration in the estimation). On the other hand, category 1 was assigned to articles where the factor was completely ignored. This distinction aims to capture the authors’ views when deciding whether to include these factors. Although it does not impact the suitability, we believe it is crucial to include the authors’ rationale for dismissing certain factors affecting the cost estimation. A complete description of factors and categories is provided in electronic supplementary information 3.

Additionally, we assessed how the studies considered four variables identified in the literature as impacting future trends in R&D costs and processes, including the role of public funding [20, 21, 23, 27]. To explore this aspect, for each variable we developed additional categories similar to those used to assess the 16 factors in the suitability score. However, the scores from these additional categories are not considered part of our evaluation of the suitability of cost estimates; they are used to illustrate the perspective of the authors. Details can be found in electronic supplementary information 5.

2.3 Drug Inclusion Period and Research and Development (R&D) Costs

In order to explore if the drug inclusion period was related to the magnitude of these estimates, a series of ordinary least squares (OLS) models were estimated, using as the dependent variable four measures of average costs: (1) total capitalized R&D costs; (2) total non-capitalized R&D costs; (3) capitalized clinical costs (phases I–III); and (4) non-capitalized clinical costs. These costs were regressed against the middle year of the drug sample inclusion period, controlling for three categorical variables: (1) phases included (preclinical, clinical, and period of submission for market approval); (2) therapeutic area (mixed vs. unique); and (3) method used (Tufts method, i.e., methodology used by DiMasi and colleagues [16,17,18] vs. other). Additionally, we considered the possibility of a nonlinear relationship between the drug sample inclusion period and the R&D costs.

3 Results

Table 1 shows the main information extracted from the 22 selected articles. Some of the articles reported more than one cost estimation, yielding a total of 45 different estimates (without considering estimates from the sensitivity analysis). Estimates of the total average capitalized (pre-launch) R&D costs needed to bring a new compound to the market varied widely, from $161 million to $4.5 billion, depending on which research phases were included in the analysis (e.g. discovery and preclinical phases were not included in 13 estimations), the therapeutic class, the drug sample inclusion period, the actual annual COC, and the methodology, among other factors (details on the methods, databases, data sources, and results of the selected studies are reported in electronic supplementary information 2).

Table 1 General characteristics and main estimates of the selected studies (costs in 2019 million US dollars), ordered by year of publication

Two articles did not consider the COC [24, 47], which implies that these papers did not provide a capitalized estimate. One study did not include phase III within clinical development (Table 1) [49]. Prasad and Mailankody [37] deviated from the other studies by using a lower COC assumption. Additionally, the Jayasundara et al. [40] key estimate was for both ‘true’ NMEs (i.e. approved by the US FDA for its first indication) and drugs that received an additional FDA-approved indication. Table 1 presents both R&D cost estimates of Jayasundara et al. [40], which include the full sample (NMEs and non-NMEs), as well as the data for ‘true’ NMEs only (which are of primary interest in our context).

3.1 Suitability Assessment

The suitability scores for the R&D cost estimations are presented in Table 2 and Fig. 1, while the rankings of the articles according to the total scores, from highest to lowest, are presented in Fig. 1, Chart A. The highest score of 81 [19], 96 being the maximum, suggests that even the cost estimate with the highest suitability still omitted some factors.

Table 2 Score assigned to the selected factors of the R&D cost estimation
Fig. 1.
figure 1

Article ranking based on the total suitability scores of the R&D cost estimation. † Excluding the factor(s) that is (are) plotted separately. †† Part of the “Drug sample characteristic group”. ††† Part of the “Possible sources of variation in R&D costs group”. ‡ Part of the “Monetary values group”. The number next to the reference represents the ranking of the estimation in accordance with the value of the suitability score. Estimations that share the same suitability score have the same ranking. Source: Authors’ elaboration

Five articles provided scores higher than 70 [17, 19, 37, 48, 54]. The most recent study by Wouters et al. [19] stands out with the highest number of factors scoring over five, with the highest score in the domain ‘monetary values’. Prasad and Mailankody [37] did not include success probabilities or any method to consider the risk of failure, which led to its low rating on the ‘drug sample characteristics’ domain (Table 2). Nevertheless, higher ratings for the factors in the domain ‘possible sources of variation in R&D costs’ led to a high overall score. This domain was largely neglected in many of the included articles.

The article by Wiggins [59] had the lowest score, followed by Young and Surrusco [24] and Årdal et al. [47] (Fig. 1). None of the R&D costs estimated by these authors reflected success probabilities: two did not consider the risk of failure and two did not include costs in the discovery and preclinical phases (Table 2). Årdal et al. [47] estimated the R&D costs for only one therapeutic class—antibiotics—and thereby scored lower on the factors of breadth of coverage and replicability. Moreover, Årdal et al. [47] did not include the costs of phase III trials, nor did they consider risk failures or costs of capital adjustments (see electronic supplementary information 2) (Fig. 2).

Fig. 2.
figure 2

Average capitalized R&D costs estimated per successful drug (considering failures)—total. Blue lines: R&D costs estimation for the clinical phases. Red lines: R&D costs estimations that include an approximation for the discovery and preclinical phases. Green lines: R&D costs estimations that include an approximation for the discovery and preclinical phases as well as the R&D during the period of submission for market approval. * A thicker line represents a higher value in the suitability score, thus higher suitability of the R&D cost estimation. The length of the lines corresponds to the drug inclusion period. This is the time period considered by the authors for the selection of the drug sample. In most articles, it is the period in which the drug was first tested in humans. Nevertheless, some authors applied different definitions. For more details, see electronic supplementary information 2. Dashed line: OLS regression (excluding Jayasundara et al. [40], Chit et al. [45], and Wouters et al. [19]— Oncology): \(R\&Dcosts = -\mathrm{64,480.30}(\text{p-}\mathrm{value} = 0.0)+32.87(\text{p-}\mathrm{value} = 0.0)*\mathrm{Year}\). Year corresponds to the middle point of the drug inclusion period, additional details in electronic supplementary information 6. Abx anti-infectives, All the estimation includes all the observations in the sample, Anes analgesics/anesthetic, CNS central nervous system, CV cardiovascular, At&Me metabolism and endocrinology, Large large enterprises, mAbs monoclonal antibodies, Max maximum reported value, Medium medium enterprises, Min minimum reported value, Neuro neuropharmacological, NSAID nonsteroidal anti-inflammatory drugs, Small small enterprises, TB tuberculosis. Note: (1) Each line corresponds to one main R&D estimate. When more than one R&D cost estimate is reported, we refer to each by including the reference of the corresponding article and a keyword that describes the main characteristic of the R&D cost estimate. Wouters et al. [19] categorized each selected data point as high, medium, or low quality, depending on the availability and consistency of reported data. ‘High quality’ corresponds only to the estimations considered high quality observations. (2) With the exception of Falconi et al. [50], all the R&D values are capitalized until market approval. (3) DiMasi and Grabowski [53] also considered therapeutic recombinant proteins. Source: Authors’ elaboration

‘Possible sources of variation in R&D costs’ is the domain in which the selected articles showed the lowest scores (Table 2). This domain aims to capture whether the estimated R&D cost represents only an average of the R&D costs per new compound, or if it reflects the heterogeneity that exists in the drug discovery and development process. Most of the studies did not address the impact of orphan drug status and consideration of tax-deductible costs.

It is important to stress the difference between the two ‘Replicability’ factors. First, the factor in the domain ‘Drug Sample Characteristics’ refers to the access to the database used to select the drug sample. For instance, the articles based on the ‘Tufts data’ [16, 18, 53, 55, 57, 58] were assigned a category ‘3’ (the database used to select the drug sample is not publicly available, but are collected by a third party with strict confidentiality arrangements where only individual companies could access their own data submitted). Even if there is the possibility of working with the Tufts group and replicating part of the methodology, there is no option to access the whole database directly. Regarding the factor ‘Replicability’ in the domain ‘Monetary Values’, we refer to the source of the monetary values (e.g., the cash expended in R&D, by phase), rather than sample selection. In this domain, we considered that suitability exists when (1) it is possible to calculate the final R&D cost estimation based on the original monetary data (i.e., data by company and/or selected drugs); or (2) it is possible to replicate the data collection fully (e.g., by interviewing the same set of companies using the same form and considering the same drugs). For instance, articles based on confidential surveys to multinational pharma companies [16,17,18, 47, 48, 53, 55, 57, 58, 60] are assigned a ‘3’ (the information used to estimate the cash expended in R&D during the clinical phases is not publicly available, but is collected by a third party with strict confidentiality arrangements where only individual companies could access their own data submitted).

Considering previous critiques of the methodology used to estimate R&D costs [20, 25, 27], we identified four factors that dominated the debate on the appropriateness of their inclusion: (1) risk of failure; (2) orphan drug status; (3) opportunity costs; and (4) tax-deductible costs. In Fig. 1, Chart B, a separate ranking of articles excluding these factors is presented. A comparison of Charts A and B reflects that the suitability scores of the articles by Jayasundara et al. [40] (a 37% decrease), Sertkaya et al. [49] (a 30% decrease) and Young and Surrusco [24] (a 30% decrease) had the most significant change. Additionally, seven articles had no change in their comparative ranking and five changed their position by one place only. The comparative rankings of two [45, 48] of the remaining 10 studies are two places lower in Chart B than in Chart A. An additional article, Jayasundara et al. [40], which focuses on the differences on the R&D costs between orphan and non-orphan drugs, decreased by three places. The latter three articles [40, 45, 48] are among the few that mentioned tax-deductible costs. Other comparative rankings improved by three to four places [47, 55, 58, 60], attributed to the fact that both orphan drug status and tax-deductible costs were considered in these studies’ analyses.

In addition to our analysis presented in Fig. 1, we conducted a sensitivity analysis by excluding different combinations of factors to observe the effect on the suitability score and on the rankings (additional details are provided in electronic supplementary information 6). While for some articles (e.g. Årdal et al. [47], Falconi et al. [50], Wiggins [59], and Wouters et al. [19]) the ranking is highly consistent regardless of the excluded factors, rankings of other articles (e.g., Adams and Brantner [51], Prasad et al. [22], and Jayasundara et al. [40]) vary noticeably.

3.2 R&D Cost Estimates

We extracted both capitalized and non-capitalized R&D costs per successful medicine. The estimates of the average capitalized R&D are displayed in Fig. 2. A thicker line represents higher suitability scores. Blue lines are estimations that considered only the clinical phases, red lines incorporated, additionally, an approximation of the discovery and preclinical phases, and green lines considered, additionally, the R&D costs invested during the period of submission for marketing authorization. Capitalized values were reported for all but three of the selected articles [24, 47, 49], giving a total of 38 capitalized estimations of the average R&D costs. Only the R&D cost estimates that could be linked to a specified drug inclusion period are represented in Fig. 2 (36 of 38). The estimates from the Global Alliance for TB Drug Development [56] did not require the definition of a drug inclusion period, therefore it is not presented in Fig. 2. Sixty-one percent (22 of 36) of the remaining estimated R&D capitalized costs were under $1 billion. Particularly remarkable was the capitalized R&D cost estimated by Wouters et al. [19] for oncological drugs ($4.5 billion ), not only because of its magnitude (by far the highest) but also because it doubled the earlier estimate by Falconi et al. [50] for oncological drugs ($2.1 billion). The estimates presented by Adams and Brantner [51] and Falconi et al. [50] included only the clinical phases, although they are among the highest in the sample and are at a similar level to those in studies that also considered the discovery, preclinical, and submission to approval phases.

Figure 2 shows the average reported values of Wouters et al. [19], although they also reported median R&D costs, which, in the case of oncology, is equal to $2.8 billion. This suggests a positive skew in the oncological drugs sample selected by Wouters et al. [19], which highlights the importance of relying on either mean or median estimates. Notably, among the included articles, only Wouters et al. [19] and Prasad et al. [22] reported median costs as the main results. The average capitalized costs with 7% COC reported by Prasad et al. [22] were $944 million, while the median costs were $788 million. The difference between the mean and median is at a magnitude of more than $150 million, although the difference is not as drastic as that reported by Wouters et al. [19], who found a difference of around $1.7 billion. Three of the main papers from DiMasi et al.[16,17,18] also reported median costs, but only for cash expenses for each phase (in addition to mean costs). Given that they used the mean phase length to calculate capitalized costs, only mean capitalized costs were reported. Adams and Brantner [54] reported in a similar way, whereas the other articles reported only average costs.

In general, publications with the most recent drug inclusion periods tended to report higher estimates (Fig. 2). Nevertheless, there were exceptions. While Adams and Brantner [54] and DiMasi and Grabowski [53] considered similar inclusion periods and similar R&D phases, estimates differed by around $330 million. Similarly, the articles by Jayasundara et al. [40] and Chit et al. [45] (the latter presenting estimates for an influenza vaccine), which had the most recent drug inclusion periods, reported R&D values that were lower than what is expected from a consistent increase trend of R&D costs across time.

The non-capitalized average R&D costs per successful drug are shown in Fig. 3. Thirty-two reported R&D cost estimations were adjusted by the risk of failure but not by the opportunity costs. Wouters et al. [19], Falconi et al. [50], and the Global Alliance for TB Drug Development [56] reported only capitalized values and values without adjusting by risk of failure or opportunity costs, while Årdal et al. [47] calculated R&D costs without adjusting by risk of failure or opportunity costs. Of the 32 estimates, the three estimates from Sertkaya et al. [49] could not be linked to a drug inclusion period, therefore only 29 observations are shown in Fig. 3. The highest estimate (DiMasi et al. [17]), which considered both preclinical and clinical phases, was around $500 million higher than the Paul et al. [52] and Mestre-Ferrandiz et al. [48] estimates, which also included the submission phase. The lowest reported R&D costs corresponded to the approximation for particular therapeutic classes—two from DiMasi et al. [58] (drugs for the cardiovascular and nervous systems) and one from Chit et al. [45] (seasonal influenza vaccine).

Fig. 3.
figure 3

Average cash spent on R&D estimated per successful drug (considering failures)—total. Blue lines R&D costs estimation for the clinical phases. Red lines: R&D costs estimation that include an estimation for the discovery and preclinical phases. Green lines: R&D costs estimations that include an estimation for the discovery and preclinical phases as well as the R&D during the period of submission for market approval. * A thicker line represents a higher value in the suitability score, thus higher suitability of the R&D cost estimation. The length of the lines corresponds to the drug inclusion period. This is the time period considered by the authors for the selection of the drug sample. In most articles, it is the period in which the drug was first tested in humans. Nevertheless, some authors applied different definitions. For more details, see electronic supplementary information 2. Dashed line: OLS regression (excluding Jayasundara et al. [40], Chit et al. [45], and Wouters et al. [19]—oncology): \(R\&Dcosts = -\mathrm{49,122.7}8(\text{p-}\mathrm{value} = 0.0)+24.95(\text{p-}\mathrm{value} = 0.0)*\mathrm{Year}\). Year corresponds to the middle point of the drug inclusion period, additional details in electronic supplementary information 6. Abx anti-infectives, All the estimation includes all the observations in the sample, Anes analgesics/anesthetic, CNS central nervous system, CV cardiovascular, At&Me metabolism and endocrinology, Large large enterprises, mAbs monoclonal antibodies, Max maximum reported value, Medium medium enterprises, Min minimum reported value, Neuro neuropharmacological, NSAID nonsteroidal anti-inflammatory drugs, Small small enterprises, TB tuberculosis. Note: (1) Each line corresponds to one main R&D estimate. When more than one R&D cost estimate is reported, we refer to each by including the reference of the corresponding article and a keyword that describes the main characteristic of the R&D cost estimate. Wouters et al. [19] categorized each selected data point as high, medium, or low quality, depending on the availability and consistency of reported data. ‘High quality’ corresponds only to the estimations considered high quality observations. (2) Young and Surrusco [24] methodology included the period of submission (R&D spending over 7-years and drugs approved during the preceding 7-years). However, it did not consider the discovery and preclinical phases; therefore, it is presented as a blue line. (3) DiMasi and Grabowski [53] also considered therapeutic recombinant proteins. Source: Authors’ elaboration

The OLS models suggest a positive and significant relationship between the period and the R&D costs regardless of the estimated equation. However, the R-squared is low (around 0.5), as well as the number of degrees of freedom (DF; regressions with the R&D capitalized value between 30 and 34 DF, and non-capitalized values between 19 and 27 DF). For detailed results, see electronic supplementary information 7.

The OLS results indicate no significant differences in R&D costs between an estimation that corresponds to one therapeutic area and an estimation that represents multiple therapeutic areas. However, the limited number of observations hinders the possibility of comparing R&D costs between different therapeutic areas. In this regard, it is interesting to observe that the data used in the articles regarding development times vary considerably by therapeutic area: Wouters et al. [19], from 7 years in oncology to 9.2 years in metabolism and endocrinology; DiMasi et al. [55], from 5.2 years in analgesics and anesthetic to 9.6 years in the central nervous system; DiMasi et al. [58], from 6.4 years in anti-infectives to 9.7 years in neuropharmacologicals.

Given the possibility of outliers (e.g., Wouters et al. [19] for the highest estimate for oncology, and Jayasundara et al. [40] and Chit et al. [45] in terms of their lower values), regression analysis was repeated omitting such extreme values. Then there was an increase in the R-squared (to around 0.8 and 0.6 for equations including non-capitalized and capitalized costs, respectively), but a reduction in the number of DFs (results reported in electronic supplementary information 6). Results including the three control variables suggested that when the Tufts method was applied, the capitalized R&D costs for the clinical phases was significantly lower and the estimated total average cash spent in R&D was significantly higher. Moreover, there was no significant effect when considering a mix of therapeutic areas versus considering only one. Dashed lines in Figs. 2 and 3 correspond to the OLS models without outliers, with R&D costs as the dependent variable and the drug inclusion period as the only independent variable. Slopes of both equations were positive and significant. While the regression results with or without the outliers should be treated with caution, they show a (positive) relationship between the drug inclusion period and the magnitude of the R&D estimates. This implies that R&D costs have increased over time.

Additionally, Figs. 2 and 3 show that there was no link between the magnitudes of the estimated R&D costs with the score assigned (additional figures showing the results for preclinical and clinical phases are presented in electronic supplementary information 8). Nevertheless, a comparison between Charts A and B in Figs. 2 and 3 suggests a lower score for those estimations linked to one particular therapeutic area. Note that the scores proposed here focused mainly on the validity of the R&D cost estimates for representing the actual overall R&D costs.

We are aware that therapeutic area-specific R&D estimates usually rely on the best evidence possible for that particular area, and some of the methods used to generate that data might not be replicable across the entire industry. Naturally, these papers will have lower ratings in some of the dimensions.

Twenty-seven of the 45 estimates presented data with and without capitalization. Figure 4 shows the percentage of the opportunity costs of the total capitalized R&D costs. For instance, for DiMasi et al. [16] and Paul et al. [52], the capitalized R&D costs were around 50% higher than the non-capitalized R&D costs, implying ‘time’ represented 50% of the total capitalized costs in these papers. The study by Chit et al. [45] was excluded from the estimates since the non-capitalized value excluded the preclinical and discovery phases, while the capitalized estimation included both. In addition, the capitalized cost estimation was in 2022 CAN$, for which we do not have a proper method to deflate to 2019 US$ to compare it with other studies. In most of the cases, opportunity costs represent a percentage that ranged between 35% and 51% (17 of 24 estimations). In general, the estimations that excluded preclinical and discovery (blue lines) show lower percentages than those that included them (red lines), with two clear exceptions—Prasad and Mailankody [37] and Adams and Brantner [51]. The former shows the smallest percentage value (21%) among the sample. This can be explained by the fact that the authors applied the lowest annual COC and one of the shortest clinical development times.

Fig. 4.
figure 4

Costs of time as proportion of the average capitalized R&D costs. Blue lines: R&D costs estimation for the clinical phases. Red lines: R&D costs estimations that include an approximation for the discovery and preclinical phases. Green lines: R&D costs estimations that include an approximation for the discovery and preclinical phases as well as the R&D during the period of submission of market approval. Percentage that the costs related to time represents equal to: \(\frac{(\text{Average }capitalized\text{ R}\&\text{D costs per successful drug}-\text{ Average }cash\text{ spent in R}\&\text{D per successful drug}) }{\left(\text{Average }capitalized\text{ R}\&\text{D costs per successful drug}\right)}\). * A thicker line represents a higher value in the suitability score, thus higher suitability of the R&D cost estimation. The length of the lines corresponds to the drug inclusion period. Dashed line: OLS regression (excluding Jayasundara et al. [40], Chit et al. [45], and Wouters et al. [19] for Oncology): \(R\&Dcosts=397.97\left(\mathrm{p-}\mathrm{value}=0.2\right)-0.18(\mathrm{p-}\mathrm{value}=0.2)*\mathrm{Year}\). Year corresponds to the middle point of the drug inclusion period, additional details in electronic supplementary information 6. Abx anti-infectives, All the estimation includes all the observations in the sample, Anes analgesics/anesthetic, CNS central nervous system, CV cardiovascular, At&Me metabolism and endocrinology, Large large enterprises, mAbs monoclonal antibodies, Max maximum reported value, Medium medium enterprises, Min minimum reported value, Neuro neuropharmacological, NSAID nonsteroidal anti-inflammatory drugs, Small small enterprises, TB tuberculosis. Note: (1) Each line corresponds to one main R&D estimate. When more than one R&D costs estimate is reported, we refer to each by including the reference of the corresponding article and a keyword that describes the main characteristic of the R&D costs estimate. Wouters et al. [19] categorized each selected data point as high, medium, or low quality, depending on the availability and consistency of reported data. "High-quality" corresponds to the estimation considered only high-quality observations. (2) The drug inclusion period corresponds to the time period considered by the authors for the selection of the drug sample. In most articles, it is the period in which the drug was first tested in humans. Nevertheless, some authors applied different definitions. For more details, see electronic supplementary information 2. (3) DiMasi and Grabowski [53] also considered therapeutic recombinant proteins. (3) Chit et al. [45] was excluded from this figure because their capitalized cost estimation was reported in 2022 Canadian dollars, for which we do not have a proper method to deflate to 2019 values. Source: Authors’ elaboration.

3.3 Impact on Future Estimates

Some experts have challenged the estimations presented in the articles used in our analysis, in respect of how four variables impact the validity of existing and future estimates [20, 25, 27]:

  1. (1)

    The role of public investment: How much of the R&D costs come from public resources and not from private investments?

  2. (2)

    Post-authorization R&D costs: After the drug is approved by the corresponding regulator (e.g., FDA or European Medicines Agency [EMA]), how much R&D is still required to assure reimbursement and market success?

  3. (3)

    Disparities in regulations and length of time for approval across time.

  4. (4)

    Variations in the complexity of clinical trials (e.g., protocol design) across time.

Although these variables do not have a main role in estimating the R&D costs of bringing a new NME to the market, they still warrant consideration. The first (the role in public investment) is not related to the R&D cost level but with the funding sources. The second (post-authorization R&D costs) captures the authors’ consideration of potential R&D costs occurring after market approval, which are particularly relevant for drugs approved in the earlier stages of development. The latter two are crucial if the aim is to analyze possible future trends in R&D costs. Table 3 illustrates that our selected articles mostly ignored these four variables.

Table 3 Variables with an impact on future R&D cost estimates. To what extent are these variables considered by the authors?

4 Discussion

We systematically reviewed studies that estimated the (pre-launch) R&D cost per NME and examined how the studies addressed key factors that might influence the estimate. The main components of the R&D cost per NME are: (1) direct cash expenses; (2) the future value at the time of launch (and thus return on investment), which is to be adjusted to the high risk of development failure (i.e., success or attrition rates); and (3) the long development times resulting in substantial opportunity costs (i.e., COC). Results showed a wide range of R&D cost estimates, from $161 million [58] to $4.54 billion [19]. There was evidence that more recent drug samples resulted in higher cost estimates; however, this should be interpreted with caution due to the (1) exceptions, (2) limited number of observations, and (3) heterogeneity in the methods and data used. Additionally, our assessment showed neither a relationship between the suitability scores and magnitude of cost estimates nor between the scores and the recency of investigated drug samples.

To our knowledge, this study is the first attempt to evaluate and systematically quantify the suitability of the R&D estimates to represent the actual R&D costs of the pharmaceutical industry. Our intention is to disentangle the complexity behind the average R&D cost estimations, rather than how any particular estimation should be used to set prices. Our proposed scoring system does not assess the quality of an article; rather, it is intended to address three critical issues for evaluating R&D cost estimates: the breadth of coverage, replicability, and reliability. We also do not intend to score the credibility of the R&D cost estimates since our suitability score depends on many factors, some of which are not related to the ‘credibility’ of the study. For instance, a paper might score low overall but high in some specific dimensions because it is therapy-specific (e.g., oncology drugs); this would not imply that the estimation is ‘less credible’ only because it covers less breadth of the industry pipeline.

We evaluated 16 factors covering three domains related to (1) how the selection of drugs was made and how the related success rates and development times used for cost estimation were obtained; (2) consideration of drivers of R&D costs variations; and (3) elements of the estimation, as shown in Table 2. The domains (1) and (3) evaluate the database(s) used to extract the different cost drivers, including monetary values (cash investments), success rates, and development times. Given the fundamental differences among the databases, scoring the monetary data’s suitability is not straightforward. For example, the Tufts group’s monetary values are differentiated by project and firm [16, 18, 53, 55, 57, 58] (i.e., high specificity) but were collected via confidential surveys under strict confidentiality arrangements. Moreover, despite the authors’ efforts to validate their estimates with publicly available aggregate data [17], to the best of our knowledge the data have not been audited by any external parties. In contrast, Young and Surrusco [24] used the publicly available data reported by each company to the Pharmaceutical, Research and Manufacturers of America (PhRMA), allowing higher transparency, but to link it to individual new chemical entities (NCEs) approved annually, the authors assumed a time lag between the R&D spending reported and the number of compounds approved. However, the R&D process is complex and assuming an equal time lag for all compounds does not account for the enormous variability in project duration [62, 63]. This variability is reflected in the wide range of development times depending on the therapeutic area.

Ultimately there is a trade-off between transparency and specificity. Therefore, we included two factors to assess the monetary values used to estimate clinical R&D costs: (1) replicability, and (2) specificity. For the replicability factor, a higher value was assigned to the use of publicly available data, while a higher specificity score was given to articles that included information by project and firm. With few exceptions [24, 37, 47, 59], the scores for specificity were higher than for replicability, suggesting a trend to sacrifice transparency in favor of more accurate estimates.

The most overlooked domain was the ‘possible source of variation in the R&D costs’, which reflects the heterogeneity of the pharmaceutical sector and the potential implications of this heterogeneity on the estimated value. Several factors affect the investment in R&D, and we selected the five factors most often mentioned in the literature: therapeutic class; orphan or non-orphan drugs; firm size; NCE or new biological entity (NBE); and self-originated or licensed-in (acknowledging it is not meant to be an exhaustive list).

Only seven papers considered differences by therapeutic area. One article [19] reported an estimate for oncology (total R&D capitalized costs) of over $4 billion higher than the R&D costs for any other estimated therapeutic category. In addition to the seven mentioned articles, seven other articles focused on one therapeutic area only, thereby not considering possible differences among disease categories. Moreover, there were differences regarding therapy-specific estimates. For example, in oncology, Falconi et al. [50] estimated R&D costs of $2.1 billion, one of the highest in the sample, while the Prasad and Mailankody [37] estimate was considerably lower at $944 million.

Only a few articles considered differences by therapeutic area. However, R&D cost differences among therapeutic areas are highly plausible given the differences in project success rates between disease areas and types of molecules [1, 4, 63]. For example, empirical evidence suggests that monoclonal antibodies for cancer have the highest probability of success [1]. Additionally, technological advancements have led to considerable growth in research opportunities in some disease areas, increasing the number of possible compounds that can be tested. This is reflected in the differences of the number of compounds in the pipeline among therapeutic areas [64, 65], where the largest proportion corresponds to oncological products [65]. Moreover, improvements in preclinical testing in some therapeutic areas have facilitated early discontinuation when warranted, avoiding unnecessary clinical trial costs [5].

Despite the increase in the number of new orphan drugs approved and launched into the market [66, 67], the methodologies used to estimate R&D costs mostly ignored possible differences between orphan and non-orphan drugs. Some studies suggest that R&D costs for orphan drugs might be about half of those for non-orphan medicines [39,40,41]. This is partly explained by the fact that trials related to rare diseases include fewer participants, are less likely to be randomized or double-blinded, and are more likely to assess disease response instead of overall survival [66]. Additionally, regulatory innovations incentivizing research on rare diseases—partly by reducing companies’ out-of-pocket R&D costs, reducing the time from submission to approval, and shifting risk to the post-authorization period—have led to the authorization of many treatments based on fewer data and surrogate measures only [66]. This might explain a negative effect on the pre-approval R&D spending while potentially impacting post-launch R&D costs. However, some factors suggest potentially higher average R&D costs of bringing an orphan drug into the market. For instance, the R&D process for orphan drugs is marked by difficulties in recruiting patients for clinical trials, thus increasing costs per patient. Furthermore, relatively less medical research is conducted on rare diseases, resulting in a limited clinical understanding of such disease processes. Moreover, for ultra-rare diseases, the generation of robust clinical evidence is hindered by the limited availability of validated instruments to measure disease severity and progression [43, 68].

An additional source of heterogeneity in the R&D process is the size of the pharmaceutical company. The evidence is inconclusive on whether smaller firms are more efficient than larger firms in bringing new drugs into the market. Backfisch [69] suggests that this disagreement is due to disparities among selected samples and periods, emphasizing that new projects from small firms have been growing strongly. Despite the increase in R&D efforts of small and medium firms, only three studies considered the firm size in their estimations [51, 54, 57]. Additionally, some of the included articles used samples solely or mainly from large firms. Moreover, medium and small firms tend to produce NBEs rather than NCEs [4], with the former having lower attrition than the latter. Thus, excluding small and medium companies would also exclude NBEs, potentially biasing the results.

There is a lack of data measuring the R&D costs of licensed-in products; accordingly, only one article included this component [37]. Ignoring these products might bias the results, since evidence suggests licensed-in products have higher success rates than self-originated molecules [4, 70]. Because small firms produce a higher proportion of licensed-in products, omitting them might also bias the results.

It is debatable whether the trend in current R&D cost estimates can predict future development. During the last decade, regulatory authorities have implemented new mechanisms allowing earlier access to new medicines [71, 72]. For instance, the FDA implemented the Fast-Track Program in 1987 and the Accelerated Approval Program in 1992 [66], while the EMA has been issuing conditional marketing authorizations since 2006 [73, 74]. These changes imply a shift in R&D costs from pre- to post-marketing authorization. The studies that are undertaken post-launch research should count as R&D costs, especially if required by regulatory authorities. However, no data are available on how this shift impacts total R&D costs when R&D includes pre- and post-authorization activities.

Additionally, most R&D cost estimates do not adequately capture the complexities of drug discovery and clinical trials. For instance, although changes in cancer classification and treatment strategies have resulted in clinical trial design improvements [75], none of the oncological drug studies mentioned the potential effects of the increasing complexity of clinical trials.

Another indicator that current estimates may not predict future expenditures per NME is the evolution of the industry’s pipeline. Given the increasing impact of personalized or precision medicines [76], there is a transition towards targeted treatments that are more effective or better tolerated in smaller groups of patients. This generates a need to co-develop diagnostic tools to identify individuals most likely to benefit. The total effect on R&D costs is unclear but it also merits further investigation.

High list prices of many recently launched anticancer drugs, orphan products, and immune and gene therapies [6, 7, 77] have prompted calls for ‘cost-plus’ pricing approaches. One such proposal began with a call for transparency on the actual R&D costs for a cancer drug and resulted in an algorithm for a ‘cost price of new [cancer] treatment’ [78]. Similarly, a proposal for reasonable prices of orphan drugs suggested increased thresholds for cost effectiveness by considering R&D costs and treatment populations [79]. They argued that society might be willing to sacrifice some efficiency as long as the profitability of the manufacturers does not exceed that of non-orphan drugs [79]. In this context, calls for transparency of the cost to develop an NME have gained popularity [80], for instance the 2019 publication of the resolution on price transparency of the World Health Assembly (WHA) [81, 82]. Interestingly, while the resolution focused on transparency of net prices, no consensus was achieved on a proposed requirement for pharmaceutical companies to disclose internal R&D and other cost data [81]. However, proposals for increased transparency of R&D costs face practical difficulties. For instance, accounting for the actual cost may remove incentives for manufacturers to accelerate and efficiently manage development. Furthermore, it may be challenging to obtain the desired transparency (e.g., the actual cost of failures).

By definition, average R&D cost estimates, even when adjusted by potential differences (e.g., by therapeutic area), do not account for the added value offered by a new treatment modality. From an economic perspective, we believe that social value should be the primary determinant of reasonable drug prices [83,84,85]. If there are important elements of social value not captured by the logic of cost effectiveness, as implemented in many health technology assessment (HTA) processes, the completeness of the value component of the cost-benefit equation needs to be examined. In fact, there is increasing evidence that social norms and preferences are not adequately included in conventional health economic evaluation models [41, 86].

4.1 Study Limitations

First, personal experience and knowledge could have influenced our scoring. We addressed potential biases by basing our scoring criteria on previous literature and incorporating the experts’ concerns reflected in the literature in the ranking and descriptions given to the categories in the 1 to 6 scales. However, the heterogeneity in experts’ opinions (i.e., points of view and priorities) made this a difficult task that leads to an unavoidable degree of subjectivity. We addressed this subjectivity by basing the categories’ descriptions on extensive deliberation among our team members. A key touchstone was establishing well-defined, commonly understood definitions of each category to reduce personal interpretation of the data. Additionally, we distributed the scoring among four persons, first independently and subsequently by consensus. Finally, we conducted a sensitivity analysis to test the robustness of the suitability scores when between one and four factors were excluded from the analysis. However, the only way to eliminate subjectivity is through broader discussions on a common ground where there is an agreement on at least what needs to be discussed; our framework aims to be this common ground.

Second, we applied equal weighting to each factor. For a study to include a meaningful set of weights, it should use a focus group of experts and relevant stakeholders to incorporate systematically different points of view (e.g., the Delphi method). Such an undertaking is part of the next steps of a research plan that will follow this study. Our framework is the first step to initiate and encourage discussion of the future development of such a set of weights.

Third, a quantitative analysis is hindered by the differences in the models and the low number of observations. Therefore, it is impossible to verify which factor(s) had the strongest influence on the R&D cost estimates, or determine causality between the factors and the final R&D costs. Accordingly, despite statistical significance, the positive relationship identified between the drug inclusion period and the magnitude of the estimates should be treated with caution. Similarly, the lack of transparency due to the confidentiality of the data hindered us from determining to what degree the sample data affected the level of R&D costs estimated.

Fourth, we included articles that report the methodology used to collect information and the estimates of R&D costs. Articles that lacked this information were excluded [87,88,89], however they may provide further insight into this research topic.

Lastly, the complexity of the topic prevented the inclusion of an exhaustive list of factors. Nevertheless, we feel that our careful analysis of the literature has captured the main concerns expressed by experts and other shareholders interested in this topic.

5 Conclusions and Future Perspectives

There is no simple answer to our original question of how much it costs (on average) to research and develop a new medicine, specifically an NME. Average R&D costs mask essential sources of heterogeneity. For example, there are differences by therapeutic area, with new cancer showing the highest R&D costs. Other sources of heterogeneity among estimates are less well-documented, for example the impact of firm size, orphan versus non-orphan, chemical versus biological compounds (i.e., small versus large molecules), the origin of a new drug candidate (in-house versus licensed-in), and the potential role of public versus private funding. Future studies should include previously neglected variables and carefully consider the trade-off between the transparency and public accessibility of data and their specificity. Our scoring system may provide valuable guidance in that respect.

We detected a trend indicating that capitalized costs per NME are increasing. If the trend continues, it might have implications for the viability of the research-based biopharmaceutical industry’s business model. However, even this interpretation relies on the assumption that current or past trends are indicators of future trends, which is debatable. Future increases in R&D costs might reflect the growing complexity of target diseases, but ultimately these increases will be a function of the evolution of direct costs, attrition rates, and development times. These factors might be influenced by technological advances, the emergence of precision medicine, the resulting ‘orphanization’ of indications, and the development of companion diagnostics allowing effective stratification of patient subpopulations.

Regarding our included studies, we do not intend to deny any of their merits. Instead, we assessed how comprehensively the studies reported factors that play a role in the R&D cost estimation. We created a multifactorial framework because we believe that the validity of R&D cost estimates rests on multiple conditions. Estimations are difficult or almost impossible to compare because too many researchers tend to favor some factors to the detriment of others. Moreover, R&D cost estimates are used without considering the assumptions underlying the estimation. In this regard, our framework can serve as a guideline of the minimum set of factors that should be considered in future R&D cost estimations. If some proposed factors are not taken into account, they should at least be discussed in terms of the potential effects on the estimation. There remains a long way to the establishment of a commonly agreed framework to evaluate R&D cost estimations, particularly when considering that the R&D of new molecular entities is far from static. We believe our framework can play an important role in providing clarity of what a particular R&D cost estimation captures.