How to optimize real-world study: concept, opportunities, and evidence quality

How to optimize real-world study: concept, opportunities, and evidence quality

Kaiping Zhang1,2, Daoyuan Wang2, Jianrong Zhang3

1School of Public Health, Imperial College London, London, UK;2Editorial Office, Translational Breast Cancer Research, AME Publishing Company, Hong Kong, China;3Department of General Practice, Melbourne Medical School, Cancer in Primary Care Research Group, Primary Care Collaborative Cancer Clinical Trials Group (PC4), Centre for Cancer Research, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Victorian Comprehensive Cancer Centre, Melbourne, Victoria, Australia

Correspondence to: Jianrong Zhang, MD, MPH. Department of General Practice, Melbourne Medical School, Cancer in Primary Care Research Group, Primary Care Collaborative Cancer Clinical Trials Group (PC4), Centre for Cancer Research, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Victorian Comprehensive Cancer Centre, Level 10/305 Grattan St, Melbourne VIC 3000, Australia. Email:

Provenance and Peer Review: This article was commissioned by the Editorial Office, Translational Breast Cancer Research. The article did not undergo external peer review.

Received: 16 June 2020; Accepted: 10 July 2020; Published: 30 July 2020.

doi: 10.21037/tbcr-20-30

World Health Statistics 2020 has indicated that, although both life expectancy and healthy life expectancy have increased 8% between 2000 and 2016, non-communicable diseases, which account for 71% of all global deaths, remain a considerable health burden (1). As an example, breast cancer, the most common cause of cancer deaths among females in 2017 (2), was also responsible for an estimated 626,679 deaths globally in 2018 (3). Owing to contributions from researchers, health-care practitioners (HCPs), and stakeholders, significant progress has been achieved in breast cancer prevention and treatment. Critical to this success has been the use of randomized controlled trials (RCTs), which are considered to be at the pinnacle of the medical evidence pyramid. However, the uncertain external validity of RCTs and the extensive costs of conducting them have become a significant concern.

Due to these issues, real-world study (RWS), being more economically feasible and possessing greater external validity, has received increasing attention in the past decade. Still, many misunderstandings and uncertainties concerning RWS, including the opportunities it offers or its capacity to prove high-quality of evidence, remain unresolved. The current article aims to discuss and clarify definitions relevant to RWS, opportunities RWS brings, and ways to ensure high-quality evidence.

What is real-world study?

The “real world” concept is not a novel innovation, and its origins can be traced back to over 50 years ago. Many do not appreciate this relatively long history, as “real word” only began to receive more attention two decades ago, and its true value has only been appreciated in recent years. When using “(real world[Title]) OR (real-world[Title])” to search in the PubMed (search date: June 10th, 2020), we found 9,031 articles published from 1966 to 2020. The number of articles has increased year by year and surged to hundreds per year beginning in 2005 and thousands per year since 2017.

Speaking RWS, the two concepts of real-world data (RWD) and real-world evidence (RWE) should be clarified together. Table 1 summarizes the most authoritative definitions and classifications (4-7). Among them, we consider the definitions of the Food and Drug Administration (FDA) as appropriate and accurate. According to the FDA, RWS refers to a study design that includes, but is not limited to, randomized and non-randomized trials (such as pragmatic clinical trials and large simple trials) in addition to observational studies; RWD is “data relating to patient health status and the delivery of healthcare routinely collected from a variety of sources, including registries, collections of EHRs (electronic health records), administrative and medical claims databases etc.”; RWE means “clinical evidence about the usage and potential benefits or risks of a medical product derived from analysis of RWD” (4). As described, the definitions of these three concepts refer to different contents: study design, data, and evidence. However, these concepts could be easily misunderstood or misinterpreted. In one study, the authors reviewed 53 documents and conducted 20 interviews with stakeholders. They found that 53% of definitions of RWD were classified as data collected in a non-RCT setting, 24% were data collected in a non-controlled or non-interventional setting, 13% were data collected in a non-experimental setting, and 11% were defined as something else (8).

Table 1
Table 1 Definitions and classifications of RWD, RWE, and RWS
Full table

It needs to be particularly emphasized and clarified that RWS is an umbrella term. RWS include studies with both randomized and non-randomized designs. In addition, the criteria for identifying RWS goes beyond its design or methodology, and also considers the use of RWD in a real-world implementation scenario (4-6). A common misunderstanding of RWS and RCTs is that RCTs do not reflect real-world settings and that all observational studies are real world (8). In fact, RCTs may include real-world settings, such as pragmatic trials, and observational studies, such as those with intensified follow-up, may not be situated in real-world care settings (7). Therefore, it is inappropriate to completely separate RWS from RCTs. Similar to conducting pragmatic trials, RWS could adapt tools and methods of traditional trials and apply these to real-world settings by selecting appropriate analysis methods and designs like prospective plans and randomization.

What opportunities could real-world study bring?

RWS not only offers benefits of potential economic feasibility and greater external validity but also provides precious opportunities for optimizing evidence generation and verification.

First, RWS can be used for post market surveillance to further confirm the efficacy and safety of the approved interventions based on pivotal RCTs (9). For instance, many of the recent cancer drugs investigated in pivotal RCTs have been approved more quickly than before, due to the Accelerated Approval or Designated Breakthrough programs and through the use of surrogate endpoints instead of the traditional gold standard overall survival (OS), or both (10-13). However, the association of treatment effects between surrogate endpoints and OS has always shown low or modest correlation (14), surrogate endpoints are still widely used in confirmatory trials (15), and the majority of approved oncology drugs eventually do not demonstrate desirable clinical benefit after several years of marketing authorization (16,17). The above situations make RWS become rational as confirmation study to further evaluate the efficacy and safety of the approved interventions based on pivotal RCTs. In addition, this use of RWS is further bolstered by its proven ability to replicate clinical trials (18). Moreover, with larger sample size RWS supplies sufficient statistical power to investigate the long-term impact of rare or suboptimal outcomes in pivotal RCTs, including adverse effects, patient-reported outcomes, and quality of life. According to these outcomes, RWS can simultaneously support post market digital pharmacovigilance, especially drug safety surveillance (19). Taken together, the above demonstrate the potential value of RWS in post market confirmation of approved interventions.

Conducting RWS also allows researchers, HCPs and stakeholders to investigate the effectiveness of interventions in the clinical practice population, including in excluded/under-represented subgroups or those with different molecular characteristics not investigated in previous RCTs (20,21). Similar research can be conducted in different geographic or economic contexts to verify the efficacy/effectiveness of the interventions among the specific population. Based on these efforts, cost-effectiveness studies tailored to the given geographic or economic contexts can be conducted, providing evidence and indications to optimize the allocation of resources in health-care services (21).

By using data linkage technology based on the datasets from health records (in primary and subsequent care), cancer registries, insurance claims, etc. (22,23), conducting longitudinal studies of patients’ cancer experiences becomes feasible. The investigations can examine any given point along the treatment pathway or any segment across the cancer continuum, including primary prevention (e.g., tobacco control and cessation, risk/preventive factors to the risk of cancer), secondary prevention and diagnosis (e.g., cancer screening, novel diagnostic techniques/strategies), treatments (e.g., advanced surgical strategies, radiotherapies, targeted therapies, immunotherapies), recovery or survivorship (e.g., prevention and early detection of recurrence, metastasis, or secondary cancer), and end-of-life care (e.g., palliative care). The vast quantity of these linked datasets can provide ample patient information to answer a diversity of research questions related to general practice, surgery, oncology, epidemiology, health services research, health policy, health economics, and even social science (6,9,19,20,22,23).

How can high-quality evidence from RWS be ensured?

Although RWS offers considerable research opportunities, whether it has played an essential role in decision-making is still controversial (24). The main reason for this doubt is that the nature of RWS precludes it from having the similarly rigorous study design and methodology as RCTs enjoy—rigor that can ensure the enhancement of internal validity. Accordingly, the strength of evidence from RWS has becomes a primary concern.

These issues related to study design and methodology are directly reflected in the current evidence-based system. Since the mid-19th century, evidence-based medicine has been the cornerstone in medical research (25), and, in the medical evidence pyramid, RWS (mainly observational studies) sit lower than RCTs. This judgement is reflected in the Grading of Recommendations Assessment, Development and Evaluation (GRADE) guidelines (26), a widely recognized approach to evidence grading. According to the guidelines, the quality of evidence is classified into four levels: high, moderate, low, and very low quality. Normally, the evidence from observational studies (like cohort studies and case-control studies, most common study design of RWS) are classified as low quality; however, the credibility of the evidence could be increased with the inclusions of a dose-response relation, plausible bias control, or other measures. On the contrary, the evidence from RCTs are generally classified as high quality, but its credibility can be reduced in cases of inconsistent results or reporting bias (26).

Apart from study design and methodology, the concerns in regard to the evidence from RWS also involve reporting. In academia, standards for creating transparent reports are a near-constant consideration. It is thus crucial to follow the principles established in the GRADE guidelines (26), the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement (27), the extension of the CONSORT statement (28), and more relevant guidelines on the EQUATOR network (29). However, the quality of reporting in published studies is not always up to standard. For instance, studies which investigated the quality of reporting of observational studies after the introduction of the STROBE statement found that the quality in these studies could still be improved (30,31).

Therefore, the guaranteeing a high quality of evidence from RWS is essential. In fact, quality control is critical in each stage of the research lifecycle. Specifically, it should be emphasized that conducting valid RWS requires data quality assurance, proper methodology design, and good reporting. Regarding data quality, prospective data collection of RWE should be foundational to preserving internal validity (19), especially for population-level studies. In addition, the items in datasets are subject to the design of the record systems, and the recorded contents are subject to the accuracy and completeness provided by clinical practitioners and patients. In terms of population-level research, the contents are also subject to the accuracy and completeness of data linkage and the feasibility of the linkage. Because of these limitations, vigilance, and administrative, financial, and technical support, are particularly appreciated and warranted to assure data quality. Also, proper study design and methodology are crucial.

RWS researchers encounter the risk of potential biases caused by confounders throughout the study and methodology design process (9). This risk can be mitigated through approaches logic-based [e.g., using pragmatic clinical trials (32), directed acyclic graphs (DAG) (33), stratification or matching (9)) to statistical (e.g., multivariable regression models (9), propensity-score matching (34)] (19), which can augment internal validity and external validity and better ensure generalizability.

As for reporting, it is strongly recommended that future RWS complies with the currently established principles from the guidelines/statements mentioned above. Adherence to these principles is required not only from article authors but from editors and reviewers as well. Indeed, more studies that investigate the quality of published RWS are expected (30,31,35), along with studies and explanations that educate article authors, editors, and reviewers on how to comply with quality of study design, methodology, and reporting guidelines/statements.


In summary, RWS is a type of study that is conducted through using RWD in a real-world implementation scenario. Its study design includes not only observational studies but also randomized and non-randomized clinical trials. RWS offers researchers and stakeholders unprecedented opportunities for optimizing clinical evidence generation and verification. To maximize the potential value of RWS, the quality of evidence is the key. Ensuring reliable data quality, proper research design and methodology, and standardized and transparent reporting all help increase the strength of evidence from RWS.


Funding: None.


Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at KZ and DW declare that they are full-time employees of AME Publishing Company (publisher of Translational Breast Cancer Research). JZ is the section editor of Annals of Translational Medicine, Annals of Cancer Epidemiology, and Journal of Hospital Management and Health Policy, which are all managed by AME Publishing Company.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See:


  1. WHO. World Health Statistics 2020: Monitoring health for the SDGs. Available online:
  2. Compare GBD. Available online:
  3. GCO CANCER TODAY. Available online:
  4. FDA. Framework for FDA’s real-world evidence program. Available online:
  5. Sherman RE, Anderson SA, Dal Pan GJ, et al. Real-World Evidence - What Is It and What Can It Tell Us? N Engl J Med 2016;375:2293-7. [Crossref] [PubMed]
  6. Chinese Thoracic Oncology Group and Wu Jieping Medical Foundation. Real world study guideline. Available online:
  7. Sun X, Tan J, Tang L, et al. Real world evidence: experience and lessons from China. BMJ 2018;360:j5262. [Crossref] [PubMed]
  8. Makady A, de Boer A, Hillege H, et al. What Is Real-World Data? A Review of Definitions Based on Literature and Stakeholder Interviews. Value Health 2017;20:858-65. [Crossref] [PubMed]
  9. Skovlund E, Leufkens HGM, Smyth JF. The use of real-world data in cancer drug development. Eur J Cancer 2018;101:69-76. [Crossref] [PubMed]
  10. Johnson JR, Ning YM, Farrell A, et al. Accelerated approval of oncology products: the food and drug administration experience. J Natl Cancer Inst 2011;103:636-44. [Crossref] [PubMed]
  11. Hwang TJ, Franklin JM, Chen CT, et al. Efficacy, Safety, and Regulatory Approval of Food and Drug Administration-Designated Breakthrough and Nonbreakthrough Cancer Medicines. J Clin Oncol 2018;36:1805-12. [Crossref] [PubMed]
  12. Zhang J, Pilar MR, Wang X, et al. Endpoint surrogacy in oncology Phase 3 randomised controlled trials. Br J Cancer 2020;123:333-4 [Crossref] [PubMed]
  13. Chen EY, Joshi SK, Tran A, et al. Estimation of Study Time Reduction Using Surrogate End Points Rather Than Overall Survival in Oncology Clinical Trials. JAMA Intern Med 2019;179:642-7. [Crossref] [PubMed]
  14. Haslam A, Hey SP, Gill J, et al. A systematic review of trial-level meta-analyses measuring the strength of association between surrogate end-points and overall survival in oncology. Eur J Cancer 2019;106:196-211. [Crossref] [PubMed]
  15. Gyawali B, Hey SP, Kesselheim AS. Assessment of the Clinical Benefit of Cancer Drugs Receiving Accelerated Approval. JAMA Intern Med 2019;179:906-13. [Crossref] [PubMed]
  16. Grössmann N, Robausch M, Rosian K, et al. Monitoring evidence on overall survival benefits of anticancer drugs approved by the European Medicines Agency between 2009 and 2015. Eur J Cancer 2019;110:1-7. [Crossref] [PubMed]
  17. Tibau A, Molto C, Ocana A, et al. Magnitude of Clinical Benefit of Cancer Drugs Approved by the US Food and Drug Administration. J Natl Cancer Inst 2018;110:486-92. [Crossref] [PubMed]
  18. Bartlett VL, Dhruva SS, Shah ND, et al. Feasibility of Using Real-World Data to Replicate Clinical Trial Evidence. JAMA Netw Open 2019;2:e1912869. [Crossref] [PubMed]
  19. Khozin S, Blumenthal GM, Pazdur R. Real-world Data for Clinical Evidence Generation in Oncology. J Natl Cancer Inst 2017;109:10. [Crossref] [PubMed]
  20. Di Maio M, Perrone F, Conte P. Real-World Evidence in Oncology: Opportunities and Limitations. Oncologist 2020;25:e746-e752. [Crossref] [PubMed]
  21. Al-Refaie WB, Vickers SM, Zhong W, et al. Cancer trials versus the real world in the United States. Ann Surg 2011;254:438-42; discussion 442-3. [Crossref] [PubMed]
  22. Emery J, Boyle D. Data linkage. Aust Fam Physician 2017;46:615-9. [PubMed]
  23. McDonald L, Lambrelli D, Wasiak R, et al. Real-world data in the United Kingdom: opportunities and challenges. BMC Med 2016;14:97. [Crossref] [PubMed]
  24. Evans K. Real World Evidence: Can We Really Expect It to Have Much Influence? Drugs Real World Outcomes 2019;6:43-5. [Crossref] [PubMed]
  25. Sackett DL, Rosenberg WM, Gray JA, et al. Evidence based medicine: what it is and what it isn't. BMJ 1996;312:71-2. [Crossref] [PubMed]
  26. Guyatt GH, Oxman AD, Vist GE, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008;336:924-6. [Crossref] [PubMed]
  27. von Elm E, Altman DG, Egger M, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. PLoS Med 2007;4:e296. [Crossref] [PubMed]
  28. Zwarenstein M, Treweek S, Gagnier JJ, et al. Improving the reporting of pragmatic trials: an extension of the CONSORT statement. BMJ 2008;337:a2390. [Crossref] [PubMed]
  29. The EQUATOR Network. Enhancing the quality and transparency of health research. Available online:
  30. Pouwels KB, Widyakusuma NN, Groenwold RH, et al. Quality of reporting of confounding remained suboptimal after the STROBE guideline. J Clin Epidemiol 2016;69:217-24. [Crossref] [PubMed]
  31. Avery L, Rotondi M. More comprehensive reporting of methods in studies using respondent driven sampling is required: a systematic review of the uptake of the STROBE-RDS guidelines. J Clin Epidemiol 2020;117:68-77. [Crossref] [PubMed]
  32. Groenwold RHH, Dekkers OM. Designing pragmatic trials-what can we learn from lessons learned?. J Clin Epidemiol 2017;90:3-5. [Crossref] [PubMed]
  33. Shrier I, Platt RW. Reducing bias through directed acyclic graphs. BMC Med Res Methodol 2008;8:70. [Crossref] [PubMed]
  34. Reiffel JA. Propensity Score Matching: The 'Devil is in the Details' Where More May Be Hidden than You Know. Am J Med 2020;133:178-81. [Crossref] [PubMed]
  35. Wang X, Johnson KJ, Zhang J. Evaluating Generalizability of Secondary Data. SAGE Research Methods Cases. London: 2020.

(English Language Editor: J. Gray)

doi: 10.21037/tbcr-20-30
Cite this article as: Zhang K, Wang D, Zhang J. How to optimize real-world study: concept, opportunities, and evidence quality. Transl Breast Cancer Res 2020;1:12.

Download Citation