2. To determine the value for undertaking a full systematic review.
3. To summarize and disseminate research findings.
4. To identify research gaps in the existing literature. [[ ] p. 21]
Researchers can undertake a scoping study to examine the extent, range, and nature of research activity, determine the value of undertaking a full systematic review, summarize and disseminate research findings, or identify gaps in the existing literature [ 6 ]. As such, researchers can use scoping studies to clarify a complex concept and refine subsequent research inquiries [ 1 ]. Scoping studies may be particularly relevant to disciplines with emerging evidence, such as rehabilitation science, in which the paucity of randomized controlled trials makes it difficult for researchers to undertake systematic reviews. In these situations, scoping studies are ideal because researchers can incorporate a range of study designs in both published and grey literature, address questions beyond those related to intervention effectiveness, and generate findings that can complement the findings of clinical trials.
In an effort to provide guidance to authors undertaking scoping studies, Arksey and O'Malley [ 6 ] developed a six-stage methodological framework: identifying the research question, searching for relevant studies, selecting studies, charting the data, collating, summarizing, and reporting the results, and consulting with stakeholders to inform or validate study findings (Table (Table2). 2 ). While this framework provided an excellent methodological foundation, published scoping studies continue to lack sufficient methodological description or detail about the data analysis process, making it challenging for readers to understand how study findings were determined [ 1 ]. Arksey and O'Malley [ 6 ] encouraged other authors to refine their framework in order to enhance the methodology.
Overview of the Arksey and O'Malley methodological framework for conducting a scoping study
Arksey and O'Malley Framework Stage | Description |
---|---|
1: Identifying the research question | Identifying the research question provides the roadmap for subsequent stages. Relevant aspects of the question must be clearly defined as they have ramifications for search strategies. Research questions are broad in nature as they seek to provide breadth of coverage. |
2: Identifying relevant studies | This stage involves identifying the relevant studies and developing a decision plan for where to search, which terms to use, which sources are to be searched, time span, and language. Comprehensiveness and breadth is important in the search. Sources include electronic databases, reference lists, hand searching of key journals, and organizations and conferences. Breadth is important; however, practicalities of the search are as well. Time, budget and personnel resources are potential limiting factors and decisions need to be made upfront about how these will impact the search. |
3: Study selection | Study selection involves inclusion and exclusion criteria. These criteria are based on the specifics of the research question and on new familiarity with the subject matter through reading the studies. |
4: Charting the data | A data-charting form is developed and used to extract data from each study. A 'narrative review' or 'descriptive analytical' method is used to extract contextual or process oriented information from each study. |
5: Collating, summarizing, and reporting results | An analytic framework or thematic construction is used to provide an overview of the breadth of the literature but not a synthesis. A numerical analysis of the extent and nature of studies using tables and charts is presented. A thematic analysis is then presented. Clarity and consistency are required when reporting results. |
6: Consultation (optional) | Provides opportunities for consumer and stakeholder involvement to suggest additional references and provide insights beyond those in the literature. |
In this paper, we apply our experiences using the Arksey and O'Malley framework to build on the existing methodological framework. Specifically, we propose recommendations for each stage of the framework, followed by considerations for the advancement, application, and relevance of scoping studies in health research. Continual refinement of the framework stages may provide greater clarity about scoping study methodology, encourage researchers and clinicians to engage in this process, and help to enhance the methodological rigor with which authors undertake and report scoping studies [ 1 ].
We each completed a scoping study in separate areas of rehabilitation using the Arksey and O'Malley framework [ 6 ]. Goals of these studies included: identifying research priorities within HIV and rehabilitation [ 7 ], applying motor learning strategies within pediatric physical and occupational therapy intervention approaches [ 8 ], and exploring the use of theory within studies of knowledge translation [ 9 ]. The amount of literature reviewed in our studies ranged from 31 (DL) to 146 (KO) publications. Upon discovering that we had similar challenges implementing the scoping study methodology, we decided to use our experiences to further develop the existing framework. We conducted an informal literature search on scoping study methodology. We searched CINAHL, MEDLINE, PubMed, ERIC, PsycInfo, and Web of Science databases using the search terms 'scoping,' 'scoping study,' 'scoping review,' and 'scoping methodology' for papers published in English between January 1990 and May 2010. Reference lists of pertinent papers were also searched. This search yielded seven citations that reflected on scoping study methodology, which were reviewed by one author (DL). After independently considering our own experiences utilizing the Arskey and O'Malley [ 6 ] framework, we met on seven occasions to discuss the challenges and develop recommendations for each stage of the methodological framework.
We outline the challenges and recommendations associated with each stage of the methodological framework (Table (Table3 3 ).
Summary of challenges and recommendations for scoping studies
Framework Stage | Challenges | Recommendations for clarification or additional steps |
---|---|---|
#1 Identifying the research question | 1. Scoping study questions are broad. 2. Establishing scoping study purpose is not associated with a framework stage. 3. The four purposes of scoping studies lack clarity. | 1. Clearly articulate the research question that will guide the scope of inquiry. Consider the concept, target population, and health outcomes of interest to clarify the focus of the scoping study and establish an effective search strategy. 2. Mutually consider the purpose of the scoping study with the research question. Envision the intended outcome ( ., framework, list of recommendations) to help determine the purpose of the study. 3. Consider rationale for conducting the scoping study to help clarify the purpose. |
#2 Identifying relevant studies | 1. Balancing breadth and comprehensiveness of the scoping study with feasibility of resources can be challenging. | 1a. Research question and purpose should guide decision-making around the scope of the study. 1b. Assemble a suitable team with content and methodological expertise that will ensure successful completion of the study. 1c. When limiting scope is unavoidable, justify decisions and acknowledge the potential limitations to the study. |
#3 Study selection | 1. The linearity of this stage is misleading. 2. The process of decision making for study selection is unclear. | 1. This stage should be considered an iterative process involving searching the literature, refining the search strategy, and reviewing articles for study inclusion. 2a. At the beginning of the process, the team should meet to discuss decisions surrounding study inclusion and exclusion. At least two reviewers should independently review abstracts for inclusion. 2b. Reviewers should meet at the beginning, midpoint and final stages of the abstract review process to discuss challenges and uncertainties related to study selection and to go back and refine the search strategy if needed. 2c. Two researchers should independently review full articles for inclusion. 2d. When disagreements on study inclusion occur, a third reviewer can determine final inclusion. |
#4 Charting the data | 1. The nature and extent of data to extract from included studies is unclear. 2. The 'descriptive analytical method' of charting data is poorly defined. | 1a. The research team should collectively develop the data-charting form and determine which variables to extract in order to answer the research question. 1b. Charting should be considered an iterative process in which researchers continually extract data and update the data-charting form. 1c. Two authors should independently extract data from the first five to ten included studies using the data-charting form and meet to determine whether their approach to data extraction is consistent with the research question and purpose. 2. Process-oriented data may require extra planning for analysis. A qualitative content analysis approach is suggested. |
#5 Collating, summarizing, and reporting the results | 1. Little detail provided and multiple steps are summarized as one framework stage. | Researchers should break this stage into three distinct steps: 1a. Analysis (including descriptive numerical summary analysis and qualitative thematic analysis); 1b. Reporting the results and producing the outcome that refers to the overall purpose or research question; 1c. Consider the meaning of the findings as they relate to the overall study purpose; discuss implications for future research, practice and policy. |
#6 Consultation | 1. This stage is optional. 2. Lack of clarity exists about when, how and why to consult with stakeholders and how to integrate the information with study findings. | 1. Consultation should be an essential component of scoping study methodology. 2a. Clearly establish a purpose for the consultation. 2b. Preliminary findings can be used as a foundation to inform the consultation. 2c. Clearly articulate the type of stakeholders to consult and how data will be collected, analyzed, reported and integrated within the overall study outcome. 2d. Incorporate opportunities for knowledge transfer and exchange with stakeholders in the field. |
Scoping study research questions are broad in nature as the focus is on summarizing breadth of evidence. Arksey and O'Malley [ 6 ] acknowledge the need to maintain a broad scope to research questions, however we found our research questions lacked the direction, clarity, and focus needed to inform subsequent stages of the research process, such as identifying studies and making decisions about study inclusion. To clarify this stage, we recommend that researchers combine a broad research question with a clearly articulated scope of inquiry. This includes defining the concept, target population, and health outcomes of interest to clarify the focus of the scoping study and establish an effective search strategy. For example, in one author's (KO) scoping study, the research question was broadly 'what is known about HIV and rehabilitation?' Defining the concept of 'rehabilitation' was essential in order to establish a clear scope to the study, guide the search strategy, and establish parameters around study selection in subsequent stages of the process [ 7 ].
Although Arskey and O'Malley [ 6 ] outline four main purposes for undertaking a scoping study, they do not articulate that purpose be specified within a specific framework stage. We recommend researchers simultaneously consider the purpose of the scoping study when articulating the research question. Linking a clear purpose for undertaking a scoping study to a well-defined research question at the first stage of the framework will help to provide a clear rationale for completing the study and facilitate decision making about study selection and data extraction later in the methodological process. A helpful strategy may be to envision the content and format of the intended outcome that may assist researchers to clearly determine the purpose at the beginning of a study. In the abovementioned HIV study, authors linked the broadly stated research question with a more specific purpose 'to identify the key research priorities in HIV and rehabilitation to advance policy and practice for people living with HIV in Canada' [ 7 ]. The envisioned outcome was a thematic framework that represented strengths and opportunities in HIV rehabilitation research, followed by a list of the key research priorities to pursue in future work.
Finally, the purposes put forth by Arksey and O'Malley [ 6 ] require more debate. We concur with Anderson et al. [ 2 ] and Davis et al. [ 1 ], who state that researchers may benefit from further clarification of the purposes for undertaking a scoping study. The first purpose, as articulated by Arksey and O'Malley [ 6 ], is to summarize the extent, range, and nature of research activity; however, researchers are not required to reflect on their underlying motivation for doing so. We recommend that researchers consider the rationale for why they should summarize the activity in a field and the implications that this will have on research, practice, or policy. The second purpose is to assess the need for a full systematic review. However, it is difficult to determine whether a systematic review is advantageous when a scoping study does not involve methodological quality assessment of included studies. Furthermore, it is unclear how this purpose differs from existing methods of determining feasibility for a systematic review. The third purpose is to summarize and disseminate research findings, but we question how this differs from other narrative or systematic literature reviews. Lastly, the fourth purpose of undertaking a scoping study -- to identify gaps in the existing literature -- may yield false conclusions about the nature and extent of those gaps if the quality of the evidence is not assessed. The purpose 'to identify the key research priorities in HIV and rehabilitation to advance policy and practice for people living with HIV in Canada' does not explicitly align with one of the four Arskey and O'Malley purposes [ 7 ]. However, it appears authors inherently first summarized the extent, range, and nature of research (purpose one) and identified gaps in the existing literature (purpose four) in order to subsequently identify the key research priorities in HIV and rehabilitation (author purpose). This suggests authors might have an overall study purpose with multiple objectives articulated by Arksey and O'Malley that are required in order to help achieve their overall purpose.
A strength of scoping studies includes the breadth and depth, or comprehensiveness, of evidence covered in a given field [ 1 ]. However, practical issues related to time, funding, and access to resources often require researchers to consider the balance between feasibility, breadth, and comprehensiveness. Brien et al. [ 5 ] reported that their search strategy yielded a vast amount of literature, making it difficult to determine how in depth to carry out the information synthesis. Although Arksey and O'Malley [ 6 ] identify these concerns and provide some suggestions to support these decisions, we also struggled with the trade-off between breadth and comprehensiveness and feasibility in our scoping studies. As such, we recommend that researchers ensure decisions surrounding feasibility do not compromise their ability to answer the research question or achieve the study purpose. Second, we recommend that a scoping study team be assembled whose members provide the methodological and context expertise needed for decisions regarding breadth and comprehensiveness. When limiting scope is unavoidable, researchers should justify their decisions and acknowledge the potential limitations of their study.
Arksey and O'Malley [ 6 ] provide suggestions to manage the time-consuming process of determining which studies to include in a scoping study. We experienced this stage as more iterative and requiring additional steps than implied in the original framework. While Arksey and O'Malley [ 6 ] do not indicate a team approach is imperative, we agree with others and suggest scoping studies involve multidisciplinary teams using a transparent and replicable process [ 2 , 10 ]. In two of our studies (HC and DL) where decision making was primarily completed by a single author, we faced several challenges, including uncertainty about which studies to include, variables to extract on the data-charting form, and the nature and extent of detail to conduct the data extraction process. This raised questions related to rigor and led to our recommendations for undertaking a systematic team approach to conducting a scoping study.
Specifically, we recommend that the team meet to discuss decisions surrounding study inclusion and exclusion at the beginning of the scoping process. Refining the search strategy based on abstracts retrieved from the search and reviewing full articles for study inclusion is also a critical step. We recommend that at least two researchers each independently review abstracts yielded from the search strategy for study selection. Reviewers should meet at the beginning, midpoint, and final stages of the abstract review process to discuss any challenges or uncertainties related to study selection and to go back and refine the search strategy if needed. This can help to alleviate potential ambiguity with a broad research question and to ensure that abstracts selected are relevant for full article review. Next, two reviewers should independently review the full articles for inclusion. When disagreements occur, a third reviewer can be consulted to determine final inclusion.
This stage involves extracting data from included studies. Based on our experiences, we were uncertain about the nature and extent of information to extract from the included studies. To clarify this stage, we recommend that the research team collectively develop the data-charting form to determine which variables to extract that will help to answer the research question. Secondly, we recommend that charting be considered an iterative process in which researchers continually update the data-charting form. This is particularly true for process-oriented data, such as understanding how a theory or model has been used within a study. Uncertainty about the nature and extent of data that should be extracted may be resolved by researchers beginning the charting process and becoming familiar with study data, and then meeting again to refine the form. We recommend an additional step to charting the data in which two researchers independently extract data from the first five to ten studies using the data-charting form and meet to determine whether their approach to data extraction is consistent with the research question and purpose. Researchers may review one study several times within this stage. The number of researchers involved in the data extraction process will likely depend upon the number of included studies. For example, in one study, authors had difficulty developing one data-charting form that could apply to all included studies representing a range study designs, reviews, reports, and commentaries [ 7 ]. As a preliminary step, authors decided to classify the included studies into three areas --HIV disability, interventions, and roles of rehabilitation professionals in HIV care -- to help determine the nature and extent of information to extract from each of the types of studies [ 7 ].
Arksey and O'Malley [ 6 ] refer to a 'descriptive analytical method' that involves summarizing process information, such as the use of a theory or model in a meaningful format. Our experiences indicated that this is a highly valuable, though challenging aspect of scoping studies, as we struggled to chart and summarize complex concepts in a meaningful way. Arksey and O'Malley [ 6 ] indicate that synthesis of material is critical as scoping studies are not a short summary of many articles. We agree, and feel that additional direction in the framework might help to navigate this crucial but challenging stage. Perhaps synthesizing process information may benefit from utilization of qualitative content analysis approaches to make sense of the wealth of extracted data [ 11 ]. This issue also highlights the overlap with the next analytical stage. The role and relevance of analyzing process data and using qualitative content analysis within scoping study methodology requires further discussion.
Stage five is the most extensive in the scoping process, yet it lacks detail in the Arksey and O'Malley framework. Scoping studies have been criticized for rarely providing methodological detail about how results were achieved [ 1 ]. We appreciate the importance of breaking the analysis phase into meaningful and systematic steps so that researchers can provide this undertake scoping studies and report on findings in a rigorous manner. As a result, we recommend three distinct steps in framework stage five to increase the consistency with which researchers undertake and report scoping study methodology: analyzing the data, reporting results, and applying meaning to the results. As described in the existing framework, analysis (otherwise referred to as collating and summarizing) should involve a descriptive numerical summary and a thematic analysis. Arksey and O'Malley [ 6 ] describe the need to provide a descriptive numerical summary, stating that researchers should describe the characteristics of included studies, such as the overall number of studies included, types of study design, years of publication, types of interventions, characteristics of the study populations, and countries where studies were conducted. However, the description of thematic analysis requires additional detail to assist authors in understanding and completing this step. In our experience, this analytical stage resembled qualitative data analytical techniques, and researchers may consider using qualitative content analytical techniques [ 10 ] and qualitative software to facilitate this process.
Second, when reporting results, we recommend that researchers consider the best approach to stating the outcome or end product of the study and how the scoping study findings will be articulated to readers ( e.g ., through themes, a framework, or a table of strengths and gaps in the evidence). This product should be tied to the purpose of the scoping study as recommended in framework stage one.
Finally, in order to advance the legitimacy of scoping study methodology, we must consider the implications of findings within the broader context. As a result, we recommend that researchers consider the meaning of their scoping study results and the broader implications for research, policy, and practice. For example, for the question 'how are motor-learning strategies used within contemporary physical and occupational therapy intervention approaches for children with neuromotor conditions?,' the author (DL) presented themes that described strategy use. Results yielded insights into how researchers should better describe interventions in their publications and provided further considerations for clinicians to make informed decisions about which therapeutic approach might best fit their clients' needs. Considering the overall implications of the results as an explicit framework stage will help to ensure that scoping study results have practical implications for future clinical practice, research, and policy. This recommendation leads to the final stage of the framework.
Arksey and O'Malley [ 6 ] suggest that consultation is an optional stage in conducting a scoping study. Although only one of our three scoping studies incorporated this stage, we argue that it adds methodological rigor and should be considered a required component. Arksey and O'Malley [ 6 ] suggest that the purposes of consulting with stakeholders are to offer additional sources of information, perspectives, meaning, and applicability to the scoping study. However, it is unclear when, how, and why to consult with stakeholders, and how to analyze and integrate these data with the findings. We recommend researchers clearly establish a purpose for the consultation, which may include sharing preliminary findings with stakeholders, validating the findings, or informing future research. We suggest researchers use preliminary findings from stage five (either in the form of a framework, themes, or list of findings) as a foundation from which to inform the consultation. This will enable stakeholders to build on the evidence and offer a higher level of meaning, content expertise, and perspective to the preliminary findings. We also recommend that researchers clearly articulate the type of stakeholders with whom they wish to consult, how they will collect the data ( e.g ., focus groups, interviews, surveys), and how these data will be analyzed, reported, and integrated within the overall study outcome.
Finally, given that consultation requires researchers to orient stakeholders on the scoping study purpose, research question, preliminary findings, and plans for dissemination, we recommend that this stage additionally be considered a knowledge transfer mechanism. This may address Brien et al .'s [ 5 ] concern about the usefulness of scoping studies for stakeholders and how to translate knowledge about scoping studies. Given the importance of knowledge transfer and exchange in the uptake of research evidence [ 12 , 13 ], the consultation stage can be used to specifically translate the preliminary scoping study findings and develop effective dissemination strategies with stakeholders in the field, offering additional value to a scoping study.
One scoping study included a consultation phase comprised of focus groups and interviews with 28 stakeholders including people living with HIV, researchers, educators, clinicians, and policy makers [ 7 ]. Authors shared preliminary findings from the literature review phase of the scoping study with stakeholders and asked whether they may be able to identify any additional emerging issues related to HIV and rehabilitation not yet published in the evidence. The team proceeded to conduct a second consultation with 17 new and returning stakeholders whereby the team presented a preliminary framework of HIV and rehabilitation research and stakeholders refined the framework to further identify six key research priorities on HIV and rehabilitation. This series of consultations engaged community members in the development of the study outcome and provided opportunities for knowledge transfer about HIV and rehabilitation research. This process offered an ideal mechanism to enhance the validity of the study outcome while translating findings with the community. Nevertheless, further development of steps for undertaking knowledge translation as a part of the scoping study framework is required.
Scoping study terminology.
Discrepancies in nomenclature between 'scoping reviews,' 'scoping studies,' 'scoping literature reviews,' and 'scoping exercises' lead to confusion. Despite our collective use of the Arksey and O'Malley framework, two authors (DL, HC) titled their studies as 'scoping reviews' while the other used 'scoping study.' In this paper, we use 'scoping studies' for consistency with Arksey and O'Malley's original framework. Nevertheless, the potential differences (if any) among the terms merit clarification. Lack of a universal definition for scoping studies is also problematic to researchers trying to clearly articulate their reasons for undertaking a scoping study. Finally, we advocate for labeling the methodology as the 'Arksey and O'Malley framework' to provide consistency for future use.
Another consideration for scoping study methodology is the potential need to assess included studies for methodological quality. Brien et al. [ 5 ] state that this lack of quality assessment makes the results of scoping studies more challenging to interpret. Grant and Booth [ 4 ] imply that a lack of quality assessment limits the uptake of scoping study findings into policy and practice. While our research questions did not directly relate to any quality assessment debate, we recognize the challenges in assessing quality among the vast range of published and grey literature that may be included in scoping studies. This also raises the question of whether and how evidence from stakeholder consultation is evaluated in the scoping study process. It remains unclear whether the lack of quality assessment impacts the uptake and relevance of scoping study findings.
A final consideration for legitimization of scoping study methodology includes the development of a critical appraisal tool for scoping study quality [ 5 ]. Anderson et al. [ 2 ] offer criteria for assessing the value and utility of a commissioned scoping study in health policy contexts, but these criteria are not necessarily applicable to scoping studies in other areas of health research. Developing a critical appraisal tool would require the elements of a methodologically rigorous scoping study to be defined. This could include, but would not be limited to, the minimum level of analysis required and the requirements for reporting results. Overall, the issues surrounding quality assessment of included studies and subsequent scoping studies require further discussion.
This paper responds to Arksey and O'Malley's [ 6 ] request for feedback to their proposed methodological framework. However, the recommendations that we propose are derived from our subjective experiences undertaking scoping studies of varying sizes in the rehabilitation field, and we recognize that they may not represent the opinions of all scoping study authors. Other than our individual experiences with our own studies, we have not yet implemented the full framework recommendations. Hence, readers can determine how strongly to interpret and implement these recommendations in their scoping study research. We invite others to trial our recommendations and continue the process of refining and improving this methodology.
Scoping studies present an increasingly popular option for synthesizing health evidence. Brien et al. [ 5 ] argue that guidelines are required to facilitate scoping review reporting and transparency. In this paper, we build on the existing methodological framework for scoping studies outlined by Arksey and O'Malley [ 6 ] and provide recommendations to clarify and enhance each stage, which may increase the consistency with which researchers undertake and report scoping studies. Recommendations include: clarifying and linking the purpose and research question; balancing feasibility with breadth and comprehensiveness of the scoping process; using an iterative team approach to selecting studies and extracting data; incorporating a numerical summary and qualitative thematic analysis; identifying the implications of the study findings for policy, practice, or research; and adopting consultation as a required component of scoping study methodology. Ongoing considerations include: establishing a common accepted definition and purpose(s) of scoping studies; defining methodological rigor for the assessment of scoping study quality; debating the need for quality assessment of included studies; and formalizing knowledge translation as a required element of scoping methodology. Continued debate and development about scoping study methodology will help to maximize the usefulness of scoping study findings within healthcare research and practice.
The authors declare that they have no competing interests.
DL and HC conceived of this paper. DL undertook the literature review process. DL, HC and KO developed challenges and recommendations. All authors drafted the manuscript. All authors read and approved the final manuscript.
DL is a physical therapist and doctoral candidate in the School of Rehabilitation Science at McMaster University. HC is an occupational therapist and doctoral candidate in the School of Rehabilitation Science at McMaster University. KO is a clinical epidemiologist, physical therapist, and postdoctoral fellow in the School of Rehabilitation Science at McMaster University. She is also a Lecturer in the Department of Physical Therapy at the University of Toronto.
DL is supported by a Doctoral Award from the Canadian Child Health Clinician Scientist Program, a strategic training initiative of the Canadian Institutes of Health Research (CIHR), and the McMaster Child Health Research Institute. HC is supported by a Doctoral Award from the CIHR, the CIHR Quality of Life Strategic Training Program in Rehabilitation Research and the Canadian Occupational Therapy Foundation. KO is supported by a Fellowship from the CIHR, HIV/AIDS Research Program and a Michael DeGroote Postdoctoral Fellowship (McMaster University). The authors acknowledge the helpful feedback of Dr. Cheryl Missiuna on an earlier draft of this manuscript.
Reference management. Clean and simple.
Why do you need a research methodology, what needs to be included, why do you need to document your research method, what are the different types of research instruments, qualitative / quantitative / mixed research methodologies, how do you choose the best research methodology for you, frequently asked questions about research methodology, related articles.
When you’re working on your first piece of academic research, there are many different things to focus on, and it can be overwhelming to stay on top of everything. This is especially true of budding or inexperienced researchers.
If you’ve never put together a research proposal before or find yourself in a position where you need to explain your research methodology decisions, there are a few things you need to be aware of.
Once you understand the ins and outs, handling academic research in the future will be less intimidating. We break down the basics below:
A research methodology encompasses the way in which you intend to carry out your research. This includes how you plan to tackle things like collection methods, statistical analysis, participant observations, and more.
You can think of your research methodology as being a formula. One part will be how you plan on putting your research into practice, and another will be why you feel this is the best way to approach it. Your research methodology is ultimately a methodological and systematic plan to resolve your research problem.
In short, you are explaining how you will take your idea and turn it into a study, which in turn will produce valid and reliable results that are in accordance with the aims and objectives of your research. This is true whether your paper plans to make use of qualitative methods or quantitative methods.
The purpose of a research methodology is to explain the reasoning behind your approach to your research - you'll need to support your collection methods, methods of analysis, and other key points of your work.
Think of it like writing a plan or an outline for you what you intend to do.
When carrying out research, it can be easy to go off-track or depart from your standard methodology.
Tip: Having a methodology keeps you accountable and on track with your original aims and objectives, and gives you a suitable and sound plan to keep your project manageable, smooth, and effective.
With all that said, how do you write out your standard approach to a research methodology?
As a general plan, your methodology should include the following information:
In any dissertation, thesis, or academic journal, you will always find a chapter dedicated to explaining the research methodology of the person who carried out the study, also referred to as the methodology section of the work.
A good research methodology will explain what you are going to do and why, while a poor methodology will lead to a messy or disorganized approach.
You should also be able to justify in this section your reasoning for why you intend to carry out your research in a particular way, especially if it might be a particularly unique method.
Having a sound methodology in place can also help you with the following:
A research instrument is a tool you will use to help you collect, measure and analyze the data you use as part of your research.
The choice of research instrument will usually be yours to make as the researcher and will be whichever best suits your methodology.
There are many different research instruments you can use in collecting data for your research.
Generally, they can be grouped as follows:
These are the most common ways of carrying out research, but it is really dependent on your needs as a researcher and what approach you think is best to take.
It is also possible to combine a number of research instruments if this is necessary and appropriate in answering your research problem.
There are three different types of methodologies, and they are distinguished by whether they focus on words, numbers, or both.
Data type | What is it? | Methodology |
---|---|---|
Quantitative | This methodology focuses more on measuring and testing numerical data. What is the aim of quantitative research? | Surveys, tests, existing databases. |
Qualitative | Qualitative research is a process of collecting and analyzing both words and textual data. | Observations, interviews, focus groups. |
Mixed-method | A mixed-method approach combines both of the above approaches. | Where you can use a mixed method of research, this can produce some incredibly interesting results. This is due to testing in a way that provides data that is both proven to be exact while also being exploratory at the same time. |
➡️ Want to learn more about the differences between qualitative and quantitative research, and how to use both methods? Check out our guide for that!
If you've done your due diligence, you'll have an idea of which methodology approach is best suited to your research.
It’s likely that you will have carried out considerable reading and homework before you reach this point and you may have taken inspiration from other similar studies that have yielded good results.
Still, it is important to consider different options before setting your research in stone. Exploring different options available will help you to explain why the choice you ultimately make is preferable to other methods.
If proving your research problem requires you to gather large volumes of numerical data to test hypotheses, a quantitative research method is likely to provide you with the most usable results.
If instead you’re looking to try and learn more about people, and their perception of events, your methodology is more exploratory in nature and would therefore probably be better served using a qualitative research methodology.
It helps to always bring things back to the question: what do I want to achieve with my research?
Once you have conducted your research, you need to analyze it. Here are some helpful guides for qualitative data analysis:
➡️ How to do a content analysis
➡️ How to do a thematic analysis
➡️ How to do a rhetorical analysis
Research methodology refers to the techniques used to find and analyze information for a study, ensuring that the results are valid, reliable and that they address the research objective.
Data can typically be organized into four different categories or methods: observational, experimental, simulation, and derived.
Writing a methodology section is a process of introducing your methods and instruments, discussing your analysis, providing more background information, addressing your research limitations, and more.
Your research methodology section will need a clear research question and proposed research approach. You'll need to add a background, introduce your research question, write your methodology and add the works you cited during your data collecting phase.
The research methodology section of your study will indicate how valid your findings are and how well-informed your paper is. It also assists future researchers planning to use the same methodology, who want to cite your study or replicate it.
Home » Delimitations in Research – Types, Examples and Writing Guide
Table of Contents
Definition:
Delimitations refer to the specific boundaries or limitations that are set in a research study in order to narrow its scope and focus. Delimitations may be related to a variety of factors, including the population being studied, the geographical location, the time period, the research design , and the methods or tools being used to collect data .
Here are some reasons why delimitations are important in research studies:
Here are some types of delimitations in research and their significance:
This type of delimitation refers to the time frame in which the research will be conducted. Time delimitations are important because they help to narrow down the scope of the study and ensure that the research is feasible within the given time constraints.
Geographical delimitations refer to the geographic boundaries within which the research will be conducted. These delimitations are significant because they help to ensure that the research is relevant to the intended population or location.
Population delimitations refer to the specific group of people that the research will focus on. These delimitations are important because they help to ensure that the research is targeted to a specific group, which can improve the accuracy of the results.
Data delimitations refer to the specific types of data that will be used in the research. These delimitations are important because they help to ensure that the data is relevant to the research question and that the research is conducted using reliable and valid data sources.
Scope delimitations refer to the specific aspects or dimensions of the research that will be examined. These delimitations are important because they help to ensure that the research is focused and that the findings are relevant to the research question.
In order to write delimitations in research, you can follow these steps:
Here are some situations when you may need to write delimitations in research:
Examples of Delimitations in Research are as follows:
Research Title : “Impact of Artificial Intelligence on Cybersecurity Threat Detection”
Delimitations :
Research Title: “The Effects of Social Media on Academic Performance: A Case Study of College Students”
Delimitations:
Some Purposes of Delimitations are as follows:
Here are some common applications of delimitations:
Some Advantages of Delimitations are as follows:
Researcher, Academic Writer, Web developer
You have full access to this open access article
166 Accesses
Explore all metrics
Due to the need for generalizable and rapidly delivered evidence to inform healthcare decision-making, real-world data have grown increasingly important to answer causal questions. However, causal inference using observational data poses numerous challenges, and relevant methodological literature is vast. We endeavored to identify underlying unifying themes of causal inference using real-world healthcare data and connect them into a single schema to aid in observational study design, and to demonstrate this schema using a previously published research example. A multidisciplinary team (epidemiology, biostatistics, health economics) reviewed the literature related to causal inference and observational data to identify key concepts. A visual guide to causal study design was developed to concisely and clearly illustrate how the concepts are conceptually related to one another. A case study was selected to demonstrate an application of the guide. An eight-step guide to causal study design was created, integrating essential concepts from the literature, anchored into conceptual groupings according to natural steps in the study design process. The steps include defining the causal research question and the estimand; creating a directed acyclic graph; identifying biases and design and analytic techniques to mitigate their effect, and techniques to examine the robustness of findings. The cardiovascular case study demonstrates the applicability of the steps to developing a research plan. This paper used an existing study to demonstrate the relevance of the guide. We encourage researchers to incorporate this guide at the study design stage in order to elevate the quality of future real-world evidence.
Avoid common mistakes on your manuscript.
Approximately 50 new drugs are approved each year in the United States (Mullard 2022 ). For all new drugs, randomized controlled trials (RCTs) are the gold-standard by which potential effectiveness (“efficacy”) and safety are established. However, RCTs cannot guarantee how a drug will perform in a less controlled context. For this reason, regulators frequently require observational, post-approval studies using “real-world” data, sometimes even as a condition of drug approval. The “real-world” data requested by regulators is often derived from insurance claims databases and/or healthcare records. Importantly, these data are recorded during routine clinical care without concern for potential use in research. Yet, in recent years, there has been increasing use of such data for causal inference and regulatory decision making, presenting a variety of methodologic challenges for researchers and stakeholders to consider (Arlett et al. 2022 ; Berger et al. 2017 ; Concato and ElZarrad 2022 ; Cox et al. 2009 ; European Medicines Agency 2023 ; Franklin and Schneeweiss 2017 ; Girman et al. 2014 ; Hernán and Robins 2016 ; International Society for Pharmacoeconomics and Outcomes Research (ISPOR) 2022 ; International Society for Pharmacoepidemiology (ISPE) 2020 ; Stuart et al. 2013 ; U.S. Food and Drug Administration 2018 ; Velentgas et al. 2013 ).
Current guidance for causal inference using observational healthcare data articulates the need for careful study design (Berger et al. 2017 ; Cox et al. 2009 ; European Medicines Agency 2023 ; Girman et al. 2014 ; Hernán and Robins 2016 ; Stuart et al. 2013 ; Velentgas et al. 2013 ). In 2009, Cox et al. described common sources of bias in observational data and recommended specific strategies to mitigate these biases (Cox et al. 2009 ). In 2013, Stuart et al. emphasized counterfactual theory and trial emulation, offered several approaches to address unmeasured confounding, and provided guidance on the use of propensity scores to balance confounding covariates (Stuart et al. 2013 ). In 2013, the Agency for Healthcare Research and Quality (AHRQ) released an extensive, 200-page guide to developing a protocol for comparative effectiveness research using observational data (Velentgas et al. 2013 ). The guide emphasized development of the research question, with additional chapters on study design, comparator selection, sensitivity analyses, and directed acyclic graphs (Velentgas et al. 2013 ). In 2014, Girman et al. provided a clear set of steps for assessing study feasibility including examination of the appropriateness of the data for the research question (i.e., ‘fit-for-purpose’), empirical equipoise, and interpretability, stating that comparative effectiveness research using observational data “should be designed with the goal of drawing a causal inference” (Girman et al. 2014 ). In 2017 , Berger et al. described aspects of “study hygiene,” focusing on procedural practices to enhance confidence in, and credibility of, real-world data studies (Berger et al. 2017 ). Currently, the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP) maintains a guide on methodological standards in pharmacoepidemiology which discusses causal inference using observational data and includes an overview of study designs, a chapter on methods to address bias and confounding, and guidance on writing statistical analysis plans (European Medicines Agency 2023 ). In addition to these resources, the “target trial framework” provides a structured approach to planning studies for causal inferences from observational databases (Hernán and Robins 2016 ; Wang et al. 2023b ). This framework, published in 2016, encourages researchers to first imagine a clinical trial for the study question of interest and then to subsequently design the observational study to reflect the hypothetical trial (Hernán and Robins 2016 ).
While the literature addresses critical issues collectively, there remains a need for a framework that puts key components, including the target trial approach, into a simple, overarching schema (Loveless 2022 ) so they can be more easily remembered, and communicated to all stakeholders including (new) researchers, peer-reviewers, and other users of the research findings (e.g., practicing providers, professional clinical societies, regulators). For this reason, we created a step-by-step guide for causal inference using administrative health data, which aims to integrate these various best practices at a high level and complements existing, more specific guidance, including those from the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) and the International Society for Pharmacoepidemiology (ISPE) (Berger et al. 2017 ; Cox et al. 2009 ; Girman et al. 2014 ). We demonstrate the application of this schema using a previously published paper in cardiovascular research.
This work involved a formative phase and an implementation phase to evaluate the utility of the causal guide. In the formative phase, a multidisciplinary team with research expertise in epidemiology, biostatistics, and health economics reviewed selected literature (peer-reviewed publications, including those mentioned in the introduction, as well as graduate-level textbooks) related to causal inference and observational healthcare data from the pharmacoepidemiologic and pharmacoeconomic perspectives. The potential outcomes framework served as the foundation for our conception of causal inference (Rubin 2005 ). Information was grouped into the following four concepts: (1) Defining the Research Question; (2) Defining the Estimand; (3) Identifying and Mitigating Biases; (4) Sensitivity Analysis. A step-by-step guide to causal study design was developed to distill the essential elements of each concept, organizing them into a single schema so that the concepts are clearly related to one another. References for each step of the schema are included in the Supplemental Table.
In the implementation phase we tested the application of the causal guide to previously published work (Dondo et al. 2017 ). The previously published work utilized data from the Myocardial Ischaemia National Audit Project (MINAP), the United Kingdom’s national heart attack register. The goal of the study was to assess the effect of β-blockers on all-cause mortality among patients hospitalized for acute myocardial infarction without heart failure or left ventricular systolic dysfunction. We selected this paper for the case study because of its clear descriptions of the research goal and methods, and the explicit and methodical consideration of potential biases and use of sensitivity analyses to examine the robustness of the main findings.
The step-by-step guide to causal inference comprises eight distinct steps (Fig. 1 ) across the four concepts. As scientific inquiry and study design are iterative processes, the various steps may be completed in a different order than shown, and steps may be revisited.
A step-by-step guide for causal study design
Abbreviations: GEE: generalized estimating equations; IPC/TW: inverse probability of censoring/treatment weighting; ITR: individual treatment response; MSM: marginal structural model; TE: treatment effect
Please refer to the Supplemental Table for references providing more in-depth information.
1 Ensure that the exposure and outcome are well-defined based on literature and expert opinion.
2 More specifically, measures of association are not affected by issues such as confounding and selection bias because they do not intend to isolate and quantify a single causal pathway. However, information bias (e.g., variable misclassification) can negatively affect association estimates, and association estimates remain subject to random variability (and are hence reported with confidence intervals).
3 This list is not exhaustive; it focuses on frequently encountered biases.
4 To assess bias in a nonrandomized study following the target trial framework, use of the ROBINS-I tool is recommended ( https://www.bmj.com/content/355/bmj.i4919 ).
5 Only a selection of the most popular approaches is presented here. Other methods exist; e.g., g-computation and g-estimation for both time-invariant and time-varying analysis; instrumental variables; and doubly-robust estimation methods. There are also program evaluation methods (e.g., difference-in-differences, regression discontinuities) that can be applied to pharmacoepidemiologic questions. Conventional outcome regression analysis is not recommended for causal estimation due to issues determining covariate balance, correct model specification, and interpretability of effect estimates.
6 Online tools include, among others, an E-value calculator for unmeasured confounding ( https://www.evalue-calculator.com /) and the P95 outcome misclassification estimator ( http://apps.p-95.com/ISPE /).
The process of designing a study begins with defining the research question. Research questions typically center on whether a causal relationship exists between an exposure and an outcome. This contrasts with associative questions, which, by their nature, do not require causal study design elements because they do not attempt to isolate a causal pathway from a single exposure to an outcome under study. It is important to note that the phrasing of the question itself should clarify whether an association or a causal relationship is of interest. The study question “Does statin use reduce the risk of future cardiovascular events?” is explicitly causal and requires that the study design addresses biases such as confounding. In contrast, the study question “Is statin use associated with a reduced risk of future cardiovascular events?” can be answered without control of confounding since the word “association” implies correlation. Too often, however, researchers use the word “association” to describe their findings when their methods were created to address explicitly causal questions (Hernán 2018 ). For example, a study that uses propensity score-based methods to balance risk factors between treatment groups is explicitly attempting to isolate a causal pathway by removing confounding factors. This is different from a study that intends only to measure an association. In fact, some journals may require that the word “association” be used when causal language would be more appropriate; however, this is beginning to change (Flanagin et al. 2024 ).
The estimand is the causal effect of research interest and is described in terms of required design elements: the target population for the counterfactual contrast, the kind of effect, and the effect/outcome measure.
In Step 2, the study team determines the target population of interest, which depends on the research question of interest. For example, we may want to estimate the effect of the treatment in the entire study population, i.e., the hypothetical contrast between all study patients taking the drug of interest versus all study patients taking the comparator (the average treatment effect; ATE). Other effects can be examined, including the average treatment effect in the treated or untreated (ATT or ATU).When covariate distributions are the same across the treated and untreated populations and there is no effect modification by covariates, these effects are generally the same (Wang et al. 2017 ). In RCTs, this occurs naturally due to randomization, but in non-randomized data, careful study design and statistical methods must be used to mitigate confounding bias.
In Step 3, the study team decides whether to measure the intention-to-treat (ITT), per-protocol, or as-treated effect. The ITT approach is also known as “first-treatment-carried-forward” in the observational literature (Lund et al. 2015 ). In trials, the ITT measures the effect of treatment assignment rather than the treatment itself, and in observational data the ITT can be conceptualized as measuring the effect of treatment as started . To compute the ITT effect from observational data, patients are placed into the exposure group corresponding to the treatment that they initiate, and treatment switching or discontinuation are purposely ignored in the analysis. Alternatively, a per-protocol effect can be measured from observational data by classifying patients according to the treatment that they initiated but censoring them when they stop, switch, or otherwise change treatment (Danaei et al. 2013 ; Yang et al. 2014 ). Finally, “as-treated” effects are estimated from observational data by classifying patients according to their actual treatment exposure during follow-up, for example by using multiple time windows to measure exposure changes (Danaei et al. 2013 ; Yang et al. 2014 ).
Step 4 is the final step in specifying the estimand in which the research team determines the effect measure of interest. Answering this question has two parts. First, the team must consider how the outcome of interest will be measured. Risks, rates, hazards, odds, and costs are common ways of measuring outcomes, but each measure may be best suited to a particular scenario. For example, risks assume patients across comparison groups have equal follow-up time, while rates allow for variable follow-up time (Rothman et al. 2008 ). Costs may be of interest in studies focused on economic outcomes, including as inputs to cost-effectiveness analyses. After deciding how the outcome will be measured, it is necessary to consider whether the resulting quantity will be compared across groups using a ratio or a difference. Ratios convey the effect of exposure in a way that is easy to understand, but they do not provide an estimate of how many patients will be affected. On the other hand, differences provide a clearer estimate of the potential public health impact of exposure; for example, by allowing the calculation of the number of patients that must be treated to cause or prevent one instance of the outcome of interest (Tripepi et al. 2007 ).
Observational, real-world studies can be subject to multiple potential sources of bias, which can be grouped into confounding, selection, measurement, and time-related biases (Prada-Ramallal et al. 2019 ).
In Step 5, as a practical first approach in developing strategies to address threats to causal inference, researchers should create a visual mapping of factors that may be related to the exposure, outcome, or both (also called a directed acyclic graph or DAG) (Pearl 1995 ). While creating a high-quality DAG can be challenging, guidance is increasingly available to facilitate the process (Ferguson et al. 2020 ; Gatto et al. 2022 ; Hernán and Robins 2020 ; Rodrigues et al. 2022 ; Sauer 2013 ). The types of inter-variable relationships depicted by DAGs include confounders, colliders, and mediators. Confounders are variables that affect both exposure and outcome, and it is necessary to control for them in order to isolate the causal pathway of interest. Colliders represent variables affected by two other variables, such as exposure and outcome (Griffith et al. 2020 ). Colliders should not be conditioned on since by doing so, the association between exposure and outcome will become distorted. Mediators are variables that are affected by the exposure and go on to affect the outcome. As such, mediators are on the causal pathway between exposure and outcome and should also not be conditioned on, otherwise a path between exposure and outcome will be closed and the total effect of the exposure on the outcome cannot be estimated. Mediation analysis is a separate type of analysis aiming to distinguish between direct and indirect (mediated) effects between exposure and outcome and may be applied in certain cases (Richiardi et al. 2013 ). Overall, the process of creating a DAG can create valuable insights about the nature of the hypothesized underlying data generating process and the biases that are likely to be encountered (Digitale et al. 2022 ). Finally, an extension to DAGs which incorporates counterfactual theory is available in the form of Single World Intervention Graphs (SWIGs) as described in a 2013 primer (Richardson and Robins 2013 ).
In Step 6, researchers comprehensively assess the possibility of different types of bias in their study, above and beyond what the creation of the DAG reveals. Many potential biases have been identified and summarized in the literature (Berger et al. 2017 ; Cox et al. 2009 ; European Medicines Agency 2023 ; Girman et al. 2014 ; Stuart et al. 2013 ; Velentgas et al. 2013 ). Every study can be subject to one or more biases, each of which can be addressed using one or more methods. The study team should thoroughly and explicitly identify all possible biases with consideration for the specifics of the available data and the nuances of the population and health care system(s) from which the data arise. Once the potential biases are identified and listed, the team can consider potential solutions using a variety of study design and analytic techniques.
In Step 7, the study team considers solutions to the biases identified in Step 6. “Target trial” thinking serves as the basis for many of these solutions by requiring researchers to consider how observational studies can be designed to ensure comparison groups are similar and produce valid inferences by emulating RCTs (Labrecque and Swanson 2017 ; Wang et al. 2023b ). Designing studies to include only new users of a drug and an active comparator group is one way of increasing the similarity of patients across both groups, particularly in terms of treatment history. Careful consideration must be paid to the specification of the time periods and their relationship to inclusion/exclusion criteria (Suissa and Dell’Aniello 2020 ). For instance, if a drug is used intermittently, a longer wash-out period is needed to ensure adequate capture of prior use in order to avoid bias (Riis et al. 2015 ). The study team should consider how to approach confounding adjustment, and whether both time-invariant and time-varying confounding may be present. Many potential biases exist, and many methods have been developed to address them in order to improve causal estimation from observational data. Many of these methods, such as propensity score estimation, can be enhanced by machine learning (Athey and Imbens 2019 ; Belthangady et al. 2021 ; Mai et al. 2022 ; Onasanya et al. 2024 ; Schuler and Rose 2017 ; Westreich et al. 2010 ). Machine learning has many potential applications in the causal inference discipline, and like other tools, must be used with careful planning and intentionality. To aid in the assessment of potential biases, especially time-related ones, and the development of a plan to address them, the study design should be visualized (Gatto et al. 2022 ; Schneeweiss et al. 2019 ). Additionally, we note the opportunity for collaboration across research disciplines (e.g., the application of difference-in-difference methods (Zhou et al. 2016 ) to the estimation of comparative drug effectiveness and safety).
Causal study design concludes with Step 8, which includes planning quality control and sensitivity analyses to improve the internal validity of the study. Quality control begins with reviewing study output for prima facie validity. Patient characteristics (e.g., distributions of age, sex, region) should align with expected values from the researchers’ intuition and the literature, and researchers should assess reasons for any discrepancies. Sensitivity analyses should be conducted to determine the robustness of study findings. Researchers can test the stability of study estimates using a different estimand or type of model than was used in the primary analysis. Sensitivity analysis estimates that are similar to those of the primary analysis might confirm that the primary analysis estimates are appropriate. The research team may be interested in how changes to study inclusion/exclusion criteria may affect study findings or wish to address uncertainties related to measuring the exposure or outcome in the administrative data by modifying the algorithms used to identify exposure or outcome (e.g., requiring hospitalization with a diagnosis code in a principal position rather than counting any claim with the diagnosis code in any position). As feasible, existing validation studies for the exposure and outcome should be referenced, or new validation efforts undertaken. The results of such validation studies can inform study estimates via quantitative bias analyses (Lanes and Beachler 2023 ). The study team may also consider biases arising from unmeasured confounding and plan quantitative bias analyses to explore how unmeasured confounding may impact estimates. Quantitative bias analysis can assess the directionality, magnitude, and uncertainty of errors arising from a variety of limitations (Brenner and Gefeller 1993 ; Lash et al. 2009 , 2014 ; Leahy et al. 2022 ).
In order to demonstrate how the guide can be used to plan a research study utilizing causal methods, we turn to a previously published study (Dondo et al. 2017 ) that assessed the causal relationship between the use of 𝛽-blockers and mortality after acute myocardial infarction in patients without heart failure or left ventricular systolic dysfunction. The investigators sought to answer a causal research question (Step 1), and so we proceed to Step 2. Use (or no use) of 𝛽-blockers was determined after discharge without taking into consideration discontinuation or future treatment changes (i.e., intention-to-treat). Considering treatment for whom (Step 3), both ATE and ATT were evaluated. Since survival was the primary outcome, an absolute difference in survival time was chosen as the effect measure (Step 4). While there was no explicit directed acyclic graph provided, the investigators specified a list of confounders.
Robust methodologies were established by consideration of possible sources of biases and addressing them using viable solutions (Steps 6 and 7). Table 1 offers a list of the identified potential biases and their corresponding solutions as implemented. For example, to minimize potential biases including prevalent-user bias and selection bias, the sample was restricted to patients with no previous use of 𝛽-blockers, no contraindication for 𝛽-blockers, and no prescription of loop diuretics. To improve balance across the comparator groups in terms of baseline confounders, i.e., those that could influence both exposure (𝛽-blocker use) and outcome (mortality), propensity score-based inverse probability of treatment weighting (IPTW) was employed. However, we noted that the baseline look-back period to assess measured covariates was not explicitly listed in the paper.
Quality control and sensitivity analysis (Step 8) is described extensively. The overlap of propensity score distributions between comparator groups was tested and confounder balance was assessed. Since observations in the tail-end of the propensity score distribution may violate the positivity assumption (Crump et al. 2009 ), a sensitivity analysis was conducted including only cases within 0.1 to 0.9 of the propensity score distribution. While not mentioned by the authors, the PS tails can be influenced by unmeasured confounders (Sturmer et al. 2021 ), and the findings were robust with and without trimming. An assessment of extreme IPTW weights, while not included, would further help increase confidence in the robustness of the analysis. An instrumental variable approach was employed to assess potential selection bias due to unmeasured confounding, using hospital rates of guideline-indicated prescribing as the instrument. Additionally, potential bias caused by missing data was attenuated through the use of multiple imputation, and separate models were built for complete cases only and imputed/complete cases.
We have described a conceptual schema for designing observational real-world studies to estimate causal effects. The application of this schema to a previously published study illuminates the methodologic structure of the study, revealing how each structural element is related to a potential bias which it is meant to address. Real-world evidence is increasingly accepted by healthcare stakeholders, including the FDA (Concato and Corrigan-Curay 2022 ; Concato and ElZarrad 2022 ), and its use for comparative effectiveness and safety assessments requires appropriate causal study design; our guide is meant to facilitate this design process and complement existing, more specific, guidance.
Existing guidance for causal inference using observational data includes components that can be clearly mapped onto the schema that we have developed. For example, in 2009 Cox et al. described common sources of bias in observational data and recommended specific strategies to mitigate these biases, corresponding to steps 6–8 of our step-by-step guide (Cox et al. 2009 ). In 2013, the AHRQ emphasized development of the research question, corresponding to steps 1–4 of our guide, with additional chapters on study design, comparator selection, sensitivity analyses, and directed acyclic graphs which correspond to steps 7 and 5, respectively (Velentgas et al. 2013 ). Much of Girman et al.’s manuscript (Girman et al. 2014 ) corresponds with steps 1–4 of our guide, and the matter of equipoise and interpretability specifically correspond to steps 3 and 7–8. The current ENCePP guide on methodological standards in pharmacoepidemiology contains a section on formulating a meaningful research question, corresponding to step 1, and describes strategies to mitigate specific sources of bias, corresponding to steps 6–8 (European Medicines Agency 2023 ). Recent works by the FDA Sentinel Innovation Center (Desai et al. 2024 ) and the Joint Initiative for Causal Inference (Dang et al. 2023 ) provide more advanced exposition of many of the steps in our guide. The target trial framework contains guidance on developing seven components of the study protocol, including eligibility criteria, treatment strategies, assignment procedures, follow-up period, outcome, causal contrast of interest, and analysis plan (Hernán and Robins 2016 ). Our work places the target trial framework into a larger context illustrating its relationship with other important study planning considerations, including the creation of a directed acyclic graph and incorporation of prespecified sensitivity and quantitative bias analyses.
Ultimately, the feasibility of estimating causal effects relies on the capabilities of the available data. Real-world data sources are complex, and the investigator must carefully consider whether the data on hand are sufficient to answer the research question. For example, a study that relies solely on claims data for outcome ascertainment may suffer from outcome misclassification bias (Lanes and Beachler 2023 ). This bias can be addressed through medical record validation for a random subset of patients, followed by quantitative bias analysis (Lanes and Beachler 2023 ). If instead, the investigator wishes to apply a previously published, claims-based algorithm validated in a different database, they must carefully consider the transportability of that algorithm to their own study population. In this way, causal inference from real-world data requires the ability to think creatively and resourcefully about how various data sources and elements can be leveraged, with consideration for the strengths and limitations of each source. The heart of causal inference is in the pairing of humility and creativity: the humility to acknowledge what the data cannot do, and the creativity to address those limitations as best as one can at the time.
As with any attempt to synthesize a broad array of information into a single, simplified schema, there are several limitations to our work. Space and useability constraints necessitated simplification of the complex source material and selections among many available methodologies, and information about the relative importance of each step is not currently included. Additionally, it is important to consider the context of our work. This step-by-step guide emphasizes analytic techniques (e.g., propensity scores) that are used most frequently within our own research environment and may not include less familiar study designs and analytic techniques. However, one strength of the guide is that additional designs and techniques or concepts can easily be incorporated into the existing schema. The benefit of a schema is that new information can be added and is more readily accessed due to its association with previously sorted information (Loveless 2022 ). It is also important to note that causal inference was approached as a broad overarching concept defined by the totality of the research, from start to finish, rather than focusing on a particular analytic technique, however we view this as a strength rather than a limitation.
Finally, the focus of this guide was on the methodologic aspects of study planning. As a result, we did not include steps for drafting or registering the study protocol in a public database or for communicating results. We strongly encourage researchers to register their study protocols and communicate their findings with transparency. A protocol template endorsed by ISPOR and ISPE for studies using real-world data to evaluate treatment effects is available (Wang et al. 2023a ). Additionally, the steps described above are intended to illustrate an order of thinking in the study planning process, and these steps are often iterative. The guide is not intended to reflect the order of study execution; specifically, quality control procedures and sensitivity analyses should also be formulated up-front at the protocol stage.
We outlined steps and described key conceptual issues of importance in designing real-world studies to answer causal questions, and created a visually appealing, user-friendly resource to help researchers clearly define and navigate these issues. We hope this guide serves to enhance the quality, and thus the impact, of real-world evidence.
No datasets were generated or analysed during the current study.
Arlett, P., Kjaer, J., Broich, K., Cooke, E.: Real-world evidence in EU Medicines Regulation: Enabling Use and establishing value. Clin. Pharmacol. Ther. 111 (1), 21–23 (2022)
Article PubMed Google Scholar
Athey, S., Imbens, G.W.: Machine Learning Methods That Economists Should Know About. Annual Review of Economics 11(Volume 11, 2019): 685–725. (2019)
Belthangady, C., Stedden, W., Norgeot, B.: Minimizing bias in massive multi-arm observational studies with BCAUS: Balancing covariates automatically using supervision. BMC Med. Res. Methodol. 21 (1), 190 (2021)
Article PubMed PubMed Central Google Scholar
Berger, M.L., Sox, H., Willke, R.J., Brixner, D.L., Eichler, H.G., Goettsch, W., Madigan, D., Makady, A., Schneeweiss, S., Tarricone, R., Wang, S.V., Watkins, J.: and C. Daniel Mullins. 2017. Good practices for real-world data studies of treatment and/or comparative effectiveness: Recommendations from the joint ISPOR-ISPE Special Task Force on real-world evidence in health care decision making. Pharmacoepidemiol Drug Saf. 26 (9): 1033–1039
Brenner, H., Gefeller, O.: Use of the positive predictive value to correct for disease misclassification in epidemiologic studies. Am. J. Epidemiol. 138 (11), 1007–1015 (1993)
Article CAS PubMed Google Scholar
Concato, J., Corrigan-Curay, J.: Real-world evidence - where are we now? N Engl. J. Med. 386 (18), 1680–1682 (2022)
Concato, J., ElZarrad, M.: FDA Issues Draft Guidances on Real-World Evidence, Prepares to Publish More in Future [accessed on 2022]. (2022). https://www.fda.gov/drugs/news-events-human-drugs/fda-issues-draft-guidances-real-world-evidence-prepares-publish-more-future
Cox, E., Martin, B.C., Van Staa, T., Garbe, E., Siebert, U., Johnson, M.L.: Good research practices for comparative effectiveness research: Approaches to mitigate bias and confounding in the design of nonrandomized studies of treatment effects using secondary data sources: The International Society for Pharmacoeconomics and Outcomes Research Good Research Practices for Retrospective Database Analysis Task Force Report–Part II. Value Health. 12 (8), 1053–1061 (2009)
Crump, R.K., Hotz, V.J., Imbens, G.W., Mitnik, O.A.: Dealing with limited overlap in estimation of average treatment effects. Biometrika. 96 (1), 187–199 (2009)
Article Google Scholar
Danaei, G., Rodriguez, L.A., Cantero, O.F., Logan, R., Hernan, M.A.: Observational data for comparative effectiveness research: An emulation of randomised trials of statins and primary prevention of coronary heart disease. Stat. Methods Med. Res. 22 (1), 70–96 (2013)
Dang, L.E., Gruber, S., Lee, H., Dahabreh, I.J., Stuart, E.A., Williamson, B.D., Wyss, R., Diaz, I., Ghosh, D., Kiciman, E., Alemayehu, D., Hoffman, K.L., Vossen, C.Y., Huml, R.A., Ravn, H., Kvist, K., Pratley, R., Shih, M.C., Pennello, G., Martin, D., Waddy, S.P., Barr, C.E., Akacha, M., Buse, J.B., van der Laan, M., Petersen, M.: A causal roadmap for generating high-quality real-world evidence. J. Clin. Transl Sci. 7 (1), e212 (2023)
Desai, R.J., Wang, S.V., Sreedhara, S.K., Zabotka, L., Khosrow-Khavar, F., Nelson, J.C., Shi, X., Toh, S., Wyss, R., Patorno, E., Dutcher, S., Li, J., Lee, H., Ball, R., Dal Pan, G., Segal, J.B., Suissa, S., Rothman, K.J., Greenland, S., Hernan, M.A., Heagerty, P.J., Schneeweiss, S.: Process guide for inferential studies using healthcare data from routine clinical practice to evaluate causal effects of drugs (PRINCIPLED): Considerations from the FDA Sentinel Innovation Center. BMJ. 384 , e076460 (2024)
Digitale, J.C., Martin, J.N., Glymour, M.M.: Tutorial on directed acyclic graphs. J. Clin. Epidemiol. 142 , 264–267 (2022)
Dondo, T.B., Hall, M., West, R.M., Jernberg, T., Lindahl, B., Bueno, H., Danchin, N., Deanfield, J.E., Hemingway, H., Fox, K.A.A., Timmis, A.D., Gale, C.P.: beta-blockers and Mortality after Acute myocardial infarction in patients without heart failure or ventricular dysfunction. J. Am. Coll. Cardiol. 69 (22), 2710–2720 (2017)
Article CAS PubMed PubMed Central Google Scholar
European Medicines Agency: ENCePP Guide on Methodological Standards in Pharmacoepidemiology [accessed on 2023]. (2023). https://www.encepp.eu/standards_and_guidances/methodologicalGuide.shtml
Ferguson, K.D., McCann, M., Katikireddi, S.V., Thomson, H., Green, M.J., Smith, D.J., Lewsey, J.D.: Evidence synthesis for constructing directed acyclic graphs (ESC-DAGs): A novel and systematic method for building directed acyclic graphs. Int. J. Epidemiol. 49 (1), 322–329 (2020)
Flanagin, A., Lewis, R.J., Muth, C.C., Curfman, G.: What does the proposed causal inference Framework for Observational studies Mean for JAMA and the JAMA Network Journals? JAMA (2024)
U.S. Food and Drug Administration: Framework for FDA’s Real-World Evidence Program [accessed on 2018]. (2018). https://www.fda.gov/media/120060/download
Franklin, J.M., Schneeweiss, S.: When and how can Real World Data analyses substitute for randomized controlled trials? Clin. Pharmacol. Ther. 102 (6), 924–933 (2017)
Gatto, N.M., Wang, S.V., Murk, W., Mattox, P., Brookhart, M.A., Bate, A., Schneeweiss, S., Rassen, J.A.: Visualizations throughout pharmacoepidemiology study planning, implementation, and reporting. Pharmacoepidemiol Drug Saf. 31 (11), 1140–1152 (2022)
Girman, C.J., Faries, D., Ryan, P., Rotelli, M., Belger, M., Binkowitz, B., O’Neill, R.: and C. E. R. S. W. G. Drug Information Association. 2014. Pre-study feasibility and identifying sensitivity analyses for protocol pre-specification in comparative effectiveness research. J. Comp. Eff. Res. 3 (3): 259–270
Griffith, G.J., Morris, T.T., Tudball, M.J., Herbert, A., Mancano, G., Pike, L., Sharp, G.C., Sterne, J., Palmer, T.M., Davey Smith, G., Tilling, K., Zuccolo, L., Davies, N.M., Hemani, G.: Collider bias undermines our understanding of COVID-19 disease risk and severity. Nat. Commun. 11 (1), 5749 (2020)
Hernán, M.A.: The C-Word: Scientific euphemisms do not improve causal inference from Observational Data. Am. J. Public Health. 108 (5), 616–619 (2018)
Hernán, M.A., Robins, J.M.: Using Big Data to emulate a target Trial when a Randomized Trial is not available. Am. J. Epidemiol. 183 (8), 758–764 (2016)
Hernán, M., Robins, J.: Causal Inference: What if. Chapman & Hall/CRC, Boca Raton (2020)
Google Scholar
International Society for Pharmacoeconomics and Outcomes Research (ISPOR): Strategic Initiatives: Real-World Evidence [accessed on 2022]. (2022). https://www.ispor.org/strategic-initiatives/real-world-evidence
International Society for Pharmacoepidemiology (ISPE): Position on Real-World Evidence [accessed on 2020]. (2020). https://pharmacoepi.org/pub/?id=136DECF1-C559-BA4F-92C4-CF6E3ED16BB6
Labrecque, J.A., Swanson, S.A.: Target trial emulation: Teaching epidemiology and beyond. Eur. J. Epidemiol. 32 (6), 473–475 (2017)
Lanes, S., Beachler, D.C.: Validation to correct for outcome misclassification bias. Pharmacoepidemiol Drug Saf. (2023)
Lash, T.L., Fox, M.P., Fink, A.K.: Applying Quantitative bias Analysis to Epidemiologic data. Springer (2009)
Lash, T.L., Fox, M.P., MacLehose, R.F., Maldonado, G., McCandless, L.C., Greenland, S.: Good practices for quantitative bias analysis. Int. J. Epidemiol. 43 (6), 1969–1985 (2014)
Leahy, T.P., Kent, S., Sammon, C., Groenwold, R.H., Grieve, R., Ramagopalan, S., Gomes, M.: Unmeasured confounding in nonrandomized studies: Quantitative bias analysis in health technology assessment. J. Comp. Eff. Res. 11 (12), 851–859 (2022)
Loveless, B.: A Complete Guide to Schema Theory and its Role in Education [accessed on 2022]. (2022). https://www.educationcorner.com/schema-theory/
Lund, J.L., Richardson, D.B., Sturmer, T.: The active comparator, new user study design in pharmacoepidemiology: Historical foundations and contemporary application. Curr. Epidemiol. Rep. 2 (4), 221–228 (2015)
Mai, X., Teng, C., Gao, Y., Governor, S., He, X., Kalloo, G., Hoffman, S., Mbiydzenyuy, D., Beachler, D.: A pragmatic comparison of logistic regression versus machine learning methods for propensity score estimation. Supplement: Abstracts of the 38th International Conference on Pharmacoepidemiology: Advancing Pharmacoepidemiology and Real-World Evidence for the Global Community, August 26–28, 2022, Copenhagen, Denmark. Pharmacoepidemiology and Drug Safety 31(S2). (2022)
Mullard, A.: 2021 FDA approvals. Nat. Rev. Drug Discov. 21 (2), 83–88 (2022)
Onasanya, O., Hoffman, S., Harris, K., Dixon, R., Grabner, M.: Current applications of machine learning for causal inference in healthcare research using observational data. International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Atlanta, GA. (2024)
Pearl, J.: Causal diagrams for empirical research. Biometrika. 82 (4), 669–688 (1995)
Prada-Ramallal, G., Takkouche, B., Figueiras, A.: Bias in pharmacoepidemiologic studies using secondary health care databases: A scoping review. BMC Med. Res. Methodol. 19 (1), 53 (2019)
Richardson, T.S., Robins, J.M.: Single World Intervention Graphs: A Primer [accessed on 2013]. (2013). https://www.stats.ox.ac.uk/~evans/uai13/Richardson.pdf
Richiardi, L., Bellocco, R., Zugna, D.: Mediation analysis in epidemiology: Methods, interpretation and bias. Int. J. Epidemiol. 42 (5), 1511–1519 (2013)
Riis, A.H., Johansen, M.B., Jacobsen, J.B., Brookhart, M.A., Sturmer, T., Stovring, H.: Short look-back periods in pharmacoepidemiologic studies of new users of antibiotics and asthma medications introduce severe misclassification. Pharmacoepidemiol Drug Saf. 24 (5), 478–485 (2015)
Rodrigues, D., Kreif, N., Lawrence-Jones, A., Barahona, M., Mayer, E.: Reflection on modern methods: Constructing directed acyclic graphs (DAGs) with domain experts for health services research. Int. J. Epidemiol. 51 (4), 1339–1348 (2022)
Rothman, K.J., Greenland, S., Lash, T.L.: Modern Epidemiology. Wolters Kluwer Health/Lippincott Williams & Wilkins, Philadelphia (2008)
Rubin, D.B.: Causal inference using potential outcomes. J. Am. Stat. Assoc. 100 (469), 322–331 (2005)
Article CAS Google Scholar
Sauer, B.V.: TJ. Use of Directed Acyclic Graphs. In Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide , edited by P. Velentgas, N. Dreyer, and P. Nourjah: Agency for Healthcare Research and Quality (US) (2013)
Schneeweiss, S., Rassen, J.A., Brown, J.S., Rothman, K.J., Happe, L., Arlett, P., Dal Pan, G., Goettsch, W., Murk, W., Wang, S.V.: Graphical depiction of longitudinal study designs in Health Care databases. Ann. Intern. Med. 170 (6), 398–406 (2019)
Schuler, M.S., Rose, S.: Targeted maximum likelihood estimation for causal inference in Observational studies. Am. J. Epidemiol. 185 (1), 65–73 (2017)
Stuart, E.A., DuGoff, E., Abrams, M., Salkever, D., Steinwachs, D.: Estimating causal effects in observational studies using Electronic Health data: Challenges and (some) solutions. EGEMS (Wash DC) 1 (3). (2013)
Sturmer, T., Webster-Clark, M., Lund, J.L., Wyss, R., Ellis, A.R., Lunt, M., Rothman, K.J., Glynn, R.J.: Propensity score weighting and trimming strategies for reducing Variance and Bias of Treatment Effect estimates: A Simulation Study. Am. J. Epidemiol. 190 (8), 1659–1670 (2021)
Suissa, S., Dell’Aniello, S.: Time-related biases in pharmacoepidemiology. Pharmacoepidemiol Drug Saf. 29 (9), 1101–1110 (2020)
Tripepi, G., Jager, K.J., Dekker, F.W., Wanner, C., Zoccali, C.: Measures of effect: Relative risks, odds ratios, risk difference, and ‘number needed to treat’. Kidney Int. 72 (7), 789–791 (2007)
Velentgas, P., Dreyer, N., Nourjah, P., Smith, S., Torchia, M.: Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide. Agency for Healthcare Research and Quality (AHRQ) Publication 12(13). (2013)
Wang, A., Nianogo, R.A., Arah, O.A.: G-computation of average treatment effects on the treated and the untreated. BMC Med. Res. Methodol. 17 (1), 3 (2017)
Wang, S.V., Pottegard, A., Crown, W., Arlett, P., Ashcroft, D.M., Benchimol, E.I., Berger, M.L., Crane, G., Goettsch, W., Hua, W., Kabadi, S., Kern, D.M., Kurz, X., Langan, S., Nonaka, T., Orsini, L., Perez-Gutthann, S., Pinheiro, S., Pratt, N., Schneeweiss, S., Toussi, M., Williams, R.J.: HARmonized Protocol Template to enhance reproducibility of hypothesis evaluating real-world evidence studies on treatment effects: A good practices report of a joint ISPE/ISPOR task force. Pharmacoepidemiol Drug Saf. 32 (1), 44–55 (2023a)
Wang, S.V., Schneeweiss, S., Initiative, R.-D., Franklin, J.M., Desai, R.J., Feldman, W., Garry, E.M., Glynn, R.J., Lin, K.J., Paik, J., Patorno, E., Suissa, S., D’Andrea, E., Jawaid, D., Lee, H., Pawar, A., Sreedhara, S.K., Tesfaye, H., Bessette, L.G., Zabotka, L., Lee, S.B., Gautam, N., York, C., Zakoul, H., Concato, J., Martin, D., Paraoan, D.: and K. Quinto. Emulation of Randomized Clinical Trials With Nonrandomized Database Analyses: Results of 32 Clinical Trials. JAMA 329(16): 1376-85. (2023b)
Westreich, D., Lessler, J., Funk, M.J.: Propensity score estimation: Neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J. Clin. Epidemiol. 63 (8), 826–833 (2010)
Yang, S., Eaton, C.B., Lu, J., Lapane, K.L.: Application of marginal structural models in pharmacoepidemiologic studies: A systematic review. Pharmacoepidemiol Drug Saf. 23 (6), 560–571 (2014)
Zhou, H., Taber, C., Arcona, S., Li, Y.: Difference-in-differences method in comparative Effectiveness Research: Utility with unbalanced groups. Appl. Health Econ. Health Policy. 14 (4), 419–429 (2016)
Download references
The authors received no financial support for this research.
Authors and affiliations.
Carelon Research, Wilmington, DE, USA
Sarah Ruth Hoffman, Nilesh Gangan, Joseph L. Smith, Arlene Tave, Yiling Yang, Christopher L. Crowe & Michael Grabner
Elevance Health, Indianapolis, IN, USA
Xiaoxue Chen
University of Maryland School of Pharmacy, Baltimore, MD, USA
Susan dosReis
You can also search for this author in PubMed Google Scholar
SH, NG, JS, AT, CC, MG are employees of Carelon Research, a wholly owned subsidiary of Elevance Health, which conducts health outcomes research with both internal and external funding, including a variety of private and public entities. XC was an employee of Elevance Health at the time of study conduct. YY was an employee of Carelon Research at the time of study conduct. SH, MG, and JLS are shareholders of Elevance Health. SdR receives funding from GlaxoSmithKline for a project unrelated to the content of this manuscript and conducts research that is funded by state and federal agencies.
Correspondence to Sarah Ruth Hoffman .
Competing interests.
The authors declare no competing interests.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Below is the link to the electronic supplementary material.
Supplementary material 2, rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Hoffman, S.R., Gangan, N., Chen, X. et al. A step-by-step guide to causal study design using real-world data. Health Serv Outcomes Res Method (2024). https://doi.org/10.1007/s10742-024-00333-6
Download citation
Received : 07 December 2023
Revised : 31 May 2024
Accepted : 10 June 2024
Published : 19 June 2024
DOI : https://doi.org/10.1007/s10742-024-00333-6
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Wilson A 1 , Krikov S 2 , Crockett D 3 1 Parexel International, Waltham, MA, USA, 2 Parexel International, Lexington, MA, USA, 3 Intermountain Health, Salt Lake City, UT, USA
OBJECTIVES: Clinical research often requires collaboration and data sharing. Collaborative research can speed up research and improve findings, but using sensitive data like patient info raises privacy concerns. These challenges can negate any potential time savings and, in fact, be entirely prohibitive. One emerging solution to data sharing comes from the emerging field of synthetic data generation (SDG).
METHODS: In this study, we established an evaluation framework to assess synthetic data quality by comparing target causal effect estimates across different estimation methods. Successful synthetic data was defined as preserving both effect relationships and confounding structures necessary for accurate causal inference.
RESULTS: The results {illustrated in Figure 2} indicate that advanced SDG methods are successful in obtaining accurate causal estimates and maintaining confounding structures in a kidney disease progression case study.
CONCLUSIONS: Synthetic data offers a pragmatic balance between data utility and privacy protection. It also enables broader data accessibility and collaboration while allowing for the inclusion of rare or underrepresented conditions in research, enhancing the scope and depth of studies.
Methodological & Statistical Research, Real World Data & Information Systems
Data Protection, Integrity, & Quality Assurance
No Additional Disease & Conditions/Specialized Treatment Areas, Urinary/Kidney Disorders
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Scientific Reports volume 14 , Article number: 14517 ( 2024 ) Cite this article
Metrics details
Technology offers a lot of potential that is being used to improve the integrity and efficiency of infrastructures. Crack is one of the major concerns that can affect the integrity or usability of any structure. Oftentimes, the use of manual inspection methods leads to delays which can worsen the situation. Automated crack detection has become very necessary for efficient management and inspection of critical infrastructures. Previous research in crack detection employed classification and localization-based models using Deep Convolutional Neural Networks (DCNNs). This study suggests and compares the effectiveness of transfer learned DCNNs for crack detection as a classification model and as a feature extractor to overcome this restriction. The main objective of this paper is to present various methods of crack detection on surfaces and compare their performance over 3 different datasets. Experiments conducted in this work are threefold: initially, the effectiveness of 12 transfer learned DCNN models for crack detection is analyzed on three publicly available datasets: SDNET, CCIC and BSD. With an accuracy of 53.40%, ResNet101 outperformed other models on the SDNET dataset. EfficientNetB0 was the most accurate (98.8%) model on the BSD dataset, and ResNet50 performed better with an accuracy of 99.8% on the CCIC dataset. Secondly, two image enhancement methods are employed to enhance the images and are transferred learned on the 12 DCNNs in pursuance of improving the performance of the SDNET dataset. The results from the experiments show that the enhanced images improved the accuracy of transfer-learned crack detection models significantly. Furthermore, deep features extracted from the last fully connected layer of the DCNNs are used to train the Support Vector Machine (SVM). The integration of deep features with SVM enhanced the detection accuracy across all the DCNN-dataset combinations, according to analysis in terms of accuracy, precision, recall, and F1-score.
Cracks in concrete structures, resulting from factors like rust, chemical degradation, and unfavorable loading, serve as warning signs for tension, fragility, and wear. The length, width, depth, and position of these cracks impact their significance 1 . To ensure the long-term serviceability of infrastructures, monitoring structural health and performance is crucial 2 . Traditional manual inspection methods relying on eyesight are time-consuming, labor-intensive, and prone to subjective conclusions. The high cost of labor and potential human error makes frequent manual inspections impractical. Efficiently identifying surface cracks within a specific timeframe is crucial for enhancing the maintenance protocols of buildings. This swift detection allows for timely interventions, preventing the deterioration of structural issues and minimizing repair costs. By promptly addressing these cracks, potential safety hazards can be mitigated, ensuring the longevity and structural integrity of the building. Recent advancements in science and technology have led to the development of automatic crack detection models, employing image processing and machine learning (ML) techniques 3 , 4 , 5 , 6 .
Image processing-based techniques use statistical features from structural images to detect and locate cracks, treating them as regions with sudden pixel intensity changes. Machine Learning (ML)-based models utilize hand-crafted features, such as edge, texture, and color, for automatic crack detection 7 , 8 . With the availability of massive datasets, researchers have turned to Deep Learning (DL), particularly Convolutional Neural Networks (CNN), for more effective crack detection. The success of DL-based models, especially neural networks with multiple layers, has significantly improved feature learning. CNNs, with varied filters highlighting crucial features, extract basic image features in initial layers and advanced, crack-specific features in deeper layers. These features are then passed to a multi-layer perceptron classifier for crack detection. The accessibility of powerful computing resources and continuous advancements in training techniques on readily available datasets propel the rapid development of deep learning. Despite the success in feature extraction, there’s a need to enhance the accuracy of these models in detecting concrete cracks.
In this research, we put forth a method of transfer learning-based deep convolutional neural networks (DCNN) with the pre-trained weights as a classifier and feature extractor, which exhibits a considerable increase in terms of performance, unavailability of large dataset and training time. This paper also investigates the impact of ML classifiers learned over deep features for crack detection. Three publicly available datasets were used for the study SDNET2018 9 , Concrete Crack Images for Classification (CCIC) 10 , and Bridge Crack Dataset (BCD) 11 . Experiments conducted in this work are threefold (1) Crack detection based on transfer learned deep CNNs: 12 state-of-the-art CNN models transfer learned on ImageNet were used to classify the crack images (2) Crack detection using transfer learned CNNs on enhanced crack images (3) Examining DCNN’s performance as a feature extractor.
The obtained features from deep CNNs’ fully connected layers (final FC layers) are classified and compared using ML algorithms. The major contributions of the proposed work are:
Classification of crack images using 12 transfer-learned DCNNs including VGG16, VGG19, Xception, ResNet50, ResNet101, ResNet152, InceptionV3, InceptionResNetV2, MobileNet, MobileNetV2, DenseNet121 and EfficientNetB0.
Analysis of the effectiveness of image enhancement techniques such as contrast enhancement and Local Binary Pattern (LBP) pre-processing on transfer learned DCNN models for crack detection.
Development of Support Vector Machine (SVM)-ML-based classification model on deep features extracted from the aforementioned DCNN models.
The following is how the paper is organized: The related crack detection research is covered in Section “ Literature review ” of this paper. The proposed system and the experiments carried out to categorize the images are described in Section “ Proposed Methodology ”. A description of the various datasets used is provided in Section “ Dataset ”. The outcomes and conclusions of the experiment are described in Section “ Experimental Result and Analysis ”. The paper is concluded in Section “ Conclusion and Future Scope ”.
A thorough description of the most recent crack detection models is provided in this section. Crack detection models found in the literature can be divided into three major categories based on their workflow: (1) Models based on traditional image processing algorithms (2) Models based on machine learning models (3) Models based on deep learning models.
Crack detection using image processing methods have three major steps: image acquisition, pre-processing, and crack detection 12 . The target component is first photographed in high quality with a camera or any other imaging instrument. The next step in the pre-processing is to eliminate noise and shadows from the images by applying filters, segmentation, and other techniques. If necessary for the particular crack detection technique being used, the image may be transformed to gray-scale or binary format. The generated image is then put through crack detection, which emphasizes or segments the image’s cracked area using image processing techniques like edge detection, segmentation, or pixel analysis. 13 . Lins et al. 14 , developed a method to identify cracks using several color models like HSV (Hue-Saturation-Value) and RGB (Red–Green–Blue). They proposed a color feature extraction model, which searches for certain color compositions in an image in comparison to a standard query color. Further, the authors have used their crack measurement algorithm to measure the length and width of the detected cracks. Shahrokhinasab et al. 15 , analyzed various image processing methods like edge detection, and thresholding, to classify cracks. Munawar et al. 16 , analyzed different methods of fissure detection including genetic programming, beamlet transformation, Unmanned Aerial System based approach, and the Shi Tomasi algorithm. Zou et al. 17 , introduced an automated crack detection system titled as CrackTree which uses a geodesic shadow removal algorithm to eliminate shadows from pavement images.
A crack probability map is produced using tensor voting, and a graph model is built by choosing crack seeds from the crack probability map. Recursive edge pruning in the graph’s Minimum Spanning Tree (MST) is used to find the final crack curves. Gabor Filters were employed by Salman et al. 18 , for crack detection. Niu et al. 19 , introduced a method to find cracks in tunnels that involve a series of image processing, image filtering, and image feature extraction methods. They have used uniform light processing for the crack to appear better, used median and bilateral filtering to filter out the noise, and used a combination of Gabor filter and EMAP to extract required features. The features were then fed into the CEM algorithm to detect the cracks. Oliviera et al. 20 , used a group of pixel-based and block-based image processing algorithms. The image processing techniques used were anisotropic diffusion, Perona and Malik’s algorithm, morphological smoothening, alternative sequential filtering, a combination of morphological erosion and dilation operators, Symlet decomposition filters, and UINTA and R- UINTA. Baltazart et al. 21 , presented an improved version of the Minimum Spanning Tree algorithm to identify cracks called the MPS– VI and analyzed the computational time of each model. An ACDS architecture was proposed by Jo et al. 22 , which had an image acquisition block, a pre-processing block and the classification block. In the pre-processing block, they used the Hessian-based method, Gabor filter, Otsu, Retinex filter, and Median filter to extract features and use these features to train and classify the deep belief network. Classical image processing-based models for crack detection depend on the quality of images.
ML-based models for crack detection follow five steps: dataset collection, pre-processing of images, feature extraction, model training on the extracted features, and testing. Landstrom and Thurley 23 employed morphological operators to slice the cracks from the image and logistic regression is used to distinguish the crack/non-crack images using the segmented images. Prasanna et al. 24 , put forth a crack detection method called spatially tuned robust multi-feature (STRUM), in which the authors have explored classifiers including SVM, AdaBoost, and Random Forest. Lin et al. 25 , used hidden Markov random field-expectation–maximization (HMRF–EM) for automatic pavement crack detection, with 2 major modules. Firstly, the hidden Markov random field model and its expectation–maximization are combined with the adaptive line detector to increase detecting accuracy. Secondly, the integrity and continuity of the detected cracks are improved by the quantitative description of the crack region’s credibility and conditional connection. FG Pratico et al. 26 , provided a method for classifying the structural health condition of several vibro-acoustically different road pavement cracks (concealed bottom-up cracks) using supervised machine learning techniques. The technique intends to gather the signatures (using roadside acoustic sensors) and categorize the structural health status of the pavement using ML models. They compared various ML classifiers, including the random forest classifier (RFC), support vector classifier (SVC) and multi-layer perceptron (MLP). Results indicate the SVC is the best-performing ML model with an accuracy of 99.1%. Zhang et al. 27 , suggested a new method for identifying surface fractures in coal mining sites using Unmanned Aerial Vehicle(UAV) imagery and ML.
The overall accuracy was increased to 88.99% by applying the V-SVM classifier. The authors also used Laplace sharpening to improve the color of the images and Principal Component Analysis (PCA) to minimize the entire set of features to 95% of the initial variance. A ML-computer vision pipeline was proposed by Zhang et al. 28 for detecting the formation of fatigue cracks. Cracks were detected using an ML model, and vision-based algorithms were further utilized to examine the growth direction and length of the fatigue crack. The primary problem with ML-based models for crack detection is the selection and extraction of relevant features for the classifier’s training.
Numerous crack detection models have been developed in the literature as a result of recent developments in deep learning (DL), particularly the evolution of convolutional neural networks (CNNs) 29 . DL-based models for crack detection follow steps analogous to ML-based models described above. The major difference is that DL models do feature extraction implicitly. A dataset of surface cracks must be gathered first to train the DL model. To minimize noise, eliminate shadows, and modify other features like image size and brightness, the images are then pre-processed using image processing techniques. These images are then subjected to pixel-by-pixel annotation, or labeling, where the pixels corresponding to cracks are annotated either manually or by using annotation tools. One example of labeling is making the remaining pixels in the image black or “0” and the crack pixels white or “1” in the image. Following this, a DL architecture CNNS must be chosen to be applied to crack detection. Li et al. 30 , proposed a deep neural architecture with a convolutional block, four dense connections, five deep supervision modules, three conversion modules and one fusion module to identify cracked surfaces. Zhang et al. 31 , introduced a CNN architecture with four convolutional layers and two fully connected layers. Their convolution network achieved a precision of 0.869 and a recall of 0.925 for crack detection. Meng et al. 32 , proposed a deep residual neural network-based concrete crack identification method that identified concrete crack images at the pixel level. Transfer learned EfficientNetB0 was employed by C.Su and W. Wang 33 for crack detection. They reported an accuracy better than that of a fully convolutional network proposed by Ye et al. 34 , which gave an accuracy of 93.6%. Feng et al. 35 , used transfer learning on the InceptionV3 model to classify cracks which included crack, intact, spalling, seepage and rebar exposure as the classes. A custom convolutional neural network with three convolutional layers was introduced by Kim et al. 36 for crack detection. The images were pre-processed using morphological filters and contrast enhancement operators, which in turn were used to train the CNN model for the identification of cracks. Cao et al. 37 , used object-detecting paradigms such as faster RCNN and SSD models along with MobileNet, Inception, Resnet, and Inception Resnet to detect road cracks. They used mAP(mean average precision) as the performance metric to test the combinations of Object detecting paradigms and DCNNs. Among all the combinations, Faster RCNN paired with Inception V2 gave the best results with mAP at 53.06%. A two-stage detection model including a DCNN and a segmentation module was proposed by NHT Nguyen et al. 38 . The authors proved that the segmentation of cracks at the pixel level improves detection accuracy significantly. In a study presented by SE Park et al. 39 , cracks on concrete structure surfaces have been identified using DL and structured light technologies, which combine two laser sensors with vision.
The YOLO model was used to identify the cracks and the size of all cracks were calculated using the positions of the laser beams on the structural surface. Huyan et al. 40 , presented a model named CrackU-net which detects pavement cracks with a precision of 0.986. Kim et al. 41 , proposed a crack detection technique using shallow CNN architecture. They optimized the LeNet-5 model’s hyper-parameters to obtain maximum accuracy of 99.8% with fewer parameters. Even though some of these models performed pretty well in feature extraction and classification on various applications, their accuracy needs to be increased to detect concrete fractures. In this paper, we are evaluating the effectiveness of transfer-learned deep features for crack detection using raw and enhanced crack images, which shows a significant boost in terms of performance.
This section introduces DCNNs and their application for crack detection in detail.DCNNs, which were first developed in the 1980s, is the most well-known, advanced, and popular DL algorithm 42 . Earlier the researchers were not drawn to DCNNs due to the availability of minimum computational resources, powerful processors, and huge storage devices. But when computers’ processing capacity for computing, database retrieval, and storage expanded, the idea gained popularity 43 . Later in 44 , CNN’s were successfully applied in classification problems and outperformed mostly in solving computer vision problems. Figure 1 depicts a typical CNN structure. The initial layers of DCNN extract basic image features such as edges, patterns, and textures. The middle layers extract object-level information like shape and color, whereas the deeper levels extract class-level features like the whole object. The feature extraction layer’s final output is passed into either a fully connected neural network 45 for classification or a bounding box and pixel classification layer for segmentation.
CNN Architecture.
CNN has emerged as the most widely used and successful DL architecture for various input data types including images, videos and texts, with several cutting-edge architectures reported in the literature. VGG16, VGG19 46 , Xception 47 , ResNet50, ResNet101, ResNet152 48 , InceptionV3 49 , InceptionResNetV2 50 , MobileNet 51 , MobileNetV2 52 , DenseNet121 53 , EfficientNetB0 54 are some of the well-known and leading-edge DCNN architectures for classification. DCNN varieties for classification, segmentation, or localization can be used to detect cracks in the input image.
This paper proposes transfer learning-based DL models for crack identification through classification. This work carried out three experiments: (1) Transfer Learning for Crack Detection Without Image Enhancement (2) Transfer Learning for Crack Detection with Image Enhancement (3) Crack detection using SVM on deep features. Figure 2 depicts the experiments carried out in the proposed model.
Proposed Transfer Learning Architecture for Crack Detection with Pre-trained CNN Models on ImageNet Weights.
A model created for one job is utilized as the basis for another task in transfer learning, a machine learning technique 55 , 56 , 57 . The use of pre-trained models as the foundation for computer vision and natural language processing tasks is a common strategy in DL research due to the massive computing time and resources required to develop neural network models 58 . The benefits of using a transfer learned model over an end-to-end neural network include significant time and computation savings. Recent research reveals that transfer-learned models outperform traditional neural networks and can work with smaller amounts of data. Generally, for computer vision applications, the features extracted by the first and middle layers of a neural network are similar for similar inputs. The latter layers that extract high-level features make the difference. The proposed model freezes the first and middle layers and makes the final layers trainable. We retain the weights from the old model trained on a comparatively large dataset and only train a few parameters.
Figure 3 illustrates the process of transfer learning applied to a Deep Convolutional Neural Network (DCNN) using pre-trained ImageNet weights. In this experiment, we adapted the DCNN model for crack detection by leveraging the weights learned from the ImageNet dataset.
Transfer Learning Pipeline used in the proposed model.
To accomplish this, we first removed the final layers of the pre-trained models. These layers were then replaced with a new architecture consisting of several components: a flattened layer to convert the 2D feature maps into a 1D feature vector, a batch normalization layer to stabilize and accelerate the training process, a dropout layer to prevent overfitting by randomly setting a fraction of input units to zero during training, and a dense layer with two neurons, each using a sigmoid activation function to output the probability of the presence or absence of cracks.
Before training the model, the necessary datasets were collected. These datasets were then preprocessed by resizing the images to 224 × 224 pixels, a standard input size for many CNN architectures pre-trained on ImageNet. The resized dataset was subsequently split into three subsets: training, validation, and test sets. This division ensures that the model can be trained, validated, and tested on separate data to evaluate its performance accurately.After preparing the data, we loaded it into the pre-trained CNN model. As mentioned earlier, the model’s original final layers were replaced with a new set of custom layers. This new architecture was specifically designed to refine the pre-trained model’s capacity to detect cracks in images.
The transfer learning model was then trained, but with a specific focus on optimizing only a subset of parameters. Specifically, most parameters from the pre-trained layers were frozen, meaning they were not updated during training. Only the parameters from the newly added custom layers were fine-tuned. This approach allows the model to retain the general features learned from the ImageNet dataset while adapting its final layers to the specific task of crack detection with a smaller amount of data and computational resources.
Two image enhancement methods: Local Binary Pattern and contrast enhancement were employed to pre-process the input image to train the DCNN models. Image enhancement modules were introduced with the assumption that when trained on enhanced input images, Convnets would easily converge, lowering computational costs and improving accuracy. The assumption was supported further by various benchmark evaluation metrics, as shown in the following sections. The selection of image enhancement algorithms was done based on the literature as proposed by Wang et al. 59 , and Chen et al. 60 .
Contrast enhancement in the image makes dark areas darker and light areas lighter, making cracks appear darker than other surfaces. This creates a significant difference between the dark and light areas, which will aid in subsequent classification 59 .
Algorithm 1 details the steps followed for contrast enhancement and Fig. 4 shows the results of contrast enhancement on crack images selected randomly from the dataset. Figure plots the histograms corresponding to the original images and the contrast-enhanced images. From the figure, it is evident that the histograms of original crack images are not uniform (skewed towards the right) whereas that of enhanced images are uniform.
Contrast enhancement on the crack images. ( a ) Original image ( b ) Histograms of original images ( c ) Contrast-enhanced image ( d ) Histograms of Contrast-enhanced image.
LBP is a primitive texture operator that labels pixels in an image by thresholding each pixel’s vicinity based on the current pixel 61 . It is considered an efficient descriptor due to its resistance to changes in illumination, computational simplicity, and reliability in image classification. The LBP Algorithm divides the image into smaller cells and uses the intensity of the center pixel as a threshold for the remaining pixels in the cell. When neighboring pixels are greater than the threshold value, they are thresholded to 1; otherwise, they are thresholded to 0. The binary number is generated by circularly visiting the matrix. As a result, the formed binary number is converted to a decimal and used to update the value of the center pixel.
Algorithm 1: Contrast Enhancement Algorithm | |
1 | Take an input image, brightness value and contrast value |
2 | Check if brightness is equal to 0, if yes go to step 3 else go to step 5 |
3 | If the brightness value is greater than 0 then assign brightness value to shadow and highlight to 255, else assign shadow to 0 and highlight to 255 + brightness value. Calculate the and values using the highlight and shadow values using the below formulas = = |
4 | Using , and as inputs blend the images using add weighted function |
5 | Create an extra copy of the image |
6 | Check if the contrast value is not 0, if yes assign the variables and using the below formulas = 131*(contrast + 127) / (127*(131 – contrast)) gamma_c = 127*(1 – alpha_c) |
7 | Using , and as inputs blend the images using add weighted function |
Algorithm 2: Local Binary Pattern (LBP) | |
1 | Take a center pixel from the given image |
2 | Compare the value of the central pixel to the values of the 8 pixels in the vicinity |
3 | If the neighboring pixel’s value is greater than that of the center pixel then that particular pixel is assigned the value 1, else it is assigned 0 |
4 | Replace the center pixel’s value using the neighboring 8 pixels as shown below: C = Σ (p )*(2 ), where 0 ≤ i ≤ 7 |
5 | For each pixel in the provided image, repeat the preceding instructions |
The LBP feature descriptor is mathematically represented as follows:
where R is the radius and P denotes the pixels adjacent to it. c p is the center pixel’s grayscale value, and n p is the grayscale value of the neighboring pixel. The LBP algorithm is detailed in Algorithm 2. Figure 5 compares results obtained from the image enhancement module (Contrast enhancement and LBP pre-processing) for random images from SDNET 8 . From Fig. 5 , it is evident that the crack regions are more highly visible in the contrast-enhanced images than in the original and LBP pre-processed images.
Image enhancement results on random images from SDNET 1 . ( a ) Original image ( b ) Contrast-enhanced images ( c ) LBP-processed Images.
Although the LBP operator attempted to get hold of the underlying texture of the input image, it was unable to highlight the cracked regions. The same is demonstrated by experimental results in terms of model accuracy on contrast-enhanced images and LBP pre-processed images as shown in Section “ Transfer Learning for Crack Detection with Image Enhancement ” .
The effectiveness of deep features extracted from DCNN for classification is described in this section. The generic CNN architecture comprises a wide range of filters, pooling operators (Max pooling, Average Pooling), and nonlinear activations (ReLu, Sigmoid, Softmax). The filters are learned in either a supervised or unsupervised manner and extract relevant information from the input image. The pooling layers reduce the spatial dimension of the intermediate feature maps from convolution layers, and the activations introduce nonlinearity. Initial layers of DCNNs extract basic image features such as edges, textures, color etc. whereas the deeper layers extract complex class-specific features such as weights. This work proposes to use the weights learned by the deep layers of CNN as the feature representation for the input images, also known as Deep Features. Pre-trained CNN models including VGG16, VGG19, ResNet50, MobileNet, etc. were employed to extract the deep feature vectors to model the high-level representation of inputs. The extracted deep feature vectors are then fed into an ML algorithm like SVM for further classification as depicted in Fig. 6 .
Crack Detection using ML models Based on Deep Features from DCNNs pipeline.
The choice of using deep feature representation for the classification using ML models is based on the assumption that ML models can produce accurate results when trained on good feature representation, and deep features extracted from the final layers of DCNNs can generate high-level representations, implying a symbiotic relationship.
This section details the dataset used for the experiment. Three publicly available datasets were used for the study SDNET2018 9 , Concrete Crack Images for Classification (CCIC) 10 and Bridge Crack Dataset (BCD) 11 . We have formatted the dataset to have equal data points in all classes. However, class imbalance [ 69 ] can result in different results.
The SDNET dataset includes 56,092 images of cracked and non-cracked bridges, pavement, and wall surfaces. Images of bridge decks were obtained from the Systems, Materials, and Structural Health (SMASH) Laboratory at Utah State University, which houses a variety of full-scale bridge deck sections. Images of walls and pavements were taken on the premises of the Utah State University campus. All of the images are 256 × 256 pixels in size and in.jpg format. Table 1 summarizes the number of crack and non-crack images in each subclass of the SDNET dataset (bridge decks, walls, pavement).
The CCIC dataset includes images of concrete cracks and non-cracks. It includes more than 40,000 pictures gathered from different METU campus buildings. This dataset is balanced with only one type of surface concrete. It has 20,000 images in each class, crack and non-crack respectively. The images are of size 227 × 227.
Over 6070 images of cracked and uncracked bridge surfaces are included in the Bridge Crack Dataset (BCD). The crack images were captured using the Phantom 4 Pro’s 1024 1024 CMOS surface array camera. The images were later reduced to 224 × 224 dimensions to create the dataset. This dataset contains 4056 cracked images and 14 non-cracked images. The details of the count of crack and non-crack images of the 3 datasets are provided in Table 2 .
Since all these datasets are quite large, we conducted the experiments with a smaller number of images from each of them. Table 3 summarizes the train and validation split of the images used for experiments for the three datasets and Fig. 7 shows sample images from the three datasets.
Sample crack and non-crack images from the three datasets. ( a ) Crack images from SDNET ( b ) non-crack images from SDNET ( c ) Crack images from CCIC ( d ) non-crack images from CCIC ( e ) Crack images from BCD ( f ) non-crack images from BCD.
This section details the obtained results and their analysis using benchmark evaluation metrics. The performance of classification models for crack detection with 12 image classification models (VGG16, VGG19, Xception, ResNet50, ResNet101, ResNet152, InceptionV3, InceptionResNetV2, MobileNet, MobileNetV2, DenseNet121 and EfficientNetB0) on 3 different datasets (SDNET, CCIC, BCD) were experimented.
The models were implemented on Google Colaboratory and Jupyter notebook with the Machine Learning and Deep Learning packages. The hardware specifications used for the experiments are listed in Table 4 .
Accuracy, sensitivity, specificity, precision, recall, F1-score, and training duration were used to assess each model’s performance. The confusion matrix, which is used to determine the model’s overall performance and is displayed in Table 5 , is utilized to calculate the performance metrics shown below.
Number of predictions made correctly by the model concerning the total predictions made.
Measure of quality of how good the model is at predicting a particular category.
The proportion of Positive samples that were correctly identified as Positive to all of the Positive samples.
The harmonic mean of precision and recall are given by:
The study and findings of transfer learning without image enhancement are covered in this subsection. Using the ImageNet weights from the pre-trained model reduced the number of parameters that needed to be trained in this experiment. A flattened layer, batch normalization layer, dropout layer, and a dense layer with two neurons and a sigmoid as an activation function were added instead of the top layers of all the pre-trained models to achieve this. The transfer learning method’s fine-tuned hyper-parameters are tabulated in Table 6 .
Figures 8 , 9 and 10 show the performance comparison (precision (%), recall (%), accuracy (%)) of state-of-the-art transfer learned DCNNs on SDNET, CCIC and BCD datasets, respectively.
Performance comparison of transfer learned DCNNs on SDNET without image pre-processing.
Performance of transfer learned DCNNS on CCIC without image pre-processing.
Performance of transfer learned DCNNS on BCD without image pre-processing.
From Fig. 8 , it is observed that the ResNet101 is the best model on the SDNET dataset concerning test accuracy. In 34.45 min of training, the model achieved an accuracy of 53.40 percent. The model that performs the poorest concerning test accuracy is MobileNetV2, with a test accuracy of 42.7%. The best model on the BCD dataset from Fig. 9 , EfficientNetB0, has a test accuracy of 98.8% and a training time of 30.15 min. InceptionV3 has the lowest test accuracy of 47.8% on the BCD dataset.The best model for the CCIC dataset is ResNet50 (refer to Fig. 9 ), which achieved a test accuracy of 99.8% after 25.18 min of training. InceptionV3 has the lowest test accuracy compared to other DCNNs on this dataset, with 38.6%.
Table 7 summarizes the precision, recall and F1 score obtained for the transfer of learned DCNNs on three publicly available datasets under study. From Table 7 , it is evident that all the transfer-learned DCNNs perform poorly on the SDNET dataset compared to CCIC and BCD in terms of the three benchmark evaluation metrics under consideration. Based on this observation SDNET dataset was considered for the second experiment on transfer learned DCNNs using enhanced crack images.
Images from the SDNET dataset were pre-processed using image enhancement algorithms, and the improved images were used to transfer and learn the DCNNs. Contrast enhancement and texture feature analysis using the LBP operator were employed to enhance the crack images. The transfer learnt models were then trained using the improved images. EfficientNetB0 achieved the highest test accuracy of 65.10% on contrast-enhanced images (an improvement of 16.8%), whereas a test accuracy of 41.20% was achieved by MobileNetV2. The model that fared the best among those trained using LBP-added images was Xception, with a test accuracy of 60.80% (an improvement of 15.6%), whereas ResNet152 underperformed with a test accuracy of 42.40%. Figure 11 and Fig. 12 compare the performance of transfer learned DCNNs on contrast-enhanced images and LBP pre-processed images respectively.
Performance of transfer learned DCNNS on SDNET with Contrast enhancement.
Performance of transfer learned DCNNS on SDNET with LBP pre-processing.
Table 8 compares the improvement without and with image enhancement on SDNET images. Highlighted improvements include those in recall, precision, and F1 score. It is evident from the table that contrast enhancement improved the performance of most of the deep CNN architecture under consideration for crack detection since the enhanced images were able to highlight the cracked regions better than that of normal images.
Deep features extracted from the final fully connected layers of DCNNs and Support Vector Machine (SVM) are employed in this subsection to categorize the images into crack and non-crack classes. SVM is the most appropriate model to handle datasets with fewer samples of high-dimensional features because the deep features extracted from the fully connected layers of DCNNs will be high-dimension in nature 62 . Deep features and SVM increased the overall accuracy of the models for classification as tabulated in Table 9 . From Fig. 13 , it is understood that the MobileNet produced an accuracy of 83.16% (best model) on the SDNET dataset with deep features and SVM, while VGG16 has an accuracy of 77.16%. All the 12 deep CNN models were able to achieve an accuracy greater than 99% on the CCIC dataset which is shown in Fig. 14 . The models VGG16, ResNet152, MobileNet, MobileNetV2, and EfficientNetB0 continue to be the most accurate in this category with a 99.83% accuracy. Among the aforementioned top 5 DCNNs in terms of accuracy, MobileNetV2 has the fewest training parameters (2,223,872). From the observations, it can be inferred that MobileNetV2 demonstrated the optimum trade-off between accuracy and trainable parameters on the CCIC dataset.
Performance of SDNET with deep features.
Performance of CCIC with deep features.
From Fig. 15 , it is observed that the best models on the BCD dataset are ResNet101 and EfficientNetB0, both of which have an accuracy of 99.83%. EfficientNetB0 is preferred over ResNet101 as it has a smaller number of trainable parameters—nearly ten times fewer.
Performance of BCD with deep features.
Table 9 compares the improved ML models based on deep features for all three datasets. MobileNet was the model that performed the best among the SDNET models, which witnessed an increase in accuracy of between 20 to 30%. For the CCIC dataset accuracy enhancement is 10% and for the BCD dataset 11%.
The proposed study employed 12 pre-trained CNN models to get the best performance for identifying crack and non-crack surfaces. InceptionV3, InceptionResNetV2, MobileNet, MobileNetV2, DenseNet121, and EfficientNetB0 deep models were used to assess the performance of deep feature extraction and transfer learning. It can be shown from the 3 datasets (SDNET, CCIC, and BCD) that the models did exceptionally well on the CCIC and BCD datasets. This is because each of these datasets has a consistent dataset with just one type of surface. The SDNET dataset, on the other hand, has many cracks and non-crack images of various surfaces. This makes it challenging for the models to achieve the necessary accuracy on the SDNET dataset. Transfer learning models performed well on the SDNET dataset, with ResNet101 outperforming the others. ResNet50 and EfficientNetB0 were the best-performing models on the CCIC and BCD datasets, respectively. Even though some models did well on the CCIC and BCD datasets, their accuracy could yet be improved. The findings of the following experiment, in which deep features were extracted and SVM was used to classify data, were better than those of the prior one. MobileNet was the model that performed the best among the SDNET models, which witnessed an increase in accuracy of between 20 to 30%. On the CCIC and BCD datasets, each model’s accuracy was close to 99%. The model’s accuracy significantly improves when extracted deep features are fed to the SVM classifier. The accuracy of models on the SDNET dataset could yet be increased. In other words, performance measures were assessed after all of these models underwent training using images that had previously experienced some processing. While the texture operator LBP did not significantly affect model accuracy, increasing contrast proved to be a helpful pre-processing strategy that led to greater accuracy. This experiment outperformed the prior transfer learning models, but not the accuracy attained by deep features fed into the SVM classifier. From Table 10 and Table 11 it is inferred that, out of the three experiments, classifying images as crack or non-crack using deep features provided to the SVM classifier was successful and produced superior accuracies across all datasets (SDNET: MobileNet; CCIC: MobileNetV2; BCD: EfficientNetB0).
The proposed study compared the effectiveness of Deep Convolutional Neural Networks as a classifier and as a feature extractor for crack detection.
The performance of 12 different transfer-learned DCNN models for crack detection was evaluated and analyzed on three publicly available datasets: SDNET, CCIC and BCD. The effectiveness of image enhancement and deep features extracted from the final fully connected layers of CNN models for classification was also analyzed in terms of benchmark evaluation metrics.
ResNet101(Accuracy: 53.40%), EfficientNetB0 (Accuracy:98.8%) and ResNet50(Accuracy:99.8%) produced best accuracy with normal images from SDNET, BCD and CCIC dataset respectively. Since the effectiveness of transfer learned deep models were minimal on the SDNET images, two image enhancement methods (contrast enhancement and Local Binary Pattern) were employed on the images.
The experimental results show that the enhanced images improved the accuracy of transfer-learned crack detection models significantly.
The effectiveness of Deep features extracted from the final fully connected layers of DCNNs was analyzed in terms of classification accuracy. The extracted deep feature was fed into SVM for classification and the analysis in terms of accuracy, precision, recall, and F1-score revealed that the integration of deep features with SVM improved the detection accuracy across all the DCNN-dataset combinations.
Among the SDNET models, MobileNet was the finest model, with an improvement in accuracy of between 20 and 30%. Each model’s accuracy on the CCIC and BCD datasets was close to 99% for MobileNetV2 and EfficientNetB0 respectively.
The main takeaway is that we can enhance the efficiency, accuracy and decision-making processes in civil engineering applications using these models. By using ML/DL models, the task of structural health monitoring becomes so easy and efficient. It identifies potential structural issues in early stages, contributing to faster maintenance and better safety. A custom ensemble model by combining the best DCNNs for crack detection could be considered as the future scope of this study. There has been substantial research to deal with problems like security 63 and resource allocation 64 with ML and DL models. As a future scope, with enough models to accurately detect cracks we can form so many use cases to bring it to the consumers.
The datasets generated and/or analysed during the current study are available in the SDNET2018 repository, https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=4611&context=cee_facpub 9 , Concrete Crack Images for Classification (CCIC) repository, https://www.kaggle.com/datasets/arnavr10880/concrete-crack-images-for-classification 10 and Bridge Crack Dataset (BCD) repository, https://www.mdpi.com/2076-3417/9/14/2867 11 .
Yi, Y., Zhu, D., Guo, S., Zhang, Z. & Shi, C. A review on the deterioration and approaches to enhance the durability of concrete in the marine environment. Cement Concr. Compos. 113 , 103695 (2020).
Article CAS Google Scholar
Ham, Y., Han, K. K., Lin, J. J. & Golparvar-Fard, M. Visual monitoring of civil infrastructure systems via camera-equipped unmanned aerial vehicles (UAVs): A review of related works. Vis. Eng. 4 (1), 1–8 (2016).
Article Google Scholar
Sharma, K. V. et al. Prognostic modeling of polydisperse SiO2/Aqueous glycerol nanofluids’ thermophysical profile using an explainable artificial intelligence (XAI) approach. Eng. Appl. Artif. Intell. 126 , 106967 (2023).
Kanti, P. K. et al. Thermophysical profile of graphene oxide and MXene hybrid nanofluids for sustainable energy applications: Model prediction with a Bayesian optimized neural network with K-cross fold validation. FlatChem 39 , 100501 (2023).
Kanti, P. et al. Properties of water-based fly ash-copper hybrid nanofluid for solar energy applications: Application of RBF model. Sol. Energy Mater. Sol. Cells 234 , 111423 (2022).
Kanti, P. K. et al. The stability and thermophysical properties of Al2O3-graphene oxide hybrid nanofluids for solar energy applications: application of robust autoregressive modern machine learning technique. Sol. Energy Mater. Sol. Cells 253 , 112207 (2023).
Hsieh, Y. A. & Tsai, Y. J. Machine learning for crack detection: Review and model performance comparison. J. Comput. Civ. Eng. 34 (5), 04020038 (2020).
Munawar, H. S., Hammad, A. W. A., Haddad, A., Soares, C. A. P. & Waller, S. T. Image-based crack detection methods: A review. Infrastructures 6 , 115. https://doi.org/10.3390/infrastructures6080115 (2021).
Dorafshan, S., Thomas, R. J. & Maguire, M. Sdnet 2018: An annotated image dataset for non-contact concrete crack detection using deep convolutional neural networks. Data Brief 21 , 1664–1668 (2018).
Article PubMed PubMed Central Google Scholar
C¸ a˘glar, F., O ¨ zgenel, R.: Concrete crack images for classification. Mendeley Data 2 (2019)
Xu, H. et al. Automatic bridge crack detection using a convolutional neural network. Appl. Sci. 9 (14), 2867 (2019).
Harinath Reddy, C., Mini, K., Radhika, N.: Structural health monitor- ing—an integrated approach for vibration analysis with wireless sensors to steel structure using image processing. In: International Conference on ISMAC in Computational Vision and Bio-Engineering, pp. 1595–1610 (2018). Springer
Pauly, L., Hogg, D., Fuentes, R., Peel, H.: Deeper networks for pavement crack detection. In: Proceedings of the 34th ISARC, pp. 479–485 (2017). IAARC
Lins, R. G. & Givigi, S. N. Automatic crack detection and measurement based on image analysis. IEEE Trans. Instrum. Meas. 65 (3), 583–590. https://doi.org/10.1109/TIM.2015.2509278 (2016).
Shahrokhinasab, E., Hosseinzadeh, N., Monirabbasi, A. & Torkaman, S. Performance of image-based crack detection systems in concrete structures. J. Soft Comput. Civ. Eng. 4 (1), 127–139 (2020).
Google Scholar
Munawar, H. S., Hammad, A. W., Haddad, A., Soares, C. A. P. & Waller, S. T. Image-based crack detection methods: A review. Infrastructures 6 (8), 115 (2021).
Zou, Q., Cao, Y., Li, Q., Mao, Q. & Wang, S. Cracktree: Automatic crack detection from pavement images. Pattern Recognit. Lett. 33 (3), 227–238 (2012).
Salman, M., Mathavan, S., Kamal, K. & Rahman, M. Pavement crack detection using the Gabor filter. In 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013) (eds Salman, M. et al. ) 2039–2044 (IEEE, 2013).
Chapter Google Scholar
Niu, B., Wu, H. & Meng, Y. Application of cem algorithm in the field of tunnel crack identification. In 2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC) (eds Niu, B. et al. ) 232–236 (IEEE, 2020).
Chhabra, G. et al. Human emotions recognition, analysis and transformation by the bioenergy field in smart grid using image processing. Electronics 11 , 4059. https://doi.org/10.3390/electronics11234059 (2022).
Baltazart, V., Nicolle, P. & Yang, L. Ongoing tests and improvements of the mps algorithm for the automatic crack detection within grey level pavement images. In 2017 25th European Signal Processing Conference (EUSIPCO) (eds Baltazart, V. et al. ) 2016–2020 (IEEE, 2017).
Jo, J. & Jadidi, Z. A high precision crack classification system using multi-layered image processing and deep belief learning. Struct. Infrastruct. Eng. 16 (2), 297–305 (2020).
Landstrom, A. & Thurley, M. J. Morphology-based crack detection for steel slabs. IEEE J. Sel. Top. Signal Process. 6 (7), 866–875 (2012).
Prasanna, P. et al. Automated crack detection on concrete bridges. IEEE Trans. Autom. Sci. Eng. 13 (2), 591–599 (2014).
Lin, M., Zhou, R., Yan, Q. & Xu, X. Automatic pavement crack detection using hmrf-em algorithm. In 2019 International Conference on Computer, Information and Telecommunication Systems (CITS) (eds Lin, M. et al. ) 1–5 (IEEE, 2019).
Pratico, F. G., Fedele, R., Naumov, V. & Sauer, T. Detection and monitoring of bottom-up cracks in road pavement using a machine-learning approach. Algorithms 13 (4), 81 (2020).
Zhang, F. et al. A new identification method for surface cracks from uav images based on machine learning in coal mining areas. Remote Sens. 12 (10), 1571 (2020).
Zhang, L. et al. Machine learning-based real-time visible fatigue crack growth detection. Digit. Commun. Netw. 7 (4), 551–558 (2021).
Dharneeshkar, J. et al. Deep learning based detection of potholes in indian roads using yolo. In 2020 International Conference on Inventive Computation Technologies (ICICT) (eds Dharneeshkar, J. et al. ) 381–385 (IEEE, 2020).
Li, H., Zong, J., Nie, J., Wu, Z. & Han, H. Pavement crack detection algorithm based on densely connected and deeply supervised network. IEEE Access 9 , 11835–11842 (2021).
Zhang, L., Yang, F., Zhang, Y. D. & Zhu, Y. J. Road crack detection using deep convolutional neural network. In 2016 IEEE International Conference on Image Processing (ICIP) (eds Zhang, L. et al. ) 3708–3712 (IEEE, 2016).
Meng, X. Concrete crack detection algorithm based on deep residual neural networks. Sci. Program. 2021 , 1–7 (2021).
CAS Google Scholar
Su, C. & Wang, W. Concrete cracks detection using convolutional neural- network based on transfer learning. Math. Problems Eng. 2020 , 1–10 (2020).
Ye, X.-W., Jin, T. & Chen, P.-Y. Structural crack detection using deep learning–based fully convolutional networks. Adv. Struct. Eng. 22 (16), 3412–3419 (2019).
Feng, C. et al. Structural damage detection using deep convolutional neural network and transfer learning. KSCE J. Civ. Eng. 23 (10), 4493–4502 (2019).
Kim, C. N., Kawamura, K., Nakamura, H. & Tarighat, A. Automatic crack detection for concrete infrastructures using image processing and deep learning. In IOP Conference Series: Materials Science and Engineering Vol. 829 (eds Kim, C. N. et al. ) 012027 (IOP Publishing, 2020).
Cao, M.-T., Tran, Q.-V., Nguyen, N.-M. & Chang, K.-T. Survey on performance of deep learning models for detecting road damages using multiple dashcam image resources. Adv. Eng. Inform. 46 , 101182 (2020).
Nguyen, N. H. T., Perry, S., Bone, D., Le, H. T. & Nguyen, T. T. Two-stage convolutional neural network for road crack detection and segmentation. Expert Syst. Appl. 186 , 115718 (2021).
Park, S. E., Eem, S.-H. & Jeon, H. Concrete crack detection and quantifica- tion using deep learning and structured light. Constr. Build. Mater. 252 , 119096 (2020).
Huyan, J., Li, W., Tighe, S., Xu, Z. & Zhai, J. Cracku-net: A novel deep convolutional neural network for pixelwise pavement crack detection. Struct. Control Health Monit. 27 (8), 2551 (2020).
Kim, B., Yuvaraj, N., Sri Preethaa, K. & Arun Pandian, R. Surface crack detection using deep learning with shallow cnn architecture for enhanced computation. Neural Computing Appl. 33 (15), 9289–9305 (2021).
GI, K.F.: A hierarchical neural network capable of visual pattern recognition. Neural Network 1 (1989).
Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115 (3), 211–252 (2015).
Article MathSciNet Google Scholar
LeCun, Y. et al. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 2 , 396–404 (1989).
Arel, I., Rose, D. C. & Karnowski, T. P. Deep machine learning-a new frontier in artificial intelligence research [research frontier]. IEEE comput. Intel. Mag. 5 (4), 13–18 (2010).
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017).
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016).
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI Conference on Artificial Intelligence (2017).
Andrew, G. et al. Efficient convolutional neural networks for mobile vision applications. Mobilenets. Available: http://arxiv.org/abs/1704.04861 (2017).
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018).
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017).
Tan, M. & Le, Q. Efficient Net: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning (eds Tan, M. & Le, Q.) 6105–6114 (PMLR, 2019).
Sikha, O. & Bharath, B. Vgg16-random fourier hybrid model for masked face recognition. Soft Comput. 26 , 1–16 (2022).
Srihari, K. & Sikha, O. Partially supervised image captioning model for urban road views. In Intelligent Data Communication Technologies and Internet of Things (eds Srihari, K. & Sikha, O.) 59–73 (Springer, 2022).
Krishnan, G. & Sikha, O. Analysis on the Effectiveness of Transfer Learned Features for x-ray Image Retrieval. In Innovative Data Communication Technologies and Application (eds Krishnan, G. & Sikha, O.) 251–265 (Springer, 2022).
Brownlee, J. A Gentle Introduction to Transfer Learning for Deep Learning (Machine Learning Mastery, 2017).
Wang, Y. et al. Research on crack detection algorithm of the concrete bridge based on image processing. Proced. Comput. Sci. 154 , 610–616 (2019).
Chen, C., Seo, H., Jun, C. H. & Zhao, Y. Pavement crack detection and classification based on fusion feature of LBP and pca with SVM. Int. J. Pavement Eng. 23 (9), 3274–3283 (2022).
Ojala, T., Pietikainen, M. & Harwood, D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognit. 29 (1), 51–59 (1996).
Sari, Y., Prakoso, P. B. & Baskara, A. R. Road crack detection using support vector machine (svm) and otsu algorithm. In 2019 6th International Conference on Electric Vehicular Technology (ICEVT) (eds Sari, Y. et al. ) 349–354 (IEEE, 2019).
Shafiq, M., Yadav, R., Javed, A. R. & Mohsin, S. A. H. CoopGBFS: A Federated Learning and Game-Theoretic Based Approach for Personalized Security, Recommendation in 5G Beyond IoT Environments for Consumer Electronics (IEEE, 2023).
Shafiq, M., Tian, Z., Liu, Y., Aljuhani, A. & Li, Y. ESC&RAO: Enabling seamless connectivity resource allocation in tactile IoT for consumer electronics. IEEE Trans. Consum. Electron. https://doi.org/10.1109/TCE.2023.3327136 (2023).
Download references
We thank everyone that supported the study in one way or the other.
HQ thanks the support of NSF award 1761839 and 2200138.
Authors and affiliations.
Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Coimbatore, Tamil Nadu, 641112, India
K. S. Bhalaji Kharthik & O. K. Sikha
Department of Mathematics and Computer Science, Coal City University, Enugu, Nigeria
Edeh Michael Onyema
Adjunct Faculty, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India
Department of Environmental Health, Harvard T H Chan School of Public Health, Boston, MA, 02115, USA
Saurav Mallik
School of Engineering (CSE), Anurag University, Hyderabad, India
B. V. V. Siva Prasad
Department of Computer Science and Engineering, The University of Tennessee at Chattanooga, Chattanooga, TN, USA
Department of Computer Science and Engineering, Indian Institute of Information Technology, Kottayam, Kerala, 686635, India
Dept. of Information and Communication Technologies, BCN Medtech, Universitat Pompeu Fabra, Barcelona, Spain
O. K. Sikha
You can also search for this author in PubMed Google Scholar
B.K.K.S., E.M.O., S.M. developed the idea and implemented, B.V.V.S.P. wrote the manuscript, H.Q., S.C., S.O.K. reviewed and revised the manuscript. All authors consented to the publication of this paper.
Correspondence to Edeh Michael Onyema , Saurav Mallik or Hong Qin .
Competing interests.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Cite this article.
Bhalaji Kharthik, K.S., Onyema, E.M., Mallik, S. et al. Transfer learned deep feature based crack detection using support vector machine: a comparative study. Sci Rep 14 , 14517 (2024). https://doi.org/10.1038/s41598-024-63767-5
Download citation
Received : 07 March 2024
Accepted : 31 May 2024
Published : 24 June 2024
DOI : https://doi.org/10.1038/s41598-024-63767-5
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Sign up for the Nature Briefing: Anthropocene newsletter — what matters in anthropocene research, free to your inbox weekly.
IMAGES
VIDEO
COMMENTS
Scope of research refers to the range of topics, areas, and subjects that a research project intends to cover. It is the extent and limitations of the study, defining what is included and excluded in the research. The scope of a research project depends on various factors, such as the research questions, objectives, methodology, and available ...
In order to write the scope of the study that you plan to perform, you must be clear on the research parameters that you will and won't consider. These parameters usually consist of the sample size, the duration, inclusion and exclusion criteria, the methodology and any geographical or monetary constraints. Each of these parameters will have ...
Any study that describes or analyzes methods (design, conduct, analysis or reporting) in published (or unpublished) literature is a methodological study. Consequently, the scope of methodological studies is quite extensive and includes, but is not limited to, topics as diverse as: research question formulation ; adherence to reporting ...
Your study's scope and delimitations are the sections where you define the broader parameters and boundaries of your research. The scope details what your study will explore, such as the target population, extent, or study duration. Delimitations are factors and variables not included in the study. Scope and delimitations are not methodological ...
Example 1. Research question: What are the effects of social media on mental health? Scope: The scope of the study will focus on the impact of social media on the mental health of young adults aged 18-24 in the United States. Delimitation: The study will specifically examine the following aspects of social media: frequency of use, types of social media platforms used, and the impact of social ...
Case Study Research Methodology. This is a research methodology that involves in-depth examination of a single case or a small number of cases. Case studies are often used in psychology, sociology, and anthropology to gain a detailed understanding of a particular individual or group. ... Scope for innovation: Research methodology provides scope ...
What is scope and delimitation in research. The scope of a research paper explains the context and framework for the study, outlines the extent, variables, or dimensions that will be investigated, and provides details of the parameters within which the study is conducted.Delimitations in research, on the other hand, refer to the limitations imposed on the study.
A good research methodology always explains the procedure, data collection methods and techniques, aim, and scope of the research. In a research study, it leads to a well-organized, rationality-based approach, while the paper lacking it is often observed as messy or disorganized.
Consider the feasibility of your work before you write down the scope. Again, if the scope is too narrow, the findings might not be generalizable. Typically, the information that you need to include in the scope would cover the following: 1. General purpose of the study. 2. The population or sample that you are studying. 3. The duration of the ...
The scope of your project sets clear parameters for your research.. A scope statement will give basic information about the depth and breadth of the project. It tells your reader exactly what you want to find out, how you will conduct your study, the reports and deliverables that will be part of the outcome of the study, and the responsibilities of the researchers involved in the study.
Methodological studies - studies that evaluate the design, analysis or reporting of other research-related reports - play an important role in health research. They help to highlight issues in the conduct of research with the aim of improving health research methodology, and ultimately reducing research waste. We provide an overview of some of the key aspects of methodological studies such ...
Definition, Types, and Examples. Research methodology 1,2 is a structured and scientific approach used to collect, analyze, and interpret quantitative or qualitative data to answer research questions or test hypotheses. A research methodology is like a plan for carrying out research and helps keep researchers on track by limiting the scope of ...
Step 1: Explain your methodological approach. Step 2: Describe your data collection methods. Step 3: Describe your analysis method. Step 4: Evaluate and justify the methodological choices you made. Tips for writing a strong methodology chapter. Other interesting articles.
A scope is needed for all types of research: quantitative, qualitative, and mixed methods. To define your scope of research, consider the following: Budget constraints or any specifics of grant funding; Your proposed timeline and duration; Specifics about your population of study, your proposed sample size, and the research methodology you'll ...
INTRODUCTION. Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses.1,2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results.3,4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the ...
As we mentioned, research methodology refers to the collection of practical decisions regarding what data you'll collect, from who, how you'll collect it and how you'll analyse it. Research design, on the other hand, is more about the overall strategy you'll adopt in your study. For example, whether you'll use an experimental design ...
The four purposes of scoping studies lack clarity. 1. Clearly articulate the research question that will guide the scope of inquiry. Consider the concept, target population, and health outcomes of interest to clarify the focus of the scoping study and establish an effective search strategy. 2.
A research methodology encompasses the way in which you intend to carry out your research. This includes how you plan to tackle things like collection methods, statistical analysis, participant observations, and more. You can think of your research methodology as being a formula. One part will be how you plan on putting your research into ...
Answer: Scope and delimitations are two elements of a research paper or thesis. The scope of a study explains the extent to which the research area will be explored in the work and specifies the parameters within which the study will be operating. For example, let's say a researcher wants to study the impact of mobile phones on behavior ...
Delimitations refer to the specific boundaries or limitations that are set in a research study in order to narrow its scope and focus. Delimitations may be related to a variety of factors, including the population being studied, the geographical location, the time period, the research design, and the methods or tools being used to collect data.
2.4.1 Research Approach . In view of the key research questions and the specific objectives outlined above, it is clear that this study needs to deploy a mixed method research approach, which can generate the theoretical framework that can explain how different components of the ENRICH programme are impacting on 'freedom of choice' of the programme participants (who were previously denied ...
CHAPTER 1: SCOPE AND NATURE OF THE STUDY [The BOP offers] a massive opportunity for private sector firms to engage in ways that improve poor peoples' lives. ... This research proposes to develop a methodology, through theoretical research as well as making use of a case study, which can be used in determining areas that offer the greatest ...
1 Answer to this question. Answer: The scope of a study explains the extent to which the research area will be explored in the study and specifies the parameters within which the study will be operating. Thus, the scope of a study will define the purpose of the study, the population size and characteristics, geographical location, the time ...
In order to demonstrate how the guide can be used to plan a research study utilizing causal methods, we turn to a previously published study (Dondo et al. 2017) that assessed the causal relationship between the use of 𝛽-blockers and mortality after acute myocardial infarction in patients without heart failure or left ventricular systolic ...
Cancer Medicine is an open access, broad-scope oncology journal covering clinical cancer research, cancer biology, cancer prevention, & bioinformatics. ... Materials and Methods. In this study, we conducted a retrospective data collection of HER2-Mutated advanced LUAD who received first-line treatment and pyrotinib between May 2014 and June ...
The current study explores two promising SDG methods - an open-source method and a proprietary method - and evaluates them on a specific causal effect estimation task. METHODS: In this study, we established an evaluation framework to assess synthetic data quality by comparing target causal effect estimates across different estimation methods.
A custom ensemble model by combining the best DCNNs for crack detection could be considered as the future scope of this study. There has been substantial research to deal with problems like ...