Qualitative case study data analysis: an example from practice


  • 1 School of Nursing and Midwifery, National University of Ireland, Galway, Republic of Ireland.
  • PMID: 25976531
  • DOI: 10.7748/nr.22.5.8.e1307

Aim: To illustrate an approach to data analysis in qualitative case study methodology.

Background: There is often little detail in case study research about how data were analysed. However, it is important that comprehensive analysis procedures are used because there are often large sets of data from multiple sources of evidence. Furthermore, the ability to describe in detail how the analysis was conducted ensures rigour in reporting qualitative research.

Data sources: The research example used is a multiple case study that explored the role of the clinical skills laboratory in preparing students for the real world of practice. Data analysis was conducted using a framework guided by the four stages of analysis outlined by Morse ( 1994 ): comprehending, synthesising, theorising and recontextualising. The specific strategies for analysis in these stages centred on the work of Miles and Huberman ( 1994 ), which has been successfully used in case study research. The data were managed using NVivo software.

Review methods: Literature examining qualitative data analysis was reviewed and strategies illustrated by the case study example provided. Discussion Each stage of the analysis framework is described with illustration from the research example for the purpose of highlighting the benefits of a systematic approach to handling large data sets from multiple sources.

Conclusion: By providing an example of how each stage of the analysis was conducted, it is hoped that researchers will be able to consider the benefits of such an approach to their own case study analysis.

Implications for research/practice: This paper illustrates specific strategies that can be employed when conducting data analysis in case study research and other qualitative research designs.

Keywords: Case study data analysis; case study research methodology; clinical skills research; qualitative case study methodology; qualitative data analysis; qualitative research.

  • Case-Control Studies*
  • Data Interpretation, Statistical*
  • Nursing Research / methods*
  • Qualitative Research*
  • Research Design

case study research data analysis

Doing Data Science: A Framework and Case Study

Today’s data revolution is not just about big data, it is about data of all sizes and types. While the issues of volume and velocity presented by the ingestion of massive amounts of data remain prevalent, it is the rapidly developing challenges being presented by the third v , variety, that necessitates more attention. The need for a comprehensive approach to discover, access, repurpose, and statistically integrate all the varieties of data is what has led us to the development of a data science framework that forms our foundation of doing data science . Unique features in this framework include problem identification, data discovery, data governance and ingestion, and ethics. A case study is used to illustrate the framework in action. We close with a discussion of the important role for data acumen.

Keywords: data science framework, data discovery, ethics, data acumen, workforce

Media Summary

In the words of Thomas Jefferson, ‘knowledge is power,’ an adage that data scientists understand too well given that data science is quickly becoming the new currency of sound decision making and policy development. But not all data are created equal.

Today’s data revolution is not just about big data, but the emergence of all sizes and types of data. Advances in information technology, computation, and statistics now make it possible to access, integrate, and analyze massive amounts of data over time and space. Further, massive repurposing (using data for purposes other than those for which it was gathered) is becoming an increasingly common practice and concern. These data are often incomplete, flawed, challenging to access, and nonrepresentative.

That predicament is driving a significant need for a data-literate population to move from simple data analytics to actually ‘doing data science.’ To bring this to bear, researchers from the University of Virginia’s (UVA) Biocomplexity Institute and Initiative have developed a research model and data science framework to help mature data science. The data science framework and associated research processes are fundamentally tied to practical problem solving, highlight data discovery as an essential but often overlooked step in most data science frameworks, and, incorporate ethical considerations as a critical feature to the research. Finally, as data are becoming the new currency across our economy, the UVA research team emphasizes the obligation of data scientists to enlighten decision makers on data acumen (literacy). The need to help consumers of research to understand the data and the role it plays in problem solving and policy development is important, as is building a data-savvy workforce interested in public good applications, such as the Data Science for the Public Good young scholars program led by the Biocomplexity Institute team.

Today’s data revolution is more about how we are ‘doing data science’ than just ‘big data analytics,’ a buzzword with little value to policymakers or communities trying to solve complex social issues. With a proper research model and framework, it is possible to bring the all data revolution to all organizations, from local, state, and federal governments to industry and nonprofit organizations, expanding its reach, application, understanding, and impact.

1. Introduction

Data science is the quintessential translational research field that starts at the point of translation—the real problem to be solved. It involves many stakeholders and fields of practice and lends itself to team science. Data science has evolved into a powerful transdisciplinary endeavor. This article shares our development of a framework to build an understanding of what it means to just do data science .

We have learned how to do data science in a rather unique research environment within the University of Virginia’s Biocomplexity Institute , one that is an intentional collection of statisticians and social and behavioral scientists with a common interest in channeling data science to improve the impact of decision making for the public good. Our data science approach to research is based on addressing real, very applied public policy problems. It is a research model that starts with translation by working directly with the communities or stakeholders and focusing on their problems. This results in a ‘research pull’ versus a ‘research push’ to lay the research foundation for data science. Research push is the traditional research paradigm. For example, research in biology and life sciences moves from basic bench science to bedside practice. For data science, it is through working several problems in multiple domains that the synergies and overarching research needs emerge, hence a research pull.

Through our execution of multiple and diverse policy-focused case studies, synergies and research needs across the problem domains have surfaced. A data science framework has emerged and is presented in the remainder of this article along with a case study to illustrate the steps. This data science framework warrants refining scientific practices around data ethics and data acumen (literacy). A short discussion of these topics concludes the article.

2. Data Science Framework

Conceptual models are being proposed for capturing the life cycle of data science, for example, Berkeley School of Information (2019) and Berman et al. (2018). A simple Google search of ‘data science’ brings forward pages and pages of images. These figures have overlapping features and are able to nicely summarize several components of the data science process. We find it critical to go beyond the conceptual framing and have created a framework that can be operationalized for the actual practice of data science.

Our data science framework (see Figure 1) provides a comprehensive approach to data science problem solving and forms the foundation of our research (Keller, Korkmaz, Robbins, & Shipp, 2018; Keller, Lancaster, & Shipp, 2017). The process is rigorous, flexible, and iterative in that learning at each stage informs prior and subsequent stages. There are four features of our framework that deviate from other frameworks and will be described in some detail. First, we specify the problem to be addressed and keep it ever-present in the framework, hence grounding the data science research in a problem to be solved. Second, we undertake data discovery, the search for existing data sources, as a primary activity and not an afterthought. Third, governance and data ingestion play a critical role in building trust and establishing data-sharing protocols. Fourth, we actively connect data science ethics to all components of the framework.

case study research data analysis

Figure 1. Data science framework. The data science framework starts with the research question, or problem identification, and continues through the following steps: data discovery —inventory, screening, and acquisition; data ingestion and governance; data wrangling —data profiling, data preparation and linkage, and data exploration; fitness-for-use assessment; statistical modeling and analyses ; communication and dissemination of results; and ethics review .

In the following, we describe the components of the data science framework. Although the framework is described in a linear fashion, it is far from a linear process as represented by a circular arrow that integrates the process. We also provide a case study example for youth obesity and physical activity in Fairfax County, Virginia, that walks through the components of the framework to demonstrate how a disciplined implementation of the steps taken to do data science ensures transparency and reproducibility of the research.

2.1. Problem Identification

Data science brings together disciplines and communities to conduct transdisciplinary research that provides new insights into current and future societal challenges (Berman et al., 2018). Data becomes a common language for communication across disciplines (Keller, 2007; Keller et al., 2017). The data science process starts with the identification of the problem. Using relevant theories and framing hypotheses is achieved through traditional literature reviews, including the review of the grey literature (e.g., government, industry, and nonprofit organization reports) to find best practices. Subject matter (domain) expertise also plays a role in translating the information acquired into understanding the underlying phenomena in the data (Box, Hunter, & Hunter, 1978). Domain knowledge provides the context to define, evaluate, and interpret the findings at each stage of the research (Leonelli, 2019; Snee, DeVeaux, & Hoerl, 2014).

Domain knowledge is critical to bringing data to bear on real problems. It can take many forms, from understanding the theory, the modeling, or the underlying changes observed in data. For example, when we repurpose local administrative data for analyses, community leaders can explain underlying factors and trends in the data that may not be apparent without contextual knowledge.

Case Study Application—Problem Identification The Health and Human Services (HHS) of Fairfax County, Virginia, is interested in developing capacity for data-driven approaches to gain insights on current issues, such as youth obesity, by characterizing social and economic factors at the county and subcounty level and creating statistical models to inform policy options. Fairfax County is a large county (406 square miles) with 1.1 million people across all income groups and ethnicities. The obesity rate in the United States has steadily increased since the 1970s due to growing availability of food and declining physical activity that occurs as people get older. The project aims are to identify trends and activities related to obesity across geographies of interest for local policy and program development. The HHS sponsors provided insight and context in identifying geographic regions of interest for Fairfax County decision makers. Instead of using traditional census tracts to analyze subcounty trends, they requested that the analyses be based on Fairfax County high school attendance areas and political districts. As described in the following, this led to innovations in our research through the creation of synthetic information technology to align data by these geographic dimensions.

2.2. Data Discovery (Data Inventory, Screening, and Acquisition)

Data discovery is the identification of potential data sources that could be related to the specific topic of interest. Data pipelines and associated tools typically start at the point of acquisition or ingestion of the data (Weber, 2018). A unique feature of our data science framework is to start the data pipeline with data discovery. The goal of the data discovery process is to think broadly and imaginatively about all data, capturing the full potential variety of data (the third v of the data revolution) that could be useful for the problem at hand and literally assemble a list of these data sources.

An important component of doing data science is to first focus on massive repurposing of existing data in the conceptual development work. Data science methods provide opportunities to wrangle these data and bring them to bear on the research questions. In contrast to traditional research approaches, data science research allows researchers to explore all existing data sources before considering the design of new data collection. The advantage of this approach is that data collection can be directly targeted at current gaps in knowledge and information.

Khan, Uddin, and Gupta (2014) address the importance of variety in data science sources. Even within the same type of data, for example, administrative data, the problem (research question) drives its use and applicability of the information content to the issue being addressed. This level of variety drives what domain discoveries can be made (“Data Diversity,” 2019). Borgman (2019) notes that data are human constructs. Researchers and subject matter experts decide “what are data for a given purpose, how those data are to be interpreted, and what constitutes appropriate evidence.” A similar perspective is that data are “relational,” and their meaning relies on their history (how the data are born and evolve), their characteristics, and the interpretation of the data when analyzed (Leonelli, 2019).

Integrating data from disparate sources involves creating methods based on statistical principles that assess the usability of the data (United Nations Economic Commission for Europe, 2014, 2015). These integrated data sources provide the opportunity to observe the social condition and to answer questions that have been challenging to solve in the past. This highlights that the usefulness and applicability of the data vary depending on its use and domain. There are barriers to using repurposed data, which are often incomplete, challenging to access, not clean, and nonrepresentative. There may also exist restrictions on data access, data linkage, and redistribution that stem from the necessity of governance across multiple agencies and organizations. Finally, repurposed data may pose methodological issues in terms of inference or creating knowledge from data, often in the form of statistical, computational, and theoretical models (Japec et al., 2015; Keller, Shipp, & Schroeder, 2016).

When confronted over and over with data discovery and repurposing tasks, it becomes imperative to understand how data are born. To do this, we have found it useful to define data in four categories, designed, administrative, opportunity, and procedural. These definitions are given in Table 1 (Keller et al., 2017, Keller et al., 2018). The expected benefits of data discovery and repurposing are the use of timely and frequently low-cost (existing) data, large samples, and geographic granularity. The outcomes are a richer source of data to support the problem solving and better inform the research plan. A caveat is the need to also weigh the costs of repurposing existing data compared to new data collection, questioning whether new experiments would provide faster results and more unbiased results than finding and repurposing data. In our experience, the benefits of repurposing existing data sources often outweigh these costs and, more importantly, provides guidance on data gaps for cost effective development of new data collection.

Table 1. Data types.

Note. Adapted from Keller et al. (2018).

The typology of data (designed, administrative, opportunity, and procedural) provides a systematic way to think about possible data sources and a foundation for the data discovery steps. Data inventory is the process by which the data sources are first identified through brainstorming, searching, and snowballing processes (see Figure 2).

A short set of data inventory questions is conducted to assess the usefulness of the data sources to support the research objectives for a specific problem. The process is iterative, starting with the data inventory questions to assess whether the data source meets the basic criteria for the project with respect to the type of data, recurring nature of the data, data availability for the time period needed, geographic granularity, and unit of analysis required. If the data meet the basic criteria, then they undergo additional screening to document the provenance, purpose, frequency, gaps, how used in research, and other uses of the data. We employ a ‘data map’ to help drive our data discovery process (see Figure 3). Throughout the course of the project, as new ideas and data sources are discovered, they are inventoried and screened for consideration.

The acquisition process for existing data sources depends on the type and source of the data being accessed and includes downloading data, scraping the Web, acquiring it directly from a sponsor, or purchasing data from aggregators, or other sources. It also includes the development and initiation of data sharing agreements, as necessary.

case study research data analysis

Figure 2. Data discovery filter. Data discovery is the open-ended and continuous process whereby candidate data sources are identified. Data inventory refers to the broadest, most far-reaching ‘wish list’ of information pertaining to the research questions. Data screening  is an evaluative process by which eligible data sets are sifted from the larger pool of candidate data sets. Data acquisition is the process of acquiring the data from a sponsor, purchasing it, downloading it using an application programming interface (API), or scraping the web.

Case Study Application—Data Discovery The creation of a data map highlights the types of data we want to ‘discover’ for this project (see Figure 3). This is guided by a literature review and Fairfax County subject matter experts that are part of the Community Learning Data Driven Discovery (CLD3) team for this project (Keller, Lancaster, & Shipp, 2017). This data map immediately captures the multiple units of analysis that will need to be integrated in the analysis. The data map helps the team identify potential implicit biases and ethical considerations. Figure 3. Data Map. The data map highlights the types of data desired for the study and is used as a guide for data discovery. The lists are social determinants and physical infrastructure that could affect teen behaviors. The map highlights the various units of analysis that will need to be captured and linked in the analyses. These include individuals, groups and networks of individuals, and geographic areas. Data Inventory, Screening, and Acquisition. The data map then guides our approach to identify, inventory, and screen the data. We screened each one to assess its relevance to this project, as follows. For surveys and administrative data: Are the data at the county or subcounty level? (Note: This question screened out several national sources of data that are not available at the geographic granularity needed for the study.) What years are the data available, i.e., are they for the same years as the American Community Survey (ACS) and Fairfax Youth Survey? Can we acquire and use the data in the timeframe of the project, e.g., March to September? For place-based data: Is an address provided? Can the type of establishment be identified? Can we acquire and use the data in the timeframe of the project? Following the data discovery step, we identified and acquired survey, administrative, and place-based (opportunity) data to be used in this study. These are summarized in Table 2. The baseline data are the ACS, which provides demographic and economic data at the census block and census tract levels. We characterize the housing and rental stock in Fairfax County through the use of property tax assessment administrative records. Geo-coded place-based data are scraped from the Web and include location of grocery stores, convenience stores, restaurants (full-service and fast food), recreation centers, and other opportunities for physical activity. We also acquired Fairfax County Youth Survey aggregates (at the high school boundary level) and Fairfax Park Authority administrative data.

Table 2. Selected data sources.

2.3. data governance and ingestion.

Data governance is the establishment of and adherence to rules and procedures regarding data access, dissemination, and destruction. In our data science framework, access to and management of data sources is defined in consultation with the stakeholders and the university’s institutional review board (IRB). Data ingestion is the process of bringing data into the data management platform(s).

Combining disparate data sources can raise issues around privacy and confidentiality, frequently from conflicting interests among researchers and sponsors working together. For clarity, privacy refers to the amount of personal information individuals allow others to access about themselves and confidentiality is the process that data producers and researchers follow to keep individuals’ data private (National Research Council, 2007).

For some, it becomes intoxicating to think about the massive amounts of individual data records that can be linked and integrated, leading to ideas about following behavioral patterns of specific individuals, such as what a social worker might want to do. This has led us to a data science guideline distinguishing between ensuring confidentiality of the data for research and policy analyses versus real-time activities such as casework (Keller et al., 2016). Casework requires identification of individuals and families for the data to be useful, policy analysis does not. For casework, information systems must be set up to ensure that only social workers have access to these private data and approvals granted for access. Our focus is policy analysis.

Data governance requires tools to identify, manage, interpret, and disseminate data (Leonelli, 2019). These tools are needed to facilitate decision making about different ways to handle and value data and to articulate conflicts among the data sources, shifting research priorities to consider not only publications but also data infrastructures and curation of data. Our best practices around data governance and ingestion are included as part of the training of all research team members and also captured in formal data management plans.

Resulting modified read-write data, or code that can generate the modified data, produced from the original data sources are stored back to a secure server and only accessible via secured remote access. For projects involving protected information, unless special authorization is given, researchers do not have direct access to data files. For those projects, data access is mediated by the use of different data analysis tools hosted on our own secure servers that connect to the data server via authenticated protocols (Keller, Shipp, & Schroeder, 2016).

Case Study Application—Data Governance and Ingestion Selected variables from data sources in Table 2 were profiled and cleaned (indicated by the asterisks). Two unique sets of data requiring careful governance were discovered and included in the study. First is the Fairfax County Youth Survey, administered to 8th, 10th, and 12th graders every year. Access to these data requires adhering to specific governance requirements that resulted in aggregate data being provided for each school. These data include information about time spent on activities, e.g., homework, physical activity, screen time; varieties of food eaten each week; family structure and other support; and information about risky behaviors, such as use of alcohol and drugs. Second, the Fairfax County Park Authority data include usage data at their nine recreation centers, including classes taken, services used, and location of recreation center.

2.4. Data Wrangling

These next phases of executing the data science framework activities of data profiling to assess quality, preparation, linkage, and exploration can easily consume the majority of the project’s time and resources and contribute to assessing the quality of the data (Dasu & Johnson, 2003). Details of data wrangling are now readily available from many authors and are not repeated here (e.g., DeVeaux, Hoerl, & Snee, 2016; Wickham, 2014; Wing, 2019). Assessing the quality and representativeness of the data is an iterative and important part of data wrangling (Keller, Shipp, & Schroeder, 2016).

2.5. Fitness-for-Use Assessment

Fitness-for-use of data was introduced in the 1990s from a management and industry perspective (Wang & Stone, 1996) and then expanded to official statistics by Brackstone (1999). Fitness-for-use starts with assessing the constraints imposed on the data by the particular statistical methods that will be used and if inferences are to be made whether or not the data are representative of the population to which the inferences extend. This assessment extends from straightforward descriptive tabulations and visualizations to complex analyses. Finally, fitness-for-use should characterize the information content in the results.

Case Study Application—Fitness-for-Use After linking and exploring the data sources, a subset of data was selected for the fitness-for-use analyses to benchmark the data. We were unable to gain access to individual student-level data and also important health information (even in aggregate) such as body mass indices (BMI is a combination of height and weight data). An implicit bias discussion across the team ensued and given these limitations the decisions on which data would be carried forward into the analyses were guided by a refocusing of the project to characterize the social, economic, and behavioral features of the individual high schools, their attendance areas, and county political districts. These characterizations could be used to target new programing and policy development.

2.6. Statistical Modeling and Analyses

Statistics and statistical modeling are key for drawing robust conclusions using incomplete information (Adhikari & DeNero, 2019). Statistics provide consistent and clear-cut words and definitions for describing the relationship between observations and conclusions. The appropriate statistical analysis is a function of the research question, the intended use of the data to support the research hypothesis, and the assumptions required for a particular statistical method (Leek & Peng, 2015). Ethical dimensions include ensuring accountability, transparency, and lack of algorithmic bias.

Case Study Application—Statistical Modeling and Analyses We used the place-based data to calculate and map distances between home and locations of interest by political districts and high school attendance areas. The data include the availability of physical activity opportunities and access to healthy and unhealthy food. Figure 4 gives an example of the distances from home to locations of fast food versus farmers markets within each political district. Figure 4. Exploratory analysis— direct aggregation of place-based data based on location of housing units. The box plots show the distance from each housing unit to farmers market or fast food by each of the 9 Fairfax County political districts. The take-away is that people live closer to fast food restaurants than to farmers markets. Synthetic information methods . Unlike the place-based data, the survey data do not directly align with geographies of interest, e.g., 9 Supervisor Districts and 24 School Attendance Areas. To realign the data and the subsequent composite indicators to the relevant geographies, we used synthetic information technology to impute social and economic characteristics and attach these to housing and rental units across the county. Multiple sets of representative synthetic information about the Fairfax population based on iterative proportional fitting were constructed allowing for estimation of margins of errors (Beckman, Baggerly, & McKay, 1996). Some of the features of the synthetic data are an exact match to the ACS marginal tabulations, while others are generated statistically using survey data collected at varying levels of aggregation. Synthetic estimates across these multiple data sources can then be used to make inferences at resolutions not available in any single data source alone. Creation of composite indicators . Composite indicators are useful for combining data to create a proxy for a concept of interest, such as the relative economic position of vulnerable populations across the county (Atkinson, Cantillon, Marlier, & Nolan, 2002). Two composite indicators were created, the first to represent economically vulnerable populations and the second to represent schools that have a larger percent of vulnerable students (see Figure 5). We defined the indicators as follows: Economic vulnerability is the statistical combination of four factors: the percent of households with housing burden greater than 50% of household income, with no vehicle, receiving Supplemental Nutrition Assistance Program (SNAP) benefits, and in poverty. High school vulnerability indicators are developed as a statistical combination of percentage of students enrolled in Limited English Proficiency programs, receiving free and reduced meals, on Medicaid, receiving Temporary Assistance for Needy Families, and migrant or homelessness experiences. Figure 5. School and economic vulnerability indicators for Fairfax County, Virginia. Economic vulnerability indicators are mapped by the 24 high school attendance areas and by color; the darker the color, the more vulnerable is an area. The overlaid circles are high school vulnerability indicators geolocated at the high school locations. The larger the circle, the higher the vulnerability of the high school population. Figure 6 presents correlations between factors that may affect obesity. Figure 6. Correlations of factors that may affect obesity. The factors are levels of physical activity (none or 5+ days per week), food and drink consumed during past week, unhealthy weight loss, and food insecurity. As an example, the bottom left-hand corner shows a positive correlation between no physical activity and food insecurity. The next phase of the analyses was to build statistical models that would give insights into the relationships between physical activity and healthy eating based on information from the Youth Surveys. Based on the full suite of data, several machine learning models were used. Fitness-for-use assessment revisited. While we were asked to examine youth obesity, we did not have access to obesity data at the subcounty level or student level. Yet, we decided to move from descriptive analysis to more complex statistical modeling to assess if existing data could still provide useful results. First, we used Random Forest, a supervised machine learning method that builds multiple decision trees and merges them together to get a more accurate and robust prediction. Our Random Forest results did not predict any reasonable or statistically significant results. Next, we used LASSO (least absolute shrinkage and selection operator), a regression analysis method that performs both variable selection and regularization (the process of adding information) to enhance the prediction accuracy and interpretability of the statistical model it produces. However, the LASSO method consistently selected the model with zero predictors, suggesting none are useful. A partial least squares regression had the best performance when no components were used, mirroring LASSO. Instead of using the original data, partial least squares regression reduces the predictors to a smaller set of uncorrelated components and performs least squares regression on these components. Our conclusion is that more complex statistical modeling does not provide additional information beyond the (still clearly useful) descriptive analysis. As noted below, BMI data and stakeholder input to identify the relative importance of composite indicator components are needed to extend the modeling.

2.7. Communication and Dissemination

Communication involves sharing data, well-documented code, working papers, and dissemination through conference presentations, publications, and social media. These steps are critical to ensure processes and findings are transparent, replicable, and reproducible (Berman et al., 2018). An important facet of this step is to tell the story of the analysis by conveying the context, purpose, and implications of the research and findings (Berinato, 2019; Wing, 2019). Visuals, case studies, and other supporting evidence reinforce the findings.

Communication and dissemination are also important for building and maintaining a community of practice. It can include dissemination through portals, databases, and repositories, workshops, and conferences, and the creation of new journals (e.g., Harvard Data Science Review ). Underlying communication and dissemination is preserving the privacy and ethical dimensions of the research.

Case Study Application—Communication and Dissemination We summarized and presented our findings at each stage of the data science lifecycle, starting with the problem asked, through data discovery, profiling, exploratory analysis, fitness-for-use, and the statistical analysis. We provided new information to county officials about potential policy options and are continuing to explore how we might obtain data-sharing agreements to obtain sensitive data, such as BMI. The data used in this study are valuable for descriptive analyses, but the fitness-for-use assessment demonstrated the statistical models required finer level of resolution of student-level data to obtain better predictive measure, for example, body mass index (BMI) or height and weight data. The exploratory analysis described earlier provided many useful insights for Fairfax County Health and Human Services about proximity to physical activity and healthy food options for each political district and high school attendance area. We encourage Fairfax County Health and Human Services to develop new data governance policies that allow researchers to access sensitive data, while ensuring that the privacy and confidentiality of the data are maintained. Until we can access BMI or height and weight data, we propose to seek stakeholder input to develop composite indicators, such as the economic vulnerability indicator described in this example. These composite indicators would inform stakeholders and decision makers about where at-risk populations live, and changes over time in how those populations are faring from various perspectives such as economic self-sufficiency, health, access to healthy food, and access to opportunities for physical activity.

2.8. Ethics Review

The ethics review provides a set of guiding principles to ensure dialogue on this topic throughout the lifecycle of the project. Because data science involves interdisciplinary teams, conversations around ethics can be challenging. Each discipline has its own set of research integrity norms and practices. To harmonize across these fields, data science ethics touches every component and step in the practice of data science as shown in Figure 1. This is illustrated throughout the case study.

When acquiring and integrating data sources, ethical issues include considerations of mass surveillance, privacy, data sovereignty, and other potential consequences. Research integrity includes improving day-to-day research practices and ongoing training of all scientists to achieve “better record keeping, vetting experimental designs, techniques to reduce bias, rewards for rigorous research, and incentives for sharing data, code, and protocols—rather than narrow efforts to find and punish a few bad actors” (“Editorial: Nature Research Integrity,” 2019, p. 5). Research integrity is advanced by implementing these practices into research throughout the entire research process, not just through the IRB process.

Salganik (2017) proposes a principles-based approach to ethics to include standards and norms around the uses of data, analysis, and interpretation, similar to the steps associated with implementing a data science framework. Similarly, the “Community Principles on Ethical Data Sharing,” formulated at a Bloomberg conference in 2017, is based on four principles—fairness, benefit, openness, and reliability (Data for Democracy, 2018). A systematic approach to implementing these principles is ensuring scientific data are FAIR :

‘ Findable ’ using common search tools;

‘ Accessible ’ so that the data and metadata can be explored;

‘ Interoperable ’ to compare, integrate, and analyze; and

‘ Reusable ’ by other researchers or the public through the availability of metadata, code, and usage licenses (Stall et al., 2019).

Underlying the FAIR principles is to also give credit for curating and sharing data and to count this as important as journal publication citations (Pierce, Dev, Statham, & Bierer, 2019). The FAIR movement has taken hold in some scientific disciplines where issues surrounding confidentiality or privacy are not as prevalent. Social sciences, on the other hand, face challenges in that data access is often restricted for these reasons. However, the aim should be to develop FAIR principles across all disciplines and adapt as necessary. This requires creating repositories, infrastructures, and tools that make the FAIR practices the norm rather than the exception at both national and international levels (Stall et al., 2019).

Building on these principles, we have developed a Data Science Project Ethics Checklist (see the Appendix for an example). We find two things useful to do to instantiate ethics in every step of ‘doing data science.’ First, we require our researchers to take IRB) and the Responsible Conduct of Research training classes. Second, for each project, we develop a checklist to implement an ethical review at each stage of research to address the following criteria:

Balance simplicity and sufficient criteria to ensure ethical behavior and decisions.

Make ethical considerations and discussion of implicit biases an active and continuous part of the project at each stage of the research.

Seek expert help when ethical questions cannot be satisfactorily answered by the research team.

Ensure documentation, transparency, ongoing discussion, questioning, and constructive criticism throughout the project.

Incorporate ethical guidelines from relevant professional societies (for examples, see ACM Committee on Professional Ethics. (2018), American Physical Society (2019), Committee on Professional Ethics of the American Statistical Association (2018),

Creating the checklist is the first step for researchers to agree on a set of principles and serves as a reminder to have conversations throughout the project. This helps address the challenge of working with researchers from different disciplines and allow them to approach ethics through a variety of lenses. The Data Science Ethics Checklist given in the Appendix can be adapted to specific data science projects, with a focus on social science research. Responsible data science involves using a set of guiding principles and addressing the consequences across the data lifecycle.

Case Study Application—Ethics Aspects of the ethics review, a continuous process, have been touched on in the earlier steps of the case study, specifically, the ethics examination of the methods used, including the choice of variables, the creation of synthetic populations, and the models used. In addition, our findings were scrutinized, vetted, and refined based on internal discussions with the team, with our sponsors, Fairfax County officials, and external experts. The primary question asked throughout was whether we were introducing implicit bias into our research. We concurred that some of the findings had the potential to appear biased, such as the finding about level of physical activity by race and ethnicity. However, in this case, these findings would be important to school officials and political representatives.

3. Data Acumen

In the process of doing data science, we have learned that many of the consumers of this research do not have sufficient data acumen and thus can be overwhelmed with how to make use of data-driven insights. It is unrealistic to think that the majority of decision makers are data scientists. Even with domain knowledge, some literacy in data science domains is useful, including the underpinnings of probability and statistics to inform decision making under uncertainty (Kleinberg, Ludwig, Mullainathan, & Obermeyer, 2015).

Data acumen, traditionally referred to as data literacy, appears to be first introduced in the 2000s as social sciences began to embrace and use publicly open data (Prado & Marzal, 2013). We define data acumen as the ability to make good judgements about the use of data to support problem solutions. It is not only the basis of statistical and quantitative analysis; it is a critical mechanism to improve society and a necessary first step to statistical understanding. The need for policy and other decision makers with data acumen is growing in parallel with the massive repurposing of all types of data sources (Bughin, Seong, Manyika, Chui, & Joshi, 2018).

We have found it useful to conceptualize data acumen across three levels or roles (Garber, 2019). The first are the data scientists, trained in statistics, computer science, quantitative social sciences, or related fields. The second are researchers trained in a specific field, such as public health or political science, who also have a range of training in data science, obtained through a master’s degree, certificate programs, or hands-on programs such as the University of Virginia’s Data Science for the Public Good program (UVA, 2019). This second group plays a bridging role by bringing together multidisciplinary teams. The third group are the consumers of data science applications. The first and second groups may overlap with respect to skills, expertise, and application. The third group requires a basic understanding of data science, that is, they must be data literate (Garber, 2019).

Data acumen is both a baseline and overarching concept. A data literate person should conceptually understand the basics of data science, (e.g., the data science framework described in Figure 1 is a good guide), and be able to articulate questions that require data to provide evidence:

What is the problem?

What are the research questions to support the problem?

What data sources might inform the questions? Why?

How are these data born? What are the biases and ethical considerations?

What are the findings? Do they make sense? Do I trust them? How can I use them?

A data literate person understands the entire process, even if they do not have the skills to undertake the statistical research. Data acumen requires an understanding of how data are born, and why that matters for evaluating the quality of the data for the research question being addressed. As many types of data are discovered and repurposed to address analytical questions, this aspect of data literacy is increasingly important. Being data literate is important to know why our intuition may not often be right (Kahneman, 2011). We believe that building data capacity and acumen of decision makers is an important facet of data science.

4. Conclusion

Without applications (problems), doing data science would not exist. Our data science framework and research processes are fundamentally tied to practical problem solving and can be used in diverse settings. We provide a case study of using local data to address questions raised by county officials. Some contrasting examples that make formal use of the data science framework are the application to industry supply chain synchronization and the application to measuring the value and impact of open source software (Keller et al., 2018; Pires et al., 2017).

We have highlighted data discovery as a critical but often overlooked step in most data science frameworks. Without data discovery, we would fall back on data sources that are convenient. Data discovery expands the power of data science by considering many new data sources, not only designed sources. We are also developing new behaviors by adopting a principles-based approach to ethical considerations as a critical underlying feature throughout the data science lifecycle. Each step of the data science framework involves documentation of decisions made, methods used, and findings, allowing opportunity for data repurposing and reuse, sharing, and reproducibility.

Our data science framework provides a rigorous and repeatable, yet flexible, foundation for doing data science. The framework can serve as a continually evolving roadmap for the field of data science as we work together to embrace the ever-changing data environment. It also highlights the need for supporting the development of data acumen among stakeholders, subject matter experts, and decision makers.


We would like to acknowledge our colleagues who contributed to the research projects described in this paper: Dr. Vicki Lancaster and Dr. Joshua Goldstein, both with the Social & Decision Analytics Division, Biocomplexity Institute & Initiative (BII), University of Virginia, Dr. Ian Crandell, Virginia Tech, and Dr. Emily Molfino, U.S. Census Bureau. We would also like to thank Dr. Cathie Woteki, Distinguished Institute Professor, Biocomplexity Institute & Initiative, University of Virginia and Professor of Food Science and Human Nutrition at Iowa State University, who provided subject matter expertise and review of the research. Our sponsors, Michelle Gregory and Sophia Dutton, Office of Strategy Management, Fairfax County Health and Human Services, supported the research and provided context for many of the findings.

Disclosure Statement

This research was partially supported by US Census Bureau under a contract with the MITRE Corporation; National Science Foundation’s National Center for Science and Engineering Statistics under a cooperative agreement with the US Department of Agriculture, National Agriculture Statistical; U.S. Army Research Institute for Social and Behavioral Sciences; Fairfax County, Virginia.

ACM Committee on Professional Ethics. (2018). Association for Computing Machinery (ACM) code of ethics and professional conduct. Retrieved December 1, 2019, from https://www.acm.org/binaries/content/assets/about/acm-code-of-ethics-and-professional-conduct.pdf

Adhikari, A., & DeNero, J. (2019). The foundations of data science. Retrieved December 1, 2019, from https://www.inferentialthinking.com/chapters/intro#The-Foundations-of-Data-Science

American Physical Society. (2019). Ethics and values. Retrieved from https://www.aps.org/policy/statements/index.cfm

Atkinson, T., Cantillon, B., Marlier, E., & Nolan, B. (2002 ). Social indicators: The EU and social inclusion. Oxford, UK: Oxford University Press.

Beckman, R. J., Baggerly, K. A., & McKay, M. D. (1996). Creating synthetic baseline populations. Transportation Research Part A: Policy and Practice, 30 (6), 415–429. https://doi.org/10.1016/0965-8564(96)00004-3

Berinato, S. (2019). Data science and the art of persuasion: Organizations struggle to communicate the insights in all the information they’ve amassed. Here’s why, and how to fix it. Harvard Business Review, 97 (1). Retrieved from https://hbr.org/2019/01/data-science-and-the-art-of-persuasion

Borgman, C. L. (2019). The lives and after lives of data.  Harvard Data Science Review ,  1 (1). https://doi.org/10.1162/99608f92.9a36bdb6

Box, G. E. P., Hunter, W. G., & Hunter J. S. (1978). Statistics for experimenters . Hoboken, NJ: Wiley. pp.563-571

Bughin, J., Seong, J., Manyika, J., Chui, M., & Joshi, R. (2018). Notes from the AI frontier: Modeling the impact of AI on the world economy. Stamford, CT: McKinsey Global Institute.

Berkeley School of Information. (2019). What is data science? Retrieved December 1, 2019, from https://datascience.berkeley.edu/about/what-is-data-science/

Berman, F., Rutenbar, R., Hailpern, B., Christensen, H., Davidson, S., Estrin, D.,…Szalay, A. (2018). Realizing the potential of data science.  Communications of the ACM ,  61 (4), 67–72. https://doi.org/10.1145/3188721

Brackstone, G. (1999). Managing data quality in a statistical agency. Survey Methodology, 25 (2), 139–150. https://repositorio.cepal.org//handle/11362/16457

Committee on Professional Ethics of the American Statistical Association. (2018). Ethical guidelines for statistical practice. Retrieved from https://www.amstat.org/asa/files/pdfs/EthicalGuidelines.pdf

Dasu, T., & Johnson, T. (2003). Exploratory data mining and data cleaning . Hoboken, NJ: Wiley.

Data diversity. (2019, January 11). Nature Human Behaviour, 3 , 1–2. https://doi.org/10.1038/s41562-018-0525-y

Data for Democracy. (2018). A community-engineered ethical framework for practical application in your data work. Global Data Ethics Project. Retrieved December 1, 2019, from https://www.datafordemocracy.org/documents/GDEP-Ethics-Framework-Principles-one-sheet.pdf

De Veaux, R., Hoerl, R., & Snee, R., (2016). Big data and the missing links. Statistical Analysis and Data Mining, 9 (6), 411–416 . https://doi.org/10.1002/sam.11303

Editorial: Nature research integrity is much more than misconduct [Editorial]. (2019, June 6). Nature 570, 5. https://doi.org/10.1038/d41586-019-01727-0

Garber, Allan. (2019). Data science: What the educated citizen needs to know. Harvard Data Science Review, 1 (1). https://doi.org/10.1162/99608f92.88ba42cb

Japec, L., Kreuter, F., Berg, M., Biemer, P., Decker, P., Lampe, C., . . .Usher, A. (2015). Big data in survey research: AAPOR task force report. Public Opinion Quarterly, 79 , 839–880. https://doi.org/10.1093/poq/nfv039

Kahneman, D. (2011). Thinking, fast and slow . New York, NY: Farrar, Straus and Giroux.

Keller-McNulty, S. (2007). From data to policy: Scientific excellence is our future. Journal of the American Statistical Association, 102 (478), 395–399. https://doi.org/10.1198/016214507000000275

Keller, S. A., Shipp, S., & Schroeder, A. (2016). Does big data change the privacy landscape? A review of the issues. Annual Review of Statistics and Its Application, 3, 161–180. https://doi.org/10.1146/annurev-statistics-041715-033453

Keller, S., Korkmaz, G., Orr, M., Schroeder, A., & Shipp, S. (2017). The evolution of data quality: Understanding the transdisciplinary origins of data quality concepts and approaches. Annual Review of Statistics and Its Application, 4, 85–108. https://doi.org/10.1146/annurev-statistics-060116-054114

Keller, S., Korkmaz, G., Robbins, C., Shipp, S. (2018) Opportunities to observe and measure intangible inputs to innovation: Definitions, operationalization, and examples. Proceedings of the National Academy of Sciences (PNAS), 115 (50), 12638–12645. https://doi.org/10.1073/pnas.1800467115

Keller, S., Lancaster, V., & Shipp, S. (2017). Building capacity for data driven governance: Creating a new foundation for democracy. Statistics and Public Policy, 4 (1), 1–11. https://doi.org/10.1080/2330443X.2017.1374897

Khan, M. A. Uddin, M. F., & Gupta, N. (2014, April). Seven V's of Big Data understanding Big Data to extract value. In Proceedings of the 2014 Zone 1 Conference of the American Society. https://doi.org/10.1109/ASEEZone1.2014.6820689

Kleinberg, J., Ludwig, J., Mullainathan, S., & Obermeyer, Z. (2015). Prediction policy problems. American Economic Review, 105 (5), 491–495. https://doi.org/10.1257/aer.p20151023

Leek, J. T., & Peng, R. D. (2015). What is the question? Science, 347 (6228) , 1314–1315. https://doi.org/10.1126/science.aaa6146

Leonelli, S. (2019). Data governance is key to interpretation: Reconceptualizing data in data science. Harvard Data Science Review, 1 (1). https://doi.org/10.1162/99608f92.17405bb6

National Research Council. (2007). Engaging privacy and information technology in a digital age . Washington, DC: National Academies Press.

Office for Human Research Protections. (2016). Institutional Review Board (IRB) Written Procedures: Guidance for Institutions and IRBs (draft guidance issued in August 2016). Department of Health and Human Services. Washington DC. Retrieved from: https://www.hhs.gov/ohrp/regulations-and-policy/requests-for-comments/guidance-for-institutions-and-irbs/index.html

Prado, J. C., & Marzal, M. Á. (2013). Incorporating data literacy into information literacy programs: Core competencies and contents. Libri, 63 (2), 123–134. https://doi.org/10.1515/libri-2013-0010

Pierce, H. H., Dev, A., Statham, E., & Bierer, B. E. (2019, June 6). Credit data generators for data reuse. Nature, 570 (7759), 30–32. https://doi.org/10.1038/d41586-019-01715-4

Pires, B., Goldstein, J. Higdon, D., Sabin, P., Korkmaz, G., Shipp, S., ... Reese, S. (2017). A Bayesian simulation approach for supply chain synchronization . In the 2017 Winter Simulation Conference (pp . 1571–1582). New York, NY: IEEE. https://doi.org/10.1109/WSC.2017.8247898

Salganik, M. J. (2017).  Bit by bit: Social research in the digital age . Princeton, NJ: Princeton University Press.

Snee, R. D., DeVeaux, R. D., & Hoerl, R. W. (2014). Follow the fundamentals. Quality Progress, 47 (1), 24–28. https://search-proquest-com.proxy01.its.virginia.edu/docview/1491963574?accountid=14678

Stall, S., Yarmey, L., Cutcher-Gershenfeld, J., Hanson, B., Lehnert, K., Nosek, B., ... & Wyborn, L. (2019, June 6). Make all scientific data FAIR. Nature, 570 (7759), 27–29. https://doi.org/10.1038/d41586-019-01720-7

United Nations Economic Commission for Europe (UNECE). (2014). A suggested framework for the quality of big data. Retrieved December 1, 2019, from https://statswiki.unece.org/download/attachments/108102944/Big%20Data%20Quality%20Framework%20-%20final-%20Jan08-2015.pdf?version=1&modificationDate=1420725063663&api=v2

United Nations Economic Commission for Europe (UNECE). (2015). Using administrative and secondary sources for official statistics: A handbook of principles and practices. Retrieved December 1, 2019, from https://unstats.un.org/unsd/EconStatKB/KnowledgebaseArticle10349.aspx

University of Virginia (UVA). (2019). Data Science for the Public Good Young Scholars Program. Retrieved December 1, 2019, from https://biocomplexity.virginia.edu/social-decision-analytics/dspg-program

Wang, R. Y., & Strong, D. M. (1996). Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems, 12 (4), 5–33. https://doi.org/10.1080/07421222.1996.11518099

Weber, B. (2018, May 17). Data science for startups: Data pipelines (Part 3). Towards Data Science . Retrieved from https://towardsdatascience.com/data-science-for-startups-data-pipelines-786f6746a59a

Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59 (10), 1–23. https://doi.org/10.18637/jss.v059.i10

Wing, J. M. (2019). The data life cycle. Harvard Data Science Review, 1 (1). https://doi.org/10.1162/99608f92.e26845b4

Data Science Project Ethics Checklist

6/8/20: The authors added description and questions about Data Wrangling to the Ethics Checklist in the Appendix.

©2020 Sallie Ann Keller, Stephanie S. Shipp, Aaron D. Schroeder, and Gizem Korkmaz . This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license , except where otherwise indicated with respect to particular material included in the article.


  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

What the Case Study Method Really Teaches

  • Nitin Nohria

case study research data analysis

Seven meta-skills that stick even if the cases fade from memory.

It’s been 100 years since Harvard Business School began using the case study method. Beyond teaching specific subject matter, the case study method excels in instilling meta-skills in students. This article explains the importance of seven such skills: preparation, discernment, bias recognition, judgement, collaboration, curiosity, and self-confidence.

During my decade as dean of Harvard Business School, I spent hundreds of hours talking with our alumni. To enliven these conversations, I relied on a favorite question: “What was the most important thing you learned from your time in our MBA program?”

Alumni responses varied but tended to follow a pattern. Almost no one referred to a specific business concept they learned. Many mentioned close friendships or the classmate who became a business or life partner. Most often, though, alumni highlighted a personal quality or skill like “increased self-confidence” or “the ability to advocate for a point of view” or “knowing how to work closely with others to solve problems.” And when I asked how they developed these capabilities, they inevitably mentioned the magic of the case method.

Harvard Business School pioneered the use of case studies to teach management in 1921. As we commemorate 100 years of case teaching, much has been  written  about the effectiveness of this method. I agree with many of these observations. Cases expose students to real business dilemmas and decisions. Cases teach students to size up business problems quickly while considering the broader organizational, industry, and societal context. Students recall concepts better when they are set in a case, much as people remember words better when used in context. Cases teach students how to apply theory in practice and how to induce theory from practice. The case method cultivates the capacity for critical analysis, judgment, decision-making, and action.

There is a word that aptly captures the broader set of capabilities our alumni reported they learned from the case method. That word is meta-skills, and these meta-skills are a benefit of case study instruction that those who’ve never been exposed to the method may undervalue.

Educators define meta-skills as a group of long-lasting abilities that allow someone to learn new things more quickly. When parents encourage a child to learn to play a musical instrument, for instance, beyond the hope of instilling musical skills (which some children will master and others may not), they may also appreciate the benefit the child derives from deliberate, consistent practice. This meta-skill is valuable for learning many other things beyond music.

In the same vein, let me suggest seven vital meta-skills students gain from the case method:

1. Preparation

There is no place for students to hide in the moments before the famed “cold call”— when the teacher can ask any student at random to open the case discussion. Decades after they graduate, students will vividly remember cold calls when they, or someone else, froze with fear, or when they rose to nail the case even in the face of a fierce grilling by the professor.

The case method creates high-powered incentives for students to prepare. Students typically spend several hours reading, highlighting, and debating cases before class, sometimes alone and sometimes in groups. The number of cases to be prepared can be overwhelming by design.

Learning to be prepared — to read materials in advance, prioritize, identify the key issues, and have an initial point of view — is a meta-skill that helps people succeed in a broad range of professions and work situations. We have all seen how the prepared person, who knows what they are talking about, can gain the trust and confidence of others in a business meeting. The habits of preparing for a case discussion can transform a student into that person.

2. Discernment

Many cases are long. A typical case may include history, industry background, a cast of characters, dialogue, financial statements, source documents, or other exhibits. Some material may be digressive or inessential. Cases often have holes — critical pieces of information that are missing.

The case method forces students to identify and focus on what’s essential, ignore the noise, skim when possible, and concentrate on what matters, meta-skills required for every busy executive confronted with the paradox of simultaneous information overload and information paucity. As one alumnus pithily put it, “The case method helped me learn how to separate the wheat from the chaff.”

3. Bias Recognition

Students often have an initial reaction to a case stemming from their background or earlier work and life experiences. For instance, people who have worked in finance may be biased to view cases through a financial lens. However, effective general managers must understand and empathize with various stakeholders, and if someone has a natural tendency to favor one viewpoint over another, discussing dozens of cases will help reveal that bias. Armed with this self-understanding, students can correct that bias or learn to listen more carefully to classmates whose different viewpoints may help them see beyond their own biases.

Recognizing and correcting personal bias can be an invaluable meta-skill in business settings when leaders inevitably have to work with people from different functions, backgrounds, and perspectives.

4. Judgment

Cases put students into the role of the case protagonist and force them to make and defend a decision. The format leaves room for nuanced discussion, but not for waffling: Teachers push students to choose an option, knowing full well that there is rarely one correct answer.

Indeed, most cases are meant to stimulate a discussion rather than highlight effective or ineffective management practice. Across the cases they study, students get feedback from their classmates and their teachers about when their decisions are more or less compelling. It enables them to develop the judgment of making decisions under uncertainty, communicating that decision to others, and gaining their buy-in — all essential leadership skills. Leaders earn respect for their judgment. It is something students in the case method get lots of practice honing.

5. Collaboration

It is better to make business decisions after extended give-and-take, debate, and deliberation. As in any team sport, people get better at working collaboratively with practice. Discussing cases in small study groups, and then in the classroom, helps students practice the meta-skill of collaborating with others. Our alumni often say they came away from the case method with better skills to participate in meetings and lead them.

Orchestrating a good collaborative discussion in which everyone contributes, every viewpoint is carefully considered, yet a thoughtful decision is made in the end is the arc of any good case discussion. Although teachers play the primary role in this collaborative process during their time at the school, it is an art that students of the case method internalize and get better at when they get to lead discussions.

6. Curiosity

Cases expose students to lots of different situations and roles. Across cases, they get to assume the role of entrepreneur, investor, functional leader, or CEO, in a range of different industries and sectors. Each case offers an opportunity for students to see what resonates with them, what excites them, what bores them, which role they could imagine inhabiting in their careers.

Cases stimulate curiosity about the range of opportunities in the world and the many ways that students can make a difference as leaders. This curiosity serves them well throughout their lives. It makes them more agile, more adaptive, and more open to doing a wider range of things in their careers.

7. Self-Confidence

Students must inhabit roles during a case study that far outstrip their prior experience or capability, often as leaders of teams or entire organizations in unfamiliar settings. “What would you do if you were the case protagonist?” is the most common question in a case discussion. Even though they are imaginary and temporary, these “stretch” assignments increase students’ self-confidence that they can rise to the challenge.

In our program, students can study 500 cases over two years, and the range of roles they are asked to assume increases the range of situations they believe they can tackle. Speaking up in front of 90 classmates feels risky at first, but students become more comfortable taking that risk over time. Knowing that they can hold their own in a highly curated group of competitive peers enhances student confidence. Often, alumni describe how discussing cases made them feel prepared for much bigger roles or challenges than they’d imagined they could handle before their MBA studies. Self-confidence is difficult to teach or coach, but the case study method seems to instill it in people.

There may well be other ways of learning these meta-skills, such as the repeated experience gained through practice or guidance from a gifted coach. However, under the direction of a masterful teacher, the case method can engage students and help them develop powerful meta-skills like no other form of teaching. This quickly became apparent when case teaching was introduced in 1921 — and it’s even truer today.

For educators and students, recognizing the value of these meta-skills can offer perspective on the broader goals of their work together. Returning to the example of piano lessons, it may be natural for a music teacher or their students to judge success by a simple measure: Does the student learn to play the instrument well? But when everyone involved recognizes the broader meta-skills that instrumental instruction can instill — and that even those who bumble their way through Bach may still derive lifelong benefits from their instruction — it may lead to a deeper appreciation of this work.

For recruiters and employers, recognizing the long-lasting set of benefits that accrue from studying via the case method can be a valuable perspective in assessing candidates and plotting their potential career trajectories.

And while we must certainly use the case method’s centennial to imagine yet more powerful ways of educating students in the future, let us be sure to assess these innovations for the meta-skills they might instill, as much as the subject matter mastery they might enable.

  • Nitin Nohria is a professor and former dean at Harvard Business School and the chairman of Thrive Capital, a venture capital firm based in New York.

Partner Center

Case Study Research in Software Engineering: Guidelines and Examples by Per Runeson, Martin Höst, Austen Rainer, Björn Regnell

Get full access to Case Study Research in Software Engineering: Guidelines and Examples and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.


5.1 introduction.

Once data has been collected the focus shifts to analysis of data. It can be said that in this phase, data is used to understand what actually has happened in the studied case, and where the researcher understands the details of the case and seeks patterns in the data. This means that there inevitably is some analysis going on also in the data collection phase where the data is studied, and for example when data from an interview is transcribed. The understandings in the earlier phases are of course also valid and important, but this chapter is more focusing on the separate phase that starts after the data has been collected.

Data analysis is conducted differently for quantitative and qualitative data. Sections 5.2 – 5.5 describe how to analyze qualitative data and how to assess the validity of this type of analysis. In Section 5.6 , a short introduction to quantitative analysis methods is given. Since quantitative analysis is covered extensively in textbooks on statistical analysis, and case study research to a large extent relies on qualitative data, this section is kept short.


5.2.1 introduction.

As case study research is a flexible research method, qualitative data analysis methods are commonly used [176]. The basic objective of the analysis is, as in any other analysis, to derive conclusions from the data, keeping a clear chain of evidence. The chain of evidence means that a reader ...

Get Case Study Research in Software Engineering: Guidelines and Examples now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Cover of Software Architecture Patterns

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

case study research data analysis

Research Method

Home » Case Study – Methods, Examples and Guide

Case Study – Methods, Examples and Guide

Table of Contents

Case Study Research

A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation.

It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied. Case studies typically involve multiple sources of data, including interviews, observations, documents, and artifacts, which are analyzed using various techniques, such as content analysis, thematic analysis, and grounded theory. The findings of a case study are often used to develop theories, inform policy or practice, or generate new research questions.

Types of Case Study

Types and Methods of Case Study are as follows:

Single-Case Study

A single-case study is an in-depth analysis of a single case. This type of case study is useful when the researcher wants to understand a specific phenomenon in detail.

For Example , A researcher might conduct a single-case study on a particular individual to understand their experiences with a particular health condition or a specific organization to explore their management practices. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a single-case study are often used to generate new research questions, develop theories, or inform policy or practice.

Multiple-Case Study

A multiple-case study involves the analysis of several cases that are similar in nature. This type of case study is useful when the researcher wants to identify similarities and differences between the cases.

For Example, a researcher might conduct a multiple-case study on several companies to explore the factors that contribute to their success or failure. The researcher collects data from each case, compares and contrasts the findings, and uses various techniques to analyze the data, such as comparative analysis or pattern-matching. The findings of a multiple-case study can be used to develop theories, inform policy or practice, or generate new research questions.

Exploratory Case Study

An exploratory case study is used to explore a new or understudied phenomenon. This type of case study is useful when the researcher wants to generate hypotheses or theories about the phenomenon.

For Example, a researcher might conduct an exploratory case study on a new technology to understand its potential impact on society. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as grounded theory or content analysis. The findings of an exploratory case study can be used to generate new research questions, develop theories, or inform policy or practice.

Descriptive Case Study

A descriptive case study is used to describe a particular phenomenon in detail. This type of case study is useful when the researcher wants to provide a comprehensive account of the phenomenon.

For Example, a researcher might conduct a descriptive case study on a particular community to understand its social and economic characteristics. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a descriptive case study can be used to inform policy or practice or generate new research questions.

Instrumental Case Study

An instrumental case study is used to understand a particular phenomenon that is instrumental in achieving a particular goal. This type of case study is useful when the researcher wants to understand the role of the phenomenon in achieving the goal.

For Example, a researcher might conduct an instrumental case study on a particular policy to understand its impact on achieving a particular goal, such as reducing poverty. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of an instrumental case study can be used to inform policy or practice or generate new research questions.

Case Study Data Collection Methods

Here are some common data collection methods for case studies:

Interviews involve asking questions to individuals who have knowledge or experience relevant to the case study. Interviews can be structured (where the same questions are asked to all participants) or unstructured (where the interviewer follows up on the responses with further questions). Interviews can be conducted in person, over the phone, or through video conferencing.


Observations involve watching and recording the behavior and activities of individuals or groups relevant to the case study. Observations can be participant (where the researcher actively participates in the activities) or non-participant (where the researcher observes from a distance). Observations can be recorded using notes, audio or video recordings, or photographs.

Documents can be used as a source of information for case studies. Documents can include reports, memos, emails, letters, and other written materials related to the case study. Documents can be collected from the case study participants or from public sources.

Surveys involve asking a set of questions to a sample of individuals relevant to the case study. Surveys can be administered in person, over the phone, through mail or email, or online. Surveys can be used to gather information on attitudes, opinions, or behaviors related to the case study.

Artifacts are physical objects relevant to the case study. Artifacts can include tools, equipment, products, or other objects that provide insights into the case study phenomenon.

How to conduct Case Study Research

Conducting a case study research involves several steps that need to be followed to ensure the quality and rigor of the study. Here are the steps to conduct case study research:

  • Define the research questions: The first step in conducting a case study research is to define the research questions. The research questions should be specific, measurable, and relevant to the case study phenomenon under investigation.
  • Select the case: The next step is to select the case or cases to be studied. The case should be relevant to the research questions and should provide rich and diverse data that can be used to answer the research questions.
  • Collect data: Data can be collected using various methods, such as interviews, observations, documents, surveys, and artifacts. The data collection method should be selected based on the research questions and the nature of the case study phenomenon.
  • Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions and should aim to provide insights and conclusions relevant to the research questions.
  • Draw conclusions: The conclusions drawn from the case study should be based on the data analysis and should be relevant to the research questions. The conclusions should be supported by evidence and should be clearly stated.
  • Validate the findings: The findings of the case study should be validated by reviewing the data and the analysis with participants or other experts in the field. This helps to ensure the validity and reliability of the findings.
  • Write the report: The final step is to write the report of the case study research. The report should provide a clear description of the case study phenomenon, the research questions, the data collection methods, the data analysis, the findings, and the conclusions. The report should be written in a clear and concise manner and should follow the guidelines for academic writing.

Examples of Case Study

Here are some examples of case study research:

  • The Hawthorne Studies : Conducted between 1924 and 1932, the Hawthorne Studies were a series of case studies conducted by Elton Mayo and his colleagues to examine the impact of work environment on employee productivity. The studies were conducted at the Hawthorne Works plant of the Western Electric Company in Chicago and included interviews, observations, and experiments.
  • The Stanford Prison Experiment: Conducted in 1971, the Stanford Prison Experiment was a case study conducted by Philip Zimbardo to examine the psychological effects of power and authority. The study involved simulating a prison environment and assigning participants to the role of guards or prisoners. The study was controversial due to the ethical issues it raised.
  • The Challenger Disaster: The Challenger Disaster was a case study conducted to examine the causes of the Space Shuttle Challenger explosion in 1986. The study included interviews, observations, and analysis of data to identify the technical, organizational, and cultural factors that contributed to the disaster.
  • The Enron Scandal: The Enron Scandal was a case study conducted to examine the causes of the Enron Corporation’s bankruptcy in 2001. The study included interviews, analysis of financial data, and review of documents to identify the accounting practices, corporate culture, and ethical issues that led to the company’s downfall.
  • The Fukushima Nuclear Disaster : The Fukushima Nuclear Disaster was a case study conducted to examine the causes of the nuclear accident that occurred at the Fukushima Daiichi Nuclear Power Plant in Japan in 2011. The study included interviews, analysis of data, and review of documents to identify the technical, organizational, and cultural factors that contributed to the disaster.

Application of Case Study

Case studies have a wide range of applications across various fields and industries. Here are some examples:

Business and Management

Case studies are widely used in business and management to examine real-life situations and develop problem-solving skills. Case studies can help students and professionals to develop a deep understanding of business concepts, theories, and best practices.

Case studies are used in healthcare to examine patient care, treatment options, and outcomes. Case studies can help healthcare professionals to develop critical thinking skills, diagnose complex medical conditions, and develop effective treatment plans.

Case studies are used in education to examine teaching and learning practices. Case studies can help educators to develop effective teaching strategies, evaluate student progress, and identify areas for improvement.

Social Sciences

Case studies are widely used in social sciences to examine human behavior, social phenomena, and cultural practices. Case studies can help researchers to develop theories, test hypotheses, and gain insights into complex social issues.

Law and Ethics

Case studies are used in law and ethics to examine legal and ethical dilemmas. Case studies can help lawyers, policymakers, and ethical professionals to develop critical thinking skills, analyze complex cases, and make informed decisions.

Purpose of Case Study

The purpose of a case study is to provide a detailed analysis of a specific phenomenon, issue, or problem in its real-life context. A case study is a qualitative research method that involves the in-depth exploration and analysis of a particular case, which can be an individual, group, organization, event, or community.

The primary purpose of a case study is to generate a comprehensive and nuanced understanding of the case, including its history, context, and dynamics. Case studies can help researchers to identify and examine the underlying factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and detailed understanding of the case, which can inform future research, practice, or policy.

Case studies can also serve other purposes, including:

  • Illustrating a theory or concept: Case studies can be used to illustrate and explain theoretical concepts and frameworks, providing concrete examples of how they can be applied in real-life situations.
  • Developing hypotheses: Case studies can help to generate hypotheses about the causal relationships between different factors and outcomes, which can be tested through further research.
  • Providing insight into complex issues: Case studies can provide insights into complex and multifaceted issues, which may be difficult to understand through other research methods.
  • Informing practice or policy: Case studies can be used to inform practice or policy by identifying best practices, lessons learned, or areas for improvement.

Advantages of Case Study Research

There are several advantages of case study research, including:

  • In-depth exploration: Case study research allows for a detailed exploration and analysis of a specific phenomenon, issue, or problem in its real-life context. This can provide a comprehensive understanding of the case and its dynamics, which may not be possible through other research methods.
  • Rich data: Case study research can generate rich and detailed data, including qualitative data such as interviews, observations, and documents. This can provide a nuanced understanding of the case and its complexity.
  • Holistic perspective: Case study research allows for a holistic perspective of the case, taking into account the various factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and comprehensive understanding of the case.
  • Theory development: Case study research can help to develop and refine theories and concepts by providing empirical evidence and concrete examples of how they can be applied in real-life situations.
  • Practical application: Case study research can inform practice or policy by identifying best practices, lessons learned, or areas for improvement.
  • Contextualization: Case study research takes into account the specific context in which the case is situated, which can help to understand how the case is influenced by the social, cultural, and historical factors of its environment.

Limitations of Case Study Research

There are several limitations of case study research, including:

  • Limited generalizability : Case studies are typically focused on a single case or a small number of cases, which limits the generalizability of the findings. The unique characteristics of the case may not be applicable to other contexts or populations, which may limit the external validity of the research.
  • Biased sampling: Case studies may rely on purposive or convenience sampling, which can introduce bias into the sample selection process. This may limit the representativeness of the sample and the generalizability of the findings.
  • Subjectivity: Case studies rely on the interpretation of the researcher, which can introduce subjectivity into the analysis. The researcher’s own biases, assumptions, and perspectives may influence the findings, which may limit the objectivity of the research.
  • Limited control: Case studies are typically conducted in naturalistic settings, which limits the control that the researcher has over the environment and the variables being studied. This may limit the ability to establish causal relationships between variables.
  • Time-consuming: Case studies can be time-consuming to conduct, as they typically involve a detailed exploration and analysis of a specific case. This may limit the feasibility of conducting multiple case studies or conducting case studies in a timely manner.
  • Resource-intensive: Case studies may require significant resources, including time, funding, and expertise. This may limit the ability of researchers to conduct case studies in resource-constrained settings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Explanatory Research

Explanatory Research – Types, Methods, Guide


Phenomenology – Methods, Examples and Guide

Textual Analysis

Textual Analysis – Types, Examples and Guide

Research Methods

Research Methods – Types, Examples and Guide

Observational Research

Observational Research – Methods and Guide

Mixed Research methods

Mixed Methods Research – Types & Analysis

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • SAGE Choice

Continuing to enhance the quality of case study methodology in health services research

Shannon l. sibbald.

1 Faculty of Health Sciences, Western University, London, Ontario, Canada.

2 Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

3 The Schulich Interfaculty Program in Public Health, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

Stefan Paciocco

Meghan fournie, rachelle van asseldonk, tiffany scurr.

Case study methodology has grown in popularity within Health Services Research (HSR). However, its use and merit as a methodology are frequently criticized due to its flexible approach and inconsistent application. Nevertheless, case study methodology is well suited to HSR because it can track and examine complex relationships, contexts, and systems as they evolve. Applied appropriately, it can help generate information on how multiple forms of knowledge come together to inform decision-making within healthcare contexts. In this article, we aim to demystify case study methodology by outlining its philosophical underpinnings and three foundational approaches. We provide literature-based guidance to decision-makers, policy-makers, and health leaders on how to engage in and critically appraise case study design. We advocate that researchers work in collaboration with health leaders to detail their research process with an aim of strengthening the validity and integrity of case study for its continued and advanced use in HSR.


The popularity of case study research methodology in Health Services Research (HSR) has grown over the past 40 years. 1 This may be attributed to a shift towards the use of implementation research and a newfound appreciation of contextual factors affecting the uptake of evidence-based interventions within diverse settings. 2 Incorporating context-specific information on the delivery and implementation of programs can increase the likelihood of success. 3 , 4 Case study methodology is particularly well suited for implementation research in health services because it can provide insight into the nuances of diverse contexts. 5 , 6 In 1999, Yin 7 published a paper on how to enhance the quality of case study in HSR, which was foundational for the emergence of case study in this field. Yin 7 maintains case study is an appropriate methodology in HSR because health systems are constantly evolving, and the multiple affiliations and diverse motivations are difficult to track and understand with traditional linear methodologies.

Despite its increased popularity, there is debate whether a case study is a methodology (ie, a principle or process that guides research) or a method (ie, a tool to answer research questions). Some criticize case study for its high level of flexibility, perceiving it as less rigorous, and maintain that it generates inadequate results. 8 Others have noted issues with quality and consistency in how case studies are conducted and reported. 9 Reporting is often varied and inconsistent, using a mix of approaches such as case reports, case findings, and/or case study. Authors sometimes use incongruent methods of data collection and analysis or use the case study as a default when other methodologies do not fit. 9 , 10 Despite these criticisms, case study methodology is becoming more common as a viable approach for HSR. 11 An abundance of articles and textbooks are available to guide researchers through case study research, including field-specific resources for business, 12 , 13 nursing, 14 and family medicine. 15 However, there remains confusion and a lack of clarity on the key tenets of case study methodology.

Several common philosophical underpinnings have contributed to the development of case study research 1 which has led to different approaches to planning, data collection, and analysis. This presents challenges in assessing quality and rigour for researchers conducting case studies and stakeholders reading results.

This article discusses the various approaches and philosophical underpinnings to case study methodology. Our goal is to explain it in a way that provides guidance for decision-makers, policy-makers, and health leaders on how to understand, critically appraise, and engage in case study research and design, as such guidance is largely absent in the literature. This article is by no means exhaustive or authoritative. Instead, we aim to provide guidance and encourage dialogue around case study methodology, facilitating critical thinking around the variety of approaches and ways quality and rigour can be bolstered for its use within HSR.

Purpose of case study methodology

Case study methodology is often used to develop an in-depth, holistic understanding of a specific phenomenon within a specified context. 11 It focuses on studying one or multiple cases over time and uses an in-depth analysis of multiple information sources. 16 , 17 It is ideal for situations including, but not limited to, exploring under-researched and real-life phenomena, 18 especially when the contexts are complex and the researcher has little control over the phenomena. 19 , 20 Case studies can be useful when researchers want to understand how interventions are implemented in different contexts, and how context shapes the phenomenon of interest.

In addition to demonstrating coherency with the type of questions case study is suited to answer, there are four key tenets to case study methodologies: (1) be transparent in the paradigmatic and theoretical perspectives influencing study design; (2) clearly define the case and phenomenon of interest; (3) clearly define and justify the type of case study design; and (4) use multiple data collection sources and analysis methods to present the findings in ways that are consistent with the methodology and the study’s paradigmatic base. 9 , 16 The goal is to appropriately match the methods to empirical questions and issues and not to universally advocate any single approach for all problems. 21

Approaches to case study methodology

Three authors propose distinct foundational approaches to case study methodology positioned within different paradigms: Yin, 19 , 22 Stake, 5 , 23 and Merriam 24 , 25 ( Table 1 ). Yin is strongly post-positivist whereas Stake and Merriam are grounded in a constructivist paradigm. Researchers should locate their research within a paradigm that explains the philosophies guiding their research 26 and adhere to the underlying paradigmatic assumptions and key tenets of the appropriate author’s methodology. This will enhance the consistency and coherency of the methods and findings. However, researchers often do not report their paradigmatic position, nor do they adhere to one approach. 9 Although deliberately blending methodologies may be defensible and methodologically appropriate, more often it is done in an ad hoc and haphazard way, without consideration for limitations.

Cross-analysis of three case study approaches, adapted from Yazan 2015

The post-positive paradigm postulates there is one reality that can be objectively described and understood by “bracketing” oneself from the research to remove prejudice or bias. 27 Yin focuses on general explanation and prediction, emphasizing the formulation of propositions, akin to hypothesis testing. This approach is best suited for structured and objective data collection 9 , 11 and is often used for mixed-method studies.

Constructivism assumes that the phenomenon of interest is constructed and influenced by local contexts, including the interaction between researchers, individuals, and their environment. 27 It acknowledges multiple interpretations of reality 24 constructed within the context by the researcher and participants which are unlikely to be replicated, should either change. 5 , 20 Stake and Merriam’s constructivist approaches emphasize a story-like rendering of a problem and an iterative process of constructing the case study. 7 This stance values researcher reflexivity and transparency, 28 acknowledging how researchers’ experiences and disciplinary lenses influence their assumptions and beliefs about the nature of the phenomenon and development of the findings.

Defining a case

A key tenet of case study methodology often underemphasized in literature is the importance of defining the case and phenomenon. Researches should clearly describe the case with sufficient detail to allow readers to fully understand the setting and context and determine applicability. Trying to answer a question that is too broad often leads to an unclear definition of the case and phenomenon. 20 Cases should therefore be bound by time and place to ensure rigor and feasibility. 6

Yin 22 defines a case as “a contemporary phenomenon within its real-life context,” (p13) which may contain a single unit of analysis, including individuals, programs, corporations, or clinics 29 (holistic), or be broken into sub-units of analysis, such as projects, meetings, roles, or locations within the case (embedded). 30 Merriam 24 and Stake 5 similarly define a case as a single unit studied within a bounded system. Stake 5 , 23 suggests bounding cases by contexts and experiences where the phenomenon of interest can be a program, process, or experience. However, the line between the case and phenomenon can become muddy. For guidance, Stake 5 , 23 describes the case as the noun or entity and the phenomenon of interest as the verb, functioning, or activity of the case.

Designing the case study approach

Yin’s approach to a case study is rooted in a formal proposition or theory which guides the case and is used to test the outcome. 1 Stake 5 advocates for a flexible design and explicitly states that data collection and analysis may commence at any point. Merriam’s 24 approach blends both Yin and Stake’s, allowing the necessary flexibility in data collection and analysis to meet the needs.

Yin 30 proposed three types of case study approaches—descriptive, explanatory, and exploratory. Each can be designed around single or multiple cases, creating six basic case study methodologies. Descriptive studies provide a rich description of the phenomenon within its context, which can be helpful in developing theories. To test a theory or determine cause and effect relationships, researchers can use an explanatory design. An exploratory model is typically used in the pilot-test phase to develop propositions (eg, Sibbald et al. 31 used this approach to explore interprofessional network complexity). Despite having distinct characteristics, the boundaries between case study types are flexible with significant overlap. 30 Each has five key components: (1) research question; (2) proposition; (3) unit of analysis; (4) logical linking that connects the theory with proposition; and (5) criteria for analyzing findings.

Contrary to Yin, Stake 5 believes the research process cannot be planned in its entirety because research evolves as it is performed. Consequently, researchers can adjust the design of their methods even after data collection has begun. Stake 5 classifies case studies into three categories: intrinsic, instrumental, and collective/multiple. Intrinsic case studies focus on gaining a better understanding of the case. These are often undertaken when the researcher has an interest in a specific case. Instrumental case study is used when the case itself is not of the utmost importance, and the issue or phenomenon (ie, the research question) being explored becomes the focus instead (eg, Paciocco 32 used an instrumental case study to evaluate the implementation of a chronic disease management program). 5 Collective designs are rooted in an instrumental case study and include multiple cases to gain an in-depth understanding of the complexity and particularity of a phenomenon across diverse contexts. 5 , 23 In collective designs, studying similarities and differences between the cases allows the phenomenon to be understood more intimately (for examples of this in the field, see van Zelm et al. 33 and Burrows et al. 34 In addition, Sibbald et al. 35 present an example where a cross-case analysis method is used to compare instrumental cases).

Merriam’s approach is flexible (similar to Stake) as well as stepwise and linear (similar to Yin). She advocates for conducting a literature review before designing the study to better understand the theoretical underpinnings. 24 , 25 Unlike Stake or Yin, Merriam proposes a step-by-step guide for researchers to design a case study. These steps include performing a literature review, creating a theoretical framework, identifying the problem, creating and refining the research question(s), and selecting a study sample that fits the question(s). 24 , 25 , 36

Data collection and analysis

Using multiple data collection methods is a key characteristic of all case study methodology; it enhances the credibility of the findings by allowing different facets and views of the phenomenon to be explored. 23 Common methods include interviews, focus groups, observation, and document analysis. 5 , 37 By seeking patterns within and across data sources, a thick description of the case can be generated to support a greater understanding and interpretation of the whole phenomenon. 5 , 17 , 20 , 23 This technique is called triangulation and is used to explore cases with greater accuracy. 5 Although Stake 5 maintains case study is most often used in qualitative research, Yin 17 supports a mix of both quantitative and qualitative methods to triangulate data. This deliberate convergence of data sources (or mixed methods) allows researchers to find greater depth in their analysis and develop converging lines of inquiry. For example, case studies evaluating interventions commonly use qualitative interviews to describe the implementation process, barriers, and facilitators paired with a quantitative survey of comparative outcomes and effectiveness. 33 , 38 , 39

Yin 30 describes analysis as dependent on the chosen approach, whether it be (1) deductive and rely on theoretical propositions; (2) inductive and analyze data from the “ground up”; (3) organized to create a case description; or (4) used to examine plausible rival explanations. According to Yin’s 40 approach to descriptive case studies, carefully considering theory development is an important part of study design. “Theory” refers to field-relevant propositions, commonly agreed upon assumptions, or fully developed theories. 40 Stake 5 advocates for using the researcher’s intuition and impression to guide analysis through a categorical aggregation and direct interpretation. Merriam 24 uses six different methods to guide the “process of making meaning” (p178) : (1) ethnographic analysis; (2) narrative analysis; (3) phenomenological analysis; (4) constant comparative method; (5) content analysis; and (6) analytic induction.

Drawing upon a theoretical or conceptual framework to inform analysis improves the quality of case study and avoids the risk of description without meaning. 18 Using Stake’s 5 approach, researchers rely on protocols and previous knowledge to help make sense of new ideas; theory can guide the research and assist researchers in understanding how new information fits into existing knowledge.

Practical applications of case study research

Columbia University has recently demonstrated how case studies can help train future health leaders. 41 Case studies encompass components of systems thinking—considering connections and interactions between components of a system, alongside the implications and consequences of those relationships—to equip health leaders with tools to tackle global health issues. 41 Greenwood 42 evaluated Indigenous peoples’ relationship with the healthcare system in British Columbia and used a case study to challenge and educate health leaders across the country to enhance culturally sensitive health service environments.

An important but often omitted step in case study research is an assessment of quality and rigour. We recommend using a framework or set of criteria to assess the rigour of the qualitative research. Suitable resources include Caelli et al., 43 Houghten et al., 44 Ravenek and Rudman, 45 and Tracy. 46

New directions in case study

Although “pragmatic” case studies (ie, utilizing practical and applicable methods) have existed within psychotherapy for some time, 47 , 48 only recently has the applicability of pragmatism as an underlying paradigmatic perspective been considered in HSR. 49 This is marked by uptake of pragmatism in Randomized Control Trials, recognizing that “gold standard” testing conditions do not reflect the reality of clinical settings 50 , 51 nor do a handful of epistemologically guided methodologies suit every research inquiry.

Pragmatism positions the research question as the basis for methodological choices, rather than a theory or epistemology, allowing researchers to pursue the most practical approach to understanding a problem or discovering an actionable solution. 52 Mixed methods are commonly used to create a deeper understanding of the case through converging qualitative and quantitative data. 52 Pragmatic case study is suited to HSR because its flexibility throughout the research process accommodates complexity, ever-changing systems, and disruptions to research plans. 49 , 50 Much like case study, pragmatism has been criticized for its flexibility and use when other approaches are seemingly ill-fit. 53 , 54 Similarly, authors argue that this results from a lack of investigation and proper application rather than a reflection of validity, legitimizing the need for more exploration and conversation among researchers and practitioners. 55

Although occasionally misunderstood as a less rigourous research methodology, 8 case study research is highly flexible and allows for contextual nuances. 5 , 6 Its use is valuable when the researcher desires a thorough understanding of a phenomenon or case bound by context. 11 If needed, multiple similar cases can be studied simultaneously, or one case within another. 16 , 17 There are currently three main approaches to case study, 5 , 17 , 24 each with their own definitions of a case, ontological and epistemological paradigms, methodologies, and data collection and analysis procedures. 37

Individuals’ experiences within health systems are influenced heavily by contextual factors, participant experience, and intricate relationships between different organizations and actors. 55 Case study research is well suited for HSR because it can track and examine these complex relationships and systems as they evolve over time. 6 , 7 It is important that researchers and health leaders using this methodology understand its key tenets and how to conduct a proper case study. Although there are many examples of case study in action, they are often under-reported and, when reported, not rigorously conducted. 9 Thus, decision-makers and health leaders should use these examples with caution. The proper reporting of case studies is necessary to bolster their credibility in HSR literature and provide readers sufficient information to critically assess the methodology. We also call on health leaders who frequently use case studies 56 – 58 to report them in the primary research literature.

The purpose of this article is to advocate for the continued and advanced use of case study in HSR and to provide literature-based guidance for decision-makers, policy-makers, and health leaders on how to engage in, read, and interpret findings from case study research. As health systems progress and evolve, the application of case study research will continue to increase as researchers and health leaders aim to capture the inherent complexities, nuances, and contextual factors. 7

An external file that holds a picture, illustration, etc.
Object name is 10.1177_08404704211028857-img1.jpg

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Case Study | Definition, Examples & Methods

Case Study | Definition, Examples & Methods

Published on 5 May 2022 by Shona McCombes . Revised on 30 January 2023.

A case study is a detailed study of a specific subject, such as a person, group, place, event, organisation, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research.

A case study research design usually involves qualitative methods , but quantitative methods are sometimes also used. Case studies are good for describing , comparing, evaluating, and understanding different aspects of a research problem .

Table of contents

When to do a case study, step 1: select a case, step 2: build a theoretical framework, step 3: collect your data, step 4: describe and analyse the case.

A case study is an appropriate research design when you want to gain concrete, contextual, in-depth knowledge about a specific real-world subject. It allows you to explore the key characteristics, meanings, and implications of the case.

Case studies are often a good choice in a thesis or dissertation . They keep your project focused and manageable when you don’t have the time or resources to do large-scale research.

You might use just one complex case study where you explore a single subject in depth, or conduct multiple case studies to compare and illuminate different aspects of your research problem.

Prevent plagiarism, run a free check.

Once you have developed your problem statement and research questions , you should be ready to choose the specific case that you want to focus on. A good case study should have the potential to:

  • Provide new or unexpected insights into the subject
  • Challenge or complicate existing assumptions and theories
  • Propose practical courses of action to resolve a problem
  • Open up new directions for future research

Unlike quantitative or experimental research, a strong case study does not require a random or representative sample. In fact, case studies often deliberately focus on unusual, neglected, or outlying cases which may shed new light on the research problem.

If you find yourself aiming to simultaneously investigate and solve an issue, consider conducting action research . As its name suggests, action research conducts research and takes action at the same time, and is highly iterative and flexible. 

However, you can also choose a more common or representative case to exemplify a particular category, experience, or phenomenon.

While case studies focus more on concrete details than general theories, they should usually have some connection with theory in the field. This way the case study is not just an isolated description, but is integrated into existing knowledge about the topic. It might aim to:

  • Exemplify a theory by showing how it explains the case under investigation
  • Expand on a theory by uncovering new concepts and ideas that need to be incorporated
  • Challenge a theory by exploring an outlier case that doesn’t fit with established assumptions

To ensure that your analysis of the case has a solid academic grounding, you should conduct a literature review of sources related to the topic and develop a theoretical framework . This means identifying key concepts and theories to guide your analysis and interpretation.

There are many different research methods you can use to collect data on your subject. Case studies tend to focus on qualitative data using methods such as interviews, observations, and analysis of primary and secondary sources (e.g., newspaper articles, photographs, official records). Sometimes a case study will also collect quantitative data .

The aim is to gain as thorough an understanding as possible of the case and its context.

In writing up the case study, you need to bring together all the relevant aspects to give as complete a picture as possible of the subject.

How you report your findings depends on the type of research you are doing. Some case studies are structured like a standard scientific paper or thesis, with separate sections or chapters for the methods , results , and discussion .

Others are written in a more narrative style, aiming to explore the case from various angles and analyse its meanings and implications (for example, by using textual analysis or discourse analysis ).

In all cases, though, make sure to give contextual details about the case, connect it back to the literature and theory, and discuss how it fits into wider patterns or debates.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2023, January 30). Case Study | Definition, Examples & Methods. Scribbr. Retrieved 4 September 2023, from https://www.scribbr.co.uk/research-methods/case-studies/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, correlational research | guide, design & examples, a quick guide to experimental design | 5 steps & examples, descriptive research design | definition, methods & examples.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base


Research Methods | Definitions, Types, Examples

Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design . When planning your methods, there are two key decisions you will make.

First, decide how you will collect data . Your methods depend on what type of data you need to answer your research question :

  • Qualitative vs. quantitative : Will your data take the form of words or numbers?
  • Primary vs. secondary : Will you collect original data yourself, or will you use data that has already been collected by someone else?
  • Descriptive vs. experimental : Will you take measurements of something as it is, or will you perform an experiment?

Second, decide how you will analyze the data .

  • For quantitative data, you can use statistical analysis methods to test relationships between variables.
  • For qualitative data, you can use methods such as thematic analysis to interpret patterns and meanings in the data.

Table of contents

Methods for collecting data, examples of data collection methods, methods for analyzing data, examples of data analysis methods, other interesting articles, frequently asked questions about research methods.

Data is the information that you collect for the purposes of answering your research question . The type of data you need depends on the aims of your research.

Qualitative vs. quantitative data

Your choice of qualitative or quantitative data collection depends on the type of knowledge you want to develop.

For questions about ideas, experiences and meanings, or to study something that can’t be described numerically, collect qualitative data .

If you want to develop a more mechanistic understanding of a topic, or your research involves hypothesis testing , collect quantitative data .

You can also take a mixed methods approach , where you use both qualitative and quantitative research methods.

Primary vs. secondary research

Primary research is any original data that you collect yourself for the purposes of answering your research question (e.g. through surveys , observations and experiments ). Secondary research is data that has already been collected by other researchers (e.g. in a government census or previous scientific studies).

If you are exploring a novel research question, you’ll probably need to collect primary data . But if you want to synthesize existing knowledge, analyze historical trends, or identify patterns on a large scale, secondary data might be a better choice.

Descriptive vs. experimental data

In descriptive research , you collect data about your study subject without intervening. The validity of your research will depend on your sampling method .

In experimental research , you systematically intervene in a process and measure the outcome. The validity of your research will depend on your experimental design .

To conduct an experiment, you need to be able to vary your independent variable , precisely measure your dependent variable, and control for confounding variables . If it’s practically and ethically possible, this method is the best choice for answering questions about cause and effect.

Prevent plagiarism. Run a free check.

Your data analysis methods will depend on the type of data you collect and how you prepare it for analysis.

Data can often be analyzed both quantitatively and qualitatively. For example, survey responses could be analyzed qualitatively by studying the meanings of responses or quantitatively by studying the frequencies of responses.

Qualitative analysis methods

Qualitative analysis is used to understand words, ideas, and experiences. You can use it to interpret data that was collected:

  • From open-ended surveys and interviews , literature reviews , case studies , ethnographies , and other sources that use text rather than numbers.
  • Using non-probability sampling methods .

Qualitative analysis tends to be quite flexible and relies on the researcher’s judgement, so you have to reflect carefully on your choices and assumptions and be careful to avoid research bias .

Quantitative analysis methods

Quantitative analysis uses numbers and statistics to understand frequencies, averages and correlations (in descriptive studies) or cause-and-effect relationships (in experiments).

You can use quantitative analysis to interpret data that was collected either:

  • During an experiment .
  • Using probability sampling methods .

Because the data is collected and analyzed in a statistically valid way, the results of quantitative analysis can be easily standardized and shared among researchers.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

case study research data analysis

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square test of independence
  • Statistical power
  • Descriptive statistics
  • Degrees of freedom
  • Pearson correlation
  • Null hypothesis
  • Double-blind study
  • Case-control study
  • Research ethics
  • Data collection
  • Hypothesis testing
  • Structured interviews

Research bias

  • Hawthorne effect
  • Unconscious bias
  • Recall bias
  • Halo effect
  • Self-serving bias
  • Information bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Is this article helpful?

Other students also liked, writing strong research questions | criteria & examples.

  • What Is a Research Design | Types, Guide & Examples
  • Data Collection | Definition, Methods & Examples

More interesting articles

  • Between-Subjects Design | Examples, Pros, & Cons
  • Cluster Sampling | A Simple Step-by-Step Guide with Examples
  • Confounding Variables | Definition, Examples & Controls
  • Construct Validity | Definition, Types, & Examples
  • Content Analysis | Guide, Methods & Examples
  • Control Groups and Treatment Groups | Uses & Examples
  • Control Variables | What Are They & Why Do They Matter?
  • Correlation vs. Causation | Difference, Designs & Examples
  • Correlational Research | When & How to Use
  • Critical Discourse Analysis | Definition, Guide & Examples
  • Cross-Sectional Study | Definition, Uses & Examples
  • Descriptive Research | Definition, Types, Methods & Examples
  • Ethical Considerations in Research | Types & Examples
  • Explanatory and Response Variables | Definitions & Examples
  • Explanatory Research | Definition, Guide, & Examples
  • Exploratory Research | Definition, Guide, & Examples
  • External Validity | Definition, Types, Threats & Examples
  • Extraneous Variables | Examples, Types & Controls
  • Guide to Experimental Design | Overview, Steps, & Examples
  • How Do You Incorporate an Interview into a Dissertation? | Tips
  • How to Do Thematic Analysis | Step-by-Step Guide & Examples
  • How to Write a Literature Review | Guide, Examples, & Templates
  • How to Write a Strong Hypothesis | Steps & Examples
  • Inclusion and Exclusion Criteria | Examples & Definition
  • Independent vs. Dependent Variables | Definition & Examples
  • Inductive Reasoning | Types, Examples, Explanation
  • Inductive vs. Deductive Research Approach | Steps & Examples
  • Internal Validity in Research | Definition, Threats, & Examples
  • Internal vs. External Validity | Understanding Differences & Threats
  • Longitudinal Study | Definition, Approaches & Examples
  • Mediator vs. Moderator Variables | Differences & Examples
  • Mixed Methods Research | Definition, Guide & Examples
  • Multistage Sampling | Introductory Guide & Examples
  • Naturalistic Observation | Definition, Guide & Examples
  • Operationalization | A Guide with Examples, Pros & Cons
  • Population vs. Sample | Definitions, Differences & Examples
  • Primary Research | Definition, Types, & Examples
  • Qualitative vs. Quantitative Research | Differences, Examples & Methods
  • Quasi-Experimental Design | Definition, Types & Examples
  • Questionnaire Design | Methods, Question Types & Examples
  • Random Assignment in Experiments | Introduction & Examples
  • Random vs. Systematic Error | Definition & Examples
  • Reliability vs. Validity in Research | Difference, Types and Examples
  • Reproducibility vs Replicability | Difference & Examples
  • Reproducibility vs. Replicability | Difference & Examples
  • Sampling Methods | Types, Techniques & Examples
  • Semi-Structured Interview | Definition, Guide & Examples
  • Simple Random Sampling | Definition, Steps & Examples
  • Single, Double, & Triple Blind Study | Definition & Examples
  • Stratified Sampling | Definition, Guide & Examples
  • Structured Interview | Definition, Guide & Examples
  • Survey Research | Definition, Examples & Methods
  • Systematic Review | Definition, Example, & Guide
  • Systematic Sampling | A Step-by-Step Guide with Examples
  • Textual Analysis | Guide, 3 Approaches & Examples
  • The 4 Types of Reliability in Research | Definitions & Examples
  • The 4 Types of Validity in Research | Definitions & Examples
  • Transcribing an Interview | 5 Steps & Transcription Software
  • Triangulation in Research | Guide, Types, Examples
  • Types of Interviews in Research | Guide & Examples
  • Types of Research Designs Compared | Guide & Examples
  • Types of Variables in Research & Statistics | Examples
  • Unstructured Interview | Definition, Guide & Examples
  • What Is a Case Study? | Definition, Examples & Methods
  • What Is a Case-Control Study? | Definition & Examples
  • What Is a Cohort Study? | Definition & Examples
  • What Is a Conceptual Framework? | Tips & Examples
  • What Is a Controlled Experiment? | Definitions & Examples
  • What Is a Double-Barreled Question?
  • What Is a Focus Group? | Step-by-Step Guide & Examples
  • What Is a Likert Scale? | Guide & Examples
  • What Is a Prospective Cohort Study? | Definition & Examples
  • What Is a Retrospective Cohort Study? | Definition & Examples
  • What Is Action Research? | Definition & Examples
  • What Is an Observational Study? | Guide & Examples
  • What Is Concurrent Validity? | Definition & Examples
  • What Is Content Validity? | Definition & Examples
  • What Is Convenience Sampling? | Definition & Examples
  • What Is Convergent Validity? | Definition & Examples
  • What Is Criterion Validity? | Definition & Examples
  • What Is Data Cleansing? | Definition, Guide & Examples
  • What Is Deductive Reasoning? | Explanation & Examples
  • What Is Discriminant Validity? | Definition & Example
  • What Is Ecological Validity? | Definition & Examples
  • What Is Ethnography? | Definition, Guide & Examples
  • What Is Face Validity? | Guide, Definition & Examples
  • What Is Non-Probability Sampling? | Types & Examples
  • What Is Participant Observation? | Definition & Examples
  • What Is Peer Review? | Types & Examples
  • What Is Predictive Validity? | Examples & Definition
  • What Is Probability Sampling? | Types & Examples
  • What Is Purposive Sampling? | Definition & Examples
  • What Is Qualitative Observation? | Definition & Examples
  • What Is Qualitative Research? | Methods & Examples
  • What Is Quantitative Observation? | Definition & Examples
  • What Is Quantitative Research? | Definition, Uses & Methods

What is your plagiarism score?

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

case study research data analysis

Home Market Research

Data Analysis in Research: Types & Methods


Content Index

Why analyze data in research?

Types of data in research, finding patterns in the qualitative data, methods used for data analysis in qualitative research, preparing data for analysis, methods used for data analysis in quantitative research, considerations in research data analysis, what is data analysis in research.

Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. 

Three essential things occur during the data analysis process — the first is data organization . Summarization and categorization together contribute to becoming the second known method used for data reduction. It helps find patterns and themes in the data for easy identification and linking. The third and last way is data analysis – researchers do it in both top-down and bottom-up fashion.

LEARN ABOUT: Research Process Steps

On the other hand, Marshall and Rossman describe data analysis as a messy, ambiguous, and time-consuming but creative and fascinating process through which a mass of collected data is brought to order, structure and meaning.

We can say that “the data analysis and data interpretation is a process representing the application of deductive and inductive logic to the research and data analysis.”

Researchers rely heavily on data as they have a story to tell or research problems to solve. It starts with a question, and data is nothing but an answer to that question. But, what if there is no question to ask? Well! It is possible to explore data even without a problem – we call it ‘Data Mining’, which often reveals some interesting patterns within the data that are worth exploring.

Irrelevant to the type of data researchers explore, their mission and audiences’ vision guide them to find the patterns to shape the story they want to tell. One of the essential things expected from researchers while analyzing data is to stay open and remain unbiased toward unexpected patterns, expressions, and results. Remember, sometimes, data analysis tells the most unforeseen yet exciting stories that were not expected when initiating data analysis. Therefore, rely on the data you have at hand and enjoy the journey of exploratory research. 

Create a Free Account

Every kind of data has a rare quality of describing things after assigning a specific value to it. For analysis, you need to organize these values, processed and presented in a given context, to make it useful. Data can be in different forms; here are the primary data types.

  • Qualitative data: When the data presented has words and descriptions, then we call it qualitative data . Although you can observe this data, it is subjective and harder to analyze data in research, especially for comparison. Example: Quality data represents everything describing taste, experience, texture, or an opinion that is considered quality data. This type of data is usually collected through focus groups, personal qualitative interviews , qualitative observation or using open-ended questions in surveys.
  • Quantitative data: Any data expressed in numbers of numerical figures are called quantitative data . This type of data can be distinguished into categories, grouped, measured, calculated, or ranked. Example: questions such as age, rank, cost, length, weight, scores, etc. everything comes under this type of data. You can present such data in graphical format, charts, or apply statistical analysis methods to this data. The (Outcomes Measurement Systems) OMS questionnaires in surveys are a significant source of collecting numeric data.
  • Categorical data: It is data presented in groups. However, an item included in the categorical data cannot belong to more than one group. Example: A person responding to a survey by telling his living style, marital status, smoking habit, or drinking habit comes under the categorical data. A chi-square test is a standard method used to analyze this data.

Learn More : Examples of Qualitative Data in Education

Data analysis in qualitative research

Data analysis and qualitative data research work a little differently from the numerical data as the quality data is made up of words, descriptions, images, objects, and sometimes symbols. Getting insight from such complicated information is a complicated process. Hence it is typically used for exploratory research and data analysis .

Although there are several ways to find patterns in the textual information, a word-based method is the most relied and widely used global technique for research and data analysis. Notably, the data analysis process in qualitative research is manual. Here the researchers usually read the available data and find repetitive or commonly used words. 

For example, while studying data collected from African countries to understand the most pressing issues people face, researchers might find  “food”  and  “hunger” are the most commonly used words and will highlight them for further analysis.

LEARN ABOUT: Level of Analysis

The keyword context is another widely used word-based technique. In this method, the researcher tries to understand the concept by analyzing the context in which the participants use a particular keyword.  

For example , researchers conducting research and data analysis for studying the concept of ‘diabetes’ amongst respondents might analyze the context of when and how the respondent has used or referred to the word ‘diabetes.’

The scrutiny-based technique is also one of the highly recommended  text analysis  methods used to identify a quality data pattern. Compare and contrast is the widely used method under this technique to differentiate how a specific text is similar or different from each other. 

For example: To find out the “importance of resident doctor in a company,” the collected data is divided into people who think it is necessary to hire a resident doctor and those who think it is unnecessary. Compare and contrast is the best method that can be used to analyze the polls having single-answer questions types .

Metaphors can be used to reduce the data pile and find patterns in it so that it becomes easier to connect data with theory.

Variable Partitioning is another technique used to split variables so that researchers can find more coherent descriptions and explanations from the enormous data.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

There are several techniques to analyze the data in qualitative research, but here are some commonly used methods,

  • Content Analysis:  It is widely accepted and the most frequently employed technique for data analysis in research methodology. It can be used to analyze the documented information from text, images, and sometimes from the physical items. It depends on the research questions to predict when and where to use this method.
  • Narrative Analysis: This method is used to analyze content gathered from various sources such as personal interviews, field observation, and  surveys . The majority of times, stories, or opinions shared by people are focused on finding answers to the research questions.
  • Discourse Analysis:  Similar to narrative analysis, discourse analysis is used to analyze the interactions with people. Nevertheless, this particular method considers the social context under which or within which the communication between the researcher and respondent takes place. In addition to that, discourse analysis also focuses on the lifestyle and day-to-day environment while deriving any conclusion.
  • Grounded Theory:  When you want to explain why a particular phenomenon happened, then using grounded theory for analyzing quality data is the best resort. Grounded theory is applied to study data about the host of similar cases occurring in different settings. When researchers are using this method, they might alter explanations or produce new ones until they arrive at some conclusion.

LEARN ABOUT: 12 Best Tools for Researchers

Data analysis in quantitative research

The first stage in research and data analysis is to make it for the analysis so that the nominal data can be converted into something meaningful. Data preparation consists of the below phases.

Phase I: Data Validation

Data validation is done to understand if the collected data sample is per the pre-set standards, or it is a biased data sample again divided into four different stages

  • Fraud: To ensure an actual human being records each response to the survey or the questionnaire
  • Screening: To make sure each participant or respondent is selected or chosen in compliance with the research criteria
  • Procedure: To ensure ethical standards were maintained while collecting the data sample
  • Completeness: To ensure that the respondent has answered all the questions in an online survey. Else, the interviewer had asked all the questions devised in the questionnaire.

Phase II: Data Editing

More often, an extensive research data sample comes loaded with errors. Respondents sometimes fill in some fields incorrectly or sometimes skip them accidentally. Data editing is a process wherein the researchers have to confirm that the provided data is free of such errors. They need to conduct necessary checks and outlier checks to edit the raw edit and make it ready for analysis.

Phase III: Data Coding

Out of all three, this is the most critical phase of data preparation associated with grouping and assigning values to the survey responses . If a survey is completed with a 1000 sample size, the researcher will create an age bracket to distinguish the respondents based on their age. Thus, it becomes easier to analyze small data buckets rather than deal with the massive data pile.

After the data is prepared for analysis, researchers are open to using different research and data analysis methods to derive meaningful insights. For sure, statistical analysis plans are the most favored to analyze numerical data. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities. The method is again classified into two groups. First, ‘Descriptive Statistics’ used to describe data. Second, ‘Inferential statistics’ that helps in comparing the data .

Descriptive statistics

This method is used to describe the basic features of versatile types of data in research. It presents the data in such a meaningful way that pattern in the data starts making sense. Nevertheless, the descriptive analysis does not go beyond making conclusions. The conclusions are again based on the hypothesis researchers have formulated so far. Here are a few major types of descriptive analysis methods.

Measures of Frequency

  • Count, Percent, Frequency
  • It is used to denote home often a particular event occurs.
  • Researchers use it when they want to showcase how often a response is given.

Measures of Central Tendency

  • Mean, Median, Mode
  • The method is widely used to demonstrate distribution by various points.
  • Researchers use this method when they want to showcase the most commonly or averagely indicated response.

Measures of Dispersion or Variation

  • Range, Variance, Standard deviation
  • Here the field equals high/low points.
  • Variance standard deviation = difference between the observed score and mean
  • It is used to identify the spread of scores by stating intervals.
  • Researchers use this method to showcase data spread out. It helps them identify the depth until which the data is spread out that it directly affects the mean.

Measures of Position

  • Percentile ranks, Quartile ranks
  • It relies on standardized scores helping researchers to identify the relationship between different scores.
  • It is often used when researchers want to compare scores with the average count.

For quantitative market research use of descriptive analysis often give absolute numbers, but the in-depth analysis is never sufficient to demonstrate the rationale behind those numbers. Nevertheless, it is necessary to think of the best method for research and data analysis suiting your survey questionnaire and what story researchers want to tell. For example, the mean is the best way to demonstrate the students’ average scores in schools. It is better to rely on the descriptive statistics when the researchers intend to keep the research or outcome limited to the provided  sample  without generalizing it. For example, when you want to compare average voting done in two different cities, differential statistics are enough.

Descriptive analysis is also called a ‘univariate analysis’ since it is commonly used to analyze a single variable.

Inferential statistics

Inferential statistics are used to make predictions about a larger population after research and data analysis of the representing population’s collected sample. For example, you can ask some odd 100 audiences at a movie theater if they like the movie they are watching. Researchers then use inferential statistics on the collected  sample  to reason that about 80-90% of people like the movie. 

Here are two significant areas of inferential statistics.

  • Estimating parameters: It takes statistics from the sample research data and demonstrates something about the population parameter.
  • Hypothesis test: I t’s about sampling research data to answer the survey research questions. For example, researchers might be interested to understand if the new shade of lipstick recently launched is good or not, or if the multivitamin capsules help children to perform better at games.

These are sophisticated analysis methods used to showcase the relationship between different variables instead of describing a single variable. It is often used when researchers want something beyond absolute numbers to understand the relationship between variables.

Here are some of the commonly used methods for data analysis in research.

  • Correlation: When researchers are not conducting experimental research or quasi-experimental research wherein the researchers are interested to understand the relationship between two or more variables, they opt for correlational research methods.
  • Cross-tabulation: Also called contingency tables,  cross-tabulation  is used to analyze the relationship between multiple variables.  Suppose provided data has age and gender categories presented in rows and columns. A two-dimensional cross-tabulation helps for seamless data analysis and research by showing the number of males and females in each age category.
  • Regression analysis: For understanding the strong relationship between two variables, researchers do not look beyond the primary and commonly used regression analysis method, which is also a type of predictive analysis used. In this method, you have an essential factor called the dependent variable. You also have multiple independent variables in regression analysis. You undertake efforts to find out the impact of independent variables on the dependent variable. The values of both independent and dependent variables are assumed as being ascertained in an error-free random manner.
  • Frequency tables: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Analysis of variance: The statistical procedure is used for testing the degree to which two or more vary or differ in an experiment. A considerable degree of variation means research findings were significant. In many contexts, ANOVA testing and variance analysis are similar.
  • Researchers must have the necessary research skills to analyze and manipulation the data , Getting trained to demonstrate a high standard of research practice. Ideally, researchers must possess more than a basic understanding of the rationale of selecting one statistical method over the other to obtain better data insights.
  • Usually, research and data analytics projects differ by scientific discipline; therefore, getting statistical advice at the beginning of analysis helps design a survey questionnaire, select data collection  methods, and choose samples.

LEARN ABOUT: Best Data Collection Tools

  • The primary aim of data research and analysis is to derive ultimate insights that are unbiased. Any mistake in or keeping a biased mind to collect data, selecting an analysis method, or choosing  audience  sample il to draw a biased inference.
  • Irrelevant to the sophistication used in research data and analysis is enough to rectify the poorly defined objective outcome measurements. It does not matter if the design is at fault or intentions are not clear, but lack of clarity might mislead readers, so avoid the practice.
  • The motive behind data analysis in research is to present accurate and reliable data. As far as possible, avoid statistical errors, and find a way to deal with everyday challenges like outliers, missing data, data altering, data mining , or developing graphical representation.

LEARN MORE: Descriptive Research vs Correlational Research The sheer amount of data generated daily is frightening. Especially when data analysis has taken center stage. in 2018. In last year, the total data supply amounted to 2.8 trillion gigabytes. Hence, it is clear that the enterprises willing to survive in the hypercompetitive world must possess an excellent capability to analyze complex research data, derive actionable insights, and adapt to the new market needs.

LEARN ABOUT: Average Order Value

QuestionPro is an online survey platform that empowers organizations in data analysis and research and provides them a medium to collect data by creating appealing surveys.


Zara Customer Journey

The Zara Customer Journey and The Retail Experience

Sep 4, 2023

Explore the comprehensive guide to Synthetic Data. Understand its types, methods, and use cases for advanced data analysis and more.

Synthetic Data: What is it, Types, Methods + Usage

Stripe Customer Experience

The Stripe Payment Experience: Achieving Exceptional CX

Sep 1, 2023

One of the most intriguing aspects for leaders is the concept of blind spots in leadership and how it impacts our decision-making.

Navigating Blind Spots in Leadership with 360 Feedback Surveys

Aug 31, 2023

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open Access
  • Published: 04 September 2023

Research on industrial carbon emission prediction and resistance analysis based on CEI-EGM-RM method: a case study of Bengbu

  • Dawei Dai 1 ,
  • Biao Zhou 2 ,
  • Shuhang Zhao 3 ,
  • Kexin Li 2 &
  • Yuewen Liu 2  

Scientific Reports volume  13 , Article number:  14528 ( 2023 ) Cite this article

Metrics details

  • Environmental sciences
  • Environmental social sciences

This paper focuses on the development trend of industrial carbon emissions in Bengbu city, Anhui Province in the next ten years, and how to help the industry reach the carbon peak as soon as possible. The research process and conclusions are as follows: (1) Through literature review and carbon emission index method, five main factors affecting industrial carbon emission are identified. (2) The resistance model is used to analyze the main resistance factors of industrial carbon emission reduction in Bengbu city. (3) Based on the existing data of Bengbu city from 2011 to 2020, the grey prediction EGM (1,1) model is used to predict the industrial carbon emissions of Bengbu city from 2021 to 2030. The results show that among the five factors, the urbanization rate has the most significant impact on industrial carbon emissions, while energy intensity has the least impact. Bengbu’s industrial carbon emissions will continue to increase in the next decade, but the growth rate will be flat. Based on the findings of the analysis, specific recommendations on urbanization development, energy structure, and industrial structure of Bengbu city are put forward.


Since the industrial revolution, human activities, especially the industrialization process of developed countries, have consumed a large amount of fossil energy, leading to a rapid increase in greenhouse gas emissions. Global warming is one of the biggest challenges facing the world in the twenty-first century. To cope with climate change and promote the building of a community with a shared future for humanity, all countries need to reduce greenhouse gas emissions jointly. According to the World Meteorological Organization (WMO), the Earth is now nearly one degree Celsius warmer than before industrialization began. On this trend, global temperatures will be 3 to 5 °C above pre-industrial levels by 2100 1 . If no action is taken, climate change will have a severe impact on economic and social development at current trends. Moreover, climate issues have large-scale spatial and temporal externalities that require a coordinated global response.

China is the world’s largest industrial country and carbon emitter 2 , 3 . Since the reform and opening up, China’s economic and social development has achieved remarkable results, but it has also led to the continuous growth of carbon emissions, causing significant damage to the natural environment. On September 22, 2020, the Chinese government solemnly pledged at the 75th session of the United Nations General Assembly that China would increase its intended nationally determined contributions (INDC), strive to peak its carbon dioxide emissions by 2030, and achieve carbon neutrality by 2060. Cities are essential parts of China’s carbon emissions, the main carrying space of carbon sink function, and the critical administrative units for implementing of dual carbon goals and policies. In November 2021, the National Development and Reform Commission (NDRC) issued the implementation plan for high-quality development of industrial transformation and upgrading demonstration zones in old industrial cities and resource-based cities during the 14th Five-Year Plan Period to support cities in promoting industrial restructuring and green and low-carbon transition, leading the revitalization and development of old industrial cities and resource-based cities nationwide 4 . On October 16, 2022, the report of the 20th National Congress of the Communist Party of China emphasized again the promotion of green development, and actively yet prudently promoting carbon peaking and carbon neutrality 5 . So, the transformation and development of industrial cities is imperative.

As the sector with the most significant CO 2 emission, which factors will affect industrial carbon emission intensity is a scientific proposition that must be understood to achieve structural emission reduction. In recent years, especially since the Chinese government put forward the goal of peaking carbon emissions by 2030, many experts and scholars in the academic circle have conducted much research on carbon emissions and peak prediction of industrial industries. The main focus is on the following two aspects: identification of factors affecting industrial carbon emissions and prediction of carbon peaks.

The identification of factors influencing industrial carbon emissions is a focus of academic concern. In this regard, domestic and foreign scholars have conducted a large number of studies on the factors affecting CO 2 emissions, and found that they are mainly affected by economic growth 6 , 7 , 8 , energy efficiency 6 , 9 , carbon emission intensity 10 , 11 , industrial structure 8 , 12 , and urbanization rate 8 , 13 . In the prediction of industrial carbon peaking, standard research methods include gray prediction 14 , scenario analysis method 15 , and BP neural network 16 . The research perspective is mainly from the national overall macro perspective 15 , 17 , provincial perspective 18 , 19 , and regional 20 , 21 . Meanwhile, most research targets focus on resource-based cities 22 , 23 , 24 or low-carbon pilot cities 25 , 26 , 27 .

In summary, it is not difficult to find many studies on industrial carbon emissions and carbon peaking in existing literature, and relatively affluent research results have been achieved. However, there are still the following shortcomings. On the one hand, in constructing of the evaluation index system of influencing factors of industrial carbon emissions, few scholars have comprehensively evaluated it from four dimensions: population, structure, economy, and technology. On the other hand, the realization of the national energy conservation and emission reduction strategy goals cannot be achieved without the achievement and support of the carbon emission indicators of each city. Unfortunately, the current research on carbon emissions is mainly from the macro perspective of China’s overall region or the perspective of provinces, with insufficient attention paid to small and medium-sized cities, especially industrial cities.

Therefore, to make up for this deficiency, this paper selects Bengbu city, an essential comprehensive industrial base in Anhui Province, as the research object. Based on previous research, a thorough evaluation index system for industrial carbon emissions is constructed from four aspects: population, structure, economy, and technology. Combining the carbon emission index (CEI) method and resistance model to thoroughly analyze the influencing factors and resistance factors of carbon emission reduction, the EGM(1,1) model and linear regression equation model are applied to predict the future carbon emission trend accurately. Furthermore, the analysis results are used to explore the realization path of carbon emission reduction in Bengbu city.

Bengbu city is an essential old industrial base in Anhui Province with a relatively complete industrial system focusing on light textile, heavy industry and chemical industry. At the same time, it is also a provincial leading demonstration city of energy efficiency. In recent years, influenced by factors such as slow industrial transformation and upgrading and green development strategy, the economic growth of Bengbu city has stalled. In 2020, the industrial GDP increased by 0.03% compared to the previous year, but the industrial carbon emission remained at a relatively high level (See the Results section for detailed calculations). Therefore, it is vital for Bengbu city to coordinate the relationship between economic development and carbon emission by studying the current situation and future trends of industrial carbon emission in Bengbu city and exploring the path of green transformation for Bengbu city. In view of this, we study its carbon emission resistance factor and carbon emission trend. Since it will be less than ten years for China to reach the carbon emission peak, this paper focuses on predicting whether the industry can achieve the carbon emission peak before 2030, and regulating which factors can promote the smooth realization of the target. Finally, according to the analysis results, relevant targeted recommendations are provided for Bengbu’s carbon emission reduction, and reference for other similar industrial cities’ carbon emission reduction actions.

Materials and methods

Research area.

Bengbu city is located east of China, west of the Yangtze River Delta, and north of Anhui Province (Fig.  1 ), with a total area of 5,951 square kilometers. It is an important comprehensive industrial base in Anhui Province with a relatively complete industrial system focusing on light textile, heavy industry, and chemical industry. In 2020, the city’s permanent population was 3.3 million, with an urbanization rate of 41%. There are 1,061 industrial enterprises above designated size, with a total industrial output value of 67.46 billion yuan, accounting for 32% of Bengbu’s GDP. Meanwhile, industry accounts for nearly 50% of total carbon emissions. In conclusion, industry has a very important role in effectively achieving carbon emission reduction in Bengbu city, which needs further analysis.

figure 1

Location map of Anhui Province and Bengbu city in China. (a) Location map of Anhui Province in China. (b) Location map of Bengbu city in Anhui Province.

Data sources

The data in this paper mainly come from Bengbu Statistical Yearbook, and the carbon emissions are calculated using the carbon emissions measurement method in IPCC National Greenhouse Gas Inventory Guide. The data of Bengbu city from 2011 to 2020 are selected as the basic research data, and other relevant data are calculated by CEI method and resistance model, predicted by grey prediction model and linear regression equation model.

Calculation method of industrial carbon emissions

The primary source of carbon dioxide emission is the combustion of fossil energy. China’s energy carbon dioxide emission occupies an absolute proportion of total carbon emission. Therefore, the carbon emission of Bengbu city measured in this paper specifically refer to the carbon emission generated by the combustion of fossil energy.

According to the general method of IPCC guidelines, this paper calculates the carbon emissions generated when fossil energy is burned, as Eq. ( 1 ).

\({C}_{i}\) is the carbon emissions of industry I (ten thousand tons); \({B}_{j}\) is the energy consumption after discounting standard coal; \({c}_{j}\) is the carbon emission factor; \(j\) is the type of energy.

The coal coefficient and carbon emission coefficient of each energy are shown in Table 1 .

The carbon emission index method is used to determine the weight of indicators, which can avoid subjective factors in the weighting 28 . The specific calculation steps are as follows:

Step 1: Data standardization. Calculated by Eqs. ( 2 ) and ( 3 ).

where, aij is the JTH index of the ITH year.

Step 2: Calculate the proportion of item j in the ITH year, namely Eq. ( 4 ).

Step 3: Calculate the entropy value of the JTH index. The formula is shown in Eq. ( 5 ).

Step 4: Calculate the coefficient of difference. The formula is as Eq. ( 6 ).

Step 5: Calculate the weight of the ITH index, and the formula is Eq. ( 7 ).

Resistance model

In analyzing of influencing factors of industrial carbon emission, it is more necessary to analyze and diagnose the critical resistance factors in carbon emission problem, to provide specific carbon emission reduction recommendations for the region. Therefore, this paper introduces a resistance model to study the influencing factors of industrial carbon emissions in Bengbu city, and further explore the main resistance factors affecting the peak of industrial carbon emissions.

The resistance model is calculated using three indexes: factor contribution degree, index deviation degree and resistance value 29 . The calculation formula is Eq. ( 8 ).

Among them, the resistance value \({O}_{i}\) represents the influence degree of each index; index deviation \({S}_{i}=1-{X}_{ij}\) , which means the difference between each index and the optimal value.

Grey prediction model EGM (1,1)

In 1982, to effectively solve the problem of small data, uncertainty system analysis and prediction, Professor Deng put forward the grey system theory, which has been widely used in many fields. Grey system theory is mainly based on the known part of the information to study and extract, correctly describe the law of system evolution, to predict future changes. At present, grey system theory has been successfully applied to many field and can solve problems in particular areas with unknown factors. It takes the uncertain system with some information known, some information unknown, small samples, and poor information as the research object. Meanwhile, it can realize the accurate description of the system operation behavior and evolution law, has the advantages of simple operation, high precision and easy to test, and is often used for energy index prediction 30 . Grey prediction model EGM(1,1) is one of the most widely used grey system models and has a good prediction effect for data samples with fewer years. This paper uses ten years of data on industrial carbon emissions in Bengbu city to make predictions, so the prediction results is effective using this method.

The construction method of EGM (1,1) is as follows:

Step 1: Assume that the original data series is Eq. ( 9 ).

Using a single additive generation operation, Sequence \({X}^{(0)}\) can be generated into a new sequence \({X}^{(1)}\) (shown in Eq. 10 ).

Step 2: Using the new sequence \({X}^{(1)}\) obtained, the general form of EGM (1,1) model is established, which is described by Eq. ( 11 ).

a and b are correlation coefficients, which can be obtained by fitting the least square method.

Step 3: Build a grey prediction model: the solution of the differential equation can be obtained by Eq. ( 12 ).

Step 4: the reduced value of the original data is obtained by Eq. ( 13 ).

Weight analysis of influencing factors

In recent years, many scholars have conducted relevant studies on the influencing factors of industrial carbon emissions. According to the actual situation of Bengbu city and based on previous studies, the availability of data is considered. Through literature review, industrial GDP, energy intensity, urbanization rate, proportion of secondary industry and carbon dioxide intensity were selected as indicators of industrial carbon emissions in Bengbu city from four aspects: population, structure, economy and technology (Table 2 ). And the descriptive statistics of each indicator are shown in Table 3 .

Through the data collection from the Statistical Yearbook of Anhui Province and Bengbu Statistical Yearbook from 2011 to 2020, the relevant data are substituted into the calculation formula ( 1 ) of industrial carbon emissions to obtain the industrial carbon emissions of Bengbu city. Basic data are shown in Table 4 . The weights of each index are obtained by standardizing and non-dimensionalizing the data of the five major evaluation indexes. The calculation results are shown in Table 5 . According to the weight of each factor, the influencing factors of industrial carbon emissions in Bengbu city from large to small are urbanization rate, the proportion of secondary industry, industrial GDP, carbon dioxide intensity, and energy intensity.

Analysis of resistance factors

According to the analysis results of CEI method, the resistance model is introduced to analyze the resistance degree of each influencing factor. The resistance analysis of the influencing factors of industrial carbon emissions in Bengbu city from 2011 to 2020 is carried out, and ranked according to the resistance value. The calculation results are shown in Fig.  2 . It can be discovered that the frequency of urbanization rate as the main resistance factor in the past ten years is eight, indicating that the accelerating urbanization process has a severe impact on the carbon emission reduction of Bengbu city, aggravating the increase of carbon emissions and bringing tremendous pressure to the environment. The relationship between industrial GDP and the proportion of secondary industry is relatively strong, indicating that as the economy develops quickly, industrial GDP is increasing and the share of secondary industry is relatively high. Nonetheless, the secondary industry consumes a lot more energy than the other two industries, which makes it difficult to reduce carbon emissions. Compared with the first three factors, energy intensity and carbon dioxide intensity have little influence, but they are still vital resistance factors, indicating that the energy use efficiency of Bengbu city needs to be improved.

figure 2

Chart of annual proportion of each index.

Analysis of grey prediction EGM (1,1)

To ensure the reliability of data prediction, this paper uses MATLAB R2017a software to simulate and predict the carbon emissions of Bengbu from 2011 to 2030. At the same time, the linear regression equation model is used to compare with this method, and the residual and relative error of the predicted and actual industrial carbon emissions of Bengbu from 2011 to 2020 are summarized as shown in Table 6 . It is found that the relative error is similar to that of the linear regression prediction model. Then, using the data from 2011 to 2019 as the fitting data the grey prediction model and linear regression model are used to predict the carbon emissions in 2020, which are 340.91 and 344.24 million tons, respectively. So, no matter the linear regression model or the grey prediction model, the error between the calculated value and the actual value is very small. Meanwhile, the posterior difference test method was used to test the model’s progress. The calculation accuracy table of the prediction model is shown in Table 7 . The mean square deviation C  = 0.3143, P  = 1, and the model accuracy is level 1, indicating a good level of simulation accuracy. In addition, the advantage of the grey prediction model is that it can forecast with time series as the independent variable. In this paper, the original data series of Bengbu city has been showing an upward trend. That is, it displays a certain rule, and the prediction can be made when the grey prediction model meets four data series.

The results of EGM (1,1) simulation and prediction of industrial carbon emissions in Bengbu from 2011 to 2030 are shown in Fig.  3 . It is worth noting that Bengbu’s industrial carbon emissions have been rising rapidly before 2019, and the growth rate began to slow down in 2019 and showed a downward trend in 2020. This indicates that the industrial transformation and green development strategy of Bengbu city have achieved certain results in recent years, and carbon emissions have been reduced.

figure 3

Grey prediction of industrial carbon emission in Bengbu city.

According to the calculation results of CEI method and resistance model, urbanization rate has the most significant impact on industrial carbon dioxide emissions, the proportion of secondary industry has a greater effect on industrial carbon dioxide emissions, and energy intensity has a relatively small impact, which is consistent with the conclusion of references 8 , 9 , 12 , 13 . This shows that urbanization construction is a severe problem facing Bengbu city. With the continuous expansion of urbanization scale, the increasing urban population has brought massive pressure to resources and environment. It is the main direction of the low-carbon development of Bengbu city to advocate the concept of energy-saving development and accelerate the quality of urbanization construction. Although energy intensity has less impact on industrial carbon emissions than urbanization rate and the proportion of secondary industry, it is still an essential means to reduce industrial carbon emissions. Low-carbon development is still inseparable from the reduction of energy intensity, the adjustment of energy structure, and the optimization of industrial structure.

According to the grey prediction results, if the current population policy, economic growth policy and energy consumption structure are maintained, the industrial carbon emissions of Bengbu city will keep growing in the next decade. Although the growth rate is relatively slow, it will not reach its peak before 2030. This is also one of the characteristics of grey prediction model, which only shows a single trend. China’s industry occupies a large proportion of the national economy, and it is difficult to change the existing industrial structure and energy consumption mode in a short period, especially the dependence on coal resources. Such results are incredibly unfavorable to China and the global climate and environment. Through macro control and technological innovation, Bengbu’s industrial carbon emissions have been reduced in recent years, but there is still a big gap with China’s double carbon target, and continued efforts are needed in all aspects.

This study also has some limitations that can be further improved in future studies. On the one hand, this study only predicts carbon emissions under one scenario, and the simulation prediction under multiple scenarios can be carried out in the follow-up study using the scenario analysis method. On the other hand, since China is now vigorously promoting digital transformation, the subsequent research can be conducted from the perspective of the impact of digital economy on carbon emission reduction in industrial cities, and study how digital economy can empower carbon emission reduction in industrial cities to achieve low-carbon digital transformation in cities.

Conclusions and recommendations

In this paper, we collected industrial energy-related data from 2011 to 2020 in Bengbu, an old industrial city, and analyzed the influencing factors and main resistance factors of industrial carbon emissions from five aspects: urbanization rate, industrial GDP, energy intensity, CO 2 intensity, and the proportion of secondary industry, using CEI method and resistance model. Based on the industrial carbon emission data of Bengbu city from 2011 to 2020, the EGM (1,1) model and MATLAB 2017Ra software are used to predict the industrial carbon emission of Bengbu city from 2021 to 2030. The results show that industrial carbon emissions in Bengbu city will continue to increase in the next ten years, with a gradual growth rate, but will not reach a peak before 2030. Among the influencing factors, the urbanization rate has the most significant impact on industrial carbon emissions in Bengbu city, the secondary industry has a greater impact, and energy intensity has the least impact on industrial carbon emissions. Based on the main resistance factors of carbon emission, some recommendations for carbon emission reduction in old industrial cities are put forward, to ensure that can achieve carbon peak before 2030 (Fig.  4 ).

figure 4

Shortages, demands and pathways of industrial cities to achieve carbon peak.

First, the pace of urbanization should be reasonably controlled to improve the quality of urbanization. According to the analysis results of CEI method, urbanization rate has the greatest impact on industrial carbon emissions. Therefore, in future urbanization construction, on the one hand, we should adhere to the new urbanization construction with people as the core and urban–rural integration, ecological livability and harmonious development as the main features. In the process of new urbanization construction, we should vigorously promote the construction of low-carbon cities and focus on urban green development. On the other hand, it should give full play to the reverse effect of carbon peaking, and guide the process of new urbanization by developing green low-carbon technologies and implementing energy-saving and emission reduction policies, to ultimately promote a higher level of coordinated development of the two.

Second, adjust the energy consumption structure and increase the use of green energy. Energy consumption in Bengbu is dominated by coal, and the proportion of clean energy consumption, such as natural gas, hydropower, nuclear power, and wind power, is relatively low. It can be seen that adjusting the energy consumption structure is also the key to achieving carbon peak and carbon neutrality in industrial cities. Therefore, in the future, the proportion of renewable energy consumption should be increased, and clean energy such as solar energy, wind energy, and hydrogen energy should be fully utilized. Increase the research of clean technology of coal, oil and natural gas, and gradually change renewable energy from supplementary energy to mainstream one. Optimize energy utilization technology and improve energy efficiency to reduce carbon dioxide intensity. Increase the development of renewable and clean energy, and reduce the share of coal in energy consumption.

Finally, optimize the industrial structure to help traditional industries upgrade. According to the analysis results of the resistance model, it is known that the proportion of secondary industry has a greater impact on industrial carbon emissions, and the energy intensity of secondary industry is much higher than that of primary and tertiary sectors. Therefore, industrial cities should actively optimize the industrial structure and vigorously develop advanced manufacturing, photovoltaic industry, and low-energy modern service industry. Increase the construction of comprehensive digital infrastructure such as artificial intelligence and industrial Internet to promote the automation, informatization, digitalization, and intelligent transformation of traditional industries. Furthermore, utilize digital technology to green and modernize industries and further promote the development of advanced industrial structures.

Data availability

The data used for this study could be made available on request with corresponding author.

China Meteorological Administration. There is a 50% chance that global temperatures will rise by 1.5 °C over pre-industrial levels in the next five years. http://www.cma.gov.cn/2011xwzx/2011xqxxw/2011xqxyw/202205/t20220517_4833511.html . Accessed 2 Feb 2023 (2023).

Gao, P., Yue, S. & Chen, H. Carbon emission efficiency of China’s industry sectors: From the perspective of embodied carbon emissions. J. Clean. Prod. 283 , 124655 (2021).

Article   CAS   Google Scholar  

Yang, D. H. & Hu, Y. M. The impacts of input digitalization on China’s industrial carbon emission intensity: An empirical analysis. Urban Environ. Stud. 34 , 77–93 (2022).

Google Scholar  

Chinese government website. The implementation plan of the 14th five-year plan to support the high-quality development of industrial transformation and upgrading demonstration zones in old industrial cities and resource-based cities. https://www.gov.cn/xinwen/2021-12/07/content_5658016.htm . Accessed 5 July 2023 (2023).

Chinese government website. Xi jinping proposes to promote green development and harmonious coexistence between man and nature. http://www.gov.cn/xinwen/2022-10/16/content_5718812.htm . Accessed 10 Feb 2023 (2023).

Xu, B. & Lin, B. Q. Investigating spatial variability of CO 2 emissions in heavy industry: Evidence from a geographically weighted regression model. Energy Policy 149 , 112011 (2021).

Zhang, J. et al. Measuring energy and environmental efficiency interactions towards CO 2 emissions reduction without slowing economic growth in central and western Europe. J. Environ. Manag. 279 , 111704 (2021).

Article   Google Scholar  

Sun, W. & Huang, C. Predictions of carbon emission intensity based on factor analysis and an improved extreme learning machine from the perspective of carbon emission efficiency. J. Clean. Prod. 338 , 130414 (2022).

Wen, H. X., Chen, Z., Yang, Q., Liu, J. Y. & Nie, P. Y. Driving forces and mitigating strategies of CO 2 emissions in China: A decomposition analysis based on 38 industrial sub-sectors. Energy 245 , 123262 (2022).

Fang, G. C., Gao, Z. Y., Tian, L. X. & Fu, M. What drives urban carbon emission efficiency?—Spatial analysis based on nighttime light data. Appl. Energy. 312 , 118772 (2022).

Rahman, M. M., Sultana, N. & Velayutham, E. Renewable energy, energy intensity and carbon reduction: Experience of large emerging economies. Renew. Energy 184 , 252–265 (2022).

Su, Y. L., Leng, C. X. & Jiang, Y. Y. Research on current situation and influencing factors of industrial carbon emission in Shaanxi province. J. Xi’an Univ. Finance Econ. 33 , 58–65 (2020).

Hu, J. B., Zhao, K. & Yang, Y. H. Prediction and control factors of industrial carbon emission peaking in China—Empirical analysis based on BP-LSTM neural network model. Guizhou Soc. Sci. 381 , 135–146 (2021).

Yu, Y. et al. To what extent can clean energy development advance the carbon peaking process of China?. J. Clean. Prod. 412 , 137424 (2023).

Wang, Y., Bi, Y. & Wang, E. D. Scene prediction of carbon emission peak and emission reduction potential estimation in Chinese industry. China Popul. Resour. Environ. 27 , 131–140 (2017).

Ran, Q. et al. When will China’s industrial carbon emissions peak? Evidence from machine learning. Environ. Sci. Pollut. Res. 30 , 57960–57974 (2023).

Yu, X., Lou, F. & Tan, C. A simulation study of the pathway of achieving the dual carbon goals in China’s industrial sectors based on the CIE-CEAM model. China Popul. Resour. Environ. 32 , 49–56 (2022).

Wang, S. B., Zhuang, G. Y. & Dou, X. M. Tiered division of peak carbon emissions and differentiated emission paths among Provinces in China-based on the dual perspectives of carbon emissions and economic development. Wuhan Univ. J. Philos. Soc. Sci. 76 , 136–150 (2023).

Jiang, H. Q., Li, Y. X., Chen, M. M. & Shao, X. X. Prediction and realization strategy of the carbon peak of the industrial sector in Zhejiang Province under the vision of carbon neutrality. Areal Res. Dev. 41 , 157–161 (2022).

Zou, X. Q., Sun, X. C., Ge, T. Y. & Xing, S. Carbon emission differences, influence mechanisms and carbon peak projections in Yangtze river delta region. Resour. Environ. Yangtze Basin 32 , 548–557 (2023).

Chen, N. & Zhuang, G. Y. Study on the critical path of regional carbon emission peak in China—A case study of the C-type region a round Bohai Sea. J. China Univ. Geosci. (Soc. Sci. Ed.) 23 , 81–95 (2023).

Yan, D., Kong, Y., Ren, X. H., Shi, Y. K. & Chiang, S. W. The determinants of urban sustainability in Chinese resource-based cities: A panel quantile regression approach. Sci. Total Environ. 686 , 1210–1219 (2019).

Article   ADS   CAS   PubMed   Google Scholar  

Wang, J. M. & Yu, Z. L. Multiple heterogeneity of carbon emission reduction driven by low-carbon technology progress in resource-based cities. China Popul. Resour. Environ. 32 , 156–170 (2022).

Wen, Q., Hou, K. Y., Zheng, D. Y. & Yang, R. L. Evaluation of industrial transformation capability and optimization path of growing resource-based cities: A case study of Yulin, China. Sci. Geogr. Sin. 42 , 682–691 (2022).

Zeng, S. B., Jin, G., Tan, K. Y. & Liu, X. Can low-carbon city construction reduce carbon intensity? Empirical evidence from low-carbon city pilot policy in China. J. Environ. Manag. 332 , 117363 (2023).

Qiu, S. L., Wang, Z. L. & Liu, S. The policy outcomes of low-carbon city construction on urban green development: Evidence from a quasi-natural experiment conducted in China. Sustain Cities Soc. 66 , 102699 (2021).

Wen, S. Y., Jia, Z. J. & Chen, X. Q. Can low-carbon city pilot policies significantly improve carbon emission efficiency? Empirical evidence from China. J. Clean. Prod. 346 , 131131 (2022).

Lu, M., Xu, H. & Chen, F. X. Pollution and carbon reduction effects of the carbon emissions trading mechanism in the context of the ‘dual carbon’ goals. China Popul. Resour. Environ. 32 , 121–133 (2022).

Shen, Z. J., Wang, J. Y., Yang, K. Y. & Liu, J. Y. Coupling coordination relationship between new urbanization and low-carbon development in Shandong Province. Urban Probl. 328 , 94–103 (2022).

Wu, W. Z. & Zhang, T. Improvement and application of GM (1, 1) model. Stat. Decis. 35 , 15–18 (2019).

Xiong, P. P., Cao, S. R. & Yang, Z. Grey correlation analysis of carbon emissions in East China. J. Dalian Univ. Technol. Soc. Sci. 42 , 36–44 (2021).

Wen, S. B. & Liu, H. M. Research on energy conservation and carbon emission reduction effects and mechanism: Quasi-experimental evidence from China. Energy Policy 169 , 113180 (2022).

Liu, Y. & Liu, H. B. Characteristics, influence factors, and prediction of agricultural carbon emissions in Shandong province. Chin. J. Eco-Agric. 30 , 558–569 (2022).

Liu, Y. X. & Deng, X. R. An empirical study on the influencing factors of carbon emission in China: Based on fixed effect panel quantile regression model. J. Shanxi Univ. Philos. Soc. Sci. Ed. 44 , 86–96 (2021).

He, Y. Y. & Wei, Z. X. The relationship between industrial carbon emissions and economic growth: A validated analysis based on the decoupling between speed and quantity. J. Nat. Sci. Hunan Norm. Univ. 44 , 19–29 (2021).

Tang, S., Fu, J. W. & Wu, J. L. Analysis of influencing factors of carbon emission in typical cities of China. Stat. Decis. 37 , 59–63 (2021).

Ahmadi, Y., Yamazaki, A. & Kabore, P. How do carbon taxes affect emissions? Plant-level evidence from manufacturing. Environ. Resour Econ. 82 , 285–325 (2022).

Guo, Y. J., Zhu, Y. L. & Zhang, Y. Q. Study on the mechanism and spatial characteristics of technological progress on industrial carbon emission intensity: An empirical analysis based on panel quantile regression. Enterp. Econ. 39 , 71–78 (2020).

Liu, Z., Wang, Z. L. & Yuan, C. J. Impact of independent technological innovation on industrial carbon emissions and trend prediction from the perspective of structure. China Popul. Resour. Environ. 32 , 12–21 (2022).

Download references

This paper is supported by the Key Projects of Research Topic on Innovative Development of Social Sciences in Anhui Province (2022CX525), the Mining Enterprise Safety Management of Humanities and Social Science Key Research Base in Anhui Province (MF2022003), the Special Topic of Spirit Research and Interpretation of the Sixth Plenary Session of the 19th CPC Central Committee of AUST (sjjlzqh2021-15), the Postgraduate Innovation Fund Project of AUST (2021CX1013) and the Scientific Research Education Demonstration Project of AUST (KYX202123).

Author information

Authors and affiliations.

State Key Laboratory of Mining Response and Disaster Prevention and Control in Deep Coal Mines, Anhui University of Science and Technology, Huainan, Anhui, China

School of Humanities and Social Sciences, Anhui University of Science and Technology, Huainan, Anhui, China

Biao Zhou, Kexin Li & Yuewen Liu

School of Economics and Management, Anhui University of Science and Technology, Huainan, Anhui, China

Shuhang Zhao

You can also search for this author in PubMed   Google Scholar


D.D., conceptualization, literature analysis and writing—review; B.Z., software, literature search, visualization and writing—original draft; S.Z., software, writing— review and editing; K.L., data curation; Y.L., check typography and format. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Shuhang Zhao .

Ethics declarations

Competing interests..

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and Permissions

About this article

Cite this article.

Dai, D., Zhou, B., Zhao, S. et al. Research on industrial carbon emission prediction and resistance analysis based on CEI-EGM-RM method: a case study of Bengbu. Sci Rep 13 , 14528 (2023). https://doi.org/10.1038/s41598-023-41857-0

Download citation

Received : 23 March 2023

Accepted : 01 September 2023

Published : 04 September 2023

DOI : https://doi.org/10.1038/s41598-023-41857-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

case study research data analysis

  • Environmental Topics
  • Laws & Regulations

Science Inventory

You are here:.

EPA Home » Science Inventory » BMD Analysis of Multiple Endpoints in Human Health Risk Assessment: Chloroprene Case Study

BMD Analysis of Multiple Endpoints in Human Health Risk Assessment: Chloroprene Case Study

Farrar, D., T. Blessinger, Allen Davis, Jeff Gift, M. Wheeler, AND Lily Wang. BMD Analysis of Multiple Endpoints in Human Health Risk Assessment: Chloroprene Case Study. Society of Environmental Toxicology and Chemistry Europe Annual Meeting, Dublin, IRELAND, April 30 - May 04, 2023.


 Due to the challenges in utilizing current BMD methods to analyze and interpret toxicity data with multiple outcomes and complex relationships for risk assessment, such as wide ranges and high variability resulting from analysis of a single endpoint at a time, there is an increasing interest and need for the development of advanced BMD for integrated analysis of multiple outcomes simultaneously in a single analysis. To promote scientific exchanges and communications on BMD method development and application, 2023 European SETAC has organized a session focusing on “Advanced Multivariate Benchmark Dose (BMD) Approach for the Application of High-throughput Toxicology Data to Chemical Risk Assessment”, and I have been selected as the primary chair to host session and present as well. The major goal of this session is to present advanced and effective multivariate BMDs that can simultaneously model all the dose-response data obtained from an in vivo traditional animal study or in vitro high throughput toxicity screening in a single analysis for more robust data analysis results and valuable biological and toxicological information to support risk assessment decision-making. As a professional in the fields of environmental toxicity data analysis and human health risk assessment, I would like to take this great opportunity to learn new techniques and methods and advances in the fields to expand my knowledge and skills and find solutions to the problems we are facing in the current research, such as HERA for advanced multivariate BMD modeling approach I am currently leading. The conference will also give me the opportunity to talk to conference scientist one-on-one for their advice on how to enhance our own work. I will have the opportunity to ask presenters and session participants questions about their work and the rationale behind it, which I can’t do when reading journal articles  


Human health risk assessment often evaluates multiple endpoint variables in order to form a comprehensive picture of the toxicity effects of an environmental chemical or toxicant. Benchmark dose (BMD) modeling is a flexible method that takes the shape of the dose-response curve and important measures of uncertainty and variability in the data into account. While BMD modeling is an improvement over previous (e.g., NOAEL) methods, existing BMD models are designed to evaluate single or independent endpoints, not multiple related endpoints simultaneously. Therefore, development of an advanced BMD approach for efficient analysis of multiple related endpoints could be beneficial to risk assessment. This presentation will present our research in univariate, ordinal and multivariate analysis of multiple endpoints including simultaneous analysis of multiple correlated endpoints using environmental chemical toxicity data. Our results demonstrate that multivariate BMD approaches produced comparable and stable BMDLs in a single analysis, with less variability and more robust estimation of adverse health outcomes. In addition, this presentation will also include the major approaches that are currently used in multivariate modeling of multiple endpoints for human health risk assessment, and the challenges in applying these models, such as model choice and model averaging, clustered, Bayesian versus likelihood or frequentist methods, and the analysis of aggregate versus individual data.    (The information in this Abstract has been subjected to review by the Center for Public Health and Environmental Assessment and approved for presentation.  Approval does not signify that the contents reflect the views of the Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use.)

Record Details:


  1. 11: Principle data analysis scheme of this case study

    case study research data analysis

  2. Qualitative Case Study Research Example

    case study research data analysis

  3. Case Study Qualitative Research Example

    case study research data analysis

  4. How to Customize a Case Study Infographic With Animated Data

    case study research data analysis

  5. Session 3 Research Methods

    case study research data analysis

  6. Single case study research

    case study research data analysis


  1. What are data analysis methods

  2. Case Study Research in Software Engineering

  3. Qualitative Approach

  4. case study in research methodology

  5. Research Methodology

  6. Research Methodology and Data Analysis-Refresher Course


  1. Qualitative case study data analysis: an example from practice

    Qualitative case study data analysis: an example from practice This paper illustrates specific strategies that can be employed when conducting data analysis in case study research and other qualitative research designs.

  2. Case Study Method: A Step-by-Step Guide for Business Researchers

    Case study method is the most widely used method in academia for researchers interested in qualitative research ( Baskarada, 2014 ). Research students select the case study as a method without understanding array of factors that can affect the outcome of their research.

  3. Four Steps to Analyse Data from a Case Study Method

    case study is one of the many qualitative and quantitative methods that can be adopted to collect data for research. Such methods represent part of what is referred to as the research strategy that details the design and data collection approaches to be used in the research (Fowler and Mangione, 1990).

  4. What Is a Case Study?

    Step 1: Select a case Step 2: Build a theoretical framework Step 3: Collect your data Step 4: Describe and analyze the case Other interesting articles When to do a case study A case study is an appropriate research design when you want to gain concrete, contextual, in-depth knowledge about a specific real-world subject.

  5. PDF Analyzing Case Study Evidence

    128CASESTUDYRESEARCH Tip:How do I start analyzing my case study data? Youmightstartwith questions(e.g.,thequestions inyourcasestudyprotocol) ratherthanwiththedata.Startwitha smallquestionfirst,thenidentifyyour evidencethataddressesthequestion.

  6. Doing Data Science: A Framework and Case Study

    Case Study Application—Data Governance and Ingestion. Selected variables from data sources in Table 2 were profiled and cleaned (indicated by the asterisks). Two unique sets of data requiring careful governance were discovered and included in the study. ... The appropriate statistical analysis is a function of the research question, the ...

  7. PDF Comparing the Five Approaches

    interviews in phenomenology, multiple forms in case study research to provide the in-depth case picture). At the data analysis stage, the differences are most pronounced. Not only is the distinction one of specificity of the analysis phase (e.g., grounded the-ory most specific, narrative research less defined) but the number of steps to be under-

  8. Writing a Case Analysis Paper

    Case analysis is a problem-based teaching and learning method that involves critically analyzing complex scenarios within an organizational setting for the purpose of placing the student in a "real world" situation and applying reflection and critical thinking skills to contemplate appropriate solutions, decisions, or recommended courses of action.


    Besides discussing case study design, data collection, and analysis, the refresher addresses several key features of case study research. First, an abbreviated definition of ... As briefly introduced in this chapter, case study research involves systematic data collection and analysis procedures, and case study findings can be generalized to

  10. What the Case Study Method Really Teaches

    Cases expose students to real business dilemmas and decisions. Cases teach students to size up business problems quickly while considering the broader organizational, industry, and societal ...

  11. PDF Kurt Schoch I

    Case study research involves a detailed and intensive analysis of a particular event, situation, orga- nization, or social unit. Typically, a case has a defined space and time frame: "a phenomenon of some sort in a bounded context" (Miles, Huberman, & Saldaña, 2014, p. 28).

  12. Case Study Methodology of Qualitative Research: Key Attributes and

    Case study is a research strategy, and not just a method/technique/process of data collection. 2. A case study involves a detailed study of the concerned unit of analysis within its natural setting. A de-contextualised study has no relevance in a case study research.

  13. Learning to Do Qualitative Data Analysis: A Starting Point

    The types of qualitative research included: 24 case studies, 19 generic qualitative studies, and eight phenomenological studies. Notably, about half of the articles reported analyzing their qualitative data via content analysis and a constant comparative method, which was also commonly referred to as a grounded theory approach and/or inductive ...

  14. UCSF Guides: Qualitative Research Guide: Case Studies

    According to the book Understanding Case Study Research, case studies are "small scale research with meaning" that generally involve the following: The study of a particular case, or a number of cases. That the case will be complex and bounded. That it will be studied in its context. That the analysis undertaken will seek to be holistic.


    As case study research is a flexible research method, qualitative data analysis methods are commonly used [176]. The basic objective of the analysis is, as in any other analysis, to derive conclusions from the data, keeping a clear chain of evidence. The chain of evidence means that a reader ...

  16. Qualitative Case Study Data Analysis: An Example from Practice

    This is especially important in case study research, as data often are large, and from multiple sources of evidence, necessitating a rigorous analysis method to handle the data (Houghton et al ...

  17. Case Study

    A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.

  18. Continuing to enhance the quality of case study methodology in health

    Purpose of case study methodology. Case study methodology is often used to develop an in-depth, holistic understanding of a specific phenomenon within a specified context. 11 It focuses on studying one or multiple cases over time and uses an in-depth analysis of multiple information sources. 16,17 It is ideal for situations including, but not limited to, exploring under-researched and real ...

  19. Case Study

    Revised on 30 January 2023. A case study is a detailed study of a specific subject, such as a person, group, place, event, organisation, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes ...

  20. Research Methods

    Knowledge Base Methodology Research Methods | Definitions, Types, Examples Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design. When planning your methods, there are two key decisions you will make. First, decide how you will collect data.

  21. Planning Qualitative Research: Design and Decision Making for New

    Data collected from a case study or an ethnography can undergo the same types of analyses since the data analysis requires researchers to triangulate the diversity of data. This triangulation strengthens the research findings because "various strands of data are braided together to promote a greater understanding of the case" ( Baxter ...

  22. Data Analysis in Research: Types & Methods

    Data Analysis in Research: Types & Methods What is data analysis in research? Why analyze data in research? Data analysis in qualitative research Finding patterns in the qualitative data Methods used for data analysis in qualitative research

  23. List: Data Analysis: Case Studies

    Dynamic Restaurant Case Study In Microsoft Excel — Hello, once again, aspiring data analysts and scientists! Previously, I have worked with LinkedIn and Netflix datasets in whic

  24. Research on industrial carbon emission prediction and ...

    Research area. Bengbu city is located east of China, west of the Yangtze River Delta, and north of Anhui Province (Fig. 1), with a total area of 5,951 square kilometers.It is an important ...

  25. BMD Analysis of Multiple Endpoints in Human Health Risk Assessment

    Impact/Purpose: Due to the challenges in utilizing current BMD methods to analyze and interpret toxicity data with multiple outcomes and complex relationships for risk assessment, such as wide ranges and high variability resulting from analysis of a single endpoint at a time, there is an increasing interest and need for the development of advanced BMD for integrated analysis of multiple ...

  26. Three ways to integrate social justice into mixed-methods research

    Social justice research needs a purposeful emphasis on justice throughout the research process. This means that justice is not considered only at the end, but throughout the entire research design, from conceptualization to creating tools, data collection, analysis, findings, sharing findings and continuing to engage with communities.

  27. How to Perform Case Study Using Excel Data Analysis

    Steps: Click any data from the dataset. Next, click as follows: Home > Analyze Data. Soon after, you will get an Analyze Data field on the right side of your Excel window. Where you will see different kinds of cases like- Pivot Tables and Pivot Charts. Look, there is a sample Pivot Table of Sales and Profit by Category.