Data Science Methodology, week-3 Assignments

 


Which topic did you choose to apply the data science methodology to? (2 marks)

Hospitals

Next, you will play the role of the client and the data scientist.

Using the topic that you selected, complete the Business Understanding stage by coming up with a problem that you would like to solve and phrasing it in the form of a question that you will use data to answer. (3 marks)

You are required to:

  1. Describe the problem, related to the topic you selected.

  2. Phrase the problem as a question to be answered using data.

For example, using the food recipes use case discussed in the labs, the question that we defined was, "Can we automatically determine the cuisine of a given dish based on its ingredients?".


I am concerned about the rising expense of patient care as a hospital administrator. I'm curious whether there are any factors that are linked to greater patient expenditures. This information could assist me in identifying places where I can cut expenditures without affecting care quality. What factors are associated with increased patient costs? This question can be answered using data from the hospital's electronic health records (EHRs). EHRs provide information regarding patient demographics, illnesses, procedures, drugs, and other aspects that may be associated to patient expenditures. I would conduct a statistical analysis to determine the factors that are most strongly associated with patient costs. This analysis would enable me to find cost-cutting opportunities. I believe that the hospital administrator would benefit from this information. It would assist the administrator in making better judgments about how to distribute resources and increase hospital efficiency.


Briefly explain how you would complete each of the following stages for the problem that you described in the Business Understanding stage, so that you are ultimately able to answer the question that you came up with. (5 marks):

  1. Analytic Approach

  2. Data Requirements

  3. Data Collection

  4. Data Understanding and Preparation

  5. Modeling and Evaluation

You can always refer to the labs as a reference with describing how you would complete each stage for your problem.


1: Analytic Approach: I would need to collect data on patient demographics, diagnoses, procedures, medications, and other factors that could be correlated with patient costs. I would then use statistical analysis to identify the factors that are most strongly correlated with patient costs. 2: Data Requirements: I would need to collect data from the hospital's electronic health records (EHRs). The EHRs contain information about patient demographics, diagnoses, procedures, medications, and other factors that could be correlated with patient costs. 3: Data collection: I would need to access the EHRs through the hospital's data warehouse. I would then format the data by removing any missing values and by converting the data into a format that is suitable for statistical analysis. 

4: Data Understanding and preparation: This involves exploring the data to identify any patterns or trends. It also involves cleaning the data to remove any errors or inconsistencies. For this problem, I would explore the data to identify any factors that are correlated with patient costs. I would also clean the data to remove any errors or inconsistencies. 5: Modeling and evaluations: This involves choosing a statistical method, training the model on the data, and evaluating the model's performance on a holdout dataset. For this problem, I would choose a statistical method that is appropriate for the data. I would then train the model on the data and evaluate the model's performance on a holdout dataset. this approach would allow me to answer the question that I posed in the Business Understanding stage. It would also allow me to identify the factors that are most strongly correlated with patient costs.