VIVA

Visual Analytics

Data Analysis for US Hospital Inpatient Discharges in 2013.

Year

January - April, 2015

Type

Individual, Class, Analytics

Role

Visual Analyst, Data Wrangler

Tools

Tableau, Microsoft Excel

The analysis of this dataset was done for a Visual Analytics class for a certificate program, held by the Vancouver Institute for Visual Analytics (VIVA). Within the dataset, contains information about patient conditions, background, their financial payments for their treatment, dates of admission and other useful information that revolve around the patients' condition and their treatments which I have used to answer questions that I made for US hospital inpatient discharges that happened in 2013.

Hospital Admissions

Did admissions or performances drop on different days of the week, or would performances remain consistent throughout the week?

It was clear that admissions remained relatively consistent according to the scatterplots, but there was not enough detailed data to indicate performance drops for any day of the week.

Overall, how was the performance for each hospital facility under financial incentives and how cost-effective were they?

Some information was not available, so this question remains partially speculative because a lot of factors can skew the data such as an expensive surgery, with a shorter recover period. Without more direct information for length of surgeries or which types of conditions are tied to expensive surgeries, it cannot be compared directly to length of stay. The dimensions that I used for this were total charges and length of stay to compare hospital facilities. Although this doesn't show direct performance indicators between hospitals, it does show how much benefit an individual may gain for their expenses. For instance, a low total charge with a greater length of stay may have some relevance to how costly a hospital may be for their service.

Risk Analysis

What were the types of conditions associated with the total and greatest risks and which age group was more likely to be at risk?

The dataset provided sufficient information for this question because these charts made it clear which age group had the most risk and which conditions had the highest total risk factor.

Financial Breakdown

What were the total charges for each hospital facility and how did individuals pay for their expenses?

Was it affordable for each individual or did individuals have to resort to another source of payment for coverage?

It was clear that many individuals covered their expenses with Medicare. I didn't create another dashboard that showed alternative sources of payments, although it might have been useful for seeing what kind of financial situations individuals might have been in if their initial source of payments were insufficient.

Workflow

Collection

At first, I wanted to collect multiple .csv files and merge them together for a more diverse dataset, creating opportunities multivariate analysis. Originally, I intended on using a dataset about Cancer research or a dataset that included common medical symptoms and their treatments such as over the counter medications. However, I found a good dataset about US Hospital Inpatient discharge conditions from HealthData.gov with good variety in the data.

Wrangling

There was little cleaning required with this dataset. The majority of the data cleaning that I made was hiding values for dimensions that included null values or diagnosed conditions that I thought were uncommon. I tried to visualize information that people would recognize and relate more with.

I planned on using Google refine to group related dimensions such as heart diseases and other commonalities in the data, but kept the data unaltered because there was good variation in the dimensions. For instance, grouping diagnosis conditions into categories such as aortic diseases or cancers would've simplified the data too much. However, creating a classified copy of these dimensions in the future would have been useful for an overall analysis of disease categories.

Analysis

The analysis for this dataset originates from propositions and questions about general medical conditions, treatments and costs that needed to be answered or explored. After looking at some key questions, I've produced charts that were relevant with a common dimension for each chart in each dashboard, making it easier to relate information between the charts.

Audience

These visualizations could be projected towards US Hospital Inpatient personnel for evaluating their overall performances, patients for reviewing which hospital inpatient has the best performances and cost rate, and economists that may want to see trends that may be happening in all US Hospital Inpatients. Doctors, surgeons and pharmacists may also be interested in this type of data because it may offer them a comprehensive breakdown of what type of conditions they could be researching or the types of patients they could be expecting.

Dimensions

HOSPITAL PERFORMANCE

Date of Admission

[Facility Name]
[Admit Day of the Week : Mon - Fri]
[Length of Stay : SUM]
[Total Charges : SUM]

[Admit Day of the Week : Mon - Fri]
[APR Risk of Mortality : Extreme]
[APR Medical Surgical Description : Surgery]
[Emergency Department Indicator]

Type of Admission

[Facility Name]
[Type of Admission : COUNT]

[Admit Day of the Week : Mon - Fri]
[APR Risk of Mortality : Extreme]
[APR Medical Surgical Description : Surgery]
[Type of Admission]

RISK ANALYSIS

Diagnosis Description Count

[Facility Name]
[CSS Diagnosis Description]
[APR Risk of Mortality : COUNT]

[CSS Procedure Description]
[Risk of Mortality]

Risk of Mortality Count for Age Groups

[Age Group]
[Admit Day of the Week]
[Type of Admission]
[APR Risk of Mortality : COUNT]

Risk of Mortalities for Age Groups

[Age Group]
[APR Risk of Mortality]

[APR Risk of Mortality : COUNT]

FINANCIAL BREAKDOWN

Total Charges for Medical Procedures

[CCS Procedure Description]
[Total Charges : SUM]

[APR Risk of Mortality]

Total Charges for Facilities

[Facility Name]
[APR Medical Surgical Description]

[Total Charges : SUM]

Source of Payments

[Source of Payment 1 : COUNT]
[Total Charges : SUM]

Lesson

Challenges

One of the issues with working with this one dataset is that I was limited to the number of things that I could compare when doing detailed analysis. Most of the dimensions given in this dataset were irrelevant to my analysis and the dimensions that were useful weren't detailed enough. For instance, some dimensions that could have given a more detailed analysis would be dimensions such as specific location in the United States such as the origin of state for the inpatient admission and what the differences between moderate, minor, major and extreme cases for risk of mortality meant.

Reflection

Given more time with the project, I would have looked for more datasets that were related to this one so that I could perform a multivariate analysis across multiple dimensions, comparing data points across two datasets. This would have yielded richer and more detailed insights into the current state of inpatient admissions to US hospitals.

Next Project