The code below was produced to accompany the Cushing/Whitney Medical Library blog post, "Poison Yesterday and Today."
In brief, the code below does the following:
Data citation: Centers for Disease Control and Prevention, National Center for Health Statistics. National Vital Statistics System, Mortality 2018-2021 on CDC WONDER Online Database, released in 2021. Data are from the Multiple Cause of Death Files, 2018-2021, as compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program. Accessed at http://wonder.cdc.gov/ucd-icd10-expanded.html on Mar 16, 2023.
_See caveats from CDC Wonder below as well.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn
#read in data
cdc_ct_death = pd.read_csv("CDCWONDER_underlying_death_cause_external_causes_2018-2021.csv",
dtype={"Cause of death": str,
"Cause of death Code": str,
"Deaths": int,
"Population": int})
#personal style preference here for columns
def make_cols_snake_case(df):
df.columns = [x.lower() for x in df.columns]
df.columns = df.columns.str.replace("[ ]", "_", regex=True)
make_cols_snake_case(cdc_ct_death)
#check info and make sure function above took
cdc_ct_death.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 63 entries, 0 to 62 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 cause_of_death 63 non-null object 1 cause_of_death_code 63 non-null object 2 deaths 63 non-null int64 3 population 63 non-null int64 4 crude_rate_per_100000 63 non-null object dtypes: int64(2), object(3) memory usage: 2.6+ KB
#preview data
cdc_ct_death
cause_of_death | cause_of_death_code | deaths | population | crude_rate_per_100000 | |
---|---|---|---|---|---|
0 | Total | not applicable | 9192 | 10727890 | 85.70 |
1 | Accidental poisoning by and exposure to other ... | X44 | 2159 | 10727890 | 20.10 |
2 | Accidental poisoning by and exposure to narcot... | X42 | 1669 | 10727890 | 15.60 |
3 | Unspecified fall | W19 | 1215 | 10727890 | 11.30 |
4 | Person injured in unspecified motor-vehicle ac... | V89.2 | 459 | 10727890 | 4.30 |
... | ... | ... | ... | ... | ... |
58 | Urinary catheterization | Y84.6 | 11 | 10727890 | Unreliable |
59 | Pedestrian injured in collision with car, pick... | V03.1 | 10 | 10727890 | Unreliable |
60 | Motorcycle rider injured in noncollision trans... | V28.4 | 10 | 10727890 | Unreliable |
61 | Fall from, out of or through building or struc... | W13 | 10 | 10727890 | Unreliable |
62 | Contact with other and unspecified machinery | W31 | 10 | 10727890 | Unreliable |
63 rows × 5 columns
#peel off data slice where cause of death references 'accidental poison'
cdc_ct_accidental_poison = cdc_ct_death[cdc_ct_death["cause_of_death"].str.contains("Accidental poison")]
cdc_ct_accidental_poison
cause_of_death | cause_of_death_code | deaths | population | crude_rate_per_100000 | |
---|---|---|---|---|---|
1 | Accidental poisoning by and exposure to other ... | X44 | 2159 | 10727890 | 20.10 |
2 | Accidental poisoning by and exposure to narcot... | X42 | 1669 | 10727890 | 15.60 |
9 | Accidental poisoning by and exposure to alcohol | X45 | 168 | 10727890 | 1.60 |
19 | Accidental poisoning by and exposure to antiep... | X41 | 57 | 10727890 | 0.50 |
39 | Accidental poisoning by and exposure to other ... | X47 | 21 | 10727890 | 0.20 |
50 | Accidental poisoning by and exposure to nonopi... | X40 | 13 | 10727890 | Unreliable |
#drop row (index=[50]) with unreliable result
cdc_ct_accidental_poison = cdc_ct_accidental_poison.drop(index=[50])
cdc_ct_accidental_poison
cause_of_death | cause_of_death_code | deaths | population | crude_rate_per_100000 | |
---|---|---|---|---|---|
1 | Accidental poisoning by and exposure to other ... | X44 | 2159 | 10727890 | 20.10 |
2 | Accidental poisoning by and exposure to narcot... | X42 | 1669 | 10727890 | 15.60 |
9 | Accidental poisoning by and exposure to alcohol | X45 | 168 | 10727890 | 1.60 |
19 | Accidental poisoning by and exposure to antiep... | X41 | 57 | 10727890 | 0.50 |
39 | Accidental poisoning by and exposure to other ... | X47 | 21 | 10727890 | 0.20 |
#look at full cause of death strings in a list
ct_poison_cause_of_death_strings_list = cdc_ct_accidental_poison["cause_of_death"].to_list()
ct_poison_cause_of_death_strings_list
['Accidental poisoning by and exposure to other and unspecified drugs, medicaments and biological substances', 'Accidental poisoning by and exposure to narcotics and psychodysleptics [hallucinogens], not elsewhere classified', 'Accidental poisoning by and exposure to alcohol', 'Accidental poisoning by and exposure to antiepileptic, sedative-hypnotic, antiparkinsonism and psychotropic drugs, not elsewhere classified', 'Accidental poisoning by and exposure to other gases and vapours']
#make other columns lists for tiny df
ct_poison_deaths = cdc_ct_accidental_poison["deaths"].to_list()
ct_population = cdc_ct_accidental_poison["population"].to_list()
ct_crude_rate_per_100K = cdc_ct_accidental_poison["crude_rate_per_100000"].to_list()
#make new tiny df with only rows interested in and make new short titles for better viz
ct_poison = pd.DataFrame({'ct_poison_cause_of_death_short': ['Drugs, etc.', 'Narcotics, etc.', 'Alcohol', 'Antiepileptic drugs, etc.', "Gases and vapours"],
'ct_poison_cause_of_death_long': ct_poison_cause_of_death_strings_list,
'ct_poison_deaths': ct_poison_deaths,
'ct_population': ct_population,
'ct_poison_death_rate_per_100K': ct_crude_rate_per_100K
})
#preview data again
ct_poison
ct_poison_cause_of_death_short | ct_poison_cause_of_death_long | ct_poison_deaths | ct_population | ct_poison_death_rate_per_100K | |
---|---|---|---|---|---|
0 | Drugs, etc. | Accidental poisoning by and exposure to other ... | 2159 | 10727890 | 20.10 |
1 | Narcotics, etc. | Accidental poisoning by and exposure to narcot... | 1669 | 10727890 | 15.60 |
2 | Alcohol | Accidental poisoning by and exposure to alcohol | 168 | 10727890 | 1.60 |
3 | Antiepileptic drugs, etc. | Accidental poisoning by and exposure to antiep... | 57 | 10727890 | 0.50 |
4 | Gases and vapours | Accidental poisoning by and exposure to other ... | 21 | 10727890 | 0.20 |
#combine all drug rows into single variables to use in even smaller dataframe below
combined_drugs_poison_death_count = ct_poison.ct_poison_deaths[0] + ct_poison.ct_poison_deaths[1] + ct_poison.ct_poison_deaths[3]
combined_drugs_cause_of_death_long = ct_poison.ct_poison_cause_of_death_long[0] + " OR "+ ct_poison.ct_poison_cause_of_death_long[1] + " OR " + ct_poison.ct_poison_cause_of_death_long[3]
print("Total combined number of deaths from drug poisoning:", combined_drugs_poison_death_count)
print(" ")
print("Total combined text strings (each separated by 'OR') about cause of death from poison --", combined_drugs_cause_of_death_long)
Total combined number of deaths from drug poisoning: 3885 Total combined text strings (each separated by 'OR') about cause of death from poison -- Accidental poisoning by and exposure to other and unspecified drugs, medicaments and biological substances OR Accidental poisoning by and exposure to narcotics and psychodysleptics [hallucinogens], not elsewhere classified OR Accidental poisoning by and exposure to antiepileptic, sedative-hypnotic, antiparkinsonism and psychotropic drugs, not elsewhere classified
#making even smaller dataset for ideal viz
ct_poison_min = pd.DataFrame({'ct_poison_cause_of_death_short': ['Drugs', 'Alcohol', "Gases and vapours"],
'ct_poison_cause_of_death_long': [combined_drugs_cause_of_death_long, ct_poison_cause_of_death_strings_list[2], ct_poison_cause_of_death_strings_list[4]],
'ct_poison_deaths': [combined_drugs_poison_death_count, ct_poison.ct_poison_deaths[2], ct_poison.ct_poison_deaths[4]],
'ct_population': ct_population[0:3]
})
#use seaborn to plot - https://seaborn.pydata.org/generated/seaborn.barplot.html
fig = seaborn.barplot(data=ct_poison_min,
x="ct_poison_cause_of_death_short",
y="ct_poison_deaths",
palette="rocket",
orient="v")
fig.set_xlabel("Accidental poisoning types")
fig.set_ylabel("Number of deaths")
fig.set_title("Accidental poisoning deaths in Connecticut, 2018-2021*")
plt.savefig("ct_poison_2018-21.png")
#simplified version of combined text above - also see caveats below
print("* See full definition of 'drugs' below, plus caveats below.")
print("Drugs include other and unspecified drugs, medicaments and biological substances; narcotics and psychodysleptics [hallucinogens], not elsewhere classified; antiepileptic, sedative-hypnotic, antiparkinsonism and psychotropic drugs, not elsewhere classified.")
* See full definition of 'drugs' below, plus caveats below. Drugs include other and unspecified drugs, medicaments and biological substances; narcotics and psychodysleptics [hallucinogens], not elsewhere classified; antiepileptic, sedative-hypnotic, antiparkinsonism and psychotropic drugs, not elsewhere classified.
Caveats: Data are Suppressed when the data meet the criteria for confidentiality constraints. More information.
Death rates are flagged as Unreliable when the rate is calculated with a numerator of 20 or less. More information.
The population figures for year 2021 are single-race estimates of the July 1 resident population, from the Vintage 2021 postcensal series released by the Census Bureau on June 30, 2022. The population figures for year 2020 are single-race estimates of the July 1 resident population, from the Vintage 2020 postcensal series released by the Census Bureau on July 27, 2021. The population figures for year 2019 are single-race estimates of the July 1 resident population, from the Vintage 2019 postcensal series released by the Census Bureau on June 25, 2020. The population figures for year 2018 are single-race estimates of the July 1 resident population, from the Vintage 2018 postcensal series released by the Census Bureau on June 20, 2019.
The population figures used in the calculation of death rates for the age group 'under 1 year' are the estimates of the resident population that is under one year of age. More information.
Beginning with the 2018 data, changes have been implemented that affect the counts for ICD-10 cause of death codes O00-O99 compared to previous practice. In addition, data for the cause of death codes O00-O99 for 2003 through 2017 reflect differences in information available to individual states and probable errors. Caution should be used in interpreting these data. More information can be found at: https://www.cdc.gov/nchs/maternal-mortality/.
On March 11, 2021, the 2019 mortality data on CDC WONDER was updated with the 2019 mortality data updated by NCHS on March 4, 2021 to include corrected information for residents of Texas affecting 5 records previously coded to cause code *U01.4, Terrorism involving firearms (homicide). The underlying and multiple cause of death codes for 5 records were corrected in the 2019 data. Underlying and multiple cause of death codes for those 5 records were recoded to Assault (homicide) by other and unspecified firearm discharge, ICD-10 code X95. The corrected final death records replaces the data released on December 22, 2020.
Changes to cause of death classification affect reporting trends. More information.
Help:
See Underlying Cause of Death, 2018-2021, Single Race Documentation for more information.
Query Date: Mar 16, 2023 2:51:43 PM
Query Criteria: ICD-10 Codes: V01-Y89 (External causes of morbidity and mortality) States: Connecticut (09) Year/Month: 2019; 2020; 2021 Group By: Cause of death Show Totals: True Show Zero Values: False Show Suppressed: False Calculate Rates Per: 100,000 Rate Options: Default intercensal populations for years 2001-2009 (except Infant Age Groups)