Computer Science > QUESTIONS & ANSWERS > Homework 5: Pivot Tables and Iteration<homework 05<ALL ANSWERS CORRECT (All)
Please complete this notebook by filling in the cells provided. Before you begin, execute the following cell to load the provided tests. Each time you start your server, you will need to execute this... cell again to load the tests. Homework 5 is due Wednesday, 10/09 at 11:59pm. Start early so that you can come to office hours if you're stuck. Check the website for the office hours schedule. Directly sharing answers is not okay, but discussing problems with the course staff or with other students is encouraged. Refer to the policies page to learn more about how to learn cooperatively. For all problems that you must write our explanations and sentences for, you must provide your answer in the designated space. Moreover, throughout this homework and all future ones, please be sure to not re-assign variables throughout the notebook! For example, if you use max_temperature in your answer to one question, do not reassign it later on. In [3]: # Don't change this cell; just run it. import numpy as np from datascience import * # These lines do some fancy plotting magic. import matplotlib %matplotlib inline import matplotlib.pyplot as plt plt.style.use('fivethirtyeight') import warnings warnings.simplefilter('ignore', FutureWarning) from client.api.notebook import Notebook ok = Notebook('hw05.ok') _ = ok.auth(inline=True) 1. Causes of Death by Year This exercise is designed to give you practice using the Table method pivot . Here is a link to the Python reference page in case you need a quick refresher. We'll be looking at a dataset from the California Department of Public Health that records the cause of death, as recorded on a death ===================================================================== Assignment: Homework 5: Pivot Tables and Iteration OK, version v1.14.15 ===================================================================== ERROR | auth.py:102 | {'error': 'invalid_grant'} Open the following URL: https://okpy.org/client/login/ After logging in, copy the code from the web page and paste it into the box. Then press the "Enter" key on your keyboard. Paste your code here: 5H3T10w8kGd3FsxmKZBhBdtS8pI2rR Successfully logged in as smn2za@virginia.eduWe'll be looking at a dataset from the California Department of Public Health that records the cause of death, as recorded on a death certificate, for everyone who died in California from 1999 to 2013. The data is in the file causes_of_death.csv.zip . Each row records the number of deaths by a specific cause in one year in one ZIP code. To make the file smaller, we've compressed it; run the next cell to unzip and load it. The first line is not a Python statement. Anything appearing after ! on a line will be executed not by the Python kernel, but by the system command-line. If you have a Windows machine, the first line might not work. If that's the case, you'll need to comment it out, and unzip the file manually, giving it the name causes_of_death.csv . In [4]: !unzip -o causes_of_death.csv.zip causes = Table.read_table('causes_of_death.csv') causes The causes of death in the data are abbreviated. We've provided a table called abbreviations.csv to translate the abbreviations. In [5]: abbreviations = Table.read_table('abbreviations.csv') abbreviations.show() Archive: causes_of_death.csv.zip inflating: causes_of_death.csv Out[4]: Year ZIP Code Cause of Death Count Location 1999 90002 SUI 1 (33.94969, -118.246213) 1999 90005 HOM 1 (34.058508, -118.301197) 1999 90006 ALZ 1 (34.049323, -118.291687) 1999 90007 ALZ 1 (34.029442, -118.287095) 1999 90009 DIA 1 (33.9452, -118.3832) 1999 90009 LIV 1 (33.9452, -118.3832) 1999 90009 OTH 1 (33.9452, -118.3832) 1999 90010 STK 1 (34.060633, -118.302664) 1999 90010 CLD 1 (34.060633, -118.302664) 1999 90010 DIA 1 (34.060633, -118.302664) ... (320142 rows omitted) Cause of Death Cause of Death (Full Description) AID Acquired Immune Deficiency Syndrome (AIDS) ALZ Alzheimer's Disease CAN Malignant Neoplasms (Cancers) CLD Chronic Lower Respiratory Disease (CLRD) CPD Chronic Obstructive Pulmonary Disease (COPD) DIA Diabetes Mellitus HIV Human Immunodeficiency Virus Disease (HIVD) HOM Homicide HTD Diseases of the Heart HYP Essential Hypertension and Hypertensive Renal Disease INJ Unintentional Injuries LIV Chronic Liver Disease and Cirrhosis NEP Kidney Disease (Nephritis)The dataset is missing data on certain causes of death for certain years. It looks like those causes of death are relatively rare, so for some purposes it makes sense to drop them from consideration. Of course, we'll have to keep in mind that we're no longer looking at a comprehensive report on all deaths in California. Question 1. Let's clean up our data. First, filter out the HOM, HYP, and NEP rows from the table for the reasons described above. Next, join together the abbreviations table and our causes of death table so that we have a more detailed description of each disease in each row. Lastly, drop the column which contains the acronym of the disease, and rename the column with the full description 'Cause of Death'. Assign the variable cleaned_causes to the resulting table. Hint: You should expect this to take more than one line. Use many lines and many intermediate tables to complete this question. In [8]: cleaned_causes = causes.where("Cause of Death",are.not_equal_to("HOM")).where("Cause of Death",are. not_equal_to("HYP")).where("Cause of Death",are.not_equal_to("NEP")) cleaned_causes = cleaned_causes.join("Cause of Death",abbreviations,"Cause of Death") cleaned_causes = cleaned_causes.drop("Cause of Death") cleaned_causes = cleaned_causes.relabeled(4, "Cause of Death") cleaned_causes In [9]: answer_cleaned_causes = cleaned_causes.copy() _ = ok.grade('q1_1') OTH All Other Causes PNF Pneumonia and Influenza STK Cerebrovascular Disease (Stroke) SUI Intentional Self Harm (Suicide) Cause of Death Cause of Death (Full Description) Out[8]: Year ZIP Code Count Location Cause of Death 1999 90006 1 (34.049323, -118.291687) Alzheimer's Disease 1999 90007 1 (34.029442, -118.287095) Alzheimer's Disease 1999 90012 1 (34.061396, -118.238479) Alzheimer's Disease 1999 90015 1 (34.043439, -118.271613) Alzheimer's Disease 1999 90017 1 (34.055864, -118.266582) Alzheimer's Disease 1999 90020 1 (34.066535, -118.302211) Alzheimer's Disease 1999 90031 1 (34.078349, -118.211279) Alzheimer's Disease 1999 90033 1 (34.048676, -118.208442) Alzheimer's Disease 1999 90042 1 (34.114527, -118.192902) [Show More]
Last updated: 2 years ago
Preview 1 out of 11 pages
Buy this document to get the full access instantly
Instant Download Access after purchase
Buy NowInstant download
We Accept:
Can't find what you want? Try our AI powered Search
Connected school, study & course
About the document
Uploaded On
Apr 16, 2021
Number of pages
11
Written in
This document has been written for:
Uploaded
Apr 16, 2021
Downloads
0
Views
40
In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Scholarfriends · High quality services·