Database Management > QUESTIONS & ANSWERS > COGS 108 - Assignment 4: Data Analysis Important Reminders You must submit this file ( A4_DataAnalys (All)
Data Analysis Important Reminders You must submit this file ( A4_DataAnalysis.ipynb ) to TritonED to finish the homework. This assignment has hidden tests: tests that are not visible here, but that... will be run on your submitted assignment for grading. This means passing all the tests you can see in the notebook here does not guarantee you have the right answer! In particular many of the tests you can see simply check that the right variable names exist. Hidden tests check the actual values. It is up to you to check the values, and make sure they seem reasonable. A reminder to restart the kernel and re-run the code as a first line check if things seem to go weird. For example, note that some cells can only be run once, because they re-write a variable (for example, your dataframe), and change it in a way that means a second execution will fail. Also, running some cells out of order might change the dataframe in ways that may cause an error, which can be fixed by re-running. In [1]: In [3]: Notes - Assignment Outline Parts 1-6 of this assignment are modeled on being a minimal example of a project notebook. This mimics, and gets you working with, something like what you will need for your final project. Parts 7 & 8 break from the project narrative, and are OPTIONAL (UNGRADED). They serve instead as a couple of quick one-offs to get you working with some other methods that might be useful to incorporate into your project. Setup Data: the responses collected from a survery of the COGS 108 class. There are 417 observations in the data, covering 10 different 'features'. Research Question: Do students in different majors have different heights? Background: Physical height has previously shown to correlate with career choice, and career success. More recently it has been demonstrated that these correlations can actually be explained by height in high school, as opposed to height in adulthood (1). It is currently unclear whether height correlates with choice of major in university. Reference: 1) http://economics.sas.upenn.edu/~apostlew/paper/pdf/short.pdf (http://economics.sas.upenn.edu/~apostlew/paper/pdf/short.pdf) Hypothesis: We hypothesize that there will be a relation between height and chosen major. Part 1: Load & Clean the Data Fixing messy data makes up a large amount of the work of being a Data Scientist. Collecting package metadata: done Solving environment: done ## Package Plan ## environment location: /Users/brianbarry/anaconda3 added / updated specs: - patsy=0.5.1 The following packages will be downloaded: package | build ---------------------------|----------------- conda-4.6.7 | py36_0 1.6 MB openssl-1.1.1b | h1de35cc_0 3.4 MB patsy-0.5.1 | py36_0 376 KB ------------------------------------------------------------ Total: 5.5 MB The following packages will be UPDATED: conda 4.6.4-py36_0 --> 4.6.7-py36_0 openssl 1.1.1a-h1de35cc_0 --> 1.1.1b-h1de35cc_0 patsy 0.5.0-py37_0 --> 0.5.1-py36_0 Downloading and Extracting Packages conda-4.6.7 | 1.6 MB | ##################################### | 100% patsy-0.5.1 | 376 KB | ##################################### | 100% openssl-1.1.1b | 3.4 MB | ##################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done # Run this cell to ensure you have the correct version of patsy # You only need to do the installation once # Once you have run it you can comment these two lines so that the cell doesn't execute everytime. import sys !conda install --yes --prefix {sys.prefix} patsy=0.5.1 # Imports - These are all you need for the assignment: do not import additional packages %matplotlib inline import numpy as np import pandas as pd import matplotlib.pyplot as plt import patsy import statsmodels.api as sm import scipy.stats as stats from scipy.stats import ttest_ind, chisquare, normaltest # Note: the statsmodels import may print out a 'FutureWarning'. Thats fine. 12345 123456789 10 11 [Show More]
Last updated: 2 years ago
Preview 1 out of 20 pages
Buy this document to get the full access instantly
Instant Download Access after purchase
Buy NowInstant download
We Accept:
Can't find what you want? Try our AI powered Search
Connected school, study & course
About the document
Uploaded On
Jul 21, 2022
Number of pages
20
Written in
This document has been written for:
Uploaded
Jul 21, 2022
Downloads
0
Views
62
In Scholarfriends, a student can earn by offering help to other student. Students can help other students with materials by upploading their notes and earn money.
We're available through e-mail, Twitter, Facebook, and live chat.
FAQ
Questions? Leave a message!
Copyright © Scholarfriends · High quality services·