Import the dataset and Load the dataset. Load
the necessary libraries. Q1. Show the last 10 records of the dataset. (2 point) Q2. Show the first 10 records of the dataset. (2
points) Q3. Show the dimension of the data
...
Import the dataset and Load the dataset. Load
the necessary libraries. Q1. Show the last 10 records of the dataset. (2 point) Q2. Show the first 10 records of the dataset. (2
points) Q3. Show the dimension of the dataset. (2 points) Q5. Print the information about all the variables of the data
set. (2 points) Q6. Check for missing values. (2 points) Q7. How many missing values are present? (2 points) Q8. Get the initial data (name it 'df') with dropping the NA
values. (2 points) Q9. Get the summary of the original data (before dropping
the 'na' values). (2 points) Q10. Check the information of the new dataframe. (2
points) Q11. Get the unique start destinations. (2 points)
Note: This question is based on the dataframe with no 'na' values in the
'START' variable. Q12. What is the total number of unique start
destinations? (2 points)
Note: This question is based on the dataframe with no 'na' values in the
'START' variable. Q13. Print the total number of unique stop destinations. (2
points)
Note: This question is based on the dataframe with no 'na' values in the
'STOP' variable. Q14. Print all the Uber trips that has the starting point of
San Francisco. (2 points)
Note: Use the original dataframe without dropping the 'na' values. Q15. What is the most popular starting point for the Uber
Out[18]:
START_DATE* END_DATE* CATEGORY* START* STOP* MILES* PURPOSE*
362 05-09-2016
14:39
05-09-2016
15:06 Business Francisco San Palo Alto 20.5 Between Offices
440 6/14/2016 16:09 6/14/2016 16:39 Business San
Francisco Emeryville 11.6 Meeting
836 10/19/2016
14:02
10/19/2016
14:31 Business Francisco San Berkeley 10.8 NaN
917 11-07-2016
19:17
11-07-2016
19:57 Business Francisco San Berkeley 13.2 Between Offices
919 11-08-2016
12:16
11-08-2016
12:49 Business Francisco San Berkeley 11.3 Meeting
927 11-09-2016
18:40
11-09-2016
19:17 Business Francisco San Oakland 12.7 Customer Visit
933 11-10-2016
15:17
11-10-2016
15:22 Business Francisco San Oakland 9.9 Temporary Site
966 11/15/2016
20:44
11/15/2016
21:00 Business Francisco San Berkeley 11.8 Temporary Site
START_DATE* END_DATE* CATEGORY* START* STOP* \
362 05-09-2016 14:39 05-09-2016 15:06 Business San Francisco Palo Alto
440 6/14/2016 16:09 6/14/2016 16:39 Business San Francisco Emeryville
836 10/19/2016 14:02 10/19/2016 14:31 Business San Francisco Berkeley
MILES* PURPOSE*
362 20.5 Between Offices
440 11.6 Meeting
836 10.8 NaN
917 13.2 Between Offices
919 11.3 Meeting
927 12.7 Customer Visit
933 9.9 Temporary Site
966 11.8 Temporary Site
dataset[dataset['START*']=='San Francisco']
print(dataset[dataset['START*']=='San Francisco'])
1 1 1 16/2/2020 PDS_UberDrive_Questions_Final - Jupyter Notebook
localhost:8888/notebooks/~zGL/Project/Uber_Driver/PDS_UberDrive_Questions_Final.ipynb 12/21
Q15. What is the most popular starting point for the Uber
drivers? (2 points)
Note: This question is based on the dataframe with no 'na' values in the
'START' variable. Q16. What is the most popular dropping point for the Uber
drivers? (2 points)
Note: This question is based on the dataframe with no 'na' values in the
'STOP' variable. Q17. List the most frequent route taken by Uber drivers. (3
points)
Note: This question is based on the dataframe with no 'na'
values. Q18. Print all types of purposes for the trip in an array. (3
points)
Note: This question is based on the dataframe with no 'na' values in the
'PURPOSE' variable. Q19. Plot a bar graph of Purposes vs Distance. (3
points) Q20. Print a dataframe of Purposes and the distance
travelled for that particular Purpose. (3 points) Q21. Plot number of trips vs Category of trips. (3
points) Q22. What is proportion of trips that is Business and what
is the proportion of trips that is Personal? (3 points)
[Show More]