326_621a-2018fall

326.621A Homework 3

Due November 21, 2018 @ 11:59PM

Due November 25, 2018 @ 11:59PM

Q1 R for Data Science Chapter 3

Problem 3.7.1.5

Problem 3.8.1.3

Problem 3.9.1.4

Q2 R for Data Science Chapter 5

Problem 5.2.4.4

Problem 5.3.1.3

Problem 5.4.1.2

Problem 5.5.2.5

Problem 5.6.7.5

Problem 5.7.1.7

Q3 R for Data Science Chapter 10

Problem 10.5.3

Q4 R for Data Science Chapter 11

Problem 11.2.2.5

Problem 11.3.5.7

Problem 13.2.1.4

Q5 R for Data Science Chapter 13

Problem 13.3.1.2

Problem 13.4.6.1, 13.4.6.2, 13.4.6.3

Problem 13.5.1.1, 13.5.1.2, 13.5.1.4

Q6 LA City Parking War

The SQLite database /home/stat326_621a/data/la_parking/LA_Parking_Citations.sqlite on teaching server contains information about parking tickets in the City of Los Angeles, U.S.A. It was downloaded from Kaggle; the original dataset is available from LA Open Data Portal. Connect to the database and answer following questions using plots and summary statistics. In this exercise, you are not allowed to load whole data into memory. Use the transform in database, plot in R strategy.

  1. How many tickets are in this data set? Which time period do these tickets span? Which years have most data?

  2. When (which hour, weekday, month day, and month) are you most likely to get a ticket and when are you least likely to get a ticket?

  3. Which car makes received most citations?

  4. How many different colors of cars were ticketed? Which color attracted most tickets?

  5. What are the most common ticket types?

  6. How much money was collected on parking tickets in 2015 and 2016?

  7. Visualize any other information you are interested in.