Lab - Internet Meter Anomaly Detection
Objectives
**Part 1: Feature Engineering**
**Part 2: Euclidean Anomaly Detection**
Scenario/Background
Anomaly-detection algorithms locate those datapoints that stand out from
...
Lab - Internet Meter Anomaly Detection
Objectives
**Part 1: Feature Engineering**
**Part 2: Euclidean Anomaly Detection**
Scenario/Background
Anomaly-detection algorithms locate those datapoints that stand out from a pattern. For example, algorithms of this kind can be used to test the safety of airplane
engines by recording quantities such as fuel consumption, temperature, and so on. Whenever the measurements display extreme values, such as unusually high
temperature, anomaly detection alerts the operator, who can then take action to resolve potential issues. Constant improvement of safety standards is not unique
to the transport sector, and these algorithms find applications in all branches of industry, from food manufacturing to the production of toys for children.
Required Resources
1 PC with Internet access
Raspberry Pi version 2 or higher
Python libraries: numpy, pandas, matplotlib
Datafiles: rpi_data_processed.csv13/05/2021 4.3.2.4 Lab - Internet Meter Anomaly Detection
https://static-course-assets.s3.amazonaws.com/IoTFBDA201/en/course/files/4.3.2.4 Lab - Internet Meter Anomaly Detection.html 2/11
Part 1 : Feature Engineering
Step 1: Import Python Libraries.
In this step, you will import Python libraries.
In [ ]: # Code Cell 1
import numpy as np
import pandas as pd
Step 2: Create a Dataframe and modify the quantities.
The quantities that are recorded when gathering data, also known as features, may require some transformation before analysis. For example, the quantity called
'ping' obtained when measuring internet speed. This feature describes intervals of time. A contrast is then observed, as the other quantities being monitored,
namely the rates of download and upload, have dimensions of inverse time. Because of this, 'ping' is not the optimal choice for statistical analysis. Better results
are achieved using a related feature, which we will call 'ping rate'. This is calculated by applying the simple transformation
This process of 'modifying' quantities in view of analysis is termed 'feature engineering', and is generally an important part of the machine-learning workflow.
Load the internet-speed data from the file rpi_data_processed.csv into a Pandas dataframe named df. Using this as a starting point, generate another
dataframe, df_rates, whose three columns are download_rate, upload_rate and ping_rate respectively. When computing this last feature, make sure that
the result is given in units of 1/seconds
[Show More]