文章目录
You can learn ML&DL in coursera.
1 Supervised vs. Unsupervised Machine Learning
1.1 Supervised Machine Learning
Supervised learning learns from being given "right answer"
.
x
to y
mapping
input -> output
email -> spam
audio -> transcript
English -> Chinese
Regression
:
predicta number
of infinitely many possible outputs. (House price prediction)Classification
:
predictcategories
of small number of possible outputs. (cancer)
1.2 Unsupervised Machine Learning
Unsupervised machine learning is a super
of supervised machine learning, beacuse there are no
any given labels. It can find sonething interesting in unlabled
data.
Clustering
:
Data only comes with inputs x
, but not output labels y
. Algorithm has to find structure
in the data.
Group similar data points together: google news, DNA microarray, grouping customer.
Anomaly detection
:
Find unusual data points.Dimensionality reduction
:
Compress data using fewer number.
1.3 Jupyter Notebooks
We run Python in Jupyter notebooks.
#This is a 'Code' Cell
print("This is code cell")
This is code cell
We use the python f-string style
# print statements
variable = "right in the strings!"
print(f"hello, world")
print(f"{
variable}")
hello, world
right in the strings!
2 Regression Model
2.1 Linear Regression Model
- training set:
x
= “input” varialble (feature
)
y
= “output” variable ("target"
)
m
=number
of training examples - learning algorithm:
x
(feature) —>f
(model)—>y-hat
(prediction/estimated y) - model f:
f_w,b(x) = wx + b
- w:
weight
- b:
bias
2.2 Model Representation in Jupyter Notebooks
scientific computing library NumPy
and plotting data library Matplotlib
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('./deeplearning.mplstyle')
- How to predict house price for follows?
size price
1 300
2 500
for size = 1.5, price = ?
input feature x
and target y
x_train = np.array([1.0, 2.0])
y_train = np.array([300.0, 500.0])
print(f"x_train = {
x_train}")
print(f"y_train = {
y_train}")
x_train = [1. 2.]
y_train = [300. 500.]
We use x_train.shape[0]
to denote the number m
of training examples. .shape[n]
is the length number
of n
th dimension. You can also use len(x_train)
, it is same.
print(f"x_train.shape: {
x_train.shape}")
m = x_train.shape[0]
#m = len(x_train)
print(f"Number of training examples is: {
m}")
x_train.shape: (2,)
Number of training examples is: 2
Our training example is x_i, y_i
.
i = 0 # Change this to 1 to see (x^1, y^1)
x_i = x_train[i]
y_i = y_train[i]
print(f"(x^({
i}), y^({
i})) = ({
x_i}, {
y_i})")
(x^(0), y^(0)) = (1.0, 300.0)
Plot the data points
plt.scatter
is the plot scatter points
, x
means marker style
, r
means red
.
# Plot the data points
plt.scatter(x_train, y_train, marker='x', c='r')
# Set the title
plt.title("Housing Prices")
# Set the y-axis label
plt.ylabel('Price (in 1000s of dollars)')
# Set the x-axis label
plt.xlabel('Size (1000 sqft)')
plt.show()
Now we use our linear regression
model f_w,b(x) = wx + b
;
The weight w
and bias b
are given
w = 200
b = 100
print(f"w: {
w}")
print(f"b: {
b}")
w: 200
b: 100
We use our training data
into the linear regression
model f_w,b(x) = wx + b
, for a large number of data points, we need loop
def compute_model_output(x, w, b):
m = x.shape[0]
f_wb = np.zeros(m)
for i in range(m):#loop
f_wb[i