Course3_UnsupervisedLearning

Course3_UnsupervisedLearning

Week 1

K-means clustering

cluster centroids

image-20240822150202574

K-means alogrithm

image-20240822152224716

K-means optimization objective

image-20240822154355047

image-20240822154657010

random initialization

image-20240822162142918

elbow method

image-20240822170200049

How to choose the value of K

image-20240822170455122

anomaly detection

density estimation

image-20240823221504369

image-20240823225221422

image-20240824003555615

anomaly detection alogrithm

image-20240824004627495

the real number evaluation

image-20240824010419901

algorithm evaluation——回忆Course 2 的倾斜数据集的操作方法

image-20240824011002605

how to choose from anomoly detection and supervised learning

image-20240824085315385

image-20240824085740327

for anomaly detection, it is hard for the alogrithm to judge one feature useful or not. So it is important for anomaly detection to choose features carefully.

image-20240824090849732

Week 2 recommondation system

what do we do if we have the features of the movie?

image-20240825002207064

Cost function for recommendation

image-20240825003244298

image-20240825003648743

和上面的Cost function相比较一下——How to learn the features of the movie——We can actually learn the features for the given movie without these features externally being given

image-20240825222455425

collaboritive filting alogrithm

image-20240825223324971

How to solve the cost function?

image-20240825223630974

binary lable

cost function for binary application

image-20240826113339861

mean normalization; ie. row normalization

image-20240826120712875

auto diff/grad

image-20240826141816774

inplementation in Tensorflow for collaborative filtering system

image-20240826143519322

find related items

image-20240826145755181

limitations of collaborative filtering

image-20240826150424947

content-based filtering approach

collaborative filtering vs content-based filtering

image-20240826174239707

content-based filtering : learn to match

image-20240826174414041

deep learning for content-based filtering

image-20240826180020035

image-20240826202515868

learned user and item vector

image-20240826203323000

recommending from a large catalogue

image-20240826204300685

image-20240826204546755

retrieval step

image-20240826205130114

通过神经网络学习用户和物品的低维嵌入表示,并通过计算它们的点积来预测用户对物品的偏好

# GRADED_CELL
# UNQ_C1

num_outputs = 32
tf.random.set_seed(1)
user_NN = tf.keras.models.Sequential([
    ### START CODE HERE ###     
    tf.keras.layers.Dense(256, activation = 'relu'),
    tf.keras.layers.Dense(128, activation = 'relu'),
    tf.keras.layers.Dense(num_outputs, activation = 'linear')
  
  
    ### END CODE HERE ###  
])

item_NN = tf.keras.models.Sequential([
    ### START CODE HERE ###     
    tf.keras.layers.Dense(256, activation = 'relu'),
    tf.keras.layers.Dense(128, activation = 'relu'),
    tf.keras.layers.Dense(num_outputs, activation = 'linear')
  
  
    ### END CODE HERE ###  
])

# create the user input and point to the base network
input_user = tf.keras.layers.Input(shape=(num_user_features))
vu = user_NN(input_user)
vu = tf.linalg.l2_normalize(vu, axis=1)

# create the item input and point to the base network
input_item = tf.keras.layers.Input(shape=(num_item_features))
vm = item_NN(input_item)
vm = tf.linalg.l2_normalize(vm, axis=1)

# compute the dot product of the two vectors vu and vm
output = tf.keras.layers.Dot(axes=1)([vu, vm])

# specify the inputs and output of the model
model = tf.keras.Model([input_user, input_item], output)

model.summary()

这段代码构建了一个用于推荐系统的神经网络模型,具体地说,它是一个协同过滤模型,用于学习用户和物品的嵌入表示,并通过计算它们的点积来预测评分或匹配分数。下面是对代码的详细解释:

1. 设置输出维度和随机种子:

num_outputs = 32
tf.random.set_seed(1)
  • num_outputs: 定义模型的嵌入空间的维度,即用户和物品的嵌入向量的大小为 32。
  • tf.random.set_seed(1): 设置随机种子以确保结果的可复现性。这会影响模型权重的初始化,使得每次运行时初始化相同。

2. 构建用户嵌入网络(user_NN):

user_NN = tf.keras.models.Sequential([
    tf.keras.layers.Dense(256, activation = 'relu'),
    tf.keras.layers.Dense(128, activation = 'relu'),
    tf.keras.layers.Dense(num_outputs, activation = 'linear')
])
  • tf.keras.models.Sequential(): 使用顺序模型来堆叠神经网络层。
  • 第一层: Dense 层,包含 256 个神经元,使用 ReLU 激活函数。
  • 第二层: Dense 层,包含 128 个神经元,使用 ReLU 激活函数。
  • 第三层: Dense 层,包含 num_outputs 个神经元,使用线性激活函数。线性激活函数通常用于最后一层,以输出一个实数向量,这里表示用户的嵌入向量。

3. 构建物品嵌入网络(item_NN):

item_NN = tf.keras.models.Sequential([
    tf.keras.layers.Dense(256, activation = 'relu'),
    tf.keras.layers.Dense(128, activation = 'relu'),
    tf.keras.layers.Dense(num_outputs, activation = 'linear')
])
  • 结构与user_NN完全相同,只是输入数据不同。item_NN网络用于生成物品的嵌入表示。

4. 创建用户输入和嵌入:

input_user = tf.keras.layers.Input(shape=(num_user_features))
vu = user_NN(input_user)
vu = tf.linalg.l2_normalize(vu, axis=1)
  • tf.keras.layers.Input(): 定义输入层,shape=(num_user_features) 表示输入的特征向量大小。
  • user_NN(input_user): 将输入数据通过用户嵌入网络,生成用户的嵌入向量 vu
  • tf.linalg.l2_normalize(vu, axis=1): 对用户嵌入向量进行 L2 正则化,使得每个向量的长度为 1。这有助于在计算点积时规范化向量的尺度。

5. 创建物品输入和嵌入:

input_item = tf.keras.layers.Input(shape=(num_item_features))
vm = item_NN(input_item)
vm = tf.linalg.l2_normalize(vm, axis=1)
  • 与用户输入和嵌入处理类似,这里定义了物品输入 input_item,并通过物品嵌入网络 item_NN 生成物品嵌入向量 vm,同样对其进行 L2 正则化。

6. 计算用户和物品向量的点积:

output = tf.keras.layers.Dot(axes=1)([vu, vm])
  • tf.keras.layers.Dot(axes=1): 计算用户向量 vu 和物品向量 vm 之间的点积。点积用于衡量两个向量的相似性或匹配程度。这里的 axes=1 表示沿着每个向量的最后一个维度计算点积。

7. 构建模型并总结:

model = tf.keras.Model([input_user, input_item], output)
model.summary()
  • tf.keras.Model([input_user, input_item], output): 定义最终的模型,将用户输入和物品输入连接到输出点积结果。输入是 [input_user, input_item],输出是它们的点积 output
  • model.summary(): 打印模型的架构总结,显示每一层的输出形状和参数数量。

总结:

这个模型的主要目标是通过神经网络学习用户和物品的低维嵌入表示,并通过计算它们的点积来预测用户对物品的偏好。通过 L2 正则化确保向量的单位长度,从而使得点积的结果主要由向量方向决定,而不是它们的尺度。模型构建完成后,可以通过训练使其学习用户与物品之间的关系,并最终用于推荐系统中。

Principal component analysis

use fewer numbers to capture “size” feature

image-20240829191153734

from 3d to 2d

image-20240829191916255

preprocess data

image-20240829220819059

principle component

image-20240829222632178

more principle component

image-20240829223529428

pca is not linear regression

image-20240829224251345

the reconstruction of PCA

image-20240829225151146

Week 3 reinforcement learning

reward function

image-20240830191833700

四元组

image-20240830194116877

return && dicount factor

image-20240831155216858

Policy

image-20240831155928017

the concepts of the reinforcment learning

image-20240831161135394

markov decision process

image-20240831160926742

state action value function

image-20240901103025053

Picking actions

image-20240901103618300

bellman equation

image-20240901111213981

stochastic environment

expected return

image-20240901115637220

contineous state space

image-20240901172549823

lunar lander problem

image-20240901173754605

deep reinforcement learning

image-20240901174210985

DQN(Deep Q network learning alogrithm

image-20240904004340981

some refinements on the neuron network architecture

image-20240904103342834

ϵ \epsilon ϵ - greedy policy

how to choose actions while still learning

explaitation && exploration

image-20240904110549745

by the way , it it worth to notice that the reinforcement learning is more finicky than the supervised learning, which means if you choose a bad paramater, it may take much more time(10 times maybe) to train.

Mini-batch and soft updates

image-20240909175116923

每次只取一个很大样本的一部分进行训练,从而极大加快了训练的速度。但是缺点也非常明显,就是它的整体的成本函数往往不会一直降低(如右上图)所示

minibatch in reinforcement leaning

image-20240909184706893

soft update : make a gradual update to Q

image-20240909184630511

limitations of reinforcement learning

image-20240912233734851

target network

image-20240913003028914

experience replay

image-20240913095623109

deep Q-learning Alogrithm with experience replay

终于学完了,哈哈哈哈哈哈哈!(已经精神失常了)
ate to Q

[外链图片转存中…(img-6vvmTQyT-1726294284614)]

limitations of reinforcement learning

[外链图片转存中…(img-km1lrUV5-1726294284615)]

target network

[外链图片转存中…(img-MkDm2yf5-1726294284615)]

experience replay

[外链图片转存中…(img-L4TkT2R6-1726294284615)]

deep Q-learning Alogrithm with experience replay

[外链图片转存中…(img-UExf8sQS-1726294284616)]

终于学完了,哈哈哈哈哈哈哈!(已经精神失常了)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值