- 博客(28)
- 收藏
- 关注
原创 Cost Function and Backpropagation
Cost Function and BackpropagationCost FunctionNeural Network(Classification)有m组训练集{(x(1),y(1)),(x(2),y(2)),...,(x(m),y(m))(x^{(1)},y^{(1)}),(x^{(2)},y^{(2)}),...,(x^{(m)},y^{(m)})(x(1),y(1)),(x(2),y(2)),...,(x(m),y(m))}LLL= total number of layers in ne
2022-02-23 15:24:39 137
原创 Neural Networks
Neural NetworksModel Representation IHow we represent our hypothesis or how we represent our model when using neural networks?neurons are cells in the brainneuron has a cell bodyneuron has a number of input wires( dendrites, receive inputs from other
2022-02-06 23:33:49 708
原创 Logistic Regression Model
Logistic Regression ModelCost FunctionLogistic Function will not be a convex function, it will cause the output to be wavy, causing many local optima.Cost function for logistic regression looks like:When y=1, J(θ)J(\theta)J(θ) vs hθ(x)h_\theta(x)hθ
2022-02-05 15:40:16 1289
原创 Solving the Problem of Overfitting
Solving the Problem of OverfittingThe problem of overfittingUnderfitting, high biasoverfitting, high variance:If we have too many features, the learned hypothesis may fit the training set very well, but fail to generalize to new examples.Debuggi
2022-02-05 15:37:41 1013
原创 Classification and Representation
Classification and RepresentationClassificationy∈y\iny∈{0,1}0:“Negative Class”1:“Positive Class”The training set is classified arbitrarily as 0 or 1.classification is not actually a linear functionbinary classification problemy can take on only
2022-01-27 21:23:35 760
原创 Computing Parameters Analytically
Computing Parameters AnalyticallyNormal EquationFind the optimum theta without iterationMinimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero.Formula:θ=(XTX)−1XTy\theta={(X^TX)}^{-1}X^Tyθ=(XTX)−1XTyOc
2022-01-25 22:49:38 199
原创 Multivariate Linear Regression
Multivariate Linear RegressionMultiple FeaturesMultivariate linear regression: Linear regression with mutiple variables.Notationnnn= number of featuresx(i)x^{(i)}x(i)= input (features) of ithi^{th}ith training example.xjix_j^{i}xji= value of feat
2022-01-25 21:55:20 128
原创 Linear Algebra Review
Linear Algebra ReviewMatrices and VectorsMatrix: Rectangular array of numbersDimension of matrix: number of rows x number of columnsRefer to matrix of specific dimension: R2×3\R^{2\times3}R2×3Refer to specific matrix elements of the matrix: AijA_{ij
2022-01-23 23:41:44 302
原创 Parameter Learning
Parameter LearningGradient DescentThe outline of gradient descent:Start with some θ0,θ1\theta_0,\theta_1θ0,θ1 (commonly set θ0=0,θ1=0\theta_0=0,\theta_1=0θ0=0,θ1=0)Keep changing θ0,θ1\theta_0,\theta_1θ0,θ1 to reduce J(θ0,θ1)J(\theta_0,\the
2022-01-22 23:51:25 411
原创 Model and Cost Function
Model and Cost Function2-1 Model RepresentationSupervised Learning:Regression Problem, predict real-valued outputClassification Problem, predict discrete-valued outputTraining set(x,y)(x,y)(x,y)–one training example(x(i),y(i))(x^{(i)},y^{(i)}
2022-01-21 14:55:33 652
原创 Machine Learning
Machine Learning1-2 What is machine learningPerform on task T as measured by performance P, improve with experience E.Types of machine learning algorithm:Supervised learning→\rightarrow→ teachingUnsupervised learning→\rightarrow→learning by itselfR
2022-01-20 23:35:27 288
原创 Homework 4
Q 1(a)(b) FalseQ 2(a)(b) Yes. At the line : −x1+x2=0-x_1+x_2=0−x1+x2=0 ,move along x1x_1x1, it could find the optional solution at the stop at x1=10x_1=10x1=10, Zm=−10Z_m=-10Zm=−10.© Yes, the optional solution equals to the one in (b).(d) No,it
2022-01-19 01:34:33 322
原创 《概率论》 2.1 随机变量的定义
若X是由可测空间(Ω,F)(\Omega,F)(Ω,F)到$(|R,B(|R))上的可测映射,称X为随机变量。设(Ω,F,P)(\Omega,F,P)(Ω,F,P) 为某随机现象可测空间,X是定义在Ω\OmegaΩ上的实值函数,则X是随机变量当且仅当对任意实数x,ω∈Ω:X(ω)≤x∈F{\omega\in \Omega:X_(\omega)\leq x}\in Fω∈Ω:X(ω)≤x∈F只有两个取值的随机变量为Bernoulli 随机变量如:lA(ω)={1,若ω∈A;0,若ω∈...
2022-01-16 18:04:17 1182
原创 《概率论》 2.2 概率分布
1. 离散型分布设随机变量X的可能取值为有限个或可列个,记X1,X2,...X_1,X_2,...X1,X2,...则称X为离散型随机变量或X具有离散型分布,并称pk=P{(X=xk)}p_k=P\lbrace(X=x_k)\rbracepk=P{(X=xk)} \;(k=1,2…)为X的分布列或概率函数单点分布:随机变量X满足P(X=c)=1,即X的分布函数F是一个退化分布函数‾\underline{退化分布函数}退化分布函数,称X服从单点分布,记X∼ScX\sim S_cX∼Sc
2022-01-16 17:19:07 1680
原创 《概率论》1.2 概率的定义
确定概率的常用方法古典方法频率方法几何方法主观方法设 Ω\OmegaΩ ,若 Ω\OmegaΩ 只含有限个样本点,每个样本的出现的可能性相等,P(A)=∣A∣∣Ω∣P(A)=\frac{|A|}{|\Omega|}P(A)=∣Ω∣∣A∣处理排列组合问题要遵循加法原理和乘法原理.若m≤0或m≥0m \leq0 或m \geq0m≤0或m≥0, Cnm=0C^m_n=0Cnm=0频率方法大量重复随机试验fn(A)=n(A)nf_n(A)=\frac{n(A)}{n}fn(A)=
2022-01-15 14:30:17 1043 1
原创 零空间、点积与对偶性
零向量变换后落在原点的向量的集合称矩阵的零空间或核(Kernel)Ax→=A\overrightarrow {x}=Ax=[00]\begin{bmatrix}0\\0\\\end{bmatrix}[00]零空间为这个向量方程的所有可能解非方阵变换后基向量的坐标作为矩阵的列列空间的维数与输入空间的维数相等3*2矩阵:二维空间映射到三维空间上两列:输入空间有两个基向量三行:每一个基向量变换后用三个独立坐标描述2*3矩阵:三维→\rightarrow→二维点积与对偶性[271].
2021-12-05 22:01:15 558
原创 数据结构与算法 11_20
Red and BlackTips: DFS;利用递归,对所走过的点进行标记,并判断边界。DescriptionThere is a rectangular room, covered with square tiles. Each tile is colored either red or black. A man is standing on a black tile. From a tile, he can move to one of four adjacent tiles. But he c
2021-11-27 14:48:21 89
原创 数据结构与算法 11_13
数据结构与算法 11_13A/B【快速幂法求逆元】Description要求(A/B)%9973,但由于A很大,我们只给出n(n=A%9973)(我们给定的A必能被B整除,且gcd(B,9973) = 1)。Input数据的第一行是一个T,表示有T组数据。每组数据有两个数n(0 <= n < 9973)和B(1 <= B <= 10^9)。Output对应每组数据输出(A/B)%9973。Sample Input21000 5387 123456789Sam
2021-11-21 13:34:46 79
原创 数据结构与算法 10.30
数据结构与算法 10.30Lake CountingDescriptionDue to recent rains, water has pooled in various places in Farmer John’s field, which is represented by a rectangle of N x M (1 <= N <= 100; 1 <= M <= 100) squares. Each square contains either water (‘W’)
2021-11-13 22:50:56 450
原创 《概率论》1.1 随机事件
A∪B⇒至少一个发生A\cup B \Rightarrow 至少一个发生A∪B⇒至少一个发生 (当AB≠∅,将A∪B写作A+BAB\not=\empty,将A\cup B写作A+BAB=∅,将A∪B写作A+B)A∩B⇒A,B同时发生A\cap B\Rightarrow A,B同时发生A∩B⇒A,B同时发生A BA\ BA B⇒\Rightarrow⇒A发生且B不发生 (当A⊃BA\supset BA⊃B,有时A\B写A-B)A‾\overline{A}A =Ω\OmegaΩ
2021-11-06 16:04:36 256
原创 《概率论》1.5条件概率
条件概率(Ω\OmegaΩ,F,P)B∈\in∈F且P(B)>0,对任意A∈\in∈F,令P(A|B)=P(AB)P(B)\frac{P(AB)}{P(B)}P(B)P(AB)P(A|B)为在B发生的条件下,A发生的条件概率1)若P(B)=0,则P(A|B)无意义2)若P(B)>0,A与B相互独立,则P(A|B)=P(A)P(A)为A的无条件概率令B=Ω\OmegaΩ,可将无条件概率视为条件概率 P(A)=P(A|Ω\OmegaΩ)定理条件概率是概率 (Ω\OmegaΩ,F,P
2021-11-03 00:01:35 482
原创 线性相关概念
向量相关概念线性组合,张成空间,基基严格地说是张成该空间的一个线性无关集基向量(i⃗,j⃗)(\vec{i},\vec{j})(i,j)用数字描述向量,依赖于所取的基张成空间(span)所有可以表示为给定向量线性组合的向量的集合av⃗+bw⃗(a,b为所有可能值)a\vec{v}+b\vec{w}(a,b为所有可能值)av+bw(a,b为所有可能值)称为给定向量张成的空间实际是问仅通过向量加法和向量数乘这两种基础运算,所能获得的所有向量。线性相关 v⃗,w⃗,u⃗的
2021-10-29 23:09:48 466
原创 矩阵运算规则
矩阵运算规则多解法乘法[−−−−−−]\begin{bmatrix} --\\--\\-- \end{bmatrix}⎣⎡−−−−−−⎦⎤[∣∣∣∣∣∣]\begin{bmatrix}|&|&|\\|&|&|\end{bmatrix}[∣∣∣∣∣∣]=[∣∣∣∣∣∣∣∣∣]\begin{bmatrix}|&|&|\\|&|&|\\|&|&|\end{bmatrix}⎣⎡∣∣∣∣∣∣∣∣∣⎦⎤
2021-10-29 23:09:20 289
原创 方程组的几何解释
#方程组的几何解释Ax=b行图像(Row picture)一次取一行作图在x-y坐标系中作方程解,图像交点即为x.线性,就是直线列图像(Column picture)3*3→图像为平面plane求解:消元法不是每个b都使Ax=b有解对A B C若有A+B=C,能得到的b是在ABC所组成的平面内的。#矩阵乘法#方法(从右往左乘)什么是向量数乘几何上就是缩放...
2021-10-29 23:08:45 58
原创 数据结构与算法 10.23
数据结构与算法 10.23SubsequenceDescriptionA sequence of N positive integers (10 < N < 100 000), each of them less than or equal 10000, and a positive integer S (S < 100 000 000) are given. Write a program to find the minimal length of the subsequence
2021-10-29 23:05:23 140
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人