The figure below shows a Recurrent Neural Network (RNN) with one input unit
x
, one logistic hidden unit
h
, and one linear output unit
y
. The RNN is unrolled in time for T=0,1, and 2.
The network parameters are:
Wxh=−0.1
,
Whh=0.5
,
Why=0.25
,
hbias=0.4
, and
ybias=0.0
.
If the input
x
takes the values
18,9,−8
at time steps
0,1,2
respectively, the hidden unit values will be
0.2,0.4,0.8
and the output unit values will be
0.05,0.1,0.2
(you can check these values as an exercise). A variable
z
is defined as the total input to the hidden unit before the logistic nonlinearity.
If we are using the squared loss, with targets
t0,t1,t2
, then the sequence of calculations required to compute the total error
E
is as follows:
z0=h0=y0=E0=E=Wxhx0+hbiasσ(z0)Whyh0+ybias12(t0−y0)2E0+E1+E2z1=h1=y1=E1=Wxhx1+Whhh0+hbiasσ(z1)Whyh1+ybias12(t1−y1)2z2=h2=y2=E2=Wxhx2+Whhh1+hbiasσ(z2)Whyh2+ybias12(t2−y2)2
If the target output values are
t0=0.1,t1=−0.1,t2=−0.2
and the squared error loss is used, what is the value of the error derivative just before the hidden unit nonlinearity at
T=1
(i.e.
∂E∂z1
)? Write your answer up to at least the fourth decimal place.