Derivative of the activation function.

Hi, my name is Ramiro, I was checking the code and I have a doubt.
When you update the parameters, related to the input layer and the hidden layer (W1,b1), you calculate the derivative of the activation function, I think that it is done in this line (_ann.py_ file):
` dZ = pY_T.dot(self.W2.T) * (1 - Z*Z) # tanh`
In the particular case of the _tanh_ I think that (1 - Z*Z) is the derivate, if this is correct so why we use Z. Recall what is stored in Z:
`Z = np.tanh(X.dot(self.W1) + self.b1)`
I think that we should use only _X.dot(self.W1) + self.b1_ to evaluate the the derivative, which is the same that use _np.arctanh(Z)_. So the result should be _(1 - np.arctanh(Z)*np.arctanh(Z))_.
I'm probably wrong, just want to know why.

Thanks!
R.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Derivative of the activation function. #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Derivative of the activation function. #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions