tanh
tanh is the acronym for Hyperbolic Tangent.
Hyperbolic Tangent
A mathematical function and a commonly used activation function in artificial neural networks, particularly in shallow networks or hidden layers of deep learning models. The hyperbolic tangent function is similar to the sigmoid function, as both are S-shaped and continuously differentiable. However, tanh has a range of output values between -1 and 1, whereas the sigmoid function’s output range is between 0 and 1.
The hyperbolic tangent function is defined as:
Loading formula...Where e is the base of the natural logarithm (approximately 2.71828), and x is the input value.
The main properties of the tanh function are:
- Non-linearity: The tanh function introduces non-linearity into the neural network, enabling it to learn complex patterns and relationships in the data.
- Continuously differentiable: The tanh function has a continuous derivative, which is important for gradient-based optimization methods like backpropagation.
- Centered at zero: Unlike the sigmoid function, the tanh function’s output is centered at zero, which can help with faster convergence during the training process by reducing the chances of weights getting stuck in undesirable local minima.
However, the tanh function can also suffer from the vanishing gradient problem, especially for deep learning models with many layers. In this case, the gradients of the loss function become extremely small during backpropagation, leading to slow or ineffective learning. This issue has contributed to the increasing popularity of alternative activation functions like Rectified Linear Unit (ReLU) and its variants for deep learning models.