Loading Runtime

In machine learning and artificial neural networks, an activation function is a mathematical function that determines the output of a neural network node (or neuron) given its input. These functions introduce non-linearity into the network, allowing it to model complex relationships between inputs and outputs.

The activation function takes the weighted sum of inputs (from the previous layer or directly from the input) and produces an output that serves as the input to the next layer or as the final output of the neural network.

There are several types of activation functions used in neural networks. Some common activation functions include:

  • Sigmoid: The sigmoid function maps any input value to a value between 0 and 1. It's expressed as f(x) = 1 / (1 + e^(-x)).
  • ReLU (Rectified Linear Unit): ReLU returns 0 for any negative input and the input value for any positive input. It's expressed as f(x) = max(0, x).
  • Tanh (Hyperbolic Tangent): Tanh function is similar to the sigmoid function but maps input values to a range between -1 and 1. It's expressed as f(x) = (e^(x) - e^(-x)) / (e^(x) + e^(-x)).
  • Softmax: Softmax is often used in the output layer of neural networks for multi-class classification problems. It converts a vector of arbitrary real values into a probability distribution where the sum of probabilities equals 1.

Here is an example of how some of these (and other) activation function look graphically:

activation function graphs