Understanding Neurons, Weights, and Biases in Recurrent Neural Networks (RNNs)

Every neural network, whether it is a simple Feedforward Neural Network (FNN) or an advanced Recurrent Neural Network (RNN), is made up of small computational units known as neurons.

Each neuron performs a simple mathematical operation:
it takes some inputs, multiplies them by weights, adds a bias, and then applies an activation function to produce an output.

When thousands (or even millions) of such neurons are connected in layers, they can collectively learn to recognize patterns in data — from text and speech to stock prices or even human emotions.

To understand how RNNs process sequential data (like sentences or time series), you must clearly understand three core elements:

Neuron – the basic processing unit
Weights – determine the strength of input connections
Bias – provides flexibility in learning

2. What Is a Neuron?

2.1 Definition

A neuron is the fundamental building block of a neural network. It is modeled after the biological neurons in the human brain, which receive signals, process them, and send output to other neurons.

In an artificial neural network, a neuron performs a simple mathematical transformation.

Given input values x₁,x₂,…,x_n, each neuron calculates:

$z = w_1 x_1 + w_2 x_2 + \dots + w_n x_n + b$

$a = f(z)$

Where:

w₁,w2,…,wn: weights of the neuron
b: bias
z: weighted sum of inputs plus bias
f: activation function (such as tanh, ReLU, sigmoid)
a: output (activation) of the neuron

The activation function f introduces non-linearity, enabling the network to learn complex patterns rather than just linear relationships.

2.2 Intuitive Analogy

Think of a neuron like a decision maker:

Each input has an associated importance (weight).
The neuron sums up all inputs after scaling them by importance.
It then adds a bias, which lets it make decisions even when all inputs are zero.
Finally, it passes this total through an activation function, deciding whether to “fire” or stay “inactive.”

2.3 Example

Suppose you have a neuron with two inputs:

$x_1 = 3, \quad x_2 = 2 \\ w_1 = 0.5, \quad w_2 = 0.4, \quad b = 0.1$

Then:

z=(0.5)(3) + (0.4)(2) + 0.1 = 1.5 + 0.8 + 0.1 = 2.4

If the activation function is sigmoid,

$a = \frac{1}{1 + e^{-2.41}} \approx 0.916$

So the neuron outputs approximately 0.916 — a value between 0 and 1 representing the neuron’s “activation level.”

3. Neurons in an RNN

An RNN neuron works a bit differently from a neuron in a standard feedforward network.

While a feedforward neuron only considers current inputs,
an RNN neuron also takes into account the previous hidden state — meaning it has a memory of what happened before.

At each time step ttt, the RNN neuron computes:

$h_t = f(W_{xh}x_t + W_{hh}h_{t-1} + b_h)$

$y_t = W_{hy}h_t + b_y$

Where:

x_t: input vector at time ttt
h_t−1: previous hidden state (memory)
h_t: current hidden state (output of the neuron)
y_t: output at time ttt
Wxh: input-to-hidden weight matrix
Whh: hidden-to-hidden weight matrix
Why: hidden-to-output weight matrix
b_h,by: biases
f: activation function (usually tanh or ReLU)

Thus, an RNN neuron takes two types of input:

The current input x_t
The previous output (hidden state) h_t−1

This recurrence gives the RNN its ability to remember past information, making it ideal for sequential data.

4. Understanding Weights in RNNs

4.1 Definition

Weights are learnable parameters that determine how much influence each input or previous state has on the current neuron’s output.

In RNNs, we have three main types:

Input-to-Hidden Weights (Wxh)
Connect the input x_t to the hidden state h_t.
These control how strongly the current input affects the current neuron’s activation.
Hidden-to-Hidden Weights (Whh)
Connect the previous hidden state h_t−1 to the current hidden state h_t.
These store memory and determine how much past information should influence the present.
Hidden-to-Output Weights (W_hy)
Connect the hidden state to the output yt.
These map the neuron’s internal representation to the final predicted output.

$x_t = [2], \quad h_{t-1} = [0.5] \\ W_{xh} = 0.6, \quad W_{hh} = 0.3, \quad b_h = 0.1 \\ Then: \\ h_t = \tanh(W_{xh}x_t + W_{hh}h_{t-1} + b_h) \\ h_t = \tanh(0.6 \times 2 + 0.3 \times 0.5 + 0.1) \\ = \tanh(1.2 + 0.15 + 0.1) = \tanh(1.45) \approx 0.896$

So, the neuron outputs 0.896 at this time step, which becomes input for the next time step.

5. Understanding Bias in RNNs

5.1 Definition

Bias acts like a constant offset that allows the neuron to activate even when all inputs are zero.
It helps the network fit data better by shifting the activation function left or right.

In RNNs, we have:

Hidden bias (bh): added when computing the hidden state.
Output bias (by): added when computing the final output.

5.2 Why It’s Important

Without a bias term, if all inputs are zero, the neuron’s output will always be zero, regardless of the activation function.
The bias ensures the neuron can still produce meaningful activations.

6. Relationship Between Neuron, Weight, and Bias

To summarize how they interact:

Component	Role
Neuron	Core computation unit that processes inputs and produces an output.
Weights	Determine how strongly each input influences the neuron’s output.
Bias	Allows flexibility by shifting the activation threshold.

In an RNN, these three elements combine to create a memory mechanism that links past inputs to future outputs through the hidden state.

7. How Weights and Bias Are Learned

RNNs learn weights and biases through a process called Backpropagation Through Time (BPTT).

Steps:

Forward pass: Compute outputs for all time steps.
Loss calculation: Compare predicted output with actual target.
Backward pass: Propagate error backward through all time steps.
Update weights and biases: Adjust W_xh, W_hh, W_hy, b_h, and b_y using gradient descent.

This learning ensures that the neuron’s parameters evolve to capture patterns in sequential data.