Use this page to see what a neural network is made of. Change the number of inputs,
hidden layers, and outputs, then watch how the parameter count changes. The page connects
those parameters back to the familiar maths idea y = ax + b, where weights are
like a and biases are like b. The examples below show how a tiny
classifier calculates a result, and how a larger network can recognise simple letters.
Network shape
Inputs, hidden layers, and outputs determine how many connections the model has.
Parameter cost
Each weight and bias must be stored, trained, and used whenever the model runs.
Maths connection
Neural networks reuse the idea of y = ax + b many times across many neurons.
Learning example
The letter demo shows how pixels become signals, scores, and a final prediction.
Change the network structure
Type layer sizes separated by commas. Example: 8, 16, 8
Cost message
Your current network is small.
Loading...
Wider layers add many more connections.
More hidden layers increase depth.
Both usually increase training and inference cost.
For image or letter recognition, the input layer often grows quickly because each pixel can become an input value.
Cost summary
Layer sizes
-
Total parameters
-
Approx memory (FP32)
-
Relative compute cost
-
Depth
-
Simple size label
-
Parameter = weight + biasWeight is like aBias is like bMore parameters = more costDeeper network = more layers
Parameters and y = ax + b
In traditional maths, a straight line is often written as y = ax + b.
The a controls how strongly the input x changes the output y, and the
b shifts the result up or down.
A neural network neuron uses the same idea, just repeated many times:
each connection has a weight, and each neuron has a bias. During training,
the network searches for useful values for those weights and biases.
Traditional:
y = a*x + b
One neuron:
output = activation(w*x + b)
Many inputs:
output = activation(w1*x1 + w2*x2 + ... + b)
Network picture
Lines represent weighted connections. Nodes represent units. The picture is capped visually so large networks still stay easy to view.
Small deep learning example: letter recognition
Dark cell = pixel turned on
Every pixel is connected to the next layer
Blue glow = stronger active signal
Predicted letter
1. Input image
A small grid stores the letter as pixel values.
2. Hidden layers
Every input pixel connects to the next layer, even white pixels.
3. Output scores
Each output node represents one possible letter.
4. Prediction
The highest score becomes the recognised letter.
Example shown: the network reads a small pixel grid and predicts which letter it is.
Smallest useful network example
The smallest useful neural network can be just one input and one output neuron.
It is not deep yet, but it uses the same parameter idea as larger networks: one weight and one bias.
Here it predicts whether a student is likely to pass from the number of study hours.
Input:
x = study hours
Parameters learned by training:
weight w = 2
bias b = -5
Neuron calculation:
z = w*x + b
score = sigmoid(z)
Decision rule:
if score >= 0.5, predict Pass
otherwise, predict Not pass
What sigmoid means:
sigmoid(z) = 1 / (1 + e^(-z))
It squeezes any number into a score between 0 and 1.
Example 1:
x = 3 hours
z = 2*3 - 5 = 1
score = sigmoid(1)
score = 1 / (1 + e^(-1))
score = 1 / (1 + 0.37) = 0.73
result = Pass
Example 2:
x = 2 hours
z = 2*2 - 5 = -1
score = sigmoid(-1)
score = 1 / (1 + e^(1))
score = 1 / (1 + 2.72) = 0.27
result = Not pass
Parameter count:
1 weight + 1 bias = 2 parameters
Study hours x
z = 2*x - 5
sigmoid(z)
Prediction
3
1
0.73
Pass
2
-1
0.27
Not pass
This is the same shape as y = ax + b: the weight is like a, the bias is like b, and sigmoid turns the raw result z into a probability-like score between 0 and 1.