The Perceptron Isn't a Neuron: It's a Weighted Vote With a Threshold

perceptron mcculloch-pitts neural-networks weights-bias supervised-learning

Before there were multilayer networks, there was a single unit trying to prove a point: that a machine could make a decision from weighted evidence. The McCulloch-Pitts neuron is the ancestor of every neuron in every network since. It does exactly two things: a function $g$ takes the inputs and performs an aggregation, and a function $f$ makes a decision based on that aggregated value. Inputs come in two flavors: excitatory (positive values that push toward firing) and inhibitory (negative values that push against it). The aggregation itself is nothing exotic:

y_{sum} = \sum_{i=1}^{n} w_i x_i

The Perceptron builds directly on this: the simplest form of neural network that makes a decision by combining weighted inputs and running the result through an activation function. The one addition that matters is the bias term:

z = \sum_{i=1}^{n} w_i x_i + b

Weight and bias do different jobs and it's worth being precise about the difference. The weight controls how much each input influences the output: bigger weight, more influence. The bias controls when the perceptron activates at all: it shifts the decision boundary up, down, left, or right, independent of the inputs. A perceptron with zero bias is forced through the origin no matter how good its weights are. Bias is what lets the boundary sit wherever the data actually needs it.

Training a perceptron is the Perceptron Rule, and it's a clean three-step loop. First, initialize weights, bias, and a learning rate $\eta$ . Second, run the training process: compute $z = w \cdot x + b$ , threshold it into a prediction ( $y_{pred} = 1$ if $z \geq 0$ , else $0$ ), compute the error ( $\text{Error} = y - y_{pred}$ ), then nudge the weights and bias in the direction that would have reduced that error:

w' = w + \eta \cdot \text{Error} \cdot x \qquad b' = b + \eta \cdot \text{Error}

Third, repeat this for multiple epochs until the errors stop moving the weights. There's no calculus here, no gradients, just: were you wrong, and if so, which direction should the weight have leaned. That simplicity is also the perceptron's ceiling: it can only draw a straight line through the data. Everything after this in the neural network story is about breaking past that limit.