𝗜 𝗕𝘂𝗶𝗹𝘁 𝗔 𝗡𝗲𝘂𝗿𝗮𝗹 𝗡𝗲𝘁𝘄𝗼𝗿𝗸'𝘀 𝗙𝗶𝗿𝘀𝘁 𝗡𝗲𝘂𝗿𝗼𝗻

📅2 days ago⏱1 min read

Transformers did not exist in 1958. Backpropagation did not exist. Instead, Frank Rosenblatt created the Perceptron.

Build this neuron and you see the core unit of every deep network. This is Day 1 of DeepLearningFromZero. I build neural networks from scratch without using frameworks.

How a neuron works:

You take inputs.
You multiply each input by a weight.
You add a bias.
You apply an activation function.

The original Perceptron used a step function. It outputs +1 if the sum is 0 or more. Otherwise, it outputs -1.

The math creates a decision boundary. This boundary is a straight line. One side represents class +1. The other side represents class -1. A neuron holds knowledge through the tilt and position of this line.

Learning happens through error correction:

Predict the label for a point.
If the prediction is correct, do nothing.
If the prediction is wrong, nudge the weights.

This process rotates and shifts the line. It moves the line until the misclassified point sits on the correct side. Rosenblatt proved that if a straight line can separate two classes, this method will find it.

A single neuron has a limit. It can only draw one straight line. It cannot solve the XOR problem because no single line separates those points. This limitation stalled neural network research in the 1970s.

We solved this by stacking neurons into layers. We replaced the step function with smooth activations. We added gradient descent and backpropagation. This series follows that path from one neuron to a transformer.

Train the neuron live and watch the boundary rotate: https://dev48v.infy.uk/dl/day1-perceptron.html

Full post: https://dev.to/dev48v/i-built-a-neural-networks-first-neuron-from-scratch-the-1958-perceptron-3gfg

𝗜 𝗕𝘂𝗶𝗹𝘁 𝗔 𝗡𝗲𝘂𝗿𝗮𝗹 𝗡𝗲𝘁𝘄𝗼𝗿𝗸'𝘀 𝗙𝗶𝗿𝘀𝘁 𝗡𝗲𝘂𝗿𝗼𝗻

Continue reading

𝗧𝗵𝗲 𝗦𝗵𝗮𝗽𝗲 𝗼𝗳 𝗮 𝗡𝗲𝘂𝗿𝗼𝗻

𝗠𝗲𝗰𝗵𝗮𝗻𝗶𝘀𝘁𝗶𝗰 𝗜𝗻𝘁𝗲𝗿𝗽𝗿𝗲𝘁𝗮𝗯𝗶𝗹𝗶𝘁𝘆: 𝗜𝗻𝘀𝗶𝗱𝗲 𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗲𝗿𝘀

𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗔 𝗡𝗲𝘂𝗿𝗮𝗹 𝗡𝗲𝘁𝘄𝗼𝗿𝗸 𝗙𝗿𝗼𝗺 𝗦𝗰𝗿𝗮𝘁𝗰𝗵

𝗧𝗵𝗲 𝗦𝘁𝗿𝗶𝗱𝗲 𝗧𝗼 𝗠𝗮𝗸𝗲 𝗔𝗜 𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱 𝗙𝗼𝗿𝘁𝗵

𝗔𝗰𝘁𝗶𝘃𝗮𝘁𝗶𝗼𝗻 𝗙𝘂𝗻𝗰𝘁𝗶𝗼𝗻𝘀: 𝗧𝗵𝗲 𝗕𝗲𝗻𝗱 𝗜𝗻 𝗗𝗲𝗲𝗽 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴