Introduction — From Biology to AI
A neuron is a cell in the nervous system — the basic unit of the brain used to process and transmit information to other nerve cells and muscles. AI embeds similar behaviour into an Artificial Neural Network (ANN), which can adapt to changing inputs and produce the best possible outcome without re-programming — just as the human brain does.
🔹 Why Neural Networks Matter
- They can extract data features automatically — no manual input from the programmer.
- Power chatbots, email auto-reply, spam filtering, Facebook image tagging, product recommendations on e-commerce sites.
- One of the best-known examples — Google's search algorithm.
🔹 Key Concepts You'll Learn
- Parts of a neural network
- Components of a neural network
- Working of a neural network
- Types of neural networks (feedforward, convolutional, recurrent, …)
- Impact of neural networks on society
Prerequisite: Basic understanding of machine-learning concepts.
1.1 Parts of a Neural Network — 3 Layers
- Input Layer — contains units representing the input fields. Each unit corresponds to a specific feature or attribute.
- Hidden Layer(s) — one or more layers between input and output. Each hidden layer holds nodes/artificial neurons that process input data. Nodes are interconnected and each connection has an associated weight.
- Output Layer — contains one or more units representing the target field(s). Generates the final predictions.
If a node's output exceeds a threshold value, the node is activated and passes its output onwards; otherwise, no data is transmitted.
🔹 Deep Neural Network (DNN) vs Basic Neural Network
- An ANN with two or more hidden layers is a Deep Neural Network — training it is called Deep Learning.
- A network with more than three layers (inclusive of input + output) is considered a deep-learning algorithm.
- A neural network with only three layers is just a basic neural network.
- "Deep" refers to the number of hidden layers (depth).
1.2 Components of a Neural Network — 7 Building Blocks
🧠 1. Neurons
(a.k.a. nodes) — fundamental building blocks. They receive inputs, compute a weighted sum, apply an activation function and produce an output.
⚖️ 2. Weights
Represent the strength of connection between neurons. Each synapse has a weight that conveys the importance of that feature in predicting the final output. During training, the network learns optimal weights to minimise error.
🔘 3. Activation Functions
Act like decision-makers for each neuron — deciding whether to fire based on its input. Types: Sigmoid · Tanh · ReLU (Rectified Linear Unit). They add non-linearity, letting the model capture complex patterns.
➕ 4. Bias
A constant added to the weighted sum before the activation function — shifts the function horizontally and accounts for inherent bias in the data.
🔗 5. Connections
Represent the synapses between neurons. Each has an associated weight controlling its influence. Biases (constants) affect the activation threshold.
📚 6. Learning Rule
Specifies how weights and biases are adjusted during training. Backpropagation — the common learning algorithm — computes gradients and updates weights to minimise the network's error.
↔️ 7. Propagation Functions
🔹 Forward Propagation
Input data flows through the network layers; activations are computed; the predicted output is compared with the actual target — resulting in an error (loss).
🔹 Back Propagation
Proper tuning of weights ensures lower error rates, making the model reliable by improving generalisation and prediction over time.
2.1 Working of a Neural Network — The Formula
Each node is a simple calculator: it takes inputs, multiplies by weights, adds a bias, and produces an output — fed to the next node.
Σ(wᵢ · xᵢ) + bias = w₁x₁ + w₂x₂ + w₃x₃ + bias
Output: f(x) = 1 if Σ(wᵢxᵢ) + b ≥ 0; f(x) = 0 otherwise (or compared against a chosen threshold).
🔹 Feedforward Network
This process — each node passing its output to the next layer — defines the network as a feedforward network.
2.2 Worked Example — Two Cases
🧮 CASE I — Hidden Layer
Inputs: x₁ = 2, x₂ = 3, x₃ = 1. Weights: w₁ = 0.4, w₂ = 0.2, w₃ = 0.6. Bias = 0.1. Threshold = 3.0.
Σwᵢxᵢ + bias = (0.4·2) + (0.2·3) + (0.6·1) + 0.1 = 0.8 + 0.6 + 0.6 + 0.1 = 2.1 2.1 < 3.0 → output = 0 (neuron INACTIVE)
🧮 CASE II — Output Layer
Weights: w₁ = 0.7, w₂ = 0.3. Bias = 0.2. Hidden-layer output from Case I (= 0) is passed as input.
Output = (0.7·0) + (0.3·0) + 0.2 = 0.2 Threshold for output layer = 0.1 0.2 > 0.1 → final output = 1 (neuron ACTIVE)
2.3 Real-World Example — Should I Go Surfing?
Three decision factors affect whether you go surfing:
| Factor | Input | Weight |
|---|---|---|
| Wave Quality (good?) | x₁ = 1 | w₁ = 5 (large swells rare) |
| Line-up Congestion (empty?) | x₂ = 0 | w₂ = 2 (used to crowds) |
| No recent Shark Activity? | x₃ = 1 | w₃ = 4 (fear of sharks) |
Threshold = 3 → bias = −3. Plug into the formula:
ŷ = (1·5) + (0·2) + (1·4) − 3 = 5 + 0 + 4 − 3 = 6 6 > 3 → output = 1 → GO SURFING!
(ŷ = "y-hat" denotes the predicted value.) Adjusting weights or the threshold yields different outcomes — allowing the model to be tuned for personal preferences.
2.4 Types of Neural Networks — 5 Main Types
① 1. Standard Neural Network (Perceptron)
Created by Frank Rosenblatt in 1958 — a simple NN with a single layer of input nodes fully connected to output nodes. Uses Threshold Logic Units (TLUs) as artificial neurons.
Application: binary classification — spam detection, basic decision-making.
➡️ 2. Feed Forward Neural Network (FFNN)
Also known as Multi-Layer Perceptron (MLP). Has input + one or more hidden layers + output. Data flows one direction only, from input to output. Uses activation functions and weights to process information in a forward manner.
Applications: image recognition · NLP · regression. Efficient for handling noisy data.
🖼️ 3. Convolutional Neural Network (CNN)
Uses filters to extract features from images — incorporates a three-dimensional arrangement of neurons, ideal for visual data.
Applications: computer vision — object detection · image recognition · style transfer · medical imaging.
🔁 4. Recurrent Neural Network (RNN)
Designed for sequential data — features feedback loops so information persists across time steps. If a prediction is wrong, the learning rate makes small adjustments during backpropagation to gradually move towards the right prediction.
Applications: NLP (language modelling, machine translation, chatbots) · speech recognition · time-series prediction · sentiment analysis.
🎨 5. Generative Adversarial Network (GAN)
Consists of two networks:
- Generator — creates new data instances.
- Discriminator — evaluates them for authenticity.
Trained simultaneously under unsupervised learning — used to generate realistic data such as images and videos.
Applications: synthetic-data generation · image generation · style transfer · data augmentation.
2.5 Future of NNs & Impact on Society
- Efficiency & productivity — automates tasks, optimises resources across manufacturing, finance and more.
- Personalisation — tailored recommendations and experiences based on huge datasets.
- Economic growth — creates new jobs in data science and AI.
- Ethical concerns — data privacy, algorithmic bias, job displacement — demand careful consideration and regulation.
3.1 Activity 1 — Machine Learning for Kids (Animals & Birds)
Visit machinelearningforkids.co.uk. Steps:
- Create a project — Identifying Animals & Birds.
- Add labels (classes like "Cat", "Dog", "Parrot") and upload sample contents for each.
- Click Train — the platform trains a model.
- Test with new images.
- Click Describe your model to view the underlying neural-network structure.
- Click Next to see deep-learning working step-by-step.
3.2 Activity 2 — Celsius → Fahrenheit with TensorFlow
Formula: f = c × 1.8 + 32. Instead of writing this as a Python function, train a single-neuron NN to learn the relationship.
import tensorflow as tf import numpy as np import matplotlib.pyplot as plt # Training data c = np.array([-40, -10, 0, 8, 15, 22, 38], dtype=float) f = np.array([-40, 14, 32, 46, 59, 72, 100], dtype=float) # Model: 1 dense layer with 1 neuron, input shape (1,) model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1])]) # Compile model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam(0.1), metrics=['mean_squared_error']) # Train — 500 epochs history = model.fit(c, f, epochs=500, verbose=False) print("Finished training the model") # Plot the loss curve plt.xlabel('Epoch Number') plt.ylabel('Loss Magnitude') plt.plot(history.history['loss']) plt.show() # Predict print(model.predict(np.array([100.0]))) # ~ 212.0 print((100 * 1.8) + 32) # 212.0 (formula check)
🔹 Key TensorFlow Concepts
- input_shape=[1] — single numeric input (1-D array with 1 value).
- units=1 — one neuron in the layer.
- Loss function — measures how far predictions are from the target (the "loss").
- Optimizer function — adjusts internal values to reduce the loss (e.g., Adam).
3.3 Activity 3 — Creating a Neural Network with Python (Advanced)
A deeper ANN for the same Celsius-to-Fahrenheit problem using a CSV dataset + 2 hidden layers:
import numpy as np import pandas as pd import matplotlib.pyplot as plt import tensorflow as tf # Load dataset temp_df = pd.read_csv('cel_fah.csv') temp_df.head() # Visualise plt.scatter(temp_df['Celsius'], temp_df['Fahrenheit']) X_train = temp_df['Celsius'] y_train = temp_df['Fahrenheit'] # Sequential model — 1 input, 2 hidden layers (32 each), 1 output model = tf.keras.Sequential() model.add(tf.keras.layers.Dense(units=32, input_shape=(1,))) model.add(tf.keras.layers.Dense(units=32)) model.add(tf.keras.layers.Dense(units=1)) model.summary() # Compile + train (30 epochs, 20% validation) model.compile(optimizer=tf.keras.optimizers.Adam(0.1), loss='mean_squared_error') epochs_hist = model.fit(X_train, y_train, epochs=30, validation_split=0.2) # Deploy Temp_C = float(input("Enter temperature in Celsius: ")) Temp_F = model.predict(np.array([Temp_C])) print('Temperature in Fahrenheit (ANN) =', Temp_F)
model.summary() shows the architecture — layers, output shape of each layer, and number of trainable parameters.
3.4 Activity 4 — TensorFlow Playground
Visit playground.tensorflow.org — a browser-based web app where a real neural network runs live. Tweak parameters and see how algorithms behave.
🔹 The 8 Key Parameters
📊 1. Data
Six pre-loaded datasets: Circle, Exclusive OR (XOR), Gaussian, Spiral, Plane, Multi-Gaussian. First four — classification; last two — regression. Blue dots = positive; orange = negative.
🔢 2. Features
Seven inputs — X₁, X₂, X₁², X₂², X₁·X₂, sin(X₁), sin(X₂). Toggle on/off to see which are most important.
⚖️ 3. Weights
Lines between neurons — blue = positive weight · orange = negative weight. Thickness represents magnitude. Output dots change colour / intensity based on confidence of prediction.
🔄 4. Epoch
One complete iteration through the dataset.
🚀 5. Learning Rate
Alpha — the speed at which the model learns. Too high → overshoot; too low → slow convergence. Start at 0.03.
🔘 6. Activation Function
Choose one of: Tanh · ReLU · Sigmoid · Linear.
🛡️ 7. Regularization
L1 and L2 regularisation — reduce/remove overfitting by penalising large weights.
📈 8. Output
Check model performance after training — observe the Test loss and Training loss.
🔹 Sample Classification Walk-through (XOR)
- Select the Exclusive OR (XOR) dataset.
- Set ratio of training : test = 60 : 40.
- Noise = 5, batch size = 10.
- With features X₁ and X₂ only → Training loss ≈ 0.004, Test loss ≈ 0.002, Steps 255.
- Add the product X₁ · X₂ → Training loss ≈ 0.001, Test loss ≈ 0.001, Steps 102 — much better!
- Set learning rate = 0.03 and experiment.
- Select 2 hidden layers — 4 neurons in the first, 2 in the second, then output.
- Observe the weight thicknesses and final Train / Test loss.
- Neural network → ML model inspired by the human brain.
- Neurons → nodes that make up the layers of a neural network (also functions that process inputs).
- Role of activation functions → introduce non-linearity, allowing complex patterns.
- Backpropagation → adjusts weights and biases to minimise error.
- NN for image recognition → Convolutional Neural Network (CNN).
- Neural networks learn by adjusting weights and biases based on prediction error.
Quick Revision — Key Points to Remember
- Neural Network = ML model inspired by biological neurons; ANN = its AI form.
- Google's search algorithm is a famous NN application.
- 3 Layers: Input · Hidden (1+) · Output.
- Deep Neural Network = ANN with ≥ 2 hidden layers → Deep Learning.
- 7 Components: Neurons · Weights · Activation Functions (Sigmoid/Tanh/ReLU) · Bias · Connections · Learning Rule · Propagation Functions.
- Forward Propagation: input → activation → loss.
- Backpropagation = backward propagation of errors → adjusts weights via gradient descent each epoch.
- Formula: Σ(wᵢ · xᵢ) + bias → activation → output (1 if ≥ threshold, else 0).
- Worked example: Surfing → ŷ = 6 > threshold 3 → output 1 (GO!).
- 5 Types of NN: Perceptron (Rosenblatt 1958, TLUs) · FFNN/MLP · CNN (vision) · RNN (sequences/NLP) · GAN (Generator + Discriminator).
- CNN — 3-D neuron arrangement, filters, used in object detection & medical imaging.
- RNN — feedback loops, sequential data, NLP + speech + time-series.
- GAN — two networks trained together to produce realistic synthetic data.
- Society impact: efficiency · personalisation · new jobs + concerns on privacy, bias, displacement.
- Hands-on tools: Machine Learning for Kids · TensorFlow (Celsius→Fahrenheit) · TensorFlow Playground.
- TF Playground parameters: Data · Features · Weights · Epoch · Learning Rate · Activation (Tanh/ReLU/Sigmoid/Linear) · Regularization (L1/L2) · Output (Test/Train loss).