PART B ▪ UNIT 6 · Understanding Neural Networks

  root@vm-learning
  ~
  $
  open
  ch-b6
  

PART B ▪ UNIT 6

Understanding Neural Networks

Structure · Components · Working · Types · Society Impact

Neural Network — a machine-learning program (or model) that makes decisions in a manner similar to the human brain, using processes that mimic how biological neurons work together to identify phenomena, weigh options and arrive at conclusions.

Introduction — From Biology to AI

A neuron is a cell in the nervous system — the basic unit of the brain used to process and transmit information to other nerve cells and muscles. AI embeds similar behaviour into an Artificial Neural Network (ANN), which can adapt to changing inputs and produce the best possible outcome without re-programming — just as the human brain does.

Why Neural Networks Matter

They can extract data features automatically — no manual input from the programmer.
Power chatbots, email auto-reply, spam filtering, Facebook image tagging, product recommendations on e-commerce sites.
One of the best-known examples — Google's search algorithm.

Key Concepts You'll Learn

Parts of a neural network
Components of a neural network
Working of a neural network
Types of neural networks (feedforward, convolutional, recurrent, …)
Impact of neural networks on society

Prerequisite: Basic understanding of machine-learning concepts.

Learning Outcome 1: Explain the basic structure and components of a neural network

1.1 Parts of a Neural Network — 3 Layers

Input Layer — contains units representing the input fields. Each unit corresponds to a specific feature or attribute.
Hidden Layer(s) — one or more layers between input and output. Each hidden layer holds nodes/artificial neurons that process input data. Nodes are interconnected and each connection has an associated weight.
Output Layer — contains one or more units representing the target field(s). Generates the final predictions.

If a node's output exceeds a threshold value, the node is activated and passes its output onwards; otherwise, no data is transmitted.

Deep Neural Network (DNN) vs Basic Neural Network

An ANN with two or more hidden layers is a Deep Neural Network — training it is called Deep Learning.
A network with more than three layers (inclusive of input + output) is considered a deep-learning algorithm.
A neural network with only three layers is just a basic neural network.
"Deep" refers to the number of hidden layers (depth).

1.2 Components of a Neural Network — 7 Building Blocks

1. Neurons

(a.k.a. nodes) — fundamental building blocks. They receive inputs, compute a weighted sum, apply an activation function and produce an output.

2. Weights

Represent the strength of connection between neurons. Each synapse has a weight that conveys the importance of that feature in predicting the final output. During training, the network learns optimal weights to minimise error.

3. Activation Functions

Act like decision-makers for each neuron — deciding whether to fire based on its input. Types: Sigmoid · Tanh · ReLU (Rectified Linear Unit). They add non-linearity, letting the model capture complex patterns.

4. Bias

A constant added to the weighted sum before the activation function — shifts the function horizontally and accounts for inherent bias in the data.

5. Connections

Represent the synapses between neurons. Each has an associated weight controlling its influence. Biases (constants) affect the activation threshold.

6. Learning Rule

Specifies how weights and biases are adjusted during training. Backpropagation — the common learning algorithm — computes gradients and updates weights to minimise the network's error.

↔️ 7. Propagation Functions

Forward Propagation

Input data flows through the network layers; activations are computed; the predicted output is compared with the actual target — resulting in an error (loss).

Back Propagation

Backpropagation (short for "backward propagation of errors") — an optimisation algorithm during training. It adjusts weights of the network based on the error (loss) obtained in the previous epoch (iteration). Gradients are propagated backwards to update weights using algorithms like gradient descent.

Proper tuning of weights ensures lower error rates, making the model reliable by improving generalisation and prediction over time.

Learning Outcome 2: Identify different types of neural networks and their respective applications

2.1 Working of a Neural Network — The Formula

Each node is a simple calculator: it takes inputs, multiplies by weights, adds a bias, and produces an output — fed to the next node.

Σ(wᵢ · xᵢ) + bias = w₁x₁ + w₂x₂ + w₃x₃ + bias

Output: f(x) = 1 if Σ(wᵢxᵢ) + b ≥ 0; f(x) = 0 otherwise (or compared against a chosen threshold).

Feedforward Network

This process — each node passing its output to the next layer — defines the network as a feedforward network.

2.2 Worked Example — Two Cases

CASE I — Hidden Layer

Inputs: x₁ = 2, x₂ = 3, x₃ = 1. Weights: w₁ = 0.4, w₂ = 0.2, w₃ = 0.6. Bias = 0.1. Threshold = 3.0.

Σwᵢxᵢ + bias = (0.4·2) + (0.2·3) + (0.6·1) + 0.1
             = 0.8 + 0.6 + 0.6 + 0.1
             = 2.1

2.1 < 3.0  →  output = 0  (neuron INACTIVE)

CASE II — Output Layer

Weights: w₁ = 0.7, w₂ = 0.3. Bias = 0.2. Hidden-layer output from Case I (= 0) is passed as input.

Output = (0.7·0) + (0.3·0) + 0.2 = 0.2

Threshold for output layer = 0.1
0.2 > 0.1  →  final output = 1  (neuron ACTIVE)

2.3 Real-World Example — Should I Go Surfing?

Three decision factors affect whether you go surfing:

Factor	Input	Weight
Wave Quality (good?)	x₁ = 1	w₁ = 5 (large swells rare)
Line-up Congestion (empty?)	x₂ = 0	w₂ = 2 (used to crowds)
No recent Shark Activity?	x₃ = 1	w₃ = 4 (fear of sharks)

Threshold = 3 → bias = −3. Plug into the formula:

ŷ = (1·5) + (0·2) + (1·4) − 3
  = 5 + 0 + 4 − 3
  = 6

6 > 3  →  output = 1  →  GO SURFING!

(ŷ = "y-hat" denotes the predicted value.) Adjusting weights or the threshold yields different outcomes — allowing the model to be tuned for personal preferences.

2.4 Types of Neural Networks — 5 Main Types

① 1. Standard Neural Network (Perceptron)

Created by Frank Rosenblatt in 1958 — a simple NN with a single layer of input nodes fully connected to output nodes. Uses Threshold Logic Units (TLUs) as artificial neurons.
Application: binary classification — spam detection, basic decision-making.

️ 2. Feed Forward Neural Network (FFNN)

Also known as Multi-Layer Perceptron (MLP). Has input + one or more hidden layers + output. Data flows one direction only, from input to output. Uses activation functions and weights to process information in a forward manner.
Applications: image recognition · NLP · regression. Efficient for handling noisy data.

3. Convolutional Neural Network (CNN)

Uses filters to extract features from images — incorporates a three-dimensional arrangement of neurons, ideal for visual data.
Applications: computer vision — object detection · image recognition · style transfer · medical imaging.

4. Recurrent Neural Network (RNN)

Designed for sequential data — features feedback loops so information persists across time steps. If a prediction is wrong, the learning rate makes small adjustments during backpropagation to gradually move towards the right prediction.
Applications: NLP (language modelling, machine translation, chatbots) · speech recognition · time-series prediction · sentiment analysis.

5. Generative Adversarial Network (GAN)

Consists of two networks:

Generator — creates new data instances.
Discriminator — evaluates them for authenticity.

Trained simultaneously under unsupervised learning — used to generate realistic data such as images and videos.
Applications: synthetic-data generation · image generation · style transfer · data augmentation.

2.5 Future of NNs & Impact on Society

Efficiency & productivity — automates tasks, optimises resources across manufacturing, finance and more.
Personalisation — tailored recommendations and experiences based on huge datasets.
Economic growth — creates new jobs in data science and AI.
Ethical concerns — data privacy, algorithmic bias, job displacement — demand careful consideration and regulation.

Learning Outcome 3: Understand ML and neural networks through hands-on projects, interactive visualization tools, and practical Python programming

3.1 Activity 1 — Machine Learning for Kids (Animals & Birds)

Visit machinelearningforkids.co.uk. Steps:

Create a project — Identifying Animals & Birds.
Add labels (classes like "Cat", "Dog", "Parrot") and upload sample contents for each.
Click Train — the platform trains a model.
Test with new images.
Click Describe your model to view the underlying neural-network structure.
Click Next to see deep-learning working step-by-step.

3.2 Activity 2 — Celsius → Fahrenheit with TensorFlow

Formula: f = c × 1.8 + 32. Instead of writing this as a Python function, train a single-neuron NN to learn the relationship.

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# Training data
c = np.array([-40, -10, 0, 8, 15, 22, 38], dtype=float)
f = np.array([-40,  14, 32, 46, 59, 72, 100], dtype=float)

# Model: 1 dense layer with 1 neuron, input shape (1,)
model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1])])

# Compile
model.compile(loss='mean_squared_error',
              optimizer=tf.keras.optimizers.Adam(0.1),
              metrics=['mean_squared_error'])

# Train — 500 epochs
history = model.fit(c, f, epochs=500, verbose=False)
print("Finished training the model")

# Plot the loss curve
plt.xlabel('Epoch Number')
plt.ylabel('Loss Magnitude')
plt.plot(history.history['loss'])
plt.show()

# Predict
print(model.predict(np.array([100.0])))     # ~ 212.0
print((100 * 1.8) + 32)                     # 212.0 (formula check)

Key TensorFlow Concepts

input_shape=[1] — single numeric input (1-D array with 1 value).
units=1 — one neuron in the layer.
Loss function — measures how far predictions are from the target (the "loss").
Optimizer function — adjusts internal values to reduce the loss (e.g., Adam).

3.3 Activity 3 — Creating a Neural Network with Python (Advanced)

A deeper ANN for the same Celsius-to-Fahrenheit problem using a CSV dataset + 2 hidden layers:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

# Load dataset
temp_df = pd.read_csv('cel_fah.csv')
temp_df.head()

# Visualise
plt.scatter(temp_df['Celsius'], temp_df['Fahrenheit'])

X_train = temp_df['Celsius']
y_train = temp_df['Fahrenheit']

# Sequential model — 1 input, 2 hidden layers (32 each), 1 output
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units=32, input_shape=(1,)))
model.add(tf.keras.layers.Dense(units=32))
model.add(tf.keras.layers.Dense(units=1))

model.summary()

# Compile + train (30 epochs, 20% validation)
model.compile(optimizer=tf.keras.optimizers.Adam(0.1), loss='mean_squared_error')
epochs_hist = model.fit(X_train, y_train, epochs=30, validation_split=0.2)

# Deploy
Temp_C = float(input("Enter temperature in Celsius: "))
Temp_F = model.predict(np.array([Temp_C]))
print('Temperature in Fahrenheit (ANN) =', Temp_F)

model.summary() shows the architecture — layers, output shape of each layer, and number of trainable parameters.

3.4 Activity 4 — TensorFlow Playground

Visit playground.tensorflow.org — a browser-based web app where a real neural network runs live. Tweak parameters and see how algorithms behave.

The 8 Key Parameters

1. Data

Six pre-loaded datasets: Circle, Exclusive OR (XOR), Gaussian, Spiral, Plane, Multi-Gaussian. First four — classification; last two — regression. Blue dots = positive; orange = negative.

2. Features

Seven inputs — X₁, X₂, X₁², X₂², X₁·X₂, sin(X₁), sin(X₂). Toggle on/off to see which are most important.

3. Weights

Lines between neurons — blue = positive weight · orange = negative weight. Thickness represents magnitude. Output dots change colour / intensity based on confidence of prediction.

4. Epoch

One complete iteration through the dataset.

5. Learning Rate

Alpha — the speed at which the model learns. Too high → overshoot; too low → slow convergence. Start at 0.03.

6. Activation Function

Choose one of: Tanh · ReLU · Sigmoid · Linear.

7. Regularization

L1 and L2 regularisation — reduce/remove overfitting by penalising large weights.

8. Output

Check model performance after training — observe the Test loss and Training loss.

Sample Classification Walk-through (XOR)

Select the Exclusive OR (XOR) dataset.
Set ratio of training : test = 60 : 40.
Noise = 5, batch size = 10.
With features X₁ and X₂ only → Training loss ≈ 0.004, Test loss ≈ 0.002, Steps 255.
Add the product X₁ · X₂ → Training loss ≈ 0.001, Test loss ≈ 0.001, Steps 102 — much better!
Set learning rate = 0.03 and experiment.
Select 2 hidden layers — 4 neurons in the first, 2 in the second, then output.
Observe the weight thicknesses and final Train / Test loss.

Competency-based exercise — Dinner Restaurant Choice: decide between two restaurants using 3 factors (food quality, ambience, distance). Model as a neuron: assign inputs (0/1) and weights; compute Σwᵢxᵢ + bias; apply a threshold. Compare outcomes by adjusting weights.

Check Your Progress — quick MCQ pointers:

Neural network → ML model inspired by the human brain.
Neurons → nodes that make up the layers of a neural network (also functions that process inputs).
Role of activation functions → introduce non-linearity, allowing complex patterns.
Backpropagation → adjusts weights and biases to minimise error.
NN for image recognition → Convolutional Neural Network (CNN).
Neural networks learn by adjusting weights and biases based on prediction error.

Quick Revision — Key Points to Remember

Neural Network = ML model inspired by biological neurons; ANN = its AI form.
Google's search algorithm is a famous NN application.
3 Layers: Input · Hidden (1+) · Output.
Deep Neural Network = ANN with ≥ 2 hidden layers → Deep Learning.
7 Components: Neurons · Weights · Activation Functions (Sigmoid/Tanh/ReLU) · Bias · Connections · Learning Rule · Propagation Functions.
Forward Propagation: input → activation → loss.
Backpropagation = backward propagation of errors → adjusts weights via gradient descent each epoch.
Formula: Σ(wᵢ · xᵢ) + bias → activation → output (1 if ≥ threshold, else 0).
Worked example: Surfing → ŷ = 6 > threshold 3 → output 1 (GO!).
5 Types of NN: Perceptron (Rosenblatt 1958, TLUs) · FFNN/MLP · CNN (vision) · RNN (sequences/NLP) · GAN (Generator + Discriminator).
CNN — 3-D neuron arrangement, filters, used in object detection & medical imaging.
RNN — feedback loops, sequential data, NLP + speech + time-series.
GAN — two networks trained together to produce realistic synthetic data.
Society impact: efficiency · personalisation · new jobs + concerns on privacy, bias, displacement.
Hands-on tools: Machine Learning for Kids · TensorFlow (Celsius→Fahrenheit) · TensorFlow Playground.
TF Playground parameters: Data · Features · Weights · Epoch · Learning Rate · Activation (Tanh/ReLU/Sigmoid/Linear) · Regularization (L1/L2) · Output (Test/Train loss).

Practice Quiz — test yourself on this chapter→