How Deep Learning works: understanding artificial neural networks and other Deep Learning architectures

5 min readOct 7, 2023

Deep learning is a subfield of machine learning that uses artificial neural networks to learn from data. Neural networks are inspired by the structure and function of the human brain, and they are able to learn complex patterns and relationships in data. Deep learning is a powerful machine learning technique that has achieved impressive results in many different tasks, such as recognizing images, understanding and generating text, and translating languages.

I-Artificial neural networks (ANNs)

Artificial neural networks (ANNs) are a type of machine learning algorithm that are inspired by the structure and function of the human brain. ANNs consist of layers of interconnected nodes that process information. These layers typically include an input layer, one or more hidden layers, and an output layer. Each node receives input from other nodes and processes it to produce an output signal, which is then passed on to other nodes in the next layer. The output signal is computed by some non-linear function of the sum of its inputs called activation function. The connections between nodes are called edges, and they have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection.

In this illustration, we have a simple artificial neural network. We called it ‘simple’ because it contains just one or two hidden layers. An ANN can also contains more than two hidden layers. In this case, we talk about a Deep Neural Network or multilayer perceptrons (MLPs).

https://cdn-images-1.medium.com/max/1600/1*3fA77_mLNiJTSgZFhYnU0Q@2x.png

The main difference between a simple artificial neural network (ANN) and a deep neural network (DNN) is the number of hidden layers. DNNs are able to learn more complex patterns and relationships in data than simple ANNs because they have more hidden layers. This is because each hidden layer learns a different set of features of the data, and the layers are stacked on top of each other to learn more complex features.

ANNs are trained using a process called backpropagation. In backpropagation, the network is given a set of input data and the desired output. The network then makes a prediction, and the error between the prediction and the desired output is calculated. The error is then propagated back through the network, and the weights of the neurons are adjusted to reduce the error.

This process is repeated until the network is able to make accurate predictions for the training data. Once the network is trained, it can be used to make predictions for new data.

1- Use case of ANNs:

Here are some examples of how ANNs are being used in the real world:

Image recognition: ANNs are used in image recognition tasks such as self-driving cars, facial recognition, and medical imaging.
Natural language processing: ANNs are used in natural language processing tasks such as machine translation, text summarization, and question answering.
Machine translation: ANNs are used in machine translation tasks to translate text from one language to another.
Speech recognition: ANNs are used in speech recognition tasks to convert speech to text.
Recommender systems: ANNs are used in recommender systems to recommend products, movies, and other items to users.

2- How to implement an artificial neural network (ANN)?

To implement an artificial neural network (ANN) we need to:

Define the architecture of the network. This includes specifying the number of layers in the network, the number of neurons in each layer, and the activation function for each layer.
Initialize the weights of the network. This can be done randomly or using a pre-trained model.
Train the network. This involves feeding the network training data and adjusting the weights of the network to minimize the error between the predicted output and the desired output.
Test the network. This involves feeding the network test data and evaluating the performance of the network on the test data.

Here is a simple example of how to implement an ANN in Python using the TensorFlow library:

import tensorflow as tf

# Define the architecture of the network
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10)

# Test the model
test_loss, test_accuracy = model.evaluate(x_test, y_test)

# Print the test accuracy
print('Test accuracy:', test_accuracy)

This code will implement a simple ANN with one hidden layer of 128 neurons and a softmax output layer. The model could be trained on dataset like mnist, which is a dataset of handwritten digits. The model will then be evaluated on your test data to see how well it performs on unseen data.

II-Other Deep Learning Architectures

In addition to artificial neural networks, there are a number of other deep learning architectures, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), Long Short-Term Memory and Generative Adversarial Networks (GANs).

CNNs are specifically designed for image recognition tasks. They are able to learn spatial features in images, such as edges and corners.
RNNs are designed for tasks that involve sequential data, such as natural language processing and machine translation. They are able to learn temporal relationships in the data.
Long Short-Term Memory (LSTM) Networks: An extension of RNNs, LSTMs are adept at capturing long-range dependencies in sequences. They are particularly valuable in tasks where contextual information matters, such as sentiment analysis and speech synthesis.
Generative Adversarial Networks (GANs) consist of two neural networks, a generator, and a discriminator, that compete with each other. GANs are renowned for their ability to generate realistic images, videos, and text. They have applications in art generation, data augmentation, and more.

Deep learning is a powerful subset of artificial intelligence that leverages artificial neural networks and specialized architectures to solve complex problems. By understanding the inner workings of these networks and staying abreast of the latest developments, we can harness the full potential of deep learning to drive innovation and transform industries. So, whether you’re an aspiring data scientist or simply curious about the technology shaping our future, deep learning is a captivating field worth exploring.

How Deep Learning works: understanding artificial neural networks and other Deep Learning architectures

I-Artificial neural networks (ANNs)

1- Use case of ANNs:

2- How to implement an artificial neural network (ANN)?

II-Other Deep Learning Architectures

Written by Mariam Kili Bechir/ Techgirl_235

No responses yet