Let's talk about 11 main neural network structures

With the rapid development of deep learning, a whole range of neural network architectures have been created to solve a wide variety of tasks and problems. Although there are countless neural network architectures, here are 11 essential to know for any deep learning engineer, which are divided into four major categories: standard networks, recurrent networks, convolutional networks, and autoencoders.

[[332182]]

Standard Network

1. Perceptron

The perceptron is the most basic of all neural networks and is the basic building block of more complex neural networks. It simply connects an input unit to an output unit.

2. Feedforward Network

A feedforward network is a collection of perceptrons where there are three basic types of layers - input layer, hidden layer, and output layer. During each connection, the signal from the previous layer is multiplied by the weights, added to the bias, and passed through the activation function. Feedforward networks use backpropagation to iteratively update the parameters until the desired performance is achieved.

3. Residual Network (ResNet)

One problem with deep feedforward neural networks is called the vanishing gradient problem, which occurs when the network is too long to backpropagate useful information throughout the network. As the signal to update the parameters propagates through the network, it gradually decreases until the weights at the front end of the network are not changed or utilized at all.

To address this problem, residual networks employ skip connections, which propagate signals across “skipped” layers. By using connections that are less susceptible to the vanishing gradient problem, the vanishing gradient problem is reduced. Over time, the network learns to recover the skipped layers as it learns the feature space, but is more efficient to train because it is less susceptible to vanishing gradients and needs to explore less feature space.

Recurrent Networks

4. Recurrent Neural Network (RNN)

A recurrent neural network is a special type of network that contains loops and recurses on itself, hence the name "recursive". RNNs allow information to be stored in the network, using reasoning from previous training to make better, more informed decisions about upcoming events. To do this, it uses previous predictions as "contextual signals". Due to their nature, RNNs are often used to handle sequential tasks, such as generating text letter by letter or predicting time series data (such as stock prices). They can also handle inputs of any size.

5. Long Short-Term Memory Network (LSTM)

RNNs are problematic because in practice the scope of contextual information is very limited. The effect (back-propagated error) of a given input on the hidden layers (and therefore on the network output) either grows exponentially or vanishes as it loops around the network connections. A solution to this vanishing gradient problem is to use Long Short-Term Memory networks, or LSTMs.

This RNN architecture is specifically designed to solve the vanishing gradient problem, combining the structure with memory blocks. These modules can be thought of as memory chips in a computer - each module contains several recurrently connected memory cells and three gates (input, output, and forget, equivalent to write, read, and reset). The network can only interact with the cell through each gate, so the gate learns to open and close intelligently to prevent the gradient from exploding or disappearing, but also to propagate useful information through a "constant error carousel" and discard irrelevant memory content.

6. Echo State Network (ESN)

The echo state network is a variant of a recurrent neural network with very sparsely connected hidden layers (typically one percent connectivity). The connectivity and weights of the neurons are randomly assigned and the differences between layers and neurons are ignored (skip connections). The weights of the output neurons are learned so that the network can produce and reproduce specific temporal patterns. The rationale behind this network comes from the fact that, despite being nonlinear, the only weights modified during training are the synaptic connections, so the error function can be differentiated as a linear system.

Convolutional Networks

7. Convolutional Neural Network (CNN)

Images have a high dimensionality, so training a standard feed-forward network to recognize images would require thousands of input neurons, which, besides being blatantly computationally expensive, can lead to many problems associated with the curse of dimensionality of neural networks. Convolutional Neural Networks (CNNs) provide a solution by using convolutional and pooling layers to help reduce the dimensionality of images. Since the convolutional layer is trainable, but has far fewer parameters than a standard hidden layer, it is able to highlight the important parts of the image and pass them forward. Traditionally, in CNNs, the last few layers are hidden layers that process "compressed image information."

Convolutional neural networks excel at image-based tasks, such as classifying an image as a dog or a cat.

8. Deconvolutional Neural Network (DNN)

As the name implies, a deconvolutional neural network does the opposite of a convolutional neural network. Instead of performing convolution to reduce the dimensionality of an image, a DNN uses deconvolution to create an image, usually from noise. This is an inherently difficult task. Consider a CNN tasked with writing a three-sentence summary of the entire book of Orwell’s 1984, while a DNN tasked with writing the entire book from a three-sentence structure.

9. Generative Adversarial Networks (GANs)

Generative adversarial networks are a special type of network designed specifically for generating images, and consist of two networks: a discriminator and a generator. The task of the discriminator is to distinguish whether the image was extracted from the dataset or generated by the generator, while the task of the generator is to generate images that are convincing enough that the discriminator cannot distinguish whether the image is real or not.

Over time, and with careful supervision, the two opponents compete against each other, pushing each other to successfully improve each other. The end result is a well-trained generator that can spit out realistic images. The discriminator is a convolutional neural network whose goal is to maximize the accuracy of identifying real/fake images, while the generator is a deconvolutional neural network whose goal is to minimize the performance of the discriminator.

Autoencoder

10. Autoencoder (AE)

The basic idea of an autoencoder is to take the original high-dimensional data, "compress" it into highly informative low-dimensional data, and then project the compressed form into a new space. Autoencoders have many applications, including dimensionality reduction, image compression, denoising data, feature extraction, image generation, and recommender systems. It can be used as both an unsupervised and supervised method and can be very insightful about the nature of the data.

The hidden units can be replaced with convolutional layers to adapt to processing images.

11. Variational Autoencoder (VAE)

While autoencoders learn compressed representations of inputs, which can be images or text sequences, by first compressing the input and then decompressing it to match the original input, variational autoencoders (VAEs) learn the parameters of a probability distribution to represent the data. It not only learns a function to represent the data, but also obtains a more detailed and nuanced view of the data, sampling from the distribution and generating new input data samples. In this sense, it is more like a purely "generative" model, such as GAN.

VAEs use probabilistic hidden cells that apply a radial basis function to the difference between a test case and the cell mean.

<<: 5G is not yet popular, 6G is on the way, and 7G will achieve space roaming

>>: Factors that affect OSPF neighbor relationships, OSPF neighbor issues: network and subnet mask

Cisco appears at the 4th National Cyber Security Publicity Week, committed to building a secure "all-intelligent network"

Blog

Industry Observation: 6G will mainly become an industrial IoT network

DiyVM: US CN2/Hong Kong CN2 VPS monthly payment starts from 50 yuan, Hong Kong independent server starts from 499 yuan/month

DiyVM is a Chinese hosting company founded in 200...

Sharktech: 10Gbps unlimited traffic server starting from $259/month, dual E5-2678v3/64G memory/1T NVMe disk/5IPv4/60Gbps defense

Sharktech's special promotion machine this mo...

Mellanox: Reconstructing the network world with data at the center

Not long ago, as a leading provider of end-to-end...

Let's talk about 11 main neural network structures

Cisco appears at the 4th National Cyber Security Publicity Week, committed to building a secure "all-intelligent network"

Industry Observation: 6G will mainly become an industrial IoT network

5G is here—what’s next?

A brief discussion on SD-WAN troubleshooting

What are the differences and connections between 25G/50G/100G technologies?

Why is C-band spectrum important for 5G?

A great tool for front-end engineers - Puppeteer

What is Wi-Fi 7, how will it enhance connectivity, and which devices will be available?

What will the Internet look like in 10 years?

Recommend

"Prevention" is the key! The three major operators have taken multiple measures to help resume production and work

With spending of the three major operators declining, has China's 5G construction slowed down?

My boss told me not to use strings to store IP addresses, no!

A complete set of DNS related tests in IPv6 environment

Choosing the right communication mode for your IoT project

DiyVM: US CN2/Hong Kong CN2 VPS monthly payment starts from 50 yuan, Hong Kong independent server starts from 499 yuan/month

The benefits of 5G technology for education upgrades

2G is shut down in many places. How can NB-IoT and Cat.1 seize the opportunity in the reshuffle?

By 2026, the Wi-Fi 6 and 6E market in Asia Pacific will reach US$8.559 billion

How can you explain in simple terms the difference between TCP/UDP protocols and HTTP, FTP, SMTP and other protocols?

6G is on the way, what is the terahertz technology behind it?

Let ChatGPT tell you how to build a lossless network that supports ChatGPT computing power

HostKvm Hong Kong VPS 30% off: $5.95/month KVM-2GB RAM/40GB hard drive/500GB monthly bandwidth

Sharktech: 10Gbps unlimited traffic server starting from $259/month, dual E5-2678v3/64G memory/1T NVMe disk/5IPv4/60Gbps defense

Mellanox: Reconstructing the network world with data at the center