What is a neural network?
Artificial neural networks (ANN), more commonly referred to as neural networks (NN), are computing systems inspired by the biological neural networks that constitute human brains.
Neural networks: A brief history
Neural networks may seem new and exciting, but the field itself is not new at all. Frank Rosenblatt, an American psychologist, conceptualized and tried to build a machine that responds like the human mind in 1958. He named his machine “Perceptron.”
For all practical purposes, artificial neural networks learn by example, in a manner similar to their biological counterparts. External inputs are received, processed, and actioned in the same way the human brain does.
The layered structure of neural networks
We know that different sections of the human brain are wired to process various kinds of information. These parts of the brain are arranged hierarchically in levels. As information enters the brain, each layer, or level, of neurons does its particular job of processing the incoming information, deriving insights, and passing them on to the next and more senior layer. For example, when you walk past a bakery, your brain will respond to the aroma of freshly baked bread in stages:
- Data input: The smell of freshly baked bread
- Thought: That reminds me of my childhood
- Decision making: I think I’ll buy some of that bread
- Memory: But I’ve already eaten lunch
- Reasoning: Maybe I could have a snack
- Action: Can I have a loaf of that bread, please?
This is how the brain works in stages. Artificial neural networks work in a similar manner. Neural networks try to simulate this multi-layered approach to processing various information inputs and basing decisions on them.
At a cellular, or individual neuron level, the functions are fine-tuned. Neurons are the nerve cells in the brain. Nerve cells have fine extensions known as dendrites. They receive signals and then transmit them to the cell body. The cell body processes the stimuli and makes the decision to trigger signals to other neurons in the network. If the cell decides to do so, the extension on the cell body known as the axon will conduct the signal to other cells via chemical transmission. The working of neural networks is inspired by the function of the neurons in our brain, though the technological mechanism of action is different from the biological one.
How neural networks function similar to the human brain
An artificial neural network in its most basic form has three layers of neurons. Information flows from one to the next, just as it does in the human brain:
- The input layer: the data’s entry point into the system
- The hidden layer: where the information gets processed
- The output layer: where the system decides how to proceed based on the data
More complex artificial neural networks will have multiple layers, some hidden.
The neural network functions via a collection of nodes or connected units, just like artificial neurons. These nodes loosely model the neuron network in the animal brain. Just like its biological counterpart, an artificial neuron receives a signal in the form of a stimulus, processes it, and signals other neurons connected to it.
But the similarities end there.
The neuronal workings of an artificial neural network
In an artificial neural network, the artificial neuron receives a stimulus in the form of a signal that is a real number. Then:
- The output of each neuron is computed by a nonlinear function of the sum of its inputs.
- The connections among the neurons are called edges.
- Both neurons and edges have a weight. This parameter adjusts and changes as the learning proceeds.
- The weight increases or decreases the strength of the signal at a connection.
- Neurons may have a threshold. A signal is sent onward only if the aggregate signal crosses this threshold.
As mentioned earlier, neurons aggregate into layers. Different layers may perform different modifications on their inputs. Signals flit from the first layer (the input layer) to the last layer (the output layer) in the manner discussed above, sometimes after traversing the layers multiple times.
Neural networks inherently contain some manner of a learning rule, which modifies the weights of the neural connections in accordance with the input patterns they are presented with, just as a growing child learns to recognize animals from examples of animals.
Neural networks and deep learning
It is impossible to talk about neural networks without mentioning deep learning. The terms “neural networks” and “deep learning” are often used interchangeably, although they are distinct from each other. However, the two are closely connected as one depends on the other to function. If neural networks did not exist, neither would deep learning:
- Deep learning forms the cutting edge of an entity already of the forefront, artificial intelligence (AI).
- Deep learning is different from machine learning, which is designed to teach computers to process and learn from data.
- With deep learning, the computer continually trains itself to process data, learn from it, and build more capabilities. The multiple layers of more complex artificial neural networks are what make this possible.
- Complex neural networks contain an input layer and an output layer just like simple-form neural networks, but they also pack in multiple hidden layers. Therefore, they are called a deep neural network and are conducive to deep learning.
- A deep learning system teaches itself and becomes more “knowledgeable” as it goes along, filtering information through multiple hidden layers, in a manner similar to the human brain with all its complexities.
Why deep learning matters for organizations
Deep learning is like the new gold rush or the latest oil discovery in the tech world. The potential of deep learning has piqued the interest of big, established corporations as well as nascent startups and everything in between. Why?
It is part of the data-driven big picture, in particular, thanks to the rise in the importance of big data. If you think of internet-derived data as crude oil stored in databases, data warehouses, and data lakes, waiting to be drilled into with various data analytics tools, deep learning is the oil refinery that takes the crude data and converts it into final products you can use.
Deep learning is the endgame in a market flooded with analytical tools sitting on a hotbed of data: without an efficient and state-of-the-art processing unit, extracting anything of value is just not possible.
Deep learning has the potential to replace humans by automating repetitive tasks. However, deep learning cannot replace the thought processes of a human scientist or engineer creating and maintaining deep learning applications.
Making the distinction between machine learning and other kinds of learning
Machine learning
When it comes to the how of machine learning, it is all about training the learning algorithms such as linear regression, K-means, Decision Trees, Random Forest, K-nearest neighbors (KNN) algorithm, and support vector machine or SVM algorithm.
These algorithms sift through datasets, learning as they go along to adapt to new situations and look for interesting and insightful data patterns. Data is the key substrate for these algorithms to function at their best.
Supervised learning
The datasets used for training machine learning can be labeled. The dataset comes with an answer sheet to inform the computer of the right answer. For example, a computer scanning an inbox for spam can refer to a labeled dataset to understand which emails are spam and which ones are legitimate. This is called supervised learning. Supervised regression or classification is accomplished by means of linear regression and K-nearest neighbors algorithm.
Unsupervised learning
When datasets are not labeled, and algorithms like K-means are employed and directed to aggregate cluster patterns without the benefit of any reference sheets, it is called unsupervised learning.
Neural networks and fuzzy logic
As an aside, it is also important to make the distinction between neural networks and fuzzy logic. Fuzzy logic allows making concrete decisions based on imprecise or ambiguous data. On the other hand, neural networks attempt to incorporate human-like thinking processes to solve problems without first designing mathematical models.
How do neural networks differ from conventional computing?
To better understand how computing works with an artificial neural network, a conventional “serial” computer and its software process information must be understood.
A serial computer has a central processor that can address an array of memory locations where data and instructions are stored. The processor reads instructions and any data the instruction needs from within memory addresses. The instruction is then executed and the results saved in a specified memory location.
In a serial system or a standard parallel one, the computational steps are deterministic, sequential, and logical. Furthermore, the state of a given variable can be tracked from one operation to another.
The workings of neural networks
In contrast, artificial neural networks are neither sequential nor necessarily deterministic. They do not contain any complex central processors. Instead, they are made up of several simple processors that take the weighted sum of their inputs from other processors.
Neural networks do not execute programmed instructions. They respond in parallel (either in a simulated manner or actual) to the pattern of inputs presented to it.
Neural networks do not contain any separate memory addresses for data storage. Instead, information is contained in the overall activation state of the network. Knowledge is represented by the network itself, which is quite literally more than the sum of its individual components.
Advantages of neural networks over conventional techniques
Neural networks can be expected to self-train quite efficiently in case of problems where the relationships are dynamic or nonlinear. This ability is further enhanced by if the internal data patterns are strong. It also depends to some extent on the application itself.
Neural networks are an analytical alternative to standard techniques somewhat limited to ideas such as strict assumptions of linearity, normality, and variable independence.
The ability of neural networks to examine a variety of relationships makes it easier for the user to quickly model phenomena that may have been quite difficult, or even impossible, to comprehend otherwise.
Limitations of neural networks
There are some specific issues potential users should be aware of, particularly in connection with backpropagation neural networks and certain other types of networks.
Process is not explainable
Backpropagation neural networks have been referred to as the ultimate black box. Apart from outlining the general architecture and possibly using some random numbers as seeding, all the user needs to do is provide the input, keep an on it training, and then receive the output. Some software packages allow users to sample the network’s progress over time. The learning itself in these cases progresses on its own.
The final output is a trained network that is autonomous in the sense that it does not provide equations or coefficients defining a relationship beyond its own, internal mathematics. The network itself is the final equation of the relationship.
Slower to train
In addition, backpropagation networks tend to be slower to train than other types of networks and sometimes require thousands of epochs. This is because the machine’s central processing unit must compute the function of each node and connection separately. This can be highly cumbersome and cause problems in very large networks containing a huge amount of data. Contemporary machines do work fast enough to sidestep this issue, though.
Applications of neural networks
Neural networks are universal approximators. They happen to work best if the system has a high tolerance for error.
Neural networks are useful:
- For understanding associations or discovering regular elements within a set of patterns
- Where the data is enormous either in volume or in the diversity of parameters
- Relationships between variables are vaguely understood
- Where conventional approaches fall short in describing relationships
This beautiful, biology-inspired paradigm is one of the most elegant technological developments of our era.