If not redirected, please click here https://www.thesecuritybuddy.com/ai-ml-dl/ai-machine-learning-and-deep-learning-in-cyber-security/
Many of us
might have heard the terms AI, machine learning and deep learning.
Some of us also might have heard that they can have a big impact on
cyber security. What are AI, machine learning and deep learning
actually? And, how can they improve cyber security? In this article
we would discuss about that.
What is Artificial Intelligence ?
Artificial
Intelligence or AI is the science and engineering of making a machine
intelligent, so that it can perform tasks similar to those that
require human intelligence. It can give machines the ability to learn
without being explicitly programmed. For example, a machine can know
about the facts about a specific situation and based upon that it can
decide upon its action to achieve a goal. It can look at the previous
steps of a game of chess and decide on what can be the best possible
next move. Or a machine can know about the general facts of the
world, facts about a particular situation and a statement of a goal
and it can plan a strategy or sequence of actions using AI to achieve
its goal.
Artificial
Intelligence is widely used in many areas, like:
-
Playing games like chess
-
Speech Recognition
-
Understanding natural language
-
Computer vision
-
Building expert systems
What is Machine Learning ?
Machine
learning is a sub-field of AI that gives machines the ability to
learn from data and make predictions based on that. For example, a
machine can use machine learning to learn from a set of inputs and
its corresponding outputs and based on that it can predict the output
of a new input data. Applications of machine learning includes spam
filtering, Optical Character Recognition, search engines, computer
vision and cyber security.
There can
be three types of machine learning algorithms:
-
Supervised Learning
-
Unsupervised Learning
-
Reinforcement Learning
What is Supervised Learning ?
In this technique, the machine is provided with a set of inputs and its corresponding outputs. The machine uses supervised learning to obtain general rules that map the inputs with the outputs. The algorithm typically iteratively makes predictions on the training input data and adjusts itself from the feedback. It stops when an acceptable level of performance is achieved. This is called supervised learning because the training dataset supervises the learning process.
What is Unsupervised
Learning ?
In unsupervised learning, the machine is provided with only the input data with no labels on them. The goal is to learn the underlying structure or distribution in the data and predict outcome of similar input data based on that. For example, it can extract features on the input dataset and divide them into similarity groups, so that when a new data comes, it can predict its output based on the information. A common application can be in an ecommerce website, where machine learning can be used to divide the customers into segments and draw inferences based on that to use it in a marketing campaign.
In many applications,
semi-supervised learning algorithm is used, where the machine uses
both supervised and unsupervised algorithms to learn from the
training datasets.
What is Reinforcement Learning ?
In reinforcement learning, the machine interacts with the dynamic environment to perform a certain goal. A good example can be playing a game of chess, where the machine can use machine learning to learn from the previous steps and decide on its next move. And, based on the user’s next move, it can again decide on its next action.
What is Deep Learning ?
There are several approaches
of machine learning algorithms. One such approach is to use
artificial neural network. An artificial neural network is a machine
learning algorithm that is inspired by the structure and functional
aspects of biological neural networks. The neurons in the neural
network are connected to each other, through which data can
propagate. In a simple case, there can be two sets of neurons –
ones that receive the input signals and ones that send the output
signals. Deep Learning uses several layers between the input layer
and the output layer.
In Deep Learning, when an
input is given to the input layer, the input layer processes the
input and passes on a modified version of the input to the next
layer. Each neuron in the neural network assigns a weighting to its
input and the final output is determined by the total of those
weightings.
A simple example of using deep
learning can be recognizing a stop sign from an image. Attributes of
the stop sign image like its octagonal shape, red color, letters
used, size of the traffic sign etc are examined by the neurons and
based on that each neuron gives a weighting. Depending on the
weightings, the deep learning algorithm can come up with a
probability vector whether the image can be a stop sign.
So, to summarize, machine
learning is evolved from a sub-field of artificial intelligence. And,
a sub-field of machine learning is deep learning. Falling hardware
prices and the development of GPUs have contributed to the
development of Deep Learning.
AI, Machine Learning, Deep Learning and Cyber Security
Let’s try to understand, how
AI, machine learning and deep learning can be used to improve cyber
security.
Traditional Malware Detection Techniques
There are several ways malware
are detected using traditional anti-malware programs. Some most
common of them are:
Signature
Based Detection – In
this technique, an unidentified piece of code is compared with a
database of signatures of known malware. If a match is found, the new
piece of code is identified as a malware. But, the problem with this
approach is, signature based detection cannot detect new malware the
signatures of which are not updated with the database. Moreover,
sometimes it takes months to release signatures of newly found
malware. And so, this technique is extremely inefficient in detecting
malware especially Zero Day Threats and APTs.
Heuristic
Techniques
– In this technique, the unidentified piece of code is made to run
and the behavioral characteristics of the new code is observed.
Malware behavior is typically observed at runtime, once the code
starts execution. So, the prevention mechanism gets delayed which
makes it ineffective at times.
Sandbox
–
In sandbox solutions, the unidentified code is executed in a virtual
environment and its behavior is observed to determine whether it can
be a malware. This process is time consuming and ineffective for
real-time protection. Moreover, the malware can stall its execution
once it detects a virtual environment, which makes its detection
challenging at times.
Malware Detection using AI, Machine Learning and Deep Learning
Machine Learning can be used
in more effective malware detection. In this technique, a file’s
behavior is observed to detect whether it can contain a malware. This
is done by training the machine learning algorithm with the help of
some manually selected features, that can determine whether the file
is malicious or legitimate.
This is no doubt a better
approach, but it has its own disadvantages. This technique requires
human intervention to teach the machine the parameters, variables or
features based on which malware detection can be done. And, to
address that an advanced technique is used that uses deep learning to
detect malware.
In this technique, a dataset
of huge number of malicious and legitimate files are fed into the
machine. The machine uses deep learning to self-learn the features
necessary for malware detection. When the learning completes, the
machine can detect any malicious file type. Also, threats can be
detected in real time and potential threats can be blocked. This
technique can be quite effective in detecting even Zero Day threats
and APTs.
AI, machine learning and Deep
Learning technologies are evolving day by day. And, if used properly,
they can improve cyber security up to a great extent.
Nice article. Personally I have lurking around this for sometime now. I want to make this my MS thesis. I am pretty sure that Deep Learning can impact Malicious Code Detection in ways we didn't imagine.
ReplyDelete