The following is an excerpt from our recently published whitepaper, “Self-Supervised Learning – AI for Complex Network Security.” The author, Dr. Peter Stephenson, is a cybersecurity and digital forensics expert having practiced in the security, forensics and digital investigation fields for over 55 years.
Section 2 – Machine Learning, Deep Learning and Neural Networks, Oh My!
There are three more terms that get bandied about without much real definition in typical marketing materials. They, like AI, have become buzzwords without usually giving a lay description of what they really mean (or don’t mean). Since Russell’s whitepaper gives us a great start, let’s take a little deeper dive. This is important because, as we progress to the all-important “what’s in it for me?” question, we need a little technical background to help us get past the jargon.
If we think of machine learning (ML) as fact gathering and Deep Learning (DL) as interpretation we’ll be pretty close on the meat of the terms. It’s really a bit more than that, of course.
Machine learning does, in fact, gather data but it does not require explicit programming to do so. For example, first generation anti-virus products worked on the basis of signatures. They looked for a signature – a specific bit pattern – to identify a virus.
That was pretty simple to fool and the moment a new .dat file emerged from a vendor, a set of changes to the viruses it could identify appeared in the criminal hacker underground. The virus writers made the changes and the AV product’s efficacy was diminished substantially until the next round at which time, of course, the cycle repeated. These .dat files and their countermeasures by the adversary were not ML. They were, simply, pattern recognition.
The next step the AV vendors took was the introduction of heuristics and behavior-based analysis. These at their inception were crude examples of ML. The AV was observing known signatures and looking for things that looked or behaved a bit like them. So the fact gathering was there but there was a bit more intelligence applied to it.
Deep learning makes decisions based upon the data it sees and the data that it doesn’t see but infers from what it does see. This became useful in the AV industry when the adversary introduced polymorphic viruses. These are viruses that change their appearance on the fly and not always in the same way.
For example, a polymorph may be encrypted to hide its signature. At some point in its execution cycle it decrypts itself, does its damage and re-encrypts, this time using a different key. An ML system would have a hard time with this because of the seemingly random changes but a DL system might look at the code, decide that it’s encrypted and make some decisions as to what to do about it.
Some early systems actually put the code in a sandbox and single- stepped through it until decryption occurred and then tried to identify the malware. This was inefficient, slowed down or crashed the host application and produced false positives, but the landscape was changing. Wave 3 approaches address the same problem far more efficiently and without false returns.
Neural networks (NN) simply are an extension of machine learning where the AI system uses the data collected by the ML and attempts to analyze it in much the same way a human would.
The NN, then, tries to extract knowledge from all of the data collected and inferred by the AI and make decisions based upon those data. What it does not do is take the context of the data collected and inferred in order to create novel decisions that, in themselves, may predict behavior.