An Introduction to Baselining Technology

This is the first installment of the three part blog series on creating baselines of network behavior to improve your security stance. Here we will walk you through the basics of baselining technology.

What Are Baselines and Why Do They Matter?

Spend any amount of time looking into the latest developments in cybersecurity, and you’ll probably hear claims of security systems that establish a “baseline” of your network’s activity to better monitor it. Some industry leaders even claim to do it.

What is a real network baseline? How does it work? And, why is it vital to stopping zero-day attacks? These questions can help you choose the right security provider, by properly vetting their system for true baseline development through Unsupervised Learning, not just empty words touting Artificial Intelligence-enabled monitoring.

When we say baseline, we mean the day-to-day normal behavior of your network that is used to predict what your network should look like in 5 minutes, hours, or days so that any IP’s that behave anomalously on it can be flagged.

Physical Characteristics

In order to better understand the concept of a baseline being created for your network, let’s look at an example.

If you’re in a production environment, people come in at 9 a.m. on Monday, the traffic goes up and it stays relatively constant until lunch. Then it drops a bit, and after lunch it will stay on average constant until about 6 p.m., until it drops again and goes to some baseline during the night.

Now, on top of that average between 9 a.m. and 6 p.m. you’re also going to have a lot of fluctuations. People send emails at arbitrary times, emails come in at arbitrary times, and people will be looking at various directories randomly. That is all unpredictable.

What is predictable is the 9 o’clock jump and the 6 o’clock drop off. Therefore, anything that happens on the network has to be encoded in some sort of representation that is going to give an easy view of that normal behavior.

That’s what the baseline is, and it does reference both the bulk as well as the individual IP.

This applies to almost every area of the network in a production environment. If you have some organization that works 24/7 the story is going to be different, but the idea is the same.

Possibilities with Baseline Layers

At Mixmode, we start with individual IPs and collect all available information from them. However, it’s useful to look at it not on an individual IP level but on a bigger scale.

The first layer we provide feedback on is inbound traffic, which means all IPs that have a geo address from outside the network to inside the network. Second is local traffic, which includes all geolocations from internal traffic. Last we have outbound traffic which is being sent out of the organization. As you might imagine from a network security perspective, outbound traffic is usually the biggest concern because it is actually leaving the local network with a possibility for a read on the outside.

When we go down to the individual IP level we examine those IPs that communicated during the time the anomaly was detected. Let’s say the anomaly was detected on the outbound side. We would exfiltrate it to obtain the top IPs implicated in the anomalous incident. Then, we could take a few IPs that had many files sent out during that time, and we could look at each of those individual IPs and say, “Is the behavior of this individual IP unusual?”

For a conclusive answer to this question, we could create a baseline retroactively for the individual IP and understand better whether it acted out of character.

Next week we will dive into the details of how your baseline is created.