Introduction:
Sensors were bleeping but still high value assets got compromised, emergency response team was called and asked for the most important question “we have detection technology from five vendors, how is that possible?”.
A piece of code bypassed the world’s most innovative technologies for detection on this earth. It is not a simple situation but interestingly it is a relatively common situation in breaches. At the same time it raises some serious questions: are detection technologies so weak or that piece of code is something truly sophisticated?
As a matter of fact the truth is majority of the time that piece of code is not truly sophisticated although there have been cases when we saw some real terror (eg: stuxnet etc.) but it is also not fair to call that detection technologies are weak. I have myself contributed to some of the best technologies exists today for detection and I can say that they are not useless. So where is the problem?
In order to identify the problem lets discuss about the set of detection technologies. We can broadly put them into three segments.
- Endpoint Security – Antivirus etc.
- Network Security – IPS/IDS etc.
- Execute and Detect – Sandbox etc. (I want to keep it separate because at technology level it is somewhere between #1 and #2).
Endpoint Security:
The primary models for this segment are based on a simple concept i.e signature based detection. But now solutions are not just based on signatures(although it is still a leading detection driver) we have emulators, hybrid approach (like run-time analysis), process level sandboxing etc. The important point here is that the technologies on this segment are preventive means at all cost we have to detect the malicious code before it can infect or create problems for the endpoint. It is a very ambitious approach considering the fact that it is very difficult to differentiate a malicious piece of code from benign piece of code based on the structural/static observations. Lets take a look at the real world data.
If we study the detection statistics from VirusTotal then the situation is something like this:
As we can see it takes around 10 hours to reach the detection to 10-15 vendors after the initial upload to VT. But just imagine if the sample was not uploaded to VT then how much time it will take to detect that sample? practically speaking I have no idea because the sample is not really in public.
There are some fundamental problems with the technologies that runs on the endpoints because there are things that can’t not be done. it is too risky to implement them for endpoint because they can trigger user experience problems, unexpected crashes etc.
Technology improvements are possible like Machine learning based detection, process level sandboxing/isolation but it is too early right now to discuss the effectiveness of these things.
So at this point of time we can say that it is relatively easy for the malicious program to bypass the technologies at the endpoint not because they are weak but there are some fundamental problems and challenges. In order to fill the gap we will jump to network security.
Network Security:
The IDS/IPS are detection and prevention technologies on the network side. The problem is that they are also using signature based detection models so they have the same problem as with the technologies at the endpoint side. The only advantage here is that the randomness in network traffic is likely much lower than the binary file of the malware family. So the signature on network traffic can practically detect more number of samples with the same signature basically providing more advantage over the endpoint technologies.
But at the core there are fundamental problems.. it is again a reactive technology if something is not known then it would be difficult to detect that thing using this model. So in order to fill this gap we will jump to sandbox i.e execute and detect.
Sandbox (execute and detect):
Sandbox based detection is truly not a prevention technology. it is at the core a detection tech. The good thing about this technology is that we try to simulate the endpoint environment in a controlled box to execute and examine the behaviour of the sample. On the basis of the behaviour we declare if the sample is malicious or benign. This technology has significantly raised the bar for detection and very nicely complemented the technologies on the endpoint and network side.
But there are also some fundamental problems with this technology. it is very difficult to perform the action that requires complex user interaction in automated fashion. Also it is impossible to simulate the exact same environment as on the endpoint machine.
Now if we look back and recall our original question “where is the problem?”
The problem:
These technologies were invented to fill the gap and to complement each other because at every segment we have some fundamental problems that are difficult to address for various reasons. The problem is that these technologies are not talking to each other or sharing information (although people will claim what about EDR, it is a whole different thing.) they run in an isolation, these techs are trying to solve the problem at their segment only. There is no standard or protocol that can be used by these techs to communicate and share information in order to detect the threat. So the biggest failure is not of technology or innovation it is the lack of standardisation in cyber security segment. These are standardisation and architectural problems that leads to some political arguments.
It is true that technology can’t solve this problem by 100 percent, At the core cyber security is not just a technology problem but if some simple office macro malware is bypassing all of the heavy duty techs then it raises some serious concerns on cyber security detection segment.