AI and Machine Learning in Cyber Security

April 9, 2018 admin Comments 53 comments

Zen monks have been using a tool called a ‘koan’ for hundreds of years to assist them in reaching enlightenment. These koans are like riddles or stories that can only be solved by letting go of ones narrowing believes and stories about how things should be. Zen students sit in silent meditation and observe how the koan is working on them, slowly transforming their way of looking at the world and revealing a tiny piece of the path to nirvana, that place of no suffering.

“Zen is like a man hanging by his teeth in a tree over a precipice. His hands grasp no branch, his feet rest on no limb, and under the tree another man asks him, ‘Why did Bodhidharma come to China from the West?’ If the man in the tree does not answer, he misses the question, and if he answers, he falls and loses his life. Now what shall he do?”
— Zen Koan — Case 5 of the Gateless Gate Collection.

Zen and Cyber Security

You might wonder what that has to do with cyber security. With the increased popularity of deep learning and the omni presence of the term artificial intelligence (AI), a lot of security practitioners are tricked into believing that these approaches are the magic silver bullet we have been waiting for to solve all of our cyber security challenges. But just like a koan, deep learning (or any other machine learning approach) is just a tool. It’s a tool you have to know how to apply in order for it to reveal true insight. And it’s not the only tool we need to use. We need to mix in experience. We have to work with experts to capture their knowledge for the algorithms to reveal actual security insights or issues. Just like with koan study, you work with a teacher (the expert) to have him guide you on your journey.

AI in Cyber Security

Where do we stand today with artificial intelligence in cyber security? First of all, I will stop using the term artificial intelligence and revert back to using the term machine learning. We don’t have AI (or to be precise AGI) yet, so let’s not distract ourselves with these false concepts.

Where are we with machine learning in security? To answer that question, we first need to look at what our goal is for applying machine learning to cyber security problems. To make a broad statement, we are trying to use machine learning to find anomalies. More precisely we use it to identify malicious behavior or malicious entities; call them hackers, attackers, malware, unwanted behavior, etc. But beware! To find anomalies, one of the biggest challenges is to define what’s normal. For example, can you define what is normal behavior for your laptop day in — day out? Don’t forget all the exceptional scenarios when you are traveling; or think of the time that you downloaded some ‘game’ from the Internet. How do you differentiate that from a download triggered by some malware? Put in abstract terms, interesting security events are not statistical anomalies. Only a subset of those are interesting. An increase in network traffic might be statistically interesting, but from a security point of view, that rarely ever represents an attack.

Applying Machine Learning To Security

In a somewhat simplified world, we can partition security use-cases into two groups: The problems where machine learning has made a difference and the ones where machine learning has been tried, but will likely never yield usable results. In machine learning lingo, from a supervised perspective, the former category is comprised of all the problems where we have “good”, labeled data. The latter is where we don’t have that. The unsupervised side looks a bit different. There we have to distinguish among the different unsupervised approaches. For this conversation, let’s consider clustering, dimensionality reduction, and association rule learning as the main approaches within unsupervised learning. All of these approaches are useful to make large dataset easier to analyze or understand. They can be used to reduce the number of dimensions or fields of data to look at (dimensionality reduction) or group records together (clustering and association rules). However, these algorithms are of limited use when it comes to identifying anomalies or ‘attacks’.

The following diagram summarizes this again:

Diagram 1 — Incomplete view of machine learning algorithms and applications in security.

Supervised Machine Learning

Let’s have a quick look at the different groups of machine learning algorithms, starting with the supervised case. This is where machine learning has made the biggest impact in cyber security. The two poster use-cases are malware classification, or the classification of files, and spam detection. The former is the problem of identifying whether a file
is benign — we can execute it without having to worry about any ‘side effects’ — or if it is malware that will have a negative impact when we run it. Today’s approaches in this area have greatly benefited from deep learning where it has helped drop false positive rates to very manageable numbers while also reducing the false negative rates at the same time. Malware identification works so well because of the availability of millions of labeled samples (from both malware and benign applications). These samples allow us to train deep belief networks extremely well. The problem of spam identification is very similar in the sense that we have a lot of training data to teach our algorithms right from wrong.

Where we don’t have great training data is in most other areas. For example, in the realm of detecting attacks from network traffic. We have tried for almost two decades to come up with good training data sets for these problems, but we still do not have a suitable one. The last data set we thought was decent was the MIT LARIAT data set, which turned out to be significantly biased. It’s a really hard, if not impossible, problem to assemble a good training data set. And without one, we cannot train our algorithms. There are other problems like the inability to deterministically label data, the challenges associated with cleaning data, or understanding the semantics of a data record. But those are out of scope for this article.

Unsupervised Machine Learning

On the unsupervised side, let’s start with dimensionality reduction. Applying it to security data works pretty well, but again, it doesn’t really bring us any closer to finding anomalies in our data set. The same is true for association rules. They help us group data records, such as network traffic, but how do we assess anomalies with this information? Clustering could be interesting to find anomalies. Maybe we can find ways to cluster ‘normal’ and ‘abnormal’ entities, such as users or devices? It turns out that the fundamental problems with clustering in security are distance functions and the ‘explainability’ of the clusters. If you are interested in the details of that, you can find more information about the challenge with distance functions and explainability in this blog post.

Context and Knowledge

The above algorithms are tools that could potentially be useful for detecting attacks if used in the right way. Aside from the challenges mentioned already, there are some other significant ingredients that we are missing. The first one is context. Context is anything that helps us better understand the role of the entities involved in the data, such as information about devices, applications, or users. Context for devices includes things like a device’s role, it’s location, it’s owner, etc. Rather than looking at network traffic logs in isolation, we need to add context to make sense of the data. Is a device supposed to respond to DNS queries? If you know that it is a DNS server, this is absolutely normal behavior, but if it weren’t a DNS server, that kind of behavior could be a sign of an attack.

In addition to context, we need to build systems with expert knowledge. Ideally systems that help us capture expert knowledge in simple ways. This is very different from throwing an algorithm at the wall and seeing if it yields anything potentially useful. One of the interesting approaches in the area of knowledge capture that I would love to see getting more attention is Bayesian belief networks. Is anyone done anything interesting with those in security?

We should also consider building systems that do not necessarily solve all of our problems right away, but can help make security analysts more effective by assisting them in their daily routines and in their work. Data visualizationis a great candidate in that area. Instead of having analysts look at thousands of rows of data, they can look at visual representations of the data that unlocks a deeper understanding of the data in a very short amount of time. It’s also a great tool to verify and understand the results of machine learning applications.

In Zen, koans are just a tool or a building block to get to the end goal. Just like machine learning, it’s a tool that you have to know how to apply and use in order to come to new understanding and find attackers in your systems.

source

Share this post

53 thoughts on “AI and Machine Learning in Cyber Security”

Hey, thanks for the blog post. Much thanks again. Want more. Frayda Winn Armitage

Some really marvelous work on behalf of the owner of this site, great content. Hildy Derron Holub

I like this site because so much useful stuff on here : D. Ofilia Lonnard Kyrstin

Every weekend i used to visit this web page, because i want enjoyment, since this this web page conations really nice funny material too. Naomi Homere Mariandi

Practice your juggling skills at home with the soccer ball. Pearle Waverly Yolane

Really enjoyed this blog article. Really thank you! Want more. Marianne Giovanni Maureen

Hurrah! After all I got a weblog from where I be capable of actually obtain helpful information concerning my study and knowledge. Perl Kirby Suzy

My family members always say that I am killing my time here at web, except I know I am getting know-how daily by reading thes pleasant articles. Peri Dewie Vogele

Hi mates, how is everything, and what you want to say regarding this article, in my view its really awesome designed for me. Amara Paolo Darooge

I just recently discovered your Chanel and am sad to hear it could disappear. Evaleen Hamlin Pattie

I pay a quick visit every day a few web pages and information sites to read articles, but this web site offers quality based posts. Emma Alick Levitan

Great article! We will be linking to this particularly great content on our website. Gweneth Emory Eudoca

I like the helpful info you supply on your articles. I will bookmark your blog and test once more here frequently. Timi Barny Tutankhamen

Article writing is also a excitement, if you know afterward you can write or else it is complicated to write. Sadye Chester Aubert

You made some nice points there. I did a search on the subject and found most persons will approve with your blog. Kati Mandel Howlond

Nullam convallis, dolor et volutpat gravida, neque ligula malesuada ligula, mollis blandit purus lacus vitae ex. Aliquam a tortor nibh. Mauris nec diam ex. Fanni Ian Daye

These are really fantastic ideas in concerning blogging. You have touched some pleasant points here. Any way keep up wrinting. bra deo for kvinnor tanla.interestinghere.be/for-women/bra-deo-foer-kvinnor.php bra deo for kvinnor Giustina Kelly Gorman

Very good point which I had quickly initiate efficient initiatives without wireless web services. Interactively underwhelm turnkey initiatives before high-payoff relationships. Holisticly restore superior interfaces before flexible technology. Completely scale extensible relationships through empowered web-readiness. Philippa Blaine Rochkind

Very good post! We are linking to this great content on our site. Keep up the great writing. Charmine Lucas Dyanna

Thank you so much! Hanon exercises will help you so much when you start playing the piano someday! Amalee Milt Koran

This turned out perfect! I love your imagination!! Estel Cy Theressa

Everything is very open with a very clear explanation of the challenges. It was truly informative. Your site is useful. Many thanks for sharing! Gwenette Seamus Katusha

Just wish to say your article is as astonishing. The clarity in your post is simply spectacular and i could assume you are an expert on this subject. Well with your permission let me to grab your feed to keep up to date with forthcoming post. Thanks a million and please keep up the enjoyable work. Cassandry Fransisco Soelch

When I initially commented I seem to have clicked on the -Notify me when new comments are added- checkbox and from now on every time a comment is added I recieve four emails with the exact same comment. Perhaps there is a way you can remove me from that service? Thanks a lot! Stesha Henri Rand

Natus quam eos quia libero cumque veniam consequatur. Et doloremque aut rerum blanditiis iste. Lenka Arch Harli

Nice post to share. I just loved this Guide to Google Interview Preparation. This is so useful and informative. Many students will get help from these points. These are worth to know before you go to any interview. Thanks a lot for the wonderful share. Keep sharing.. Judie Jedd Vorster

There is no way my people could adequately return the hospitality they received in Catalunya, but we would all love to show you around! Paolina Felix Flosser

This information is invaluable. Where can I find out more? Shauna Mortimer Killie

Hi there. I discovered your blog via Google whilst searching for a comparable matter, your website got here up. It looks great. I have bookmarked it in my google bookmarks to visit then. Annabella Kerr Raine

I am extremely impressed with your writing skills as well as with the layout on your weblog. Is this a paid theme or did you modify it yourself? Anyway keep up the nice quality writing, it is rare to see a great blog like this one nowadays.| Mella Antoni Readus

I will right away grab your rss as I can not find your e-mail subscription hyperlink or e-newsletter service. Do you have any? Please permit me recognise so that I could subscribe. Thanks. Dulcie Warren Raina

Hello! This is my first visit to your blog! We are a collection of volunteers and starting a new project in a community in the same niche. Your blog provided us useful information to work on. You have done a extraordinary job!| Eloise Dimitry English

Nam a enim id odio rhoncus dapibus non in leo. Curabitur vitae tempor orci. Ut ipsum tortor, pellentesque at vulputate at, imperdiet sed est. Duis tristique dolor et dui maximus congue. Donec rutrum velit ut metus suscipit dapibus. Duis convallis vestibulum finibus. Maecenas laoreet metus sed mi dapibus, nec scelerisque dolor tincidunt. Donec ultrices erat tellus, vitae egestas eros faucibus id. Emelda Hermy Sephira

hello!,I really like your writing very much! percentage we be in contact more approximately your post on AOL? I require a specialist in this house to resolve my problem. Maybe that is you! Having a look forward to see you. | Stormy Barret Lisetta

This is stunning, exquisite work!! Reckon someone will want to buy this one! Marlane Cortie Orv

Yummy!Sent via the Samsung Galaxy S7, an AT&T 4G LTE smartphone Rozanna Ives Octavius

I was very happy to discover this web-site. I wished to thanks for your time for this terrific read!! I definitely appreciating every little of it and also I have you bookmarked to check out new stuff you blog post. Berti Miller Desi

You could certainly see your skills within the article you write. Carmela Grannie Alvan

Hinc ceteri particulas arripere conati suam quisque videro voluit afferre sententiam. Sed ea mala virtuti magnitudine obruebantur. Quae in controversiam veniunt, de iis, si placet, disseramus. Lanita Tirrell Mordy

Thank you for the good writeup. It in fact was a amusement account it. Look advanced to far added agreeable from you! By the way, how can we communicate?| Miran Crichton Laughlin

Thanks-a-mundo for the blog article. Really thank you! Keep writing. Laurice Shurlock Iey

If you maintain purchasing power you would protect the users. Kerstin Gabriele Yordan

Pretty section of content. I just stumbled upon your web site and in accession capital to assert that I get in fact enjoyed account your blog posts. Anyway I will be subscribing to your augment and even I achievement you access consistently quickly. Hortense Rice Sheelagh

Pretty! This was an incredibly wonderful article. Thanks for supplying these details. Karisa Rhys Clyde

Wow, this article is good, my sister is analyzing these kinds of things, so I am going to let know her. Jillene Abbey Pettit

Pretty! This has been an extremely wonderful post. Thank you for providing these details. Barrie Sauncho Jacinto

Today, while I was at work, my sister stole my iPad and tested to see if it can survive a 25 foot drop, just so she can be a youtube sensation. My apple ipad is now broken and she has 83 views. I know this is completely off topic but I had to share it with someone! Nellie Broddy Kenney

Great selection of modern and classic books waiting to be discovered. All free and available in most ereader formats. download free books https://www.philadelphia.edu.jo/library/directors-message-library

whoah this blog is wonderful i really like reading your articles. Keep up the great paintings! You realize, a lot of people are hunting round for this info, you could help them greatly.

I have read so many posts about the blogger lovers however this post is really a good piece of writing, keep it up

https://www.philadelphia.edu.jo/library/directors-message-library