Threat Hunting in the cloud with Azure Notebooks: supercharge your hunting skills using Jupyter and KQL
Robert M. Lee has a great quote: “Threat hunting exists where automation ends”. Threat hunting is large manually, performed by SOC analysts, trying to find a ‘needle in the haystack’. And in the case of cybersecurity, that haystack is a pile of ‘signals’.
These analysts often use separate tools for querying the data, manipulating the data set, reversing the potential malware, etcetera. What if we could provide an environment where you can perform all these tasks in context, and share the outcome with your team?
Azure Notebooks, with a little KQL magic sauce, is exactly that. Let’s supercharge your hunting skills with Azure, Jupyter, Python and KQL!
Kusto Query Language (KQL)
Kusto Query Language or KQL in short is the default way to work with data in Azure Data Explorer powered services such as Log Analytics, Azure Security Center, Azure Monitor and many more. It is a powerful yet easy to learn language.
Robert Cain, a Microsoft MVP, has written a 4-hour long course on Pluralsight that you can take for free, to learn the language all the way up to the advanced queries. KQL skills is something you’ll need if you will be doing threat hunting in Azure; most of the security data will be in Log Analytics workspaces.
Jupyter Notebook, formerly called IPython, is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text through markdown. It is already broadly used in data science, and has support for lots of programming languages such as R, Python, etc. The multi-user version of Jupyter is called JupyterHub.
The cool thing is that you can share your notebook with others, and that you can produce interactive output using HTML etc. and display that through a so called “presentation mode”. This makes it great for threat hunting and sharing signals within the SOC team.
On GitHub you’ll find ready-to-run Docker images containing Jupyter.
Azure Notebooks is currently in public preview and is a free hosted service to develop and run Jupyter notebooks in the cloud with no installation. Azure Notebooks is a freeservice, but each project is limited to 4GB memory and 1GB data to prevent abuse. Legitimate users that exceed these limits see a Captcha challenge to continue running notebooks.
However, if the Azure Active Directory account you sign in with is associated with an Azure subscription, you can connect to any Azure Data Science Virtual Machine (DSVM) instances within that subscription. DSVM’s can be found in the Azure Marketplace. With these dedicated DSVM’s you can add better processing power and remove any of those limits.
PRO TIP: You need to deploy the Ubuntu version of the DSVM. The Windows version of DSVM does not contain JupyterHub by default. The Ubuntu template of DSVM has an extra bonus: it will open up the right ports by default in your NSG!
In the case of Azure Notebooks, it allows you to share your notebooks using GitHub.
Pandas, KQLMagic and other libraries
One of the things you will find out early using Jupyter is that you will want to manipulate data. This is where a library called Pandas comes in. Pandas is an open source Python framework, maintained by the PyData community and mostly used for Data Analysis and Processing.
The big picture
Putting all the pieces together you get something like this:
Real-world threat hunting
Let’s look at a real-world example. In this case we have a number of virtual machines running in Microsoft Azure, and Azure Security Center is turned on at the subscription level to capture relevant security events.
We’re suspicious of a machine called APPSERVER, based on an Alert we got fromAzure Security Center, and want to do some investigation.
We go to Azure Notebooks and login:
PRO TIP: While in the KQL query interface in Azure you’ll be using the double quote character for specifying input, you’ll be using the single quote in Jupyter. Make sure to change your queries so that they work properly in Jupyter.
If you want to go multi-line to make things better readable, you need to use double %. As our application server is in The Netherlands, I will apply a filter and only show the connections that are going to IP addresses that our outside of our country:
Sharing your findings
An unique feature of Jupyter is the Presentation mode. It allows you to easily share key items from your audience to other people in a visual friendly way, without having to copy/paste data to another application.
You can use Markdown text to annotate your notebook. Enable the Slide picker by going to the View menu, Cell Toolbar, then Slide Show. Go to any row and on the right-hand side select to Skip it, be part of a Slide, etcetera.
Lastly, click on ‘Enter/Exit RISE Slideshow’ to share your findings:
John Lambert, distinguished engineer at Microsoft’s Threat Intelligence Center, has some other great examples on threat hunting with Jupyter which he has shared here:
There is also a sample notebook on MyBinder that shows you step-by-step which Kqlmagic commands are available, and how to use them.
Jupyter is a great platform for threat hunting. You can work with data in-context and natively connect to security backends in Microsoft Azure using Kqlmagic.
Best of all, using Azure Notebooks and Azure Security Center, we didn’t spend a dollar and got our threat hunting platform for free :-)
Start learning KQL, Python and Jupyter today and supercharge your hunting skills!
— Maarten Goet, MVP & RD