After setting up a Microsoft Sentinel environment, it’s natural to push as much data into the new SIEM as possible. This is a common pitfall, as Sentinel is a cloud SIEM, meaning that storage costs can increase rapidly if not managed properly. Before enabling a new data connector, you should consider its use cases and priority. This article outlines my thought process about choosing data connectors and how to keep costs down.
Not Like On-Premises SIEM
A common mistake when migrating from an on-premises SIEM to Sentinel is to enable every data connector to ingest as much data into Sentinel as possible, including applications logs, firewalls, and NetFlow logs from switches. While this approach might work well with other SIEMs, this isn’t true for Sentinel. Billing for Sentinel is cloud-based, so costs are based on how much data Sentinel receives and stores. Although it is technically feasible to ingest data from multiple sources into Sentinel, monthly bills will increase rapidly, and the cost of Sentinel will be horrendous.
Using a cloud mindset is essential when migrating to Sentinel. This does not mean you can’t add the necessary data. Instead, think about what data is valuable, how you can improve efficiency, and keep costs down.
Reasons to Add Data to Sentinel
Before you add a data source to Sentinel, consider its use case and understand why it’s important to have the data in Sentinel. For me, four reasons exist to send data into Sentinel:
- Active alerting
Active alerting is the most common reason to send data to Sentinel. After ingesting the data, we can create analytic rules to create alerts and incidents, and the security team can investigate the incidents. Some common data types are sign-in and process events from endpoints.
Environments might have data sources that don’t hold value for alerts and incidents but can be used to investigate incidents and alerts. A great example is network logs from your proxy, a product such as zScaler. All my incidents are created based on data from Microsoft Defender for Endpoint, but this data sometimes lacks essential details. zScaler logs can be ingested to augment the existing data for an investigation. Using zScaler, you can retrieve the full URL of an HTTP request with all parameters, compared to only the domain name from Microsoft Defender for Endpoint. These logs can be ingested as basic logs to save on ingestion and retention costs. Basic logs are much cheaper to ingest, with the downside that they can’t be used for active alerting.
Sentinel includes a feature called ‘Workbooks’ that supports the creation of visualizations from data stored in Log Analytics. Through KQL queries, we can create interactive reports allowing you to present the data stored in the SIEM in a more user-friendly way. One example could be logs from a Web Application Firewall. This data is also used for active alerting, but it’s a great example of a data source that allows for nice visualizations. By ingesting the logs from your Web Application Firewall, you can create reports showcasing the activity on your web services and what geographic locations are the most active.
The last reason – compliance- might not be immediately apparent, but it’s there to cover legal requirements you might have as an organization. Some organizations are required to save data for x number of months/years. You can uphold those requirements using a unified platform by sending that data to Sentinel. For these data types, you can use the archive tier, covered here.
Cybersecurity Risk Management for Active Directory
Discover how to prevent and recover from AD attacks through these Cybersecurity Risk Management Solutions.
When I onboard a new customer onto Sentinel, I always start small and enable a set of basic data connectors first. This method has a couple of advantages:
- It allows the organization to understand the product with a set of common data, and they can learn how to use Sentinel based on that information.
- It keeps the cost low.
- It allows the SOC to identify which detection gaps they might identify in their environment and helps you prioritize the next set of data connectors.
When choosing the first set, I always work my way down the following list:
- Microsoft Cloud Logs
- External Security Products
- Network Logs
- Applications logs
Microsoft Cloud Logs
Most organizations I work with have standardized on the Microsoft 365 E5 Security stack, which provides a ton of visibility on the on-premises and cloud resources. By starting with the Microsoft Cloud logs, we can gain a large amount of visibility for a limited price. This default set of connectors includes:
- Microsoft 365 Defender
- Azure Activity
- Azure Active Directory
- Office 365
- Microsoft Defender for Cloud
This Microsoft article confirms that most of these connectors listed above are free to ingest. The set of standard logs creates a great starting point to discover Sentinel while still having decent coverage of your environment.
When working with a Microsoft E3 license, the default logs won’t suffice as this does not have the necessary security features. For those customers, they should look into bringing in their mail gateway/EDR and other security logs into Microsoft Sentinel. This is covered in the next section.
External Security Products
Of course, not everything is Microsoft, and most organizations will use other security products and tools to cover certain items that Microsoft solutions might not cover. Many built-in data connectors are available to connect external sources to Microsoft Sentinel as well.
External security products can monitor your environment and create alerts and incidents for investigation. By ingesting those logs, you ensure your security team has a single pane of glass in terms of security incidents.
While a lot is being moved to the cloud, almost all organizations still have on-premises firewalls, switches, and proxies. The data from these sources are useful for investigation, reporting, and enrichment. Sending those logs to your SIEM can increase the scope of your SOC.
The last item on the list is application logs. These are not sign-in logs. Instead, they’re activity logs generated by applications, ranging from ERP systems to HR tools. These kinds of logs are often organization-specific and require input from the business itself on what kind of activity is suspicious and should be alerted.
Filtering is key
While you can copy the logs of a product to ingest into Sentinel, chances are that you don’t need all the data in the logs. Before ingesting anything, it’s wise to filter the data to extract helpful information. A great example is firewall logs: You might be interested in the administrator activity on your firewalls (when somebody creates or deletes a rule) but might not want all the raw NetFlow logs. Some products support a granular approach to sending data, but most do not. This is where prefiltering comes in. Before you send log data into Sentinel, we can also filter the data. This can be done in one of two ways:
- If you ingest the data through the Azure Monitor Agent, use Data Collection Rules and create queries to decide what kind of data you want.
- Not all data sources support the Azure Monitor Agent; Sentinel provides integration with LogStash, an open-source tool that allows you to create queries on what kind of data you want to forward to Sentinel.
Think Before Connecting
Although Sentinel makes it easy to onboard many data sources fast, it is important to keep cost in perspective. Organizations should have a valid reason to send that data (‘Just because we can’ – isn’t a valid one) and prioritize it against others. Sending all your data to Sentinel on day one will not make sense as you will pay for data you won’t use. Prioritize the data connectors that provide the most useful data and work your way down.