How To Create a Logging Strategy

By

Logs contain critical data needed to detect threats, but not all logs are relevant when it comes to a strong detection and response strategy. A strong logging strategy can help you monitor your environment, identify threats faster, and help you determine where you can get the most value from your data. 

However, there is no single one-size-fits-all solution. A security operations center is often at the nexus of log collection since it needs to pull from so many different sources. It is easy to get buried in log data, so it is extremely important to take a concrete, requirements-based, value-driven approach to collection, use, and retention.

Strategizing Logs for Security Monitoring

There are a few reasons organizations need to develop a logging strategy. One driver may be a compliance requirement that requires a SIEM to store specific log types. If that requirement drives the logging strategy versus what the organization actually needs for comprehensive security monitoring, it may be implemented improperly and/or inefficiently. Another reason may include tuning a SIEM that is ingesting too many log sources and is sending false positives, causing alert fatigue for the security team and driving up data storage costs unnecessarily. When a SIEM is launched without a logging strategy in place, these types of costs and security risks increase.

So where is the starting point when building a logging strategy? First. decide how to go about collecting logs.

Log collection methods are executed in one of two ways:

  • Volume Logging – Input focused: Log all the sources, and let the analysts sort it out.
  • Selective Logging – Output focused: Pick and choose what logs you want, and how you want them, and hope you didn’t forget anything.

We can create the optimal solution by taking a hybrid approach and incorporating both methods.

First, start with volume logging and implement a human-led process of constant tuning and pruning. It will take more ongoing maintenance, but you will have more data you might not know you need, without having to pay for the storage/license of logging everything. With human processes and procedures, your logging strategy will be cheaper and more effective than relying on a tool or a blanket strategy every time.

When looking at tuning and pruning, one of the best approaches is to start with removing the most frequent events that you will see, as they are the least likely to have any significant security value. Focusing on outliers allows attention to the types of logs that will have the most security value, as well as being much more cost effective in terms of logging and storage.

One other factor you should consider is the purpose behind the logging. This will vary greatly depending on who will be using the logs and for what tasks they will need them. Typically these will come down to three implementations.

Security Alerts Only

  • With this approach, you only log events and data fields that are directly used for generating security alerts. Short, sweet, and to the point, this option will give you the smallest overall volume if your organization has concerns about storage or licenses if you use a premium SIEM.

Security Relevant Logs

  • These would be all of the data you need for alerts, as well as any data that would be useful for a security investigation. Not all data can be used to generate alerts, but can still provide immense value for threat hunting and incident response scenarios.

All Operational Logs

  • This is common when the security team and the network operations team is working off of the same data, in the same tool. For simplicity’s sake, all logs relevant to both security and network operations are logged into the platform of choice, and the respective teams parse through what is relevant to them. This is handy for larger organizations with their own internal SOC and NOC teams, but does ingest a much larger volume of data.

What Logs Should You Collect?

Relevant logs come from all types of sources, and it’s always worth considering using industry standard frameworks like NIST SP800-92[1] or MITRE ATT&CK to help build your logging strategy that works for your organization. Compliance is a great starting point for log collection, but not the goal. Prioritizing the logs you collect will take your SIEM to its fullest potential, giving you the right amount of visibility you need for security operations.

When looking at what logs to prioritize for detection, one of the more common methods is to see what technologies and data sources will provide the best coverage of techniques found in the MITRE ATT&CK Framework. While you should always look for more than simply checking a tactic or technique off of a list, it is still a great starting place when developing your strategy. We can leverage this method by taking a CSV of the MITRE ATT&CK framework and performing some statistical analysis on a few of the newer data points they have added: Data Sources, and Data Objects. These new data points allow us to see what we need to log in order to detect the documented techniques and sub-techniques within the framework. Here are a few interesting statistics from the research:

Here are a few interesting statistics from my analysis of the MITRE ATT&CK Data Sources and Data Objects:

  • As of writing this, MITRE has 552 techniques and sub-techniques, 80 of which by MITRE’s terms, “… will take place outside the visibility of the target organization, making detection of this behavior difficult.”[2]
  • It defines 32 different data sources, and 99 data objects used for detection.
  • 125 techniques and sub-techniques apply only to Windows Operating systems, 16 to MacOS, and 9 to Linux. All others either overlap operating systems or apply to a specific application or tool.

Mapping Data Sources to MITRE ATT&CK

Now for the fun part: If we were building our logging strategy from scratch, and I wanted to bring the most bang-for-my-buck data sources in first, which data source and object pairs cover the most techniques and sub-techniques?

Let’s take a look at some analysis on the MITRE ATT&CK Framework and see which data source/object pairs are mentioned the most as covering different techniques and sub-techniques.

MITRE ATT&CK Data Source: Object Total # of Techniques and Sub-Techniques
Command: Command Execution 243
Process: Process Creation 197
File: File Modification 95
Network Traffic: Network Traffic Content 89
Network Traffic: Network Traffic Flow 84

 

Almost half of the document techniques can be detected with command line logging.

Interesting, right?

Therefore, if developing a strategy focused on technique coverage, this is where to start. Using this data, you can prioritize not just log sources to ingest, but also consider the security tools you need to purchase, deploy, or replace. The top three on that list are all data objects that can be collected by an EDR solution; therefore, EDR rises as the top priority to consider when looking to improve your visibility.

Log prioritization can be complex, but when all else fails, Don Murdoch, the author of the Blue Team Handbook said it best: “If you have to make a choice, use user attributable data over all others.”

For more details on the above analysis, use this PDF as a reference: mitre_data_source_analysis.pdf

Calculating Logging Infrastructure Needs

When building a logging strategy, it is important to plan out the hardware and software needed to support both the logging operation itself as well as the storage for the logs collected. The best method of gauging what is needed is with a Proof-of-Concept. A POC takes a sampling of log data to easily calculate certain needed data points:

A POC takes a sampling of log data to easily calculate certain needed data points:

  • EPD: Events per day – Used for calculating overall storage capacity.
  • EPS: Events per second – Used for calculating the network and tool throughput required.
  • Peak EPS: Used for calculating maximum surge throughput needed at one time.

If a POC of regular traffic is not feasible, it is possible to deploy scripts that can pull the appropriate logs to help gauge and effective EPS. Another method for sizing EPS is by using estimation from third party research. While easy, it is not an advised method as estimations can vary wildly between sources. These options are handy and easier to use, but typically a POC for calculating event throughput is the only way to properly size log storage needs with any accuracy.

  • SANS: Benchmarking SIEM [3]
  • Estimating Log Generation for Security Information Event and Log Management [4]

Types Of Log Storage

There are three types of log storage that we would typically see and deal with when managing logs in a SIEM:

  • Hot: These are your most recent and active logs to monitor. Typically saved on SSD’s for the fastest response, retention on these disks are recommended for a minimum of 7 days, preferably 30 or more.
  • Warm: Once past the time frame of the most use, logs can be moved from SSDs to slower but larger mediums like Hard Disk or Tape. Some SIEMs have a function that allows transfer of data from these disks to SDD for faster searches on necessary data. These are typically stored for at least 90 days.
  • Cold: Beyond the first 90 days, the chances of needing a particular log file is slim, but not none. Cold storage is a cheap long term solution, but will take a long time to spool back up for use if needed.

A data retention policy is always an important item to plan out when developing a logging strategy. In security, it’s important to think about the historical logs that would be needed to effectively investigate a security incident. Industry statistics vary on the number of days it takes to detect an intrusion, with one study noting 56 days [5] and another 212 days [6] of dwell time before detection. An organization needs to determine the number of days they need based on at least that. Anything beyond that length of time is determined by your budget, program, and compliance requirements. An organization with compliance requirements for breach reporting or log collection should refer to their specific regulatory requirements and go from there.

Log Collection Types

Determining how to collect logs depends greatly on the following log types:

1. Application/device logs: These logs will come from a single source. These can be easily captured by native logging utilities within the device/application, a logging agent installed on the device, or some of the many options for syslog or other agentless logging. The applications and technologies used in your environment may vary, but be sure to investigate all of the options available to determine the right methodology for you.

2. Service Logs: Service logs are slightly different from the above logs. With service logs, you might not know exactly where the logs are coming from, or the utility to capture them is not easy to use. These logs will come down to a few different methodologies:

  • Native logging of services built into the devices that use them.
    • This has a high degree of fidelity as well as you can custom tailor the quality of logging at the device itself.
    • The biggest drawback is the volume of configuration required to set up all of the devices for logging. You must also be prepared to collect those logs from multiple different sources, which introduces networking issues as well as dealing with multiple different log formats.
  • Logging via network monitoring tools to capture all related traffic.
    • By using security tools like ZEEK (https://zeek.org) or Corelight (https://corelight.com), you can collect all of the different services you need to create logs for, with one tool and one log collection location, drastically increasing simplicity without sacrificing quality.
    • This will generate logs for systems you are unaware of, as well as provide a consistent logging format.
    • The difficulty here is getting permissions and buy-in from both management and engineering as it usually requires a network tap, an additional server, and permissions to get the required level of visibility.
  • Other options
    • Network devices such as Next-Gen firewalls can often be able to create similar network traffic monitoring. These options will work but tend to have inferior logging quality. Certain protocols will have sufficient detail of logging, such as HTTP. Data for other protocols such as HTTPS and DNS are typically insufficient.

The Importance of SIEM Tuning

There is no ‘set it and forget it’ logging strategy. While log collection can provide the right amount of fuel needed to drive better security outcomes, the SIEM is like a Bonzai tree, needing constant pruning (tuning) and care to thrive. The biggest key to deploying and maintaining a logging strategy with a SIEM is to provide sufficient resources for time and security staff to tune the SIEM. Experienced security analysts and detection engineers can manage and tune the SIEM, and adapt the log strategy over time to continually improve network security monitoring. A strong logging strategy optimizes the SIEM and ultimately reduces alert fatigue; these outcomes pay off long-term dividends with better security outcomes.

Ready to take your logging strategy to the next level? Check out Maximizing SOC Effectiveness with Managed Detection and Response for more tactical advice to improve security maturity.

Sources

[1] https://csrc.nist.gov/publications/detail/sp/800-92/final
[2] https://attack.mitre.org/techniques/T1608/
[3] https://apps.es.vt.edu/confluence/download/attachments/460849213/sans%20siem%20benchmarking.pdf?api=v2
[4] https://content.solarwinds.com/creative/pdf/Whitepapers/estimating_log_generation_white_paper.pdf
[5] https://www.fireeye.com/blog/threat-research/2020/02/mtrends-2020-insights-from-the-front-lines.html
[6] https://www.ibm.com/downloads/cas/OJDVQGRY

Subscribe to the deepwatch Insider Blog