Comment: Enterprise Log Managers – An Unsexy, But Vital Tool

Enterprise Log Managers – An Unsexy, But Vital Tool
Enterprise Log Managers – An Unsexy, But Vital Tool

Ultimately, the goal of enterprise log management (ELM) is to get the most critical events escalated to an operations staff so they can react and respond appropriately. In today’s enterprise, it would require culling through millions of events if there was no ELM to correlate that information and point to what is most critical. 

You may be asking: ‘Isn’t this security information and event management (SIEM)?’ It’s not. Well, not entirely. ELM and SIEM are interrelated. SIEM is more concerned with the larger view of your overall security landscape, whereas ELM is focused on a specific element of security (what is happening, and where?). SIEM correlates data across varying data sources and environments – a more holistic view. Therefore, ELM is a subset and critical component of a SIEM program.

Not all companies require a SIEM program. However, most companies would benefit from an ELM solution. For the purposes of this article, we’ll stick to ELM. For more information on SIEM, we encourage you to download ISACA’s free SIEM white paper.

Corporate policies are put forth, as are the related controls, in an effort to deter or prevent undesirable activities. The corporate policies, controls, and data feeds from systems and applications all need to be incorporated into the ELM. A measure of the quality of an ELM technology is how easy it is to interface with your critical systems. Things like, ‘How many different components does it understand?’, so to speak, and ‘How much technical expertise is required in order to make it deliver value?’

Use Cases and Setup

Privileged access monitoring is a classic example in which an ELM gathers logs from various systems and creates a direct workflow to the operations staff, enabling them to take action against items considered inappropriate. For example, a domain admin logged in after an allowed change window that fails to authenticate several times in a row – a potential brute force attack. The system must correlate those events and initiate the appropriate workflow, whatever that may be.

The processes established around the solution are just as important. The log management solution is only as good as the processes and teams that support it. Typically, this requires both an engineering and operations staff. The engineers build and configure the ELM so the right alerts are coming through. The operations staff is then able to take the alerts and, ideally, do the ‘right thing’.

Of course, the less mature your existing processes and workflows, the more iterations will be required. The events you consider ‘taggable’ – the events you are interested in – must tie back to corporate policy. The basic premise of ‘thou shall not access that which you are not allowed to access’ will guide the rules you develop. 

Activity will fall into one of three categories: transactions you don’t care about, transactions you want to know about, and transactions you want to take immediate action on. For example, you might have mis-keyed your password while attempting to log in. That type of transaction is not necessarily one to be concerned about. However, if there are a thousand more attempts in the next 60 seconds, you should know something is amiss. This example is likely a hacker trying to gain brute-force access to your valuable data. Flag it and determine what part of the organization should receive the system workflow.

ELM can provide value through non-security use cases as well. There could be transactional activity that indicates a problem, such as multiple acknowledgement requests being generated as a result of a system glitch. The sheer volume could saturate the network, acting as a denial of service attack. The ELM could flag this type of activity when it occurs so that preventive remediation can begin, potentially averting outage of a critical service.

A virus on the network provides an opportunity for a good ELM to demonstrate intelligence. As the tool logs virus-induced events and correlates them together as a single outbreak, operations will be able to proactively target the affected population. This approach, as is usually the case, can save hundreds or thousands of hours by solving the problem instead of reactively addressing each incident. Obviously, this becomes a compelling value statement as the Information Technology Infrastructure Library (ITIL) has put forth for decades: the presence of multiple incidents occurring for similar reasons typically represents a problem needing a solution (i.e., ‘problem management’).

Requisite Skills

The primary skill associated with successfully deploying an ELM is being able to translate business use cases into the ELM tool’s language. If your environment deals with personally identifiable information, for example, privacy concerns are going to be one of the highest priorities.

An understanding must exist of the systems generating the data and how those data relate to the company’s use cases. For example, we don’t want people logging on as a local administrator in an Active Directory domain environment; therefore, the ELM would need to alert on the appropriate event ID.

As IT professionals, we recognize there will always be technologies that are not commonly known and will require additional work to develop the proper interface. The resources you assign as your solution delivery project leaders or engineers for an ELM deployment must understand how to translate your business logic into the technical speak of your IT landscape.

Challenges

Scalability is the first challenge and biggest concern in architecting the solution. There will most likely be significant amounts of data logged. Data retention policies and growth must also be considered. Depending on your use cases, large portions of data may need to be held for very long periods of time. Therefore, consideration should be given to balance your company’s tolerance for risk with its taste for capital investment.
ELM systems typically work one of two ways: data intensive, which gathers all data to be analyzed later and thus needs to scale accordingly; and limited collection, which has agents gather only the information considered ‘interesting’. In the case of the former, storage will be a greater concern; for the latter, processing capabilities will need to be stronger to reduce the chances of introducing latency into transaction processing time.

Many ELM solutions do not use a communications protocol that provides delivery guarantee and, instead, use protocols – such as User Datagram Protocol (UDP) – that can result in some of the data getting lost. Technology and process verifications could be additional requirements to be factored into the design.

Of course, having well-defined expectations will determine the perceived success of any ELM implementation. Using a solution at an enterprise that has few policies and procedures will have little success, because there will be few rules to correlate the activity against. Define your solution delivery success criteria early, and make sure what you choose is measurable. Consider using a governance and management framework, such as COBIT 5, to guide the initiative.

Some ELMs come with standard rule sets that can accelerate implementation. Recognizing efforts to refine rule sets to reflect your organization’s corporate policies will drive the migration from focused manual intervention to true problem management. In this manner, not only will ELM implementers see a reduction in time spent resolving incidents, but their responsiveness will be seen as more proactive than reactive. As a result, these shops should see a reduction in incident management costs. And, of course, when implemented correctly, security issues will reduce overall and compliance abilities will improve.


Robert Zanella, CISA, is a member of the Metro NY ISACA Chapter and has been a Certified Information Systems Auditor since 1995. He is vice president of IT Audit at CA Technologies.

Bill Welch, CISM, is a member of the Metro NY ISACA Chapter. He is senior director of IT Security at CA Technologies.

Mike Mendelsohn is director of Application Security at CA Technologies.

Brian Korte is senior specialist of IT Security at CA Technologies.

What’s hot on Infosecurity Magazine?