ntroduction
Log4Shell (CVE-2021-44228) is a remote code execution (RCE) weakness in the Apache-establishment open-source logging library Log4j. It was distributed on December 9, 2021, and afterward the situation spun out of control. As Log4j is a typical logging library for Java applications, it is profoundly far reaching.
At Guardicore (presently part of Akamai), we mean to ensure our clients are just about as secure as could really be expected, so we boarded the steam cart of Log4j identification. We needed to ensure we can help our clients map all their weak servers and proposition them a division answer for limit the effect range of any conceivable abuse. As Akamai Guardicore is an organization division arrangement, we have solid perceivability into the server farm’s organization traffic. For have based data, we have Guardicore Insight — a mix with OSQuery, an open-source program to permit SQLite-like questioning of different OS data. Outfitted with this, we began our excursion.
Note: Perfecting our discovery apparatus is a steadily developing cycle. We invite all criticism from the local area with respect to better questions or upgrades to our recognition rationale.
Beginning Version — Detecting Listening Java Apps
We began finding out with regards to the weakness and every one of its subtleties the second it was distributed, and we likewise needed to share a prompt answer for our clients. We realized that the weakness influences Java applications, and that aggressors are involving that weakness as a section point into the association’s organization. In this way, to give prompt worth, we began by recognizing all listening Java applications. Planning them and afterward applying network division to alleviate discretionary web access would give sufficient insurance while we worked out a superior discovery technique.
select p.cmdline, p.cwd, l.port from processes AS p join listening_ports AS l on p.pid=l.pid where cmdline like “%java%” or cmdline like “%jar%”
Fig. 1: OSQuery for discovery of Java applications that are tuning in for associations
Albeit this question isn’t the most exact, and can raise both bogus up-sides and bogus negatives, it was a decent beginning. We could deliver this to our Solution Center and to Customer Success administrators, and spotlight on sharpening our recognition strategies.
Not-So-Initial Version — Better Java Detection, Exploitation Attempts Lookups
Having sent the underlying reaction, we presently had more opportunity to comprehend the weakness inside and out and investigate more location strategies. We realized that we expected to more readily recognize Java applications, and since not all Java applications utilize the Java executables for Windows (PE) or Linux (ELF), we needed to concoct a superior method for doing that. To settle this issue, we made an association of two questions. The first is a further developed variant of the Java application recognition, with a more broad rundown of strings to recognize Java in the order line:
SELECT DISTINCT LOWER(path) || ‘%%’ AS regex_path FROM processes WHERE (LOWER(cmdline) LIKE ‘%java%’ OR (cmdline) LIKE ‘%jar%’ OR LOWER(cmdline) LIKE ‘%jvm%’ OR LOWER(cmdline) LIKE ‘%jdk%’ OR LOWER(cmdline) LIKE ‘%jre%’)
Fig. 2: A superior inquiry for Java application recognition
Be that as it may, this inquiry was insufficient on the grounds that a few applications use Java by stacking the Java virtual machine into their own memory and not by direct reference (e.g., Tomcat, and a few occasions of Elasticsearch). Consequently, we added to the association a rundown of utilizations that don’t run the Java executable straightforwardly, yet are attached to Log4j:
SELECT DISTINCT REGEX_MATCH(LOWER(path), ‘.?(logstash|jenkins|tomcat|vsphere|vcenter|apache|okta).?(/|\)’, 0) || ‘%%’ AS regex_path FROM processes WHERE regex_path IS NOT NULL
Fig. 3: Query for a halfway rundown of Java applications that can run without Java executable
To be exhaustive, we chose to likewise really look at the base registry of each interaction. Assuming it contained any JAR documents, we could likely securely consider it as a Java-subordinate cycle, and check for Log4j conditions, also.
SELECT file.directory || “%%” AS regex_path FROM processes INNER JOIN record ON file.path like REPLACE(processes.path, processes.name, “%%”) AND file.filename LIKE “%.jar”
Fig. 4: Query to find all Java-subordinate cycles by taking a gander at their containing organizers and searching for JAR documents
At last, to identify Log4j conditions, we checked every one of the ways the past inquiries returned, and checked for any filename beginning with log4j and finishing with .container.
Recognizing Log4j weaknesses sufficiently isn’t — we can likewise look and check whether any adventure endeavor was made on the framework by dissecting the log records to look for the abuse and JNDI query strings. Rather than creating novel ways, we depended on Florian Roth’s YARA marks and changed them to work with Insight.1
1 Because of some parsing issues, we needed to change the YARA strings over to their hex bytes partners
The full questions for both Log4j identification and log examination can be found in our past blog entry: Mitigating Log4j Abuse Using Akamai Guardicore Segmentation.
Getting Java
In the wake of delivering the past set of questions to our field specialists, we went to Java itself, to ensure that our inquiries really do what we need them to do — identify all running Java applications on a PC and quest for Log4j conditions. We needed to direct research on how Java is executed and how it is bundled.
Java Execution
First of all, all Java applications need to run on the Java runtime and Java Virtual Machine (JVM), which implies that most Java applications will run from one of the accompanying cycles, which are the Java executables for Windows and Linux:
java
javaw
java.exe
javaw.exe
One exemption for that, which we found while checking the above theory and afterward affirmed with Java’s documentation, is that projects can stack the JVM library straightforwardly into their memory (we’ve witnessed that with both Tomcat and Elasticsearch, on the two Windows and Linux). Since the Java executable does this, also (and gratitude to Uptycs for the stunt), we can simply search for the JVM reliance in the process memory! Just in case, we can likewise add an association to check the cycle name, to not ensure anything slips:
SELECT DISTINCT
proc.pid,
proc.path,
proc.cmdline,
proc.cwd,
listening.port,
listening.address,
listening.protocol
FROM process_memory_map AS mmap
LEFT JOIN processes AS proc USING(pid)
LEFT JOIN listening_ports AS listening USING(pid)
WHERE mmap.path LIKE “%jvm%”
Association
SELECT DISTINCT
proc.pid,
proc.path,
proc.cmdline,
proc.cwd,
listening.port,
listening.address,
listening.protocol
FROM processes AS proc
LEFT JOIN listening_ports AS listening USING(pid)
WHERE proc.name IN (“java”, “javaw”, “java.exe”, “javaw.exe”)
Fig. 5: Query to find all Java-subordinate cycles by taking a gander at their memory map
Java Library Dependency
To stack other module conditions, Java programs need to indicate them in a variable called CLASSPATH. The most well-known method of indicating conditions in the classpath is by direct naming — just put the JAR (relative or outright way) straightforwardly in the classpath. The alternate way is to remember an organizer for the classpath, and all JARs under it will be stacked.
The main problem is with identifying the classpath. While the exemplary choice is to indicate it in the order line, the classpath can likewise be determined in climate factors or straightforwardly inside the principle JAR document.
Recognition Script Development
Furnished with our newly discovered information about Java internals, we needed to further develop our past questions. On more occupied servers, the questions could take a great deal of calculation assets, which would be very burdening. We can’t bear to exhaust our clients’ servers — as they are basic pieces of the server farm. To further develop execution, we split the greater inquiry into numerous more modest questions that are ensured to run all the more effectively. We can then pursue them one the other in a Python script that incorporates with an assigned REST API to run Insight questions. This would likewise clear the way for us to add more mind boggling examination to decide real weaknesses (by actually taking a look at the Log4j rendition) and furthermore to check on the off chance that there are any alleviations set up, and yield this in a simple to-handle design (e.g., a CSV document).
The content rationale is as per the following:
Identify all Java applications utilizing the inquiry from Figure 5
Break down the classpath to observe all JAR record conditions that are referenced in it straightforwardly
Separate all organizer conditions in the classpath, and for every envelope extricate all JAR records under it
For each JAR record that we found, really look at its name to check whether it is an important Log4j JAR document (log4j-center), separate its rendition and check assuming that variant is helpless against Log4Shell
Yield every one of our discoveries to a CSV document
The content functioned admirably and it created more solid outcomes than our past inquiries, which we could impart to our clients. Now and again, it even found conditions that weren’t identified by the clients’ other program the board/perceivability devices.
Tentative arrangements
While our content produces palatable outcomes, there’s consistently opportunity to get better. Our tentative arrangements for the second incorporate two increases:
Distinguish whether any alleviations are set up that keep an adventure from working notwithstanding weakness presence (e.g., a more up to date JRE rendition or a nondefault configuration)2
Pull all JAR documents that we recognized and recursively parse them to search for a Log4j reliance so we don’t depend just on the JAR existing straightforwardly in the filesystem
Container documents are essentially ZIP records, which can bundle inside them other JAR conditions — by parsing the ZIP/JAR record tree, we can search for more Log4j conditions