?InstanceName? Help

The million dollar exploit

Or the tale of checking the source code

TLDR

There's nothing here. Please don't harass anyone over this. I just wanna show people they can go read the source code, if it's available. It's okay.

How I know

I checked the source code. I'll show you, in a minute, but first. skip background info

Background

progenitor.png
tweet

This tweet, from a user who goes by a, claims there is a 7zip vulnerability. They dropped this pastebin, which has some code, some data for shellcode and headers for the 7zip file, but it is only about 93 lines of code, and seems very incomplete.

proof-of-concept.png

There was a lot of comments on this code about how it didn't compile, or that if it did, it didn't work, etc.

a-clue.png
tweet

Then another tweet, with more information, seemingly from an AI, as the author has mentioned before that they told an AI the general concept of it, and the AI give them the source code:

Hi Idor! The issue lies in the RC_NORM macro in LzmaDec.c. This macro normalizes range and code values during decoding and increments the buf pointer (p->buf++) without verifying if it exceeds allocated memory or the bufLimit. The lack of bounds checking allows a custom forged LZMA stream to manipulate range and code which causes the buf pointer to overflow into adjacent memory. By designing the LZMA stream with very low frequency symbols, we can exploit this to overwrite critical memory regions like as return addresses or function pointers. To put it simply, this vulnerability arises from inadequate validation of the LZMA stream structure which enables malformed input to trigger the overflow and execute arbitrary code. Remember this is a PROOF OF CONCEPT

This mentions a file, and a macro, so we have some clues here, to figure out if there actually is a vulnerability. If there is, it would be in the LZMA decoder, right? If the original claim, in the original paste, has any truth to it:
This exploit targets a vulnerability in the LZMA decoder of the 7-Zip software.

So... The source is open. My body is ready. Let's do this...

bingusbuffering.gif

The code

Okay, so after looking in LzmaDec.c, the macro RC_NORM does not exist, but there IS a macro called NORMALIZE.

#define NORMALIZE if (range < kTopValue) { range <<= 8; code = (code << 8) | (*buf++); }

This macro does, in fact, not check the buffer length. Is this exploitable? Well, idk. Let's see where it's used.

It gets used here that repeatedly uses this macro in a do while loop that doesn't check the buffer pointer, and here, which uses it once at the end of the same function. If we look at this part of the source code, we can see that maybe, if we try really hard, we could get this buf var to go past the bufLimit, which is mentioned in that earlier paste. If we look at the code in this LZMA_DECODE_REAL function, we can find that all of the values we need are going to come from the CLzmaDec, which is a decoder structure, and some macros also contribute to the values. We can even determine that the condition for this line of code that repeatedly uses the macro code that could maybe increment the buffer pointer past it's length will be the last iteration of a while loop, as oce you hit the while loop here, we will fail the check and return from the function.

So, in order to understand how we would start making a payload, we would have to figure out how to craft a malicious archive that can cause these conditions, AND we have to have some kind of shellcode the execute, and that shellcode needs to be written to the address pointed to by the, now over incremented buffer, and we would have to execute it somehow, so we need to either overwrite a return pointer (unlikely that stack canaries aren't around, though I didn't look at a 7z binary to confirm), or maybe overwrite a vtable to some object that has a cleanup function that gets called before the function returns, or maybe a vtable on the heap, and the cleanup function gets called later. I didn't see anything like that, though, in this codebase.

The archive is only one part of a vuln like this, and we still have requirements after overflowing this buffer, so I made a decision. First I would see if we even have code that writes to this pointer, after this function. Without being able to write to it, we can't actually continue with the rest of the attack. So, I traced all of the calls in the source to LZMA_DECODE_REAL:
https://github.com/ip7z/7zip/blob/main/C/LzmaDec.c#L695

That's it. Just this call to LzmaDec_DecodeReal2. Nothing here. Okay, so what bout before this, since our CLzmaDec structure lives upstream of this function.

Calls to LzmaDec_DecodeReal2
https://github.com/ip7z/7zip/blob/main/C/LzmaDec.c#L1087
https://github.com/ip7z/7zip/blob/main/C/LzmaDec.c#L1161
These calls are both inside the same function, LzmaDec_DecodeToDic. It seems the second time, the buffer length passed in is actually the p->buf, but again, the pointer is never written to. Only read from. That would make sense, I suppose, unless LZMA decodes in place.

Calls to LzmaDec_DecodeToDic:
https://github.com/ip7z/7zip/tree/main/C/7zDec.c#L189
https://github.com/ip7z/7zip/tree/main/C/7zDec.c#L269
https://github.com/ip7z/7zip/tree/main/C/7zDec.c#L1224
https://github.com/ip7z/7zip/tree/main/C/7zDec.c#L1357
https://github.com/ip7z/7zip/tree/main/CPP/7zip/Compress/LzmaDecoder.cpp#L160

One of these leads into the cpp code for 7zip, and the c code leads back to a main file. I still can't find anything that writes to that buffer in the CLzmaDec. It's possible it happens in the C++ code, since it's in an object, but looking around the CDecoder class, I can't find anything that uses the _state and uses that buf pointer in any writing. Searching for &_state in the C++ files doesn't show me much, and it all goes back to that C code, anyways. There just isn't a way to utilize the buffer being over incremented. There are also a few checks in calls, like this one that are calculated with that buffers pointer value: SizeT processed = (SizeT)(p->buf - p->tempBuf);. I am certain one of these checks will fail, as if you have over incremented the buffer, you will be over this constant LZMA_REQUIRED_INPUT_MAX, and this particular check is going to fail. There's a few others I saw while scouring the code, and you probably saw them, too, if you looked at the links.

I don't see it. Even if you could, theoretically, use NORMALIZE to over increment that buffer pointer, it seems kinda useless. I saw a few people checked a few things out, but I figured if there is something wrong, we can probably find it in the source code. Especially given the clues the poster gave.

It's extremely hard to prove something doesn't exist, but I think we can put this one to rest. I didn't spend a whole lot of time looking, but the only thing I could find from what we were given, was pretty succinct. You can also find the main function, which I assume is the main function for the 7zip executable that ships, and trace things down, but I don't see any real opportunity to do much more than maybe get 7zip to crash or fail to decompress the data properly. You could maybe read some data outside the buffer, but that seems unlikely, as there is some math that would be rough to reverse, if not impossible, that gets done to the data being read from the pointer, in the macro. So, the read would have to be somewhere else, too, and that read data would somehow need to be capture. So using something like this to leak pointers or something is also probably not happening.

Anyways, thank you for reading my writeup on a "not 0 day exploit".

Last modified: 02 January 2025