2020-07-08

(Ab)using Microsoft's symbol servers, for fun and profit

Since I find myself doing this on regular basis (Hail Ghidra!), and can never quite remember the commands.

Say you have a little Microsoft executable, such as the latest ARM64 version of usbxhci.sys, that you want to investigate using Ghidra.

Of course, one thing that can make the whole difference between hours of "Where the heck is the function call for the code I am after?" and a few of seconds of "Judging by its name, this call is the most likely candidate" is the availability of the .pdb debug symbols for the Windows executable you are analysing.

You may also know that, because of the huge corporate ecosystem they have where such information might be critical (as well as some government pressure to make it public), it so happens that Microsoft does make available a lot of the debug information that was generated during the compilation of Windows components. Now, since it can amount to a large volume of data (one can usually expect a .pdb to be 3 to 5 times larger than the resulting code) this debug information is not usually provided with Windows, unless you are running a Debug/Checked build.

But it can "easily" be retrieved from Microsoft's servers. Here's how.

First of all, you need to ensure that you have the Windows SDK or Windows Driver Kit installed. If you have Visual Studio 2019 (remember, the Community Edition of VS2019 is free) with the C++ development environment, these should already have been installed for you. But really it's up to you to sort that out and alter the paths below as needed.

With this prerequisite taken care of, you should find a commandline executable called symchk.exe somewhere in C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\. This is the utility that can connect to the Microsoft's servers to fetch the symbol files, i.e. the compilation .pdb's that Microsoft has made public.

So, let's say we have copied our ARM64 xHCI driver (USBXHCI.SYS - Why Microsoft suddenly decided to YELL ITS NAME is unknown) to some directory. All you need to do to retrieve its associated .pdb then is issue the command:

"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\symchk.exe" /s srv*https://msdl.microsoft.com/download/symbols /ocx .\ USBXHCI.SYS

The /s flag indicates where the symbols should be retrieved from (here the Microsoft's remote server) and the /ocx flag, followed by a folder, indicates where the .pdb should be copied (here, the same directory as the one where we have our driver).

If everything goes well, the output of the command should be:

SYMCHK: FAILED files = 0
SYMCHK: PASSED + IGNORED files = 1

with the important part being that the number of PASSED files is not zero, and you should find a newly created usbxhci.pdb in your directory. Neat!

"Hello, my name is Mr Snrub"


So, what do you do with that?

Well, I did mention Ghidra, and as a comprehensive disassembly/decompiler utility, Ghidra does of course have the ability to work with debug symbols if they happen to be available (sadly, it doesn't seem to have the ability to look them up automatically like IDA, or if it does, I haven't found where this can be configured), which helps turn an obtuse FUN_1c003ac90() function name, into a much more indicative XilRegister_ReadUlong64()...

For instance, let's say you happen to have been made aware that the reason why you currently can't use the rear USB-A ports for Windows 10 on the Raspberry Pi 4 is because Broadcom/VIA (most likely Broadcom, because they've already done everyone a number with implementing a DMA controller that chokes past 3 GB on the Bcm2711) have screwed up 64-bit PCIe accesses, and they end up returning garbage in the high 32-bit DWORD unless you only ever attempt to read 64-bit QWORDs as two sequential DWORDs instead of a single QWORD.

As a result of this, you may be exceedingly interested to find out if there exists something in the function calls used by Microsoft's usbxhci.sys driver, that can set 64-bit xHCI register accesses to be enacted as two 32-bit ones.

Obviously then, if, after using the .pdb we've just retrieved above, Ghidra helpfully tells you that there does exist a function call at address 1c003ac90 called XilRegister_ReadUlong64, you are going to be exceedingly interested in having a look at that call:

undefined8 XilRegister_ReadUlong64(longlong param_1,undefined8 *param_2)
{
  undefined8 local_30 [6];
  
  local_30[0] = 0;
  if (*(char *)(*(longlong *)(param_1 + 8) + 0x219) == '\0') {
    DataSynchronizationBarrier(3,3);
    if ((*(ulonglong *)(*(longlong *)(param_1 + 8) + 0x150) & 1) == 0) {
      // 64-bit qword access
      local_30[0] = *param_2;
    } else {
      DataSynchronizationBarrier(3,3);
      // 2x32-bit dword access
      local_30[0] = CONCAT44(*(undefined4 *)((longlong)param_2 + 4),*(undefined4 *)param_2);
    }
  } else {
    Register_ReadSecureMmio(param_1,param_2,3,1,local_30);
  }
  return local_30[0];
}

NB: The comments were not added by Ghidra. Ghidra may be good at what it does, but it's not that good...

Guess what? It so happens that there exists an attribute somewhere, that Microsoft uses the bit 0 of, to decide whether 64-bit xHCI registers should be read using two 32-bit access. Awesome, this looks exactly like what we're after.

The corresponding disassembly also tells us that this if condition is ultimately encoded as a tbnz ARM64 instruction. So if we revert that logic, by using a tbz instead of tbnz, we should be able to force the failing 64-bit reads to be enacted as 2x32-bit, which may fix our xHCI driver woes...

Let's do just that then, by editing USBXHCI.SYS and changing the EA 00 00 37 sequence at address 0x03a0d0 to EA 00 00 36 (tbnztbz) and, for good measure, do the same for XilRegister_WriteUlong64 at address 0x005b34, by also changing 0A 01 00 37 into 0A 01 00 36 to reverse the logic. "Yes that'll do".

"I like the way Snrub thinks!"


Well, we may have patched our driver and tried to fool the system by reversing some stuff, but, as the Simpsons have long attested, it's not going to do much unless you have a trusted sidekick to help you out.

Obviously, since we broke the signature of that driver the minute we changed even a single bit, we're going to have to tell Windows to disable signature enforcement for the boot on our target ARM64 platform, which can be done by setting nointegritychecks on in the BCD. And while we're at it we may want to enable test signing as well. Now, most of the commands you'll see are for the local BCD, but that's not what we are after here, since we want to modify a USB installed version of Windows, where, in our case, the BCD is located at S:\EFI\Microsoft\Boot\BCD. So the trick to achieving that (from a command prompt running elevated) is:

bcdedit /store S:\EFI\Microsoft\Boot\BCD /set {default} testsigning on
bcdedit /store S:\EFI\Microsoft\Boot\BCD /set {default} nointegritychecks on

However, if you only do that and (after taking ownership and granting yourself full permissions so that you can replace the existing driver) copy the altered USBXHCI.SYS to Windows\System32\drivers\ you will still be greeted by an obnoxious

Recovery

Your PC/Device needs to be repaired

The operating system couldn't be loaded because a critical system driver is missing or contains errors.

File: \Windows\System32\drivers\USBXHCI.SYS
Error code: 0xc0000221

Oh noes!


The problem, which is what generates the 0xc0000221 (STATUS_IMAGE_CHECKSUM_MISMATCH) error code, is that the optional PE checksum field, used by Windows to validate critical boot executables, has not been updated after we altered USBXHCI.SYS. Therefore checksum validation fails, and this is precisely what the Windows boot process is complaining about.

Fixing this is very simple: Just download PEChecksum64.exe (e.g. from here) and issue the command:

D:\Dis\>PEChecksum64.exe USBXHCI.SYS
USBXHCI.SYS: Checksum updated from 0x0008D39B to 0x0008D19B

For good measure, you will also need to self-sign that driver, so that you can avoid Windows booting into recovery mode with an obnoxious 0xc000000f from winload.exe (though you can still proceed to full boot from there).

Now we finally have all the pieces we need.

For instance, we can replace USBXHCI.SYS on a fast USB 3.0 flash drive containing an Raspberry Pi Windows 10 ARM64 installation created using WOR (and if you happen to have the latest EEPROM flashed as well as a version of the Raspberry Pi 4 UEFI firmware that includes this patch, you can actually boot the whole thing straight from USB), and, while we are at it, remove the 1 GB RAM limit that the Pi 4 had to have when booting from USB-C port (since we're not going to use that USB controller), by issuing, from an elevated prompt:

bcdedit /store Y:\EFI\Microsoft\Boot\BCD /deletevalue {default} truncatememory

Do all of the above and, who knows, you might actually end up with a usable Windows 10 ARM64 OS, running from one of the rear panel's fast USB 3.0 ports with a whooping 3 GB of RAM, on your Raspberry Pi 4.

Now, isn't that something?

But this is just a post about using Microsoft's symbol servers.
It's not a post about running full blown Windows 10 on the Raspberry Pi 4, right?

Addendum: In case you don't want to have to go through the taking of ownership, patching, updating of the PE checksum and digitally re-signing of the file yourself, may I also interest you in winpatch?

1 comment:

  1. Nice post!
    Is there way to debug Windows ARM64 kernel on Pi4? Local kernel debugging is working, but you can't set breakpoints, change memory, etc.

    ReplyDelete