So yesterday was my birthday, and after having a good company and watching a movie, I thought about doing something special, guess what? .. fixing a weird bug I had for a long time.In my PE scanning engine there was a weird heap corruption bug, after I spent a while looking for the cause, I found that some files have too long API names .. too longer than I expected.
I did a quick search online to see what is the maximum number of characters an API name could have. I didn’t find a technically valid answer, no documentations for that. So I asked Peter Ferrie, and he told me that it is just a couple of bytes under 64KiB! .. I really didn’t expect such a big number. He said that the limitation is set by RtlInitString API. That was extremely helpful!
So I started from there, and checked RtlInitString in ntdll.dll. It takes a string, computes its length and fills a STRING structure.
Length is the number of characters in the API, MaximumLength is always equals Length+1 which will have number of bytes of the buffer, since it terminates with zero. Buffer is the address of the string.
This is RtlInitString:
So the number of bytes an API name can have is exactly 0xFFFE, or 65,534 bytes. The other question that was in my mind, is there any characters that are NOT allowed?
I checked pefile source code and found that it checks only for characters in range a-z, A-Z, _?@$() as you can see here, just because compilers use them when they “mangle” or decorate the API names. But malware don’t adhere to any documentation.
But wait, why not check ntdll.dll itself and see how Windows does it! .. So I followed GetProcAddress, a function to function, until I reached LdrGetProcedureAddressEx in ntdll.dll which parses the import table of a PE file. What I’ve found was interesting .. it doesn’t check or exclude any characters.
So I tried to play with the names. I wrote a small DLL, compiled it and then manually changed the API name in the export table and included some weird characters, even non-printable ones. I wrote another small test executable that imports my DLL and manually changed the imported API name to match the one in the DLL .. the surprise is .. it works!.
When I uploaded the file to VirusTotal, which by the way uses pefile in the backend to do the file structure analysis, it didn’t show my weird API name.
You can see the file scanning report here. This file uses the function bar() as 5F 62 61 72 2F 02 .. which is “_bar/” plus hex 0x02. It doesn’t show on VirusTotal, yet it is valid on Windows.
And finally I fixed it in my code, the bug is gone, and my scanner lived happily ever after 🙂
Here is a snippet of my code:
So in conclusion:
- API name can have up to 65,534 characters.
- API name can have any characters terminated by NULL.
- VirusTotal will not see your weird imported API, until Ero Carrera fix his pefile.