If you’ve never programmed in C/C++ or any of those languages that require manual memory management or pointers chances are you won’t know the absolute horror that is finding these sorts of bugs. They are evil incarnate. One seemingly harmless pointer is left dangling after the object it pointed to has been deleted. But the pointer isn’t null. Oh no, it’s still pointing out into the void, ready to write bogus data into whatever’s there. Your application will crash, intermittently and sometimes only in production code. Bonus points if it doesn’t happen if the debugger is attached.
These sorts of bugs are extremely hard to find, especially if the codebase is large/old. Recently I came across one of these, which had been causing havoc for quite some time, and set upon the task of fixing it.
There’s a few ways to go about this, you could swap out the memory manager for something of your own design and hope you can find it. You can scale down the code while having the bug still reproduceable in the hope of finding the bug by noticing when it went away and deducing that it must have been in that code you just deleted.
Or try to understand it it more detail. A duo of tools have always done it for me: The memory view in Visual studio and the application verifier.
The app verifier is a magic piece of software that somehow attaches into the executable when running and will cause breakpoints if you mangle memory. I don’t know how it does it, but it does. Download it here. It’s easy enough to use, you add the executable in application verifier, check that it should verify application memory handling and then run it, either normally or through a debugger. Application verifier should stop the execution dead in it tracks much closer to the actual cause than what you get if you just run it.
As an example, in my latest case without the verifier the app crashed deep within Win32 when opening a window. With the verifier, it crashed in our own code – and much more reliably. Once you get it to this state, open the disassembly. Yes, the disassembly. You’ll have to go back one or two steps to get into your own code instead of the verifier inserted code. There, try to see what memory access has offended the verifier. You probably have something like a DWORD ptr that is being dereferenced. Check that memory address in the memory view.
The memory view, besides its best feature of looking like the matrix if you use green on black fonts is quite useful. You’re to identify if you are: writing to uninitialized memory, double-freeing a pointer or overrunning a buffer. Sometimes there are sentinel bytes, especially in the memory managers debug mode, which famously in win32 maps both 0xDEADBEEF and 0xBAADF00D to various states of the debug heap. Sometimes you can see strings which gives you an indication as to where in the code you are.
Either way, you should be much closer to the problem now. In my case, it was an unhandled realloc return which in previous memory management implementations never moved the buffer, but now it did.
Application verifier probably won’t give you the exact line of offending code, but it’ll get you a whole lot closer!