Aug 272014
 
What is a Symbol?

When you compile your executable the compiler generates debugging symbol information for every file it compiles and then linker assembles all of these symbol information into one file called PDB file or the Program DataBase file. Every variable, function in your application code can be called as a symbol which implies there will be private and public symbols.

This generated .pdb file’s full local path is embedded into the executable file. This comes in handy while you are debugging this application on your development machine. The PDB file will be picked up by the debugger since full path to the PDB is embedded into the executable by the linker. This will help the debugger figure, line numbers, file names, callstacks, local and global vars.

Why do we need Symbols?

Symbol file or .pdb file contains information which are actually not needed when you run your application but these come in handy when debugging application for bugs. Without pdb files or symbol files figuring out bugs or exact callstacks will be a pain on Windows. If that’s the case you might ask then why is this information not embedded into the executable? The answer is symbols are not always needed hence they are dumped into a separate file called .pdb so that you can debug when needed and also you can choose who see’s what symbols in turn making it hard for people to reverse engineer your code.

What does a PDB file contain?

They can contain a variety of information. For e.g.

  • Source code information: Line numbers, file names
  • Variables: Global and Local variables mapped to their addresses
  • Function names mapped to their addresses
  • FPO information to get correct call stack.
  • etc

Windows Debugger installation contains utilities to check out a PDB file namely: symchk, agestore, symstore, pdbcopy etc.

Public and Private Symbols

When linker generates a PDB file it contain both private and public debugging symbol information. Of course you can configure what it generates in the linker property pages.

Private symbol data contains following (mostly)

  • Global and Local Variables.
  • Functions
  • All user defined types.
  • Line number and source file information.

Public symbol table contains following…

  • Functions (just the address)
  • Global variables that are visible across obj files.

As you might have inferred private symbol files will be bigger in size compared to public symbols files. Also since private symbol file contains public symbols information as well, we can generate a separate public symbol file from this private symbols file. We use a tool called pdbcopy.exe for this purpose, comes with the windows debugger installation.

Symbol Path

So how do we tell the debugger where to look for symbols. One of my favorites is to use the environment variable_NT_SYMBOL_PATH. This variable provides us the flexibility to specify cache directories for downloaded symbols, we can even specify per symbol server cache directory.

Following value for _NT_SYMBOL_PATH downloads symbols from the server and puts into C:\Symbols folder.

cache*c:\Symbols;SRV*http://symbolserver;srv*http://anotherserver;srv*http://onemoreserver

Following value for _NT_SYMBOL_PATH downloads symbols from the server and puts into C:\Symbols folder and downloads symbols from http://anotherserver to c:\anotherserver_cache_folder.

cache*c:\Symbols;SRV*http://symbolserver;srv*c:\anotherserver_cache_folder*http://anotherserver

Windows debugger provides commands to controls symbol path, .sympath, .symfix. I use .symfix to quickly setup a default symbol path and symbols will be downloaded to a sym folder under the debugger folder. While .sympath is a cool command. If you would quickly add a symbol path to the debugger, just do the following…

.sympath+ C:\AnotherSymbolFolder
.reload

Controlling Symbol Loading in Windows Debugger

The debugger provides a command called .symopt. If we run the command without any arguments its shows our current symbol loading settings, for e.g.

Output from .symopt

So we see in this case we’ve configured to load line number information, and since we haven’t said SYMOPT_PUBLICS_ONLY, then private symbols are loaded. SYMOPT_AUTO_PUBLICS tells debugger to look for public symbols only as a last resort.

More information on symbols loading options can be found here: http://msdn.microsoft.com/en-us/library/windows/hardware/ff558827(v=vs.85).aspx

Along with this to see a list of modules for which symbol loading failed use command ‘lme’. To get a verbose output of the symbol loading process in the debugger use “!sym noisy” to turn it off use “!sym quiet”.

Conclusion

Always keep your symbols handy. Never know when you might need them.

Apr 272013
 

Filename and line number information is stored inside private symbols (.pdb file). So if private symbols are available the debugger will try figuring out the line number information. Note: public symbols doesn’t have line number information.

So the question I’ve heard people new to windbg ask is how to turn off line number display. What’s the command for this. What I normally do is and the easiest of all is the ‘.lines’ command. This is a toggle command, next time you execute .lines, the command will turn ‘on’ line number information.

Another option is to use .symopt command:
http://msdn.microsoft.com/en-in/library/windows/hardware/ff558827(v=vs.85).aspx

The symbol option of interest to us is: SYMOPT_LOAD_LINES. Following is the MSDN description of this item.

This symbol option allows line number information to be read from source files. This option must be on for source debugging to work correctly.

In KD and CDB, this option is off by default; in WinDbg, this option is on by default. In CDB and KD, the -lines command-line option will turn this option on. Once the debugger is running, it can be turned on or off by using .symopt+0x10 or .symopt-0x10, respectively. It can also be toggled on and off by using the .lines (Toggle Source Line Support) command.

This option is on by default in DBH. Once DBH is running, it can be turned on or off by using symopt +10 or symopt -10, respectively.

Mar 062012
 

Why should we force symbol loading in Windbg

Sometimes we could have a dump which does not load .pdb files even though they are present in the dump folder. The reason for the load failure is not necessarily every time a code change but could be just a rebuild of the source code. In such cases if you force load the .pdb file you should get a call stack that makes sense but you got to be good at API’s and libraries to make sure the stack makes sense. So until you get a proper .pdb file you can force load a .pdb file and work on the dump.
——————————————————-
Remember: Always insist on correct pdb file.
——————————————————-
So the command to enable this feature is: ‘.symopt’. Lists out the current symbol loading options. On my machine this is what I get…

0:000> .symopt
Symbol options are 0x30377:
0x00000001 – SYMOPT_CASE_INSENSITIVE
0x00000002 – SYMOPT_UNDNAME
0x00000004 – SYMOPT_DEFERRED_LOADS
0x00000010 – SYMOPT_LOAD_LINES
0x00000020 – SYMOPT_OMAP_FIND_NEAREST
0x00000100 – SYMOPT_NO_UNQUALIFIED_LOADS
0x00000200 – SYMOPT_FAIL_CRITICAL_ERRORS
0x00010000 – SYMOPT_AUTO_PUBLICS
0x00020000 – SYMOPT_NO_IMAGE_SEARCH

These flags determine how and what symbols will be loaded. These options also determine whether line number information should be loaded or not.

How to force load debugging symbols

So in our debugging scenario if we want to load symbols in a loose manner, i.e., without strict mapping of .pdb with .exe we will have to enable the following option…

0x00000040 – SYMOPT_LOAD_ANYTHING

In windbg we do this via…

0:000> .symopt+ 0x40
Symbol options are 0x30377:
0x00000001 – SYMOPT_CASE_INSENSITIVE
0x00000002 – SYMOPT_UNDNAME
0x00000004 – SYMOPT_DEFERRED_LOADS
0x00000010 – SYMOPT_LOAD_LINES
0x00000020 – SYMOPT_OMAP_FIND_NEAREST
0x00000040 – SYMOPT_LOAD_ANYTHING <———– Prevents validation of .pdb file
0x00000100 – SYMOPT_NO_UNQUALIFIED_LOADS
0x00000200 – SYMOPT_FAIL_CRITICAL_ERRORS
0x00010000 – SYMOPT_AUTO_PUBLICS
0x00020000 – SYMOPT_NO_IMAGE_SEARCH

To re-enable strict mapping between .exe and .pdb use

0:000> .symopt- 0x40
Symbol options are 0x30377:
0x00000001 – SYMOPT_CASE_INSENSITIVE
0x00000002 – SYMOPT_UNDNAME
0x00000004 – SYMOPT_DEFERRED_LOADS
0x00000010 – SYMOPT_LOAD_LINES
0x00000020 – SYMOPT_OMAP_FIND_NEAREST
0x00000100 – SYMOPT_NO_UNQUALIFIED_LOADS
0x00000200 – SYMOPT_FAIL_CRITICAL_ERRORS
0x00010000 – SYMOPT_AUTO_PUBLICS
0x00020000 – SYMOPT_NO_IMAGE_SEARCH

Note the +/- in the above command. ‘+’ enables, ‘-‘ disables.

Alternative way

Another way or maybe a better way to do this is to do this as and when necessary i.e. via .reload command. So if you see a PDB file not loading up due to a mismatch you can just use .reload and ask the debugger to load up the symbols even when they mismatch. This is how we do it.

Following example shows how to load a mismatched PDB/symbol file for test.exe

.reload /f /i test.exe

The /i in above command tells the debugger to ignore any symbol mismatch and just load up the PDB/Symbol file.