Aug 272014
 
What is a Symbol?

When you compile your executable the compiler generates debugging symbol information for every file it compiles and then linker assembles all of these symbol information into one file called PDB file or the Program DataBase file. Every variable, function in your application code can be called as a symbol which implies there will be private and public symbols.

This generated .pdb file’s full local path is embedded into the executable file. This comes in handy while you are debugging this application on your development machine. The PDB file will be picked up by the debugger since full path to the PDB is embedded into the executable by the linker. This will help the debugger figure, line numbers, file names, callstacks, local and global vars.

Why do we need Symbols?

Symbol file or .pdb file contains information which are actually not needed when you run your application but these come in handy when debugging application for bugs. Without pdb files or symbol files figuring out bugs or exact callstacks will be a pain on Windows. If that’s the case you might ask then why is this information not embedded into the executable? The answer is symbols are not always needed hence they are dumped into a separate file called .pdb so that you can debug when needed and also you can choose who see’s what symbols in turn making it hard for people to reverse engineer your code.

What does a PDB file contain?

They can contain a variety of information. For e.g.

  • Source code information: Line numbers, file names
  • Variables: Global and Local variables mapped to their addresses
  • Function names mapped to their addresses
  • FPO information to get correct call stack.
  • etc

Windows Debugger installation contains utilities to check out a PDB file namely: symchk, agestore, symstore, pdbcopy etc.

Public and Private Symbols

When linker generates a PDB file it contain both private and public debugging symbol information. Of course you can configure what it generates in the linker property pages.

Private symbol data contains following (mostly)

  • Global and Local Variables.
  • Functions
  • All user defined types.
  • Line number and source file information.

Public symbol table contains following…

  • Functions (just the address)
  • Global variables that are visible across obj files.

As you might have inferred private symbol files will be bigger in size compared to public symbols files. Also since private symbol file contains public symbols information as well, we can generate a separate public symbol file from this private symbols file. We use a tool called pdbcopy.exe for this purpose, comes with the windows debugger installation.

Symbol Path

So how do we tell the debugger where to look for symbols. One of my favorites is to use the environment variable_NT_SYMBOL_PATH. This variable provides us the flexibility to specify cache directories for downloaded symbols, we can even specify per symbol server cache directory.

Following value for _NT_SYMBOL_PATH downloads symbols from the server and puts into C:\Symbols folder.

cache*c:\Symbols;SRV*http://symbolserver;srv*http://anotherserver;srv*http://onemoreserver

Following value for _NT_SYMBOL_PATH downloads symbols from the server and puts into C:\Symbols folder and downloads symbols from http://anotherserver to c:\anotherserver_cache_folder.

cache*c:\Symbols;SRV*http://symbolserver;srv*c:\anotherserver_cache_folder*http://anotherserver

Windows debugger provides commands to controls symbol path, .sympath, .symfix. I use .symfix to quickly setup a default symbol path and symbols will be downloaded to a sym folder under the debugger folder. While .sympath is a cool command. If you would quickly add a symbol path to the debugger, just do the following…

.sympath+ C:\AnotherSymbolFolder
.reload

Controlling Symbol Loading in Windows Debugger

The debugger provides a command called .symopt. If we run the command without any arguments its shows our current symbol loading settings, for e.g.

Output from .symopt

So we see in this case we’ve configured to load line number information, and since we haven’t said SYMOPT_PUBLICS_ONLY, then private symbols are loaded. SYMOPT_AUTO_PUBLICS tells debugger to look for public symbols only as a last resort.

More information on symbols loading options can be found here: http://msdn.microsoft.com/en-us/library/windows/hardware/ff558827(v=vs.85).aspx

Along with this to see a list of modules for which symbol loading failed use command ‘lme’. To get a verbose output of the symbol loading process in the debugger use “!sym noisy” to turn it off use “!sym quiet”.

Conclusion

Always keep your symbols handy. Never know when you might need them.