[Debugging] How to find length of a CString string in application memory or in a dump

Recently a colleague of mine asked where’s the length of CString string stored in memory. Hmm so lets dig around. Please note I’ve declared the following CString object in my code…

CString TestCString = _T("Nibu is testing CString");

If you dump CString type in the debugger we see following…

0:000> dt TestCString
Local var @ 0xb4fcd4 Type ATL::CStringT<wchar_t,StrTraitMFC_DLL<wchar_t,ATL::ChTraitsCRT<wchar_t> > >
   +0x000 m_pszData        : 0x00dfa2f8  "Nibu is testing CString"

From above dump of type CString we see that CString class defines just one variable: m_pszData. I don’t see a length variable here so where is the length stored for CString string?

Length of a CString string is stored at a negative offset from m_pszData. The data structure that resides at the negative offset is: ATL::CStringData

0:000> dt mfc100ud!ATL::CStringData
   +0x000 pStringMgr       : Ptr32 ATL::IAtlStringMgr
   +0x004 nDataLength      : Int4B
   +0x008 nAllocLength     : Int4B
   +0x00c nRefs            : Int4B

CStringData is retrieved via a call to function: GetData()

CStringData* GetData() const throw()
{
    return( reinterpret_cast< CStringData* >( m_pszData )-1 );
}

The above code is bit of pointer arithmetic, first m_pszData is cast to a pointer to CStringData and then the casted type is deducted by –1 (which will equate to -sizeof(CStringData). So lets see while debugging if we can get to the CStringData located at a negative offset. First lets get the size of ATL::CStringData in memory.

0:045> ?? sizeof(ATL::CStringData)
unsigned int 0x10

Size of ATL::CStringData comes to 0x10 bytes. So in my test application lets find out what is located at a negative offset of 0x10 bytes. In my current frame I’ve the following locals. My CString object is called TestCString, highlighted in bold in the below code snippet.

0:000> dv
           this = 0x00ef6ba8
        cmdInfo = class CCommandLineInfo
       ttParams = class CMFCToolTipInfo
      InitCtrls = struct tagINITCOMMONCONTROLSEX
   pDocTemplate = 0xcccccccc
    TestCString = class ATL::CStringT<wchar_t,StrTraitMFC_DLL<wchar_t,ATL::ChTraitsCRT<wchar_t> > > 
     pMainFrame = 0xcccccccc

Deduction of 0x10 bytes from address of m_pszData (0x00dfa2f8) gives us the address: 00dfa2e8

0:000> ? 0x00dfa2f8-0x10
Evaluate expression: 14656232 = 00dfa2e8

Lets try dumping out CStringData located at the address: 00dfa2e8. See below

0:000> dt 00dfa2e8 TestStack!ATL::CStringData
   +0x000 pStringMgr       : 0x786cb8e4 ATL::IAtlStringMgr
   +0x004 nDataLength      : 0n23
   +0x008 nAllocLength     : 0n23
   +0x00c nRefs            : 0n1

Dump type says, length of string is: 0n23 which is correct. The length of string “Nibu is testing CString” is indeed 23.

Code documentation of CStringData says this about its member variables…

struct CStringData
{
    IAtlStringMgr* pStringMgr;  // String manager for this CStringData
    int nDataLength;  // Length of currently used data in XCHARs (not including terminating null)
    int nAllocLength;  // Length of allocated data in XCHARs (not including terminating null)
    long nRefs;     // Reference count: negative == locked
    // XCHAR data[nAllocLength+1]  // A CStringData is always followed in memory by the actual array of character data

Difference between nDataLength and nAllocLength is quite evident from the above documentation. Hope this helps.

Appreciate your comments...