Dec 092013
 

Recently a colleague of mine asked where’s the length of CString string stored in memory. Hmm so lets dig around. Please note I’ve declared the following CString object in my code…

CString TestCString = _T("Nibu is testing CString");

If you dump CString type in the debugger we see following…

0:000> dt TestCString
Local var @ 0xb4fcd4 Type ATL::CStringT<wchar_t,StrTraitMFC_DLL<wchar_t,ATL::ChTraitsCRT<wchar_t> > >
   +0x000 m_pszData        : 0x00dfa2f8  "Nibu is testing CString"

From above dump of type CString we see that CString class defines just one variable: m_pszData. I don’t see a length variable here so where is the length stored for CString string?

Length of a CString string is stored at a negative offset from m_pszData. The data structure that resides at the negative offset is: ATL::CStringData

0:000> dt mfc100ud!ATL::CStringData
   +0x000 pStringMgr       : Ptr32 ATL::IAtlStringMgr
   +0x004 nDataLength      : Int4B
   +0x008 nAllocLength     : Int4B
   +0x00c nRefs            : Int4B

CStringData is retrieved via a call to function: GetData()

CStringData* GetData() const throw()
{
    return( reinterpret_cast< CStringData* >( m_pszData )-1 );
}

The above code is bit of pointer arithmetic, first m_pszData is cast to a pointer to CStringData and then the casted type is deducted by –1 (which will equate to -sizeof(CStringData). So lets see while debugging if we can get to the CStringData located at a negative offset. First lets get the size of ATL::CStringData in memory.

0:045> ?? sizeof(ATL::CStringData)
unsigned int 0x10

Size of ATL::CStringData comes to 0x10 bytes. So in my test application lets find out what is located at a negative offset of 0x10 bytes. In my current frame I’ve the following locals. My CString object is called TestCString, highlighted in bold in the below code snippet.

0:000> dv
           this = 0x00ef6ba8
        cmdInfo = class CCommandLineInfo
       ttParams = class CMFCToolTipInfo
      InitCtrls = struct tagINITCOMMONCONTROLSEX
   pDocTemplate = 0xcccccccc
    TestCString = class ATL::CStringT<wchar_t,StrTraitMFC_DLL<wchar_t,ATL::ChTraitsCRT<wchar_t> > > 
     pMainFrame = 0xcccccccc

Deduction of 0x10 bytes from address of m_pszData (0x00dfa2f8) gives us the address: 00dfa2e8

0:000> ? 0x00dfa2f8-0x10
Evaluate expression: 14656232 = 00dfa2e8

Lets try dumping out CStringData located at the address: 00dfa2e8. See below

0:000> dt 00dfa2e8 TestStack!ATL::CStringData
   +0x000 pStringMgr       : 0x786cb8e4 ATL::IAtlStringMgr
   +0x004 nDataLength      : 0n23
   +0x008 nAllocLength     : 0n23
   +0x00c nRefs            : 0n1

Dump type says, length of string is: 0n23 which is correct. The length of string “Nibu is testing CString” is indeed 23.

Code documentation of CStringData says this about its member variables…

struct CStringData
{
    IAtlStringMgr* pStringMgr;  // String manager for this CStringData
    int nDataLength;  // Length of currently used data in XCHARs (not including terminating null)
    int nAllocLength;  // Length of allocated data in XCHARs (not including terminating null)
    long nRefs;     // Reference count: negative == locked
    // XCHAR data[nAllocLength+1]  // A CStringData is always followed in memory by the actual array of character data

Difference between nDataLength and nAllocLength is quite evident from the above documentation. Hope this helps.

Sep 052008
 

Converting from CString to char* is conditional, becuase CString is a TCHAR based implementation. TCHAR is defined as a char if _UNICODE is not defined, so if this is the case we can convert to char*, since TCHAR* and char*  are equal, else we’ve got to use function like MultiByteToWideChar/W2A/W2AEX etc.

So from now on for this post I will be using TCHAR version of char. Above explanation would have helped you in understanding what exactly is a TCHAR.

Converting CString to LPCTSTR –

CString has an inbuilt operator which returns a constant pointer to it’s internal data member. It’s called operator LPCTSTR().

So if you write code like this…

CString Str;
LPCTSTR lpctszStr = Str;

The compiler replaces above line with a call to operator LPCTSTR(). So will look somewhat like this…

LPCTSTR lpctszStr = Str.operator LPCTSTR();

This is the reason why we can directly pass CString objects to SDK functions which takes LPCTSTR arguments, for e.g. ::SetWindowText.

Converting CString to LPTSTR –

Converting from CString to LPTSTR is slightly bit more work. We’ve got to call function GetBuffer to get internal data pointer. Don’t forget to call ReleaseBuffer once you are done with the buffer.

LPTSTR lptszStr = Str.GetBuffer(0);
Str.ReleaseBuffer();

A different flavor of this function exists called GetBufferSetLength. Well what’s the use of this function, you can explicitly as CString to give you a larger buffer that the current one. For eg. if you wanna call the SDK version of GetWindowText using a CString object without using a temporary raw TCHAR buffer, you can use GetBufferSetLength function passing the required buffer length as the argument and of course the ReleaseBuffer call should be made too!

Here is a classic example from CWnd::GetWindowText on how to use GetBufferSetLength.

CString rString;
int nLen = ::GetWindowTextLength(m_hWnd);
::GetWindowText(m_hWnd, rString.GetBufferSetLength(nLen),  nLen+1);
rString.ReleaseBuffer();

Converting to and from a unicode string –

http://nibuthomas.wordpress.com/2008/07/02/how-to-convert-a-ansi-string-to-unicode-string-and-vice-versa/

Sep 052008
 

I will describe in this post three ways to trim a string of given characters…

  1. Using custom function for std::string
  2. Using CString
  3. Using StrTrim shell API function.

Using custom function for std::string

Its bit strange that std::string doesn’t provide a Trim function 😕 , but hey since we’ve got head upon our shoulders we’re gonna write one. :P. Here is a simple function which removes white space before and after a string. We’ll call it Trim! 😛

void Trim( const std::string& StrToTrim, std::string& TrimmedString )
{
  // Find first non whitespace char in StrToTrim
  std::string::size_type First = StrToTrim.find_first_not_of( ' ' );
  // Check whether something went wrong?
  if( First == std::string::npos )
  {
    First = 0;
  }

  // Find last non whitespace char from StrToTrim
  std::string::size_type Last = StrToTrim.find_last_not_of( ' ' );
  // If something didn't go wrong, Last will be recomputed to get real length of substring
  if( Last != std::string::npos )
  {
    Last = ( Last + 1 ) - First;
  }

  // Copy such a string to TrimmedString
  TrimmedString = StrToTrim.substr( First, Last );
}

int main()
{
  std::string StrToTrim = " 32 Nibu babu thomas 2342 2 23 3 ";
  std::string TrimmedString = "On return will hold trimmed string";
  Trim( StrToTrim, TrimmedString );

  return 0;
}
//Output is: 32 Nibu babu thomas 2342 2 23 3

Using CString

Also CString has a Trim function, if you have the liberty to use CString then that’s another option.

Using StrTrim shell API function

Another option is to use StrTrim shell API function. Here is a demo from MSDN.

#include
#include
#include “Shlwapi.h”

void main(void)
{
//String one
TCHAR buffer[ ] = TEXT(“_!ABCDEFG#”);
TCHAR trim[ ] = TEXT(“#A!_”);

cout << "The string before calling StrTrim: "; cout << buffer; cout << "\n"; StrTrim(buffer, trim); cout << "The string after calling StrTrim: "; cout << buffer; cout << "\n"; } OUTPUT: - - - - - - The string before calling StrTrim: _!ABCDEFG# The string after calling StrTrim: BCDEFG[/sourcecode]

Jul 022008
 

std::stringstream can be used as a replacement for CString::Format, if you are using CString just for the sake of Format, caveat being that it could be slow, I’ve heard a user mentioning this but not sure, I haven’t tested it out, but should be definitely better that using CString::Format and then assigning to std::string.

std::stringstream strm;
strm << "Age: " << nAge << ", DOB: " << szDate << ", Salary: " << nSalary; std::cout << strm.str();[/sourcecode]

Jun 252008
 

Quite simple…

// A TCHAR based std::string
typedef std::basic_string<tchar> tstring;
// A TCHAR based std::ifstream;
typedef std::basic_ifstream</tchar><tchar , std::char_traits<tchar> > tstream;
// A TCHAR based std::stringstream
typedef std::basic_stringstream</tchar><tchar , std::char_traits<tchar>, std::allocator</tchar><tchar> > tstringstream;

So now no need to worry about UNICODE and ANSI, should work as CString, since TCHAR becomes char/wchar_t based on _UNICODE macro definition.

Also note that stl has provided UNICODE versions of these classes for e.g. wstring, wstringstream, wifstream, but since windows has a type that switches automagically between char/wchar_t, we are making use of it.

So the idea behind this is that stl classes are mostly template based, so this means you can add your own version of an stl class for a custom type just like I’ve done. As a conclusion we can say that std::string can be called a vector<char> but with dedicated string operations.