Saturday, May 10, 2008

char and its relatives

In C, I was used to saying

char *psz = "This is some string";

here's what changes in VC++ :

VC++ has the ability to use multibyte characters and Unicode characters - this can be set using IDE options.

Between multibyte character system (MBCS) and Unicode, Unicode has greater acceptance, so we usually program in Unicode. Now to do that first all the string should be defined like this :

wschar_t *buff = L"This is some string";

here, the prefix L tells the compiler that the string is made up of Unicode chars and since char represents a 8 bit character, we need another datatype to represent Unicode char, so we have wschar_t (16 bit Unicode)

However, you might need to switch between ANSI string and unicode strings, to support such a situation, VC++ gives us  a macro TCHAR. It expands to wschar_t  if Unicode is defined else to char.

similarly, instead of harcoding "L" prefix, we again have an option in MFC to use a macro "_T"

We can write the macro ourselves in SDK as follows or include tchar.h

#ifdef UNICODE
#define _T(x) L##x
#else
#define _T(x) ##x
#endif

so now we can write

TCHAR *psz = _T("This is some string");

No comments: