Print Page | Close Window

characters using MBCS

Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=3928
Printed Date: 28 Apr 24 at 5:11AM
Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com


Topic: characters using MBCS
Posted By: Tullhead
Subject: characters using MBCS
Date Posted: 24 Jun 21 at 12:56AM
I'm using MBCS in MFC C++.    When I am ready to pass the string to Debenu, I use

LPCWSTR CString2Wide(CString s)
{
_bstr_t bs = s;
return (wchar_t*)bs;
}

So, a call looks something like:

pPDF->DrawText(160, 100, CString2Wide(MyTitle)); 

All this works fine for plain English strings.   Now I want to support German, French, and Spanish.
So, just umlauts and accents.  This can all be done in MBCS (in theory) with no need to go to UNICODE.
I want to avoid UNICODE for now.

Now when I pass it a German string like:  ist für alle

It shows up in the PDF like this: ist für alle

How can I fix this?  Thanks if you can help.  Don't tell me to use UNICODE.




Replies:
Posted By: tfrost
Date Posted: 24 Jun 21 at 10:30AM
I can't avoid "telling you not to use Unicode", because you are already using Unicode in your MBCS.

You may think that "ist für alle" is in Windows (USA) default encoding. If it was in default encoding, it would contain the bytes:

   69 73 74 20 66 FC 72 20 61 6C 6C 65 
    i  s  t    f   ü  r     a  l  l  e
But what you are passing to your CString2Wide function is in a Multibyte Character Set (MBCS) encoding of Unicode, UTF8:

   69 73 74 20 66 C3 BC 72 20 61 6C 6C 65 
    i  s  t    f    ü    r     a  l  l  e
   
When you simply 'cast' this to a Wide String, the program assumes that it is default encoding, so interprets it as such:

   69 73 74 20 66 C3 BC 72 20 61 6C 6C 65 
    i  s  t    f   Ã  ¼  r     a  l  l  e
And this becomes, as a Wide String:

   0069 0073 0074 0020 0066 00C3 00BC 0072 0020 0061 006C 006C 0065 
     i    s    t         f    à   ¼    r         a    l    l    e
  
To fix this in your program and continue to use MBCS elsewhere, I suggest converting your MBCS (UTF8) string to WideChars (UTF16) using MultiByteToWideChar, using CP_UTF8 as the first parameter.  The method you have used works only for ASCII, as you have found.  You will find plenty of examples for MultibyteToWideChar; since it takes an input and an output buffer as parameters, it is easiest to use inline, not as a function.



Print Page | Close Window

Forum Software by Web Wiz Forums® version 11.01 - http://www.webwizforums.com
Copyright ©2001-2014 Web Wiz Ltd. - http://www.webwiz.co.uk