Cách khắc phục lỗi mbcs environments not supported năm 2024

Question

Besides single-byte character set (SBCS) character set translation, TMA SNA supports multi-byte character set (MBCS) translation to meet Asian customer's requirement. This translation does not depend on translation tables under directory`$TUXDIR/udataobj/codepage`, instead, it is based on ICU libraries, which contains more than 200 different SBCS and MBCS character sets.

To use this translation, CODEPAGE in DMCONFIG DM_REMOTE_DOMAINS should be specified as following format:

CODEPAGE="ASCII_CHAR_SET:EBCDIC_CHAR_SET"

ASCII_CHAR_SET is for local domain and EBCDIC_CHAR_SET is for remote domain. They are separated by a colon.

Both ASCII_CHAR_SET and EBCDIC_CHAR_SET must be available character sets known by ICU. A utility uconv can be used to retrieve relevant information of these character sets.

For example, local domain uses

# DMCONFIG
*DM_REMOTE_DOMAINS
CICA TYPE=SNAX CODEPAGE="gb2312-1980:ibm-1388"

0 Chinese code set

# DMCONFIG
*DM_REMOTE_DOMAINS
CICA TYPE=SNAX CODEPAGE="gb2312-1980:ibm-1388"

1, and remote domain uses

# DMCONFIG
*DM_REMOTE_DOMAINS
CICA TYPE=SNAX CODEPAGE="gb2312-1980:ibm-1388"

2 Chinese code set

# DMCONFIG
*DM_REMOTE_DOMAINS
CICA TYPE=SNAX CODEPAGE="gb2312-1980:ibm-1388"

3:

Sorry, my loose terminology again maybe, but selecting the MBCS option sets TCHAR to char, that's what I mean by it using single-byte chars.

Regarding TCHAR: TCHAR evaulates to char in ANSI build and wchar_t in Unicode build

Right, and to char in MBCS builds.

I have obviously used MBCS for some historic reason and maybe I could have just set that option to "Not set", which I believe implies ANSI. They've been pretty much equivalent for my purposes thus far (coding for English and ignoring multi-byte codings). Either way, I think ANSI and MBCS are equally deprecated.

I remember that using char instead of TCHAR and _t*() functions was considered bad practice in MFC even back in late 1990s when I was using it, at latest around the time MSVC 6 was released (1998) and the mainstream Windows did not even support Unicode (XP was the first mainstream desktop Windows to do that in 2001).

I don't think we were ever taught about anything other than char when I was at uni in the early 90s, and the places I worked never cared about it. I know it's been around for ages though.

The other motivation was probably that it's just so damn ugly! Readable: char and strcpy(). Jibberish: _TCHAR and _tcscpy(). At least to me. It seems every time they add something to C++ they try to make it uglier. I've been working mostly in C# and Unity in the last few years and wow it's so much nicer. But I want to maintain my personal programs that I started 20 years ago in C++ and still work on and sell today. But I've dug myself into many holes and need to dig back out before I can release a new version.

I think this is a typical example of code rot, where you did not update your code in 20+ years and it caught up with you...

Actually I updated it heaps over 20 years, and it's already been a lot of work to keep up. But still char -> TCHAR and 32 -> 64 bit are huge changes that I haven't managed yet. I'm not sure if MFC -> wxWidgets is manageable at all, but I'd like to do that too if I can. Also looking to use it for a new project, but will be re-using some previous code, so still have to modernise that.

RobertWebb wrote: Sun Jun 21, 2020 5:16 amYes, choosing the Unicode option will add _UNICODE to the FINAL preprocessor definitions, but NOT to "Properties->C++->Processor->Preprocessor Definitions". There is, of course, no need to add this here manually.

This is not true: both symbols are added to the final preprocessor definitions, you can see that in "Command line" tab. You may be confused because they are not added into the Preprocessor definitions field but they are inherited from the project settings. You can see that in the Preprocessor definitions dialog which shows both manually added and inherited preprocessor definitions. See here for explanation why both symbols must be defined: https://devblogs.microsoft.com/oldnewthing/?p=40643 [/quote] We seem to be getting confused here. I made no mention of the UNICODE symbol. Yes that is added too, along with _UNICODE.

Here's the issue, and it's just about the _UNICODE symbol. In the Visual Studio options, setting "Character Set" to "Use Unicode Character Set" will mean _UNICODE is defined. This is good. This is what we want in that case. However, the _UNICODE symbol is ALSO being explicitly set, a second time, elsewhere in the project settings, in "C++->Processor->Preprocessor Definitions". This is not necessary because it will be defined anyway. Setting "Character Set" to "...Unicode..." did not put it there. It's been manually added separately, so it ends up being defined twice.

So if you change "Character Set" to "Multi-byte Character Set", it will define _MBCS instead of _UNICODE, but _UNICODE is STILL set in the C++ section, so things break. The _UNICODE symbol should not be set explicitly in any of the C++ settings. It will already be implicitly set by the "Character Set" setting.

I'm confused about a lot of things, but I don't think this is one of them

To conclude, I do not mean to preach but I think you should read up on character encodings to really understand what they mean and how they work. See e.g. the list in my previous post for starters:

Thanks, you've actually been very patient! It just seems like I have to spend all my spare time keeping up with where I already was for no functional gain, so it gets frustrating. Maybe this discussion will inspire me to make some progress!

mẹo hay Mẹo Hay Cách