[C] wide char in C, how to use it

 

In C, a standard char is only 8 bits (1 byte), which is too small to hold "wide" characters like the em dash (—) or emojis. To handle these, you have to move into the world of Wide Characters (wchar_t) and Multibyte Strings (UTF-8).

Here is how you implement them.


1. The wchar_t Approach (Fixed Width)

If you want to treat the em dash as a single "unit" in C, you use the wchar_t type and the L prefix for literals.

C
 
#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main() {
    // 1. Critical: Set the locale to the user's default (usually UTF-8)
    setlocale(LC_ALL, "");

    // 2. Define a wide character using the L prefix
    wchar_t em_dash = L'—'; 

    // 3. Print using wprintf and the %lc (wide char) specifier
    wprintf(L"The wide dash looks like this: %lc\n", em_dash);

    return 0;
}

2. The UTF-8 Approach (Modern Standard)

In modern development, we usually keep strings as standard char arrays but encode them as UTF-8. In this case, the em dash actually takes up 3 bytes ($0xE2$, $0x80$, $0x94$).

C
 
#include <stdio.h>

int main() {
    // A standard string can hold UTF-8, but it treats the dash as 3 "chars"
    char *text = "Interrupting—like this.";
    
    printf("%s\n", text);
    return 0;
}

Key Differences to Remember

Feature char (UTF-8) wchar_t
Size 1 byte 2 or 4 bytes (Platform dependent)
Storage Uses 3 bytes for an em dash Uses 1 "slot" for an em dash
Best For Saving to files, Web/Network API Internal text processing/indexing
Print Function printf("%s") wprintf(L"%ls")

Why setlocale is Mandatory

By default, a C program starts in the "C" locale, which only understands basic ASCII. If you don't call setlocale(LC_ALL, "");, your program won't know how to translate the wide bits in memory into something your terminal can display, usually resulting in a ? or nothing at all appearing on screen.


Are you planning to process these characters (like counting how many dashes are in a file), or do you just need to print them to the screen?

 

 

posted @ 2026-01-29 03:05  profesor  阅读(2)  评论(0)    收藏  举报