您当前的位置：首页 > 计算机 > 编程开发 > VC/VC++

towupper()_C语言towupper()详解：将小写字母转换为大写字母（针对宽字符）

时间：12-25来源：作者：点击数：154

wint_t towupper (wint_t wc);

towupper() 函数用来将小写字母转换为大写字母（针对宽字符）。

只有当参数 wc 是一个小写字母，并且存在对应的大写字母时，这种转换才会发生。

towupper() 是 toupper() 的宽字符版本。

参数

wc
要被转换的宽字符。它可以是一个有效的宽字符（被转换为 wint_t 类型），也可以是 WEOF（表示无效的宽字符）。

返回值

如果转换成功，那么返回与 wc 对应的大写字母；如果转换失败，那么直接返回 wc（值未变）。

注意，返回值为 wint_t 类型，你可能需要隐式或者显式地将它转换为 wchar_t 类型。

实例

将一个宽字符串中的小写字母转换为大写字母。

#include <stdio.h>
#include <wchar.h>
#include <wctype.h>
int main ()
{
    int i=0;
    wchar_t str[] = L"c c++ java python golang\n";
    wchar_t c;
    while (str[i])
    {
        c = str[i];
        putwchar (towupper(c));
        i++;
    }
    return 0;
}

运行结果：

C C++ JAVA PYTHON GOLANG

关于大写字母和小写字母

人们通常认为只有"ABC...XYZ"才是大写字母，只有"acb...xyz"才是小写字母，其实这是不对的。大小写字母并不是固定的，不同的语言文化可能会包含不同的大小写字母，例如在“简体中文”环境中，西里尔文бгё、希腊文σωδψφ（数学物理公式中常用希腊字母）等都将成为小写字母，它们对应的大写字母是БГЁ - ΣΩΔΨΦ。

我们可以通过 setlocale() 函数改变程序的地域设置，让程序使用不同的字符集，从而支持不同的语言文化。

在默认的地域设置（默认为"C"）中，C语言通常使用 ASCII 编码，能较好地支持英文，此时的小写字母包括：

a b c d e f g h i j k l m n o p q r s t u v w x y z

大写字母包括：

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

在其它地域设置中，可能会使用 GBK（简体中文）、BIG5（繁体中文）、Shift-JIS（日文）、Unicode（世界统一码）等更加复杂的编码，它们包含了更多的大小写字母。

也就是说，一个宽字符是否是大写字母或者小写字母和程序的地域设置有关，不同的地域设置会包含不同的大小写字母。

字母一定要区分大小写吗？

我们通常认为，一个字母要么是大写字母，要么是小写字母；并且一个大写字母必定对应一个小写字母，反之亦然。这种说法虽然适用于默认的地域设置（默认为"C"），但是并不一定适用于其它的地域设置。

以 Windows 下的“简体中文”环境为例，拼音āōūǖ都将成为小写字母，但是它们没有对应的大写字母。Windows 下的“简体中文”环境使用 GBK 编码，该编码并没有包含ĀŌŪǕ这些大写形式。

罗马数字ⅲⅵⅶⅸ和ⅢⅥⅦⅨ也会被视为字母，并且从视觉上看起来是大小写对应的。其实不然，对于 Windows 来说，ⅲⅵⅶⅸ和ⅢⅥⅦⅨ都仅仅是字母而已，并没有所谓的“大小写”形式；换句话说，它们既不是大写字母，也不是小写字母，仅仅是字母而已。

以上说法仅适用于 Windows，在 Linux 和 Mac OS 下使用“简体中文”情况会有所不同：

āōūǖ会有对应的大写字母ĀŌŪǕ，因为 Linux 和 Mac OS 下的“简体中文”使用 Unicode 字符集（严格来说是 UTF-8 编码），该字符集包含了世界上所有的字符。
在 Mac OS 下，ⅲⅵⅶⅸ和ⅢⅥⅦⅨ根本不会被视为字母；在 Linux 下，ⅲⅵⅶⅸ被视为小写字母，ⅢⅥⅦⅨ被视为大写字母。

站在专业角度看问题

C语言标准规定，在默认的"C"地域设置中，只有 iswlower() 或者 iswupper() 返回“真”的字母才会被视为字母；也就是说，一个字母要么是小写字母，要么是大写字母。

但是对于其它的地域设置，C语言并没有这种规定，一个字母可以是 iswlower() 或者 iswupper() 返回“真”的字符，也可以是被当前语言环境显式地（刻意地）指定为字母的字符，例如罗马数字ⅲⅵⅶⅸ和ⅢⅥⅦⅨ。但是有一个原则，被指定为字母的字符一定不能是 iswcntrl()、iswdigit()、iswpunct()、iswspace() 返回“真”的字符。

对于 towupper()，仅当有且只有一个对应的大写字母时，这种转换才能成功；如果没有对应的大写字母，或者有多个对应的大写字母，那么转换失败。转换成功返回对应的大写字母，转换失败直接返回 wc（值未变）。

【实例】在简体中文环境下检测大小写字母，并进行转换。

#include <wctype.h>
#include <wchar.h>
#include <locale.h>
int main ()
{
    int i = 0;
    wchar_t str[] = L"σωδБГЁāōūⅢⅥⅨⅲⅵⅸ";
    wchar_t c;
   
    setlocale(LC_ALL, "zh_CN.UTF-8");  //设置为简体中文，使用UTF-8编码
    //在 Windows 下可以写作 setlocale(LC_ALL, ""); 或者 setlocale(LC_ALL, "chs");
    //在 Linux 下可以写作 setlocale(LC_ALL, ""); 或者 setlocale(LC_ALL, "zh_CN.UTF-8");
    //在 Mac OS 下可以写作 setlocale(LC_ALL, "zh_CN"); 或者 setlocale(LC_ALL, "zh_CN.UTF-8");
   
    while (str[i])
    {
        c = str[i];
        if (iswupper(c)) wprintf(L"%lc is upper, the lower is %lc\n", c, towlower(c));
        else if(iswlower(c)) wprintf(L"%lc is lower, the upper is %lc\n", c, towupper(c));
        else if(iswalpha(c)) wprintf(L"%lc is alphabetic\n", c);
        else wprintf(L"%lc is a character\n", c);
        i++;
    }
    return 0;
}

在 Windows 下的运行结果：

σ is lower, the upper is Σ
ω is lower, the upper is Ω
δ is lower, the upper is Δ
Б is upper, the lower is б
Г is upper, the lower is г
Ё is upper, the lower is ё
is lower, the upper is ā
is lower, the upper is ō
is lower, the upper is ū
Ⅲ is alphabetic
Ⅵ is alphabetic
Ⅸ is alphabetic
ⅲ is alphabetic
ⅵ is alphabetic
ⅸ is alphabetic

在 Linux 下的运行结果：

σ is lower, the upper is Σ
ω is lower, the upper is Ω
δ is lower, the upper is Δ
Б is upper, the lower is б
Г is upper, the lower is г
Ё is upper, the lower is ё
ā is lower, the upper is Ā
ō is lower, the upper is Ō
ū is lower, the upper is Ū
Ⅲ is upper, the lower is ⅲ
Ⅵ is upper, the lower is ⅵ
Ⅸ is upper, the lower is ⅸ
ⅲ is lower, the upper is Ⅲ
ⅵ is lower, the upper is Ⅵ
ⅸ is lower, the upper is Ⅸ

在 Mac OS 下的运行结果：

σ is lower, the upper is Σ
ω is lower, the upper is Ω
δ is lower, the upper is Δ
Б is upper, the lower is б
Г is upper, the lower is г
Ё is upper, the lower is ё
ā is lower, the upper is Ā
ō is lower, the upper is Ō
ū is lower, the upper is Ū
Ⅲ is a character
Ⅵ is a character
Ⅸ is a character
ⅲ is a character
ⅵ is a character
ⅸ is a character

方便获取更多学习、工作、生活信息请关注本站微信公众号 城东书院微信服务号