HOME

TheInfoList



OR:

C character classification is an operation provided by a group of functions in the ANSI C Standard Library for the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well a ...
. These functions are used to test characters for membership in a particular class of characters, such as alphabetic characters, control characters, etc. Both single-byte, and wide characters are supported.


History

Early C-language programmers working on the Unix operating system developed
programming idiom In computer programming, a programming idiom or code idiom is a group of code fragments sharing an equivalent semantic role, which recurs frequently across software projects often expressing a special feature of a recurring construct in one or ...
s for classifying characters into different types. For example, for the
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
character set, the following expression identifies a letter, when its value is ''true'': ('A' <= c && c <= 'Z') , , ('a' <= c && c <= 'z') As this may be expressed in multiple formulations, it became desirable to introduce short, standardized forms of such tests that were placed in the system-wide header file ''ctype.h''.


Implementation

Unlike the above example, the character classification routines are not written as comparison tests. In most C libraries, they are written as static table lookups instead of macros or functions. For example, an array of 256 eight-bit integers, arranged as bitfields, is created, where each bit corresponds to a particular property of the character, e.g., isdigit, isalpha. If the lowest-order bit of the integers corresponds to the isdigit property, the code could be written as #define isdigit(x) (TABLE & 1) Early versions of Linux used a potentially faulty method similar to the first code sample: #define isdigit(x) ((x) >= '0' && (x) <= '9') This can cause problems if when the macro expands, the expression substituted for ''x'' has a side effect. For example, if one calls ''isdigit(x++)'' or ''isdigit(run_some_program())''. It is not immediately evident that the argument to ''isdigit'' is evaluated twice. For this reason, the table-based approach is generally used.


Overview of functions

The functions that operate on single-byte characters are defined in ''ctype.h'' header file (''cctype'' in C++). The functions that operate on wide characters are defined in ''wctype.h'' header file (''cwctype'' in C++). The classification is evaluated according to the effective locale.


References


External links

C standard library {{Use dmy dates, date=October 2017