Character Variables

A char variable is 1 byte (8 bits). It represents a single character.

char letter;

A char is unsigned. C++ supports the signed char type but it is not as commonly used. Since a signed char is effectively an 8-bit integer, it can be used in arithmetical expressions and some programmers who must write for devices with limited memory use the char type to save space. A signed char can also be promoted to int consistently.

Another use of char is to act as a byte type since C, and older C++, do not support byte. Newer C++ standards (from C++17) and compilers that support them offer a type std::byte, but this is defined in terms of unsigned char and is a class, not a primitive type. Byte types or their equivalents offer direct access to memory, which is organized by bytes.

C-Style Strings

C-style strings are actually arrays of individual characters. They must be declared with a fixed size, or allocated.

   char cstr[8];

   cstr="Hello";

The compiler automatically terminates each C-style string with an invisible “null” character. If the assigned value does not fill the array, it will be padded with blanks. It may not exceed the length of the string. The size must account for the null terminating character, so

   char greeting[5]="Hello";

will result in an error, but

   char greeting[6]="Hello";

works.

A C-style string may only be initialized to a quoted string when it is declared.

   char greeting[6];
   greeting="Hello";

is invalid.

C Style Character Operations

C++ supports C-style string functions. Include the <cstring> header.

Function Operation Usage
strcpy copy str2 to str1 strcpy(str1,str2)
strcat concatenate str2 to str1 strcat(str1,str2)
strcmp compare two strings strcmp(str1,str2)
strlen length of string (excludes null) strlen(str)

Individual characters may be addressed using bracket notation. Each character is one item, and the count begins from zero and goes to strlen-1.

   char greeting[8]="Hello";
   std::cout<<greeting[0]<<"\n";
   std::cout<<greeting[2]<<"\n";

Example:

#include <iostream>
#include <cstring>

int main() {
    char greeting[6]="Hello";
    char musical_instr[6]="Cello";
    char two_strings[13]="";

    std::cout<<strcmp(greeting,musical_instr)<<"\n";
    std::cout<<strcat(two_strings,greeting)<<"\n";
    std::cout<<strcat(two_strings,musical_instr)<<"\n";
    std::cout<<strlen(greeting)<<"\n";
    std::cout<<strcat(greeting,musical_instr)<<"\n";
    std::cout<<greeting<<"\n";
    std::cout<<strlen(greeting)<<"\n";

    char str[6];
    strcpy(str,greeting);
    std::cout<<str<<"\n";
    
}

In the above code, pay attention to the lines

    std::cout<<strlen(greeting)<<"\n";
    std::cout<<strcat(greeting,musical_instr)<<"\n";
    std::cout<<greeting<<"\n";
    std::cout<<strlen(greeting)<<"\n";

    char str[6];
    strcpy(str,greeting);
    std::cout<<str<<"\n";

What result did this code yield? On a Linux system with g++ the output was

5
HelloCello
HelloCello
10
HelloCello

The size of greeting was doubled (not counting null terminators) even though it was declared size 6. The compiler did not check for this. The strcpy function then copied it to another variable of size 6.
The result is a buffer overflow.

To see what can happen, compile and run the following code

#include <iostream>
#include <cstring>

int main(int argc, char **argv) {
    char user[6];
    char password[8];

    std::cout<<"Enter your user id: ";
    std::cin>>user;
    std::cout<<"Enter your password: ";
    std::cin>>password;

    if (std::strcmp(password,"Eleventy")==0) {
        std::cout<<"You have logged in\n";
    }
    else {
        std::cout<<"Incorrect password\n";
    }
}

Type in a short username (any string), then type Eleventy as your password. It should work as expected. Now try typing a username that is longer than 10 characters and see what happens.

If using C-style strings and functions, guard against this by using

Function Operation Usage
strncpy copy str2 to str1, max n bytes of str2 strncpy(str1,str2,n)
strncat concatenate str2 to str1, max n bytes of str2 strncat(str1,str2,n)

One way to ensure that n is correct is to use sizeof, which returns a value in bytes.

strncpy(str1,str2,sizeof(str1)-1);
str1[strlen(str1)]='\0';

We must explicitly add the null character to the end of the target of the copy or even strncpy will overflow the buffer.

The strncat function is more difficult to use correctly since it appends $n$ bytes from str2 regardless of the size of str1.

In general, it is best to avoid fixed-size char variables as much as possible, because C++ (and C) does not check C-style array bounds. Similar problems can occur with numerical arrays, but in those cases the result is typical a segmentation fault. Buffer overflows in characters can result in insecure programs.

Since we are programming in C++, not C, for most purposes it is better to use C++ strings (see here), which do not have these disadvantages.

Previous
Next