Character Variables

A char variable is 1 byte (8 bits). It represents a single character.

char letter;

A char is unsigned. C++ supports the signed char type but it is not as commonly used. Since a signed char is effectively an 8-bit integer, it can be used in arithmetical expressions and some programmers who must write for devices with limited memory use the char type to save space. A signed char can also be promoted to int consistently.

Another use of char is to act as a byte type since C, and older C++, do not support byte. Newer C++ standards (from C++17) and compilers that support them offer a type std::byte, but this is defined in terms of unsigned char and is a class, not a primitive type. Byte types or their equivalents offer direct access to memory, which is organized by bytes.

C-Style Strings

C-style strings are actually arrays of individual characters. They must be declared with a fixed size, or allocated.

   char cstr[8];

   cstr="Hello";

The compiler automatically terminates each C-style string with an invisible “null” character. If the assigned value does not fill the array, it will be padded with blanks. It may not exceed the length of the string. The size must account for the null terminating character, so

   char greeting[5]="Hello";

will result in an error, but

   char greeting[6]="Hello";

works.

A C-style string may only be initialized to a quoted string when it is declared.```c++ char greeting[6]; greeting=“Hello”;

is invalid. 

### C Style Character Operations

C++ supports C-style string functions.  Include the `<cstring>` header.

|    Function    |      Operation    |   Usage     |
|--------------|-------------------|-------------|
|   strcpy      |  copy str2 to str1 |  strcpy(str1,str2)  |
|   strcat      |  concatenate str2 to str1|  strcat(str1,str2)  |
|   strcmp      |  compare two strings |  strcmp(str1,str2)  |
|   strlen      |  length of string (excludes null)  |  strlen(str)  |

Individual characters may be addressed using bracket notation.  Each character is one item, and the count begins from zero and goes to strlen-1.
```c++
   char greeting[8]="Hello";
   std::cout<<greeting[0]<<"\n";
   std::cout<<greeting[2]<<"\n";

Example:

#include <iostream>
#include <cstring>

int main() {
    char greeting[6]="Hello";
    char musical_instr[6]="Cello";
    char two_strings[13]="";

    std::cout<<strcmp(greeting,musical_instr)<<"\n";
    std::cout<<strcat(two_strings,greeting)<<"\n";
    std::cout<<strcat(two_strings,musical_instr)<<"\n";
    std::cout<<strlen(greeting)<<"\n";
    std::cout<<strcat(greeting,musical_instr)<<"\n";
    std::cout<<greeting<<"\n";
    std::cout<<strlen(greeting)<<"\n";

    char str[6];
    strcpy(str,greeting);
    std::cout<<str<<"\n";
    
}

In the above code, pay attention to the lines

    std::cout<<strlen(greeting)<<"\n";
    std::cout<<strcat(greeting,musical_instr)<<"\n";
    std::cout<<greeting<<"\n";
    std::cout<<strlen(greeting)<<"\n";

    char str[6];
    strcpy(str,greeting);
    std::cout<<str<<"\n";

What result did this code yield? On a Linux system with g++ the output was

5
HelloCello
HelloCello
10
HelloCello

The size of greeting was doubled (not counting null terminators) even though it was declared size 6. The compiler did not check for this. The strcpy function then copied it to another variable of size 6.
The result is a buffer overflow. To see what might happen, run the following code

#include <iostream>
#include <cstring>

int main() {
    char greeting[6]="Hello";
    char musical_instr[6]="Cello";
    char str[5];
    int  year=2021;

    std::cout<<"Initial value of year: "<<year<<"\n";
    strcat(greeting,musical_instr);
    strcpy(str,greeting);
    std::cout<<"What happened to year? "<<year<<"\n";

}

On the same Linux system the result was

Initial value of year: 2021
What happened to year? 1869376613

This occurred because year was declared right after str. Str was only allocated 8 bytes of memory. Nearly all compilers will place the next declared variable subsequent in memory, so in this example that was year. Storing an excessively long variable into str caused it to overflow in memory and wipe out the value of year.

If using C-style strings and functions, guard against this by using

Function Operation Usage
strncpy copy str2 to str1, max n bytes of str2 strncpy(str1,str2,n)
strncat concatenate str2 to str1, max n bytes of str2 strncat(str1,str2,n)

One way to ensure that n is correct is to use sizeof, which returns a value in bytes.

strncpy(str1,str2,sizeof(str1)-1);
str1[strlen(str1)]='\0';

We must explicitly add the null character to the end of the target of the copy or even strncpy will overflow the buffer.

#include <iostream>
#include <cstring>

int main() {
    char greeting[6]="Hello";
    char musical_instr[6]="Cello";
    char str[5];
    int  year=2021;

    std::cout<<"Initial value of year: "<<year<<"\n";
    strcat(greeting,musical_instr);
    strcpy(str,greeting);
    std::cout<<"What happened to year? "<<year<<"\n";

}

The strncat function is more difficult to use correctly since it appends $n$ bytes from str2 regardless of the size of str1.

Since we are programming in C++, not C, for most purposes it is better to use C++ strings (see here), which do not have these disadvantages.

Previous
Next