Character Variables
A char
variable is 1 byte (8 bits). It represents a single character.
char letter;
A char
is unsigned. C++ supports the signed char
type but it is not as commonly used. Since a signed char is effectively an 8-bit integer, it can be used in arithmetical expressions and some programmers who must write for devices with limited memory use the char
type to save space. A signed char can also be promoted to int consistently.
Another use of char
is to act as a byte
type since C, and older C++, do not support byte
. Newer C++ standards (from C++17) and compilers that support them offer a type std::byte
, but this is defined in terms of unsigned char
and is a
class, not a primitive type.
Byte types or their equivalents offer direct access to memory, which is organized by bytes.
C-Style Strings
C-style strings are actually arrays of individual characters. They must be declared with a fixed size, or allocated.
char cstr[8];
cstr="Hello";
The compiler automatically terminates each C-style string with an invisible “null” character. If the assigned value does not fill the array, it will be padded with blanks. It may not exceed the length of the string. The size must account for the null terminating character, so
char greeting[5]="Hello";
will result in an error, but
char greeting[6]="Hello";
works.
A C-style string may only be initialized to a quoted string when it is declared.```c++ char greeting[6]; greeting=“Hello”;
is invalid.
### C Style Character Operations
C++ supports C-style string functions. Include the `<cstring>` header.
| Function | Operation | Usage |
|--------------|-------------------|-------------|
| strcpy | copy str2 to str1 | strcpy(str1,str2) |
| strcat | concatenate str2 to str1| strcat(str1,str2) |
| strcmp | compare two strings | strcmp(str1,str2) |
| strlen | length of string (excludes null) | strlen(str) |
Individual characters may be addressed using bracket notation. Each character is one item, and the count begins from zero and goes to strlen-1.
```c++
char greeting[8]="Hello";
std::cout<<greeting[0]<<"\n";
std::cout<<greeting[2]<<"\n";
Example:
#include <iostream>
#include <cstring>
int main() {
char greeting[6]="Hello";
char musical_instr[6]="Cello";
char two_strings[13]="";
std::cout<<strcmp(greeting,musical_instr)<<"\n";
std::cout<<strcat(two_strings,greeting)<<"\n";
std::cout<<strcat(two_strings,musical_instr)<<"\n";
std::cout<<strlen(greeting)<<"\n";
std::cout<<strcat(greeting,musical_instr)<<"\n";
std::cout<<greeting<<"\n";
std::cout<<strlen(greeting)<<"\n";
char str[6];
strcpy(str,greeting);
std::cout<<str<<"\n";
}
In the above code, pay attention to the lines
std::cout<<strlen(greeting)<<"\n";
std::cout<<strcat(greeting,musical_instr)<<"\n";
std::cout<<greeting<<"\n";
std::cout<<strlen(greeting)<<"\n";
char str[6];
strcpy(str,greeting);
std::cout<<str<<"\n";
What result did this code yield? On a Linux system with g++ the output was
5
HelloCello
HelloCello
10
HelloCello
The size of greeting
was doubled (not counting null terminators) even though it was declared size 6. The compiler did not check for this. The strcpy
function then copied it to another variable of size 6.
The result is a buffer overflow. To see what might happen, run the following code
#include <iostream>
#include <cstring>
int main() {
char greeting[6]="Hello";
char musical_instr[6]="Cello";
char str[5];
int year=2021;
std::cout<<"Initial value of year: "<<year<<"\n";
strcat(greeting,musical_instr);
strcpy(str,greeting);
std::cout<<"What happened to year? "<<year<<"\n";
}
On the same Linux system the result was
Initial value of year: 2021
What happened to year? 1869376613
This occurred because year
was declared right after str
. Str was only allocated 8 bytes of memory. Nearly all compilers will place the next declared variable subsequent in memory, so in this example that was year
. Storing an excessively long variable into str
caused it to overflow in memory and wipe out the value of year
.
If using C-style strings and functions, guard against this by using
Function | Operation | Usage |
---|---|---|
strncpy | copy str2 to str1, max n bytes of str2 | strncpy(str1,str2,n) |
strncat | concatenate str2 to str1, max n bytes of str2 | strncat(str1,str2,n) |
One way to ensure that n
is correct is to use sizeof
, which returns a value in bytes.
strncpy(str1,str2,sizeof(str1)-1);
str1[strlen(str1)]='\0';
We must explicitly add the null character to the end of the target of the copy or even strncpy will overflow the buffer.
#include <iostream>
#include <cstring>
int main() {
char greeting[6]="Hello";
char musical_instr[6]="Cello";
char str[5];
int year=2021;
std::cout<<"Initial value of year: "<<year<<"\n";
strcat(greeting,musical_instr);
strcpy(str,greeting);
std::cout<<"What happened to year? "<<year<<"\n";
}
The strncat
function is more difficult to use correctly since it appends $n$ bytes from str2 regardless of the size of str1.
Since we are programming in C++, not C, for most purposes it is better to use C++ strings (see here), which do not have these disadvantages.