All c-type strings are; are arrays of characters. C-type strings are not to be confused with the string datatype, which is completely different.
Now, because a c-type string (from here on in to be known as cstring) is an array, it must follow rules for arrays.
Declaring cstrings is no different that declaring arrays:
char ch[80]; // most common char *ch=new char[80]; // reinventing the bicycle char *ch; // unsafe, check notes 1 |
One may wonder what the number 80 represents. Well it is obviously the size of our cstring, but why 80? What if we only want to store a short word? By all means choose a different number. It can be 10, 100, 3, or 1000; just make sure you have enough memory. The only restriction is that is has to be 2 or more (more on this later). 80 is used because it is a kind of unwritten standard. In today's age of large memory, allocating unneeded space isn't usually a biggie.
Now that we've actually declared our cstring, we want to put stuff into it. This is where some common mistakes often appear. Many assume that cstrings are just like integers and have the = operator overloaded. While this may be true for the string datatype, it is untrue for cstrings. Remember cstrings are only arrays. Here is a common error:
char ch[80]; ch="Paul's a great guy!"; // this is incorrect |
Now is a good time to point out that ch is a pointer. All cstrings are pointers. This may seem unsafe and unneccessary at first, but soon it becomes evident that this allows great flexibility when specifying parts of a cstring (more on this later).
So if ch is a pointer, how do we assign data to the cstring? Well, we have basically two options: put the data in during declaration; or use a function.
|
As you may notice, the first way is the way you would populate an integer array, only we're using characters instead. Typing out all those ' , } is rather tedious work, so instead we can use the second way. Note that you can only use this method when declaring a cstring, not after declaration.
Now the third way is what you'll probably be using the most. strcpy(char *dest,char *src) is a function that exists in the library string.h or
cstring if you prefer using the standard template libraries. Be sure to include this library when ever you use cstrings.
Lets say we have a cstring that has a maximum length of 15. Now lets say you place the letters "1234567890" into our cstring. The data we placed inside is 10 characters long. That means that there are 5 empty characters left in the cstring. That is, to us they're empty; they're actually filled up with garbage in computer memory. The computer however, cannot distinguish between empty cells and this garbage. That means when you ask the computer for the length of our cstring, it will tell us that it's 15 characters long. That isn't exactly right.
What we need is a way to tell the computer that it has reached the end of the cstring. Just like \n represents a line break and \t represents a tabulation, \0 represents a null character (a slash followed by the numero zero).
This character (even though it's two characters long when you type it out, it's still a character as far as the computer is concerned) is placed at the end of every cstring to signal to the computer that the cstring ends here. The contents of our cstring with the added null character would be: 1 2 3 4 5 6 7 8 9 0 \0. As mentioned above, this is the reason that cstrings must be an array of at least 2 elements. One element is needed to hold the null character, and the second to contain a single character that makes up the cstring.
No fancy code here. By including <iostream>, you can display cstrings just like integers, or doubles.
#include <iostream> char ch[80]="Paul's a great guy!"; cout << ch; // "Paul's a great guy!" is output |
Having the user input cstrings is a little different. Using the cin command does not satisfy most needs, because it will discard everything after the first space. There is a function, gets(char *s), in the <stdio.h> library which solves this problem nicely. The function will populate the cstring provided with whatever the user enters.
#include <iostream> #include <stdio.h> char ch[80]="This text does not matter."; cout << ch; // "This text does not matter." is output gets(ch); // user inputs "This text DOES matter!" cout << ch; // "This text DOES matter!" is output |
The first trick I need to show you is very important, and you will find yourself using it often. Remember how we said that cstrings are arrays? When you use cstrings, you typically pass the very head of the cstring, that is, the main pointer. But if the head just has to be a pointer, an address, who says that you can't pass in the address of the 5th character, or the 7th, etc? Here are some common examples.
#include <iostream> char ch[80]="Paul's a great guy!"; cout << ch; // "Paul's a great guy!" is output cout << &ch[0]; // "Paul's a great guy!" is output cout << &ch[3]; // "l's a great guy!" is output cout << &ch[10]; // "reat guy!" is output |
It is important to clear up common problems with this syntax:
|
The most common bit of information that programmers utilize about cstrings are their length. This can be useful if you need to count through a loop, for instance. It is very simple to write your own function that handles this process, but strlen(char *s) is provided to us in the string.h library. It expects a null-terminated cstring, and returns an integer that holds the cstring's size. The cstring needs to be null-terminated so that the function is able to find the end.
#include <string.h> char ch[80]="1234567890123456"; int len; len=strlen(ch); // len = 16 len=strlen(&ch[5]); // len = 11 |
Many times you would like to add a cstring to another cstring. Fortunately, the function strcat(char *dest,char *src) is very handy for this process. Again, you must supply it with two null-terminated cstrings. dest is the cstring to which src will be concatenated to. src remains unchanged.
Another function that you should become familiar with is strncat(char *dest,char *src,int n). It is the same as strcat(char *dest,char *src), except that a maximum of n characters will be copied from src, instead of the entire cstring.
#include <string.h> char s1[80]="Paul's "; char s2[80]="a gre"; char s3[80]="at guy!"; strcat(s1,s2); // s1 is now "Paul's a gre" strcat(s1,s3); // s1 is now "Paul's a great guy!" // the above can also be done like this: strcat(s2,s3); // s2 is now "a great guy!" strcat(s1,s2); // s1 is now "Paul's a great guy!" // strncat is a little different: strncat(s2,s3,4); // s2 is now "a great g" strncat(s1,s2,5); // s1 is now "Paul's a gre" |
Simple you say? If s1 is equal to s2, then the cstrings are equal! Right? No. This concept seems very simple at first, until you realize what s1 and s2 are: pointers. How can one pointer be equal to another? Only if both point to the same location in memory. If that happens, then the cstrings will be identical, because they exist in the same location in memory. In order to compare cstrings, you must examine them character by character. Again, a function is provided: int strcmp(char *s1,char *s2). You must pass in the two cstrings, and the function will return an integer. If s1 < s2, then the integer is negative. If s1 == s2 (for comparison purposes, == is used), then the integer will be 0. Lastly, if s1 > s2, the function will return a positive integer. Cstrings are larger or smaller based on the ascii values of the characters. For examle, "arizona" is greater than "Arizona", because lower case "a" is greater than uppercase "A" in the ascii table.
If you don't want upper or lower case to affect your comparison, use int strcasecmp(char *s1,char *s2). It will compare the two strings as described above, but case-insensitively.
As with strncat(char *dest,char *src), int strncmp(char *s1,char *s2,int n) and int strncasecmp(char *s1,char *s2) exists. It will compare a maximum of n characters from both s1 and s2.
#include <string.h> char s1[80]="United States of America"; char s2[80]="United states of America"; char s3[80]="uNITED STATES OF AMERICA"; char s4[80]="United States of America"; int result; result=strcmp(s1,s2); // result > 0 result=strcmp(s2,s1); // result < 0 result=strcmp(s1,s3); // result > 0 (Because of the 'u') result=strcmp(s1,s4); // result = 0 // lets try strncmp() result=strncmp(s1,s2,10); // result > 0 (This includes the lower case 's' in "states") result=strncmp(s1,s2,7); // result = 0 (This does not) // what about strcasecmp() result=strcasecmp(s1,s3); // result = 0 |
Very often you will want to convert a cstring into a number. Writing a function that does this would be simple, but time consuming. As before, a function has been written that you can take advantage of. Actually, several functions have been written: int atoi(char *s) (for integers), float atof(char *s) (for floats), and finally long atol(char *s) (for longs). All of these exist in the <cstdlib> library.
#include <cstdlib> char s1[80]="89"; char s2[80]="1337.604753"; char s3[80]="8762340923734679347982346987234097"; int num1; float num2; long num3; num1=atoi(s1); // num1 = 89 num2=atof(s2); // num2 = 1337.604753 num3=atol(s3); // num3 = 8762340923734679347982346987234097 |
Accomplishing this is a little bit more difficult than the reverse. A function in the library <stdio.h> called sprintf(char *s,char *text,...) can be used to solve this problem. As you can see, it is a variable parameter function, meaning it can accept a different amount of variables each time, and they can be of different types. The first parameter is the empty cstring that you would like the number written to. The second is a string of text that is written in a specific format. The other parameters are then inserted into this string of text. Here are some examples:
#include <stdio.h> char ch[80]; char time[80]="hours"; int num1=80; float num2=13.37; sprintf(ch,"%i",num1); // ch is now "80" sprintf(ch,"%e",num2); // ch is now "13.37" sprintf(ch,"Efficiency: %f%% over %i %s",num2,num1,time); // ch is now "Efficiency: 13.37% over 80 hours" |
There is a list of common codes that you can use inside of sprintf(char *s,char *text,...) which is listed here:
|
1 - This is a dynamic pointer and the program will crash if it is the first variable that you declare.
Written for CS35IB-B, Instructed by Gerry Donaldson