A new government report indicates that enrollment in computer-related majors at engineering colleges throughout the country is skyrocketing. Analysts and demographers all point to the same cause: enthusiasm among high-school graduates about entering a field where problem-solving skills are paramount.
Let's mentally play the role of the computer executing the following program stopped at the spot in the code indicated by the red arrow:
What has already happened? Well, we know that the program started executing at the top of the main
function. Then, it executed the first statement, an invocation of function
. We recall from earlier editions of the C++ Times that invoking a function means that the program's execution is immediately transferred to the first statement of that invoked function. Therefore, the program's next step was to execute the declaration/definition of the c1
and c2
variables and assign them the initial values of 'Q'
and 'R'
, respectively! When the program's execution pauses at the red arrow, here's what the computer's memory may look like (conceptually):
Here memory is visualized here as a two dimensional "spreadsheet" of cells, each of which holds a single byte. The byte at the top left of the diagram has address 0 and the byte at the bottom left has address 24 + 7 = 31 (add the value at the beginning of the row with the value at the beginning of the column).
Note: See the C++ Times special long-form reporting on variables in memory in previous editions.
Name | Scope | Type | Value | Address |
---|---|---|---|---|
c1 |
function |
char |
'Q' |
5 (0x5 ) |
c2 |
function |
char |
'R' |
11 (0xb ) |
An expression is ... anything that has a value. Remember?
The C++ expression that gets the value of the variable c1
is just
c1
What's the expression to get the value of c2
?
c2
Recall that every variable in C++ has some associated space in memory where its value is stored and we can write expressions (like the ones above) to get the variable's value. But, can we write an expression to retrieve the address of a variable? Yes, we can! The expressions
&c1
and
&c2
evaluate to 5
(again, 0x5
) and 11
(0xb
), respectively, the addresses of those variables in the (conceptual) computer's memory. Because using &
in front of a variable in an expression gets the address of the variable, the &
is called the address-of operator. The address-of operator is a unary operator (an operator that takes a single operand. c.f.: binary operators like +
, -
) . The operand to the address-of operator must be an expression whose value identifies an object (for our purposes, this is usually a variable).
Note: Yes, the use of
&
here may at first seem confusing -- we learned earlier about using the&
sigil to declare/define a reference variable and the following code snippet shows both uses of&
:
#include <iostream>
void foer(std::string &str) {
...
}
int main() {
std::string a_string;
auto address_of_a_string = &a_string;
foer(a_string);
}
Although the confusion is palpable now, as you learn more about pointers, you will see how reference variables and pointers are closely related and, therefore, understand why the language designers have chosen to reuse the
&
for reference variables.
If we have expressions that evaluate to addresses of variables, it would be nice to be able to store them somewhere. But so far we do not have a type that will hold such a value. We have types that hold integers, floating-point values and strings, but nothing that will hold an address of a variable.
Don't worry! There is a type in C++ that we can use to declare a variable that holds the address of another variable. We call a variable that holds the address of another variable a pointer to X, where X is the type of the variable being pointed to. Why? Because we can imagine a variable with a pointer-to-X type as one that "points" to the variable storing a value whose type is X
.
Note: Because "points to" and "points at" end in a preposition and it's not nice to end sentences in prepositions, some people use the synonym "targets".
Just to be extra clear, there is absolutely nothing odd or unusual about a variable that points to another variable -- it has a scope, value, type and the pointer itself even has a place in memory!
void function() {
char c1{'Q'};
char c2{'R'};
char *ptr{&c1};
return;
}
int main() {
function();
return 0;
}
In the snippet above, ptr
has type pointer to char
and it "points at" variable c1
. It is really important to be aware that there is no such type in C++ as "pointer" -- the type is an entire phrase pointer to X where X is some type. That means that the type of a pointer variable specifies
- that the variable is a pointer, and
- the type of the variable at its target.
That's important because a variable that is a "pointer to a char
" can only point at char
-typed variables; a "pointer to an int
" can only point at an int
-typed variable; and so on.
In the example above, the scope of ptr
is the body of function
.
It is a coincidence that what
ptr
points at has the same scope asptr
. That does not have to be the case.
So far, we see two major similarities between pointer variables and the other kinds of variables:
- Both have types; and
- Both have scope.
Do they share anything else in common? Why, yes, they do! Both take up space and have a value. Let's start first with the values of pointers. A type defines the range of valid values for a variable of that type and pointers are no exception. Officially, there are a few "sets" of valid values for a pointer. Think conceptually about the requirements for a pointer: Most importantly, a pointer must contain enough information to uniquely identify its target. Well, what information is available that would uniquely identify a target in the least amount of memory?
The variable's name? Nope! Not all variables have names and names are not unique outside of their scope. Well, why not identify a variable by it's name and its scope (in combination)? That seems rather, er, laborious and, because we can nest scopes infinitely, would require lots of space. What else could we use? I know, let's rely on the fact that every variable takes up space in memory and every piece of space in memory (as long as the space is at least as big as a byte!) has a unique address! That means we could uniquely describe the target of a pointer using the address of the variable that it targets. Perfect!
We know that all variables occupy space in memory. Just how much space in memory do pointers-to-X
-type variables use? To answer this question requires some advanced calculations, but I think that we can handle it.
Because the storage for variables can occur anywhere in memory, a pointer-to-X
-type variable needs to be able to store the address of every single byte of memory in the system. Therefore, the space allocated to storage for an instance of a pointer-to-X
-type variables must contain enough space to represent each of the possible addresses that it could hold.
Almost all of today's computers have more than
The computer can only handle $1$s and $0$s. So, how can you use those two values to represent decimals? Let's assign
Count up the lines above and note that there are 16 unique combinations possible of 4 1
s/0
s that can be used to represent the numbers
X
-type variable occupies
Remember the program from above:
void function() {
char c1{'Q'};
char c2{'R'};
char *ptr{&c1};
return;
}
int main() {
function();
return 0;
}
Here's what the computer's memory may look like (conceptually for a 64-bit computer):
Here is the relevant information about each of the variables in the function
function:
Name | Scope | Type | Value | Address |
---|---|---|---|---|
c1 |
function |
char |
'Q' |
5 (0x5 ) |
c2 |
function |
char |
'R' |
11 (0xb ) |
ptr |
function |
pointer to char |
11 |
5 (0x5 ) |
ptr |
function |
pointer to char |
5 (0x5 ) |
24 (0x18 ) |
I hope that the visualization and the table make it obvious that a pointer variable has all the same attributes as every other variable:
- Type,
- Scope,
- Place in memory (an address), and
- Value.
What happens if the programmer inserts
ptr = &c2;
just before the return
statement in function
? What do the contents of the memory look like after that?
Name | Scope | Type | Value | Address |
---|---|---|---|---|
c1 |
function |
char |
'Q' |
5 (0x5 ) |
c2 |
function |
char |
'R' |
11 (0xb ) |
ptr |
function |
pointer to char |
11 (0xb ) |
24 (0x18 ) |
This table demonstrates the meaning of the assignment statement for pointers. Assigning a new address to a pointer changes the variable it targets. Remember that pointers store addresses, so we must use the address-of operator (&
) on the target variable in the assignment! Frustratingly, failing to do so may not cause a compilation error, but it will certainly lead to a runtime error!!
When I was growing up, there was a game called Chutes and Ladders.
If you landed on a ladder, your character was able to climb to its top and advance through the board rapidly. If you landed on a chute (a slide), your character tumbled to its bottom and lost a tremendous number of places. Let's imagine that a pointer-to-X
is like one of the spaces on the game board where there is an attached latter or slide -- the top of the ladder or the bottom of the chute is the pointer-to-X
's target! Landing on a space with an attached chute or ladder is like using a pointer-to-X
variable in an expression (e.g.,
ptr
in the program above) but climbing up the ladder or sliding down the chute requires an additional "push". In C++ we call such a push a dereference operation. An expression dereferencing a pointer-to-X
evaluates to the value of the variable at the pointer-to-X
's target! Like any operation in C++, the dereference operation is denoted by an operator (the *
) that we call either the dereference or contents-of operator. It is a unary operator and takes a single operand: a pointer-to-X
-type expression.
Just before function
completes its execution in the program above,
Expression | Value | Type | |
---|---|---|---|
1. | c1 |
'Q' |
char |
2. | c2 |
'R' |
char |
3. | ptr |
5 |
pointer to char |
4. | &c1 |
5 |
pointer to char |
5. | &c2 |
11 |
pointer to char |
6. | &ptr |
24 |
pointer to pointer to char |
7. | *ptr |
'Q' |
char |
Look carefully at rows 3, 6 and 7 and make sure you understand why those expressions have the values they do.
Besides being useful to get the value at the target of the pointer, the dereference operator can be used in an assignment statement to set the value of the variable at the target. Let's say that the programmer adds
*ptr = 'S'
to the program above which makes the entire program look like:
void function() {
char c1{'Q'};
char c2{'R'};
char *ptr{&c1};
*ptr = 'S'
return;
}
int main() {
function();
return 0;
}
Just before function
completes its execution in the program above, the variables have the following values:
Expression | Value | |
---|---|---|
1. | c1 |
'S' |
2. | c2 |
'R' |
3. | ptr |
5 |
4. | &c1 |
5 |
5. | &c2 |
11 |
6. | &ptr |
27 |
Using a dereferenced pointer-to-X
-type variable as the place to store the value of an expression has the effect of updating the value of the variable the pointer-to-X
targets! How cool!
What, precisely, is the syntax for declaring a pointer variable?
<type> *<name>;
That syntax declares a variable named name with the type pointer to target type. For example,
std::string *str_ptr;
declares a variable named str_ptr
that is a pointer to a std::string
. Or,
double *dbl_ptr;
declares a variable named dbl_ptr
that is a pointer to a double
.
Pointers hold some hidden "gotchas" that fastidious programmers must always keep in mind.
The most common error programmers make when using pointers is trying to assign a pointer to the wrong type. Consider
int five{5};
int *ptr_to_int = &five;
The type of ptr_to_int
is "pointer to int
". Remember, the whole phrase is the type. And, because C++ is strongly (and statically) typed, the compiler will complain if we try to
int five{5};
double fived{5.0};
int *ptr_to_int = &five;
ptr_to_int = &fived;
Can you see why?
As you continue to learn more about pointers, you will see additional common mistakes! For now, revel in your newfound power!