Pointers

Pointer Basics

On computers where integers and addresses have the same range, it's possible to interchange them. However, in general, you can't do this, and may likely get a warning from the compiler. We can declare pointer variables easily:
void foo(void)
{
  int i;  /* i can only store integer values            */
          /* The value of i is undefined at this point  */

  int *p; /* p can only store the address of an integer */
          /* The value of p is undefined at this point  */

  p = &i; /* The value of p is now the address of i     */
          /* The value of p is now defined              */
          
  i = 10; /* The value of i is now 10                   */
          /* The value of i is now defined              */
}
This is the notation that will be used when talking about identifiers in memory:

Visualizing the code above:

After declarations for i and pAfter assignment to pAfter assignment to i



Note: Be sure not to confuse the asterisk ( * ) used for multiplication with the asterisk used to declare pointers. They are the same token, but have very different meanings.

Other examples:
int    *pi; /* pi can only "point" to integers */
float  *pf; /* pf can only "point" to floats   */
double *pd; /* pd can only "point" to doubles  */
short  *ps; /* ps can only "point" to shorts   */
long   *pL; /* pL can only "point" to longs    */

unsigned short *pus; /* pus can only "point" to unsigned shorts */
unsigned int   *pui; /* pui can only "point" to unsigned ints   */

/* etc.... you get the idea   */
Unfortunately, we're not done abusing the asterisk ( * ). It has a third meaning as the indirection operator also known as the dereference operator. Operator precedence chart
void foo(void)
{
  int i = 10;
  int *p = &i; /* operator * is used as a declaration here */

  printf("The value of i is %i\n", i);
  printf("The address of i is %p\n", (void *)&i);

  printf("The value of p is %p\n", (void *)p);
  printf("The address of p is %p\n", (void *)&p);
  printf("The value of what p is pointing at is %i\n", *p); /* operator * is dereferencing the pointer p */
}

Output:
The value of i is 10
The address of i is 0x22cca4
The value of p is 0x22cca4
The address of p is 0x22cca0
The value of what p is pointing at is 10
Once we have a pointer to some data, we can read/write the data through the pointer:
void f5(void)
{
  int i = 10;  /* i has the value 10     */
  int *p = &i; /* p has the address of i */
  int j;       /* j has undefined value  */ 

  printf("The value of i is %i\n", i);
  printf("The value of *p is %i\n", *p);

  j = *p;  /* j has the value of i   */
  *p = 20; /* i has the value 20     */

  printf("The value of j is %i\n", j);
  printf("The value of i is %i\n", i);
}

Output:
The value of i is 10
The value of *p is 10
The value of j is 10
The value of i is 20	
Visually:
After declarations for i, p, jAfter j = *pAfter *p = 20



Watch out for these illegal expressions:
void foo(void)
{
  int i = 10;  /* i has the value 10     */
  int *p = &i; /* p has the address of i */
  int j;       /* j has undefined value  */ 

  j = p;  /* ILLEGAL: different types (compiler won't implicitly convert) */
  p = 20; /* ILLEGAL: different type (compiler won't implicitly convert)  */
}
Also, watch out for this as well. The compiler may not even warn you about it.
void f7(void)
{
  int *p;  /* p has undefined value (some random memory location)      */
  *p = 10; /* BAD: writing the value 10 to some random memory location */
}
You can have multiple pointers to the same address. The code below gives us three different ways to modify the contents of i:
void foo(void)
{
  int i = 10; /* i gets value 10 */
  int *p;     /* undefined       */
  int *q;     /* undefined       */

  p = &i;   /* p gets address of i              */
  q = p;    /* q gets value of p (address of i) */

  *p = 20;  /* i gets value 20 (*p and *q evaluate to 20) */
  *q = 30;  /* i gets value 30 (*p and *q evaluate to 30) */
}
Visually:
After declarations for p, q, iAfter p = &iAfter q = p



After *p = 20After *q = 30


Self-check: Given the code below, which of the assignment statements are illegal and why? What does the compiler say about the illegal statements?

int i;   
int *pi; 

pi = i;   /* 1. ??? */
pi = 5;   /* 2. ??? */
pi = &i;  /* 3. ??? */
*pi = i;  /* 4. ??? */
*pi = 5;  /* 5. ??? */
*pi = &i; /* 6. ??? */
Given these declarations:
  /* Assume addresses of variables are */
  /* a:100, b:104, c:108, d:112, e:116 */
  /* Assume a 32-bit computer          */
int a = 10;
int b = 108;
float c = 3.14F;
int *d = &a;
float *e = &c;
Diagram: (looks like an array, but it's not)

Showing the pointer arrows:

We can evaluate the expressions as such:
Expression    Type         Value
---------------------------------
     a         int           10
     b         int          108
     c        float        3.14
     d        int *         100
     e       float *        108
    *d         int           10
    *e        float        3.14
    &a        int *         100
    &d        int * *       112
    *a       illegal       -----
    *b       illegal       -----
    *c       illegal       -----

Passing Pointers to Functions

A major benefit of using pointers is realized when we need another function to modify a value. The simple example is scanf:
int a, b, c;
scanf("%d%d%d", &a, &b, &c);
We want scanf to modify our local variables, but scanf is unable to access them (local scope). So, we need to pass them to scanf. We need to pass the addresses of the local variables. If we just passed the values to scanf, any changes made to them would be lost after returning from the function. See this example.

Suppose we want to write a function that exchanges the values of two integers. Here's our first (failed) attempt:

int main(void)
{
  int x = 10;
  int y = 20;

  printf("Before: x = %i, y = %i\n", x, y);
  swap(x, y);
  printf(" After: x = %i, y = %i\n", x, y);

  return 0;
}
/* Exchanges the values of the parameters */
void swap(int a, int b)
{
  int temp = a; /* Save a for later      */
  a = b;        /* a gets value of b     */
  b = temp;     /* b gets old value of a */
}


Output:
Before swap: x = 10, y = 20
 After swap: x = 10, y = 20

Note that to swap we can't just do this:

a = b; /* value of a is lost */
b = a;
So, what went wrong?

Visually:

Initialize values, pass to swapMake copy of a in tempExchange a,b



Try it again, the correct way:

/* Exchanges the values of the parameters */
void swap(int *a, int *b)
{
  int temp = *a; /* Save a for later      */
  *a = *b;       /* a gets value of b     */
  *b = temp;     /* b gets old value of a */
}
int main(void)
{
  int x = 10;
  int y = 20;

  printf("Before swap: x = %i, y = %i\n", x, y);
  swap(&x, &y);
  printf(" After swap: x = %i, y = %i\n", x, y);

  return 0;
}


Output:
Before swap: x = 10, y = 20
 After swap: x = 20, y = 10

Visually:

In main
before swap
Call swap, pass addressesMake copy of a for later



Copy *b to *a (y to x in main)Copy temp to *b (temp to y in main)


After swap
in main
swap function is gone


Notes to remember:

Passing Arrays to Functions (review)

In C, we can't pass an array to a function. We can only pass the address of the first element (pointer to the first element). However, when we pass a pointer to some data, it is possible for the function receiving the pointer to modify the data directly, as opposed to modifying a copy of the data.

Let's see how this might be a problem.

This example works as expected (It finds the largest element in an array of integers). A pointer to the first element is passed to the function find_largest, along with the size (number of elements) in the array. Since the function has a pointer to the first element, it can access every element of the array using the subscript operator:

int main(void)
{
  int a[] = {4, 5, 3, 9, 5, 2, 7, 6};
  int largest;

  printf("Array before:\n");
  print_array(a, 8);

  largest = find_largest(a, 8);
  printf("Largest value: %i\n", largest);

  printf("Array after:\n");
  print_array(a, 8);

  return 0;
}

Output:
Array before:
    4    5    3    9    5    2    7    6
Largest value: 9
Array after:
    4    5    3    9    5    2    7    6
/* Assumes there is at least */
/* one element in the array  */
int find_largest(int a[], int size)
{
  int i;
  int max = a[0]; /* assume 1st is largest */

  for (i = 1; i < size; i++)
    if (a[i] > max) 
      max = a[i];  /* found a larger one */

  return max;
}
Let's modify the function to do something unexpected.

Now, the function is modifying the original data (not a copy). It seems that passing a pointer to the data, rather than a copy, could be dangerous if the function does something we don't expect.

Passing a pointer to a function allows the function to modify our data even if we don't want it to:

/* Modifies the array the was passed in!! */
int find_largest_BAD(int a[], int size)
{
  int i;
  int max = a[0]; /* assume 1st is largest */
  a[0] = 0;       /* change first element! */
  for (i = 1; i < size; i++)
  {
    if (a[i] > max) 
      max = a[i];  /* found a larger one */
    a[i] = 0;      /* set element to 0!! */
  }
  return max;
}
Output:
Array before:
    4    5    3    9    5    2    7    6
Largest value: 9
Array after:
    0    0    0    0    0    0    0    0
There is a solution to this problem. If we need to pass an array to a function, we must pass a pointer, which gives the function the ability to mess with the data. If we want to prevent the function from modifying the data in the array, we use the const keyword to prevent the function from modify the elements of the array. This makes the elements constant, meaning it is illegal to modify them:
/* Unable to modify the array since it's const */
int find_largest_BAD(const int a[], int size)
{
  int i;
  int max = a[0]; /* assume 1st is largest       */
  a[0] = 0;       /* ILLEGAL: elements are const */
  for (i = 1; i < size; i++)
  {
    if (a[i] > max) 
      max = a[i]; /* found a larger one          */
    a[i] = 0;     /* ILLEGAL: elements are const */
  }
  return max;
}
Compiler errors:
main.c: In function `find_largest_BAD':
main.c:161: error: assignment of read-only location
main.c:166: error: assignment of read-only location
Going back to the original non-const version of this function:
int find_largest(int a[], int size);
If we really don't want our array to change, we declare it with the const keyword:
int main(void)
{
  const int a[] = {4, 5, 3, 9, 5, 2, 7, 6}; /* Elements are constant (can't be changed)  */
  int largest = find_largest(a, 8);         /* ILLEGAL: Function expects non-const array */
  
  return 0;
}
Compiler error:
main.c:185: warning: passing arg 1 of `find_largest' discards qualifiers from pointer target type

Important: When you create a function that will accept arrays as parameters, be sure to mark them as const if you do not intend to modify them. If you don't make them const, a lot of code will not be able to use your function.

The const keyword can also be used to protect the pointer as well as what's being pointed at (the pointee?).


Now that we know that when passing arrays to functions, we are really just passing a pointer, these function declarations are 100% identical. Period. Absolutely NO difference. At. All. It is purely cosmetic, the compiler treats them exactly the same, always, every time, without exception. Any questions?

int find_largest(int a[], int size); /* Use array notation   */
int find_largest(int *a, int size);  /* Use pointer notation */
That's also why you can't put a number in the square brackets because it's a pointer to a single integer, not an array. (Even though it looks like it with the empty []).

Finally, never do this! Never return a pointer to a local variable. Ever. In other words, don't do this:

/* Function returns a pointer to an int */
int *foo(void)
{
  int i = 10; /* i is local (on the stack)      */
  return &i;  /* This will soon be undefined!!  */
}
Remember this? That's why it's undefined.

Fortunately, most compilers today will point out the sheer insanity of your code:

main.c: In function `foo':
main.c:174: warning: function returns address of local variable
Some compilers may even treat this as an error due to the abomination that it is (because it is always a bug in the code).

More on const

Sometimes you want to make sure that you don't accidentally change values in your programs. The safest way to do this is to mark the identifiers with the const keyword.

With pointers, you have more flexibility with const. You can make the pointer itself constant, which means once it points at something, it can never be changed to point at something else. Or, you can make the data pointed at constant, which means that although you can change the pointer to point at something else, you can't change what's being pointed at.

Here are the four cases:

  1. Neither the pointer nor the data being pointed at (the pointee) is const. Both can be changed:
    int *pi = &i; /* pi is a (non-const) pointer to a (non-const) int */
    
  2. The pointer is not const, but the data pointed at (the pointee) is const. The data is protected. Only the pointer can change:
    const int *pci = &i; /* pci is a pointer to a const int */
    
  3. The pointer is const but the data being pointed at is non-const. The pointer is protected. The data can be changed through the pointer.
    int * const cpi = &i; /* cpi is a const pointer to an int */
    
  4. Both the pointer and the data being pointed at are const. Both are protected. Neither can be changed:
    const int * const cpci = &ci;  /* cpci is a const pointer to a const int */
    
Points to remember:

Self-check: Here is an example that shows the const keyword in action in various ways. Make sure you understand every statement.

void foo(void)
{
  int i = 5;          /* 1. i is a non-constant int */
  int j = 6;          /* 2. j is a non-constant int */
  const int ci = 10;  /* 3. ci is a constant int    */
  const int cj = 11;  /* 4. cj is a constant int    */

  int *pi;                      /* 5. pi is a pointer to an int              */
  const int *pci;               /* 6. pci is a pointer to a const int        */
  int * const cpi = &i;         /* 7. cpi is a const pointer to an int       */
  const int * const cpci = &ci; /* 8. cpci is a const pointer to a const int */

  i = 6;      /*  9. Ok, i is not const    */
  j = 7;      /* 10. Ok, j is not const    */
  ci = 8;     /* 11. ERROR: ci is const    */
  cj = 9;     /* 12. ERROR: cj is const    */

  pi = &i;    /* 13. Ok, pi is not const   */
  *pi = 8;    /* 14. Ok, *pi is not const  */
  pi = &j;    /* 15. Ok, pi is not const   */
  *pi = 9;    /* 16. Ok, *pi is not const  */

  pci = &ci;  /* 17. Ok, pci is not const  */
  *pci = 8;   /* 18. ERROR: *pci is const  */
  pci = &cj;  /* 19. Ok, pci is not const  */
  *pci = 9;   /* 20. ERROR: *pci is const  */

  cpi = &j;   /* 21. ERROR: cpi is const   */
  *cpi = 10;  /* 22. Ok, *cpi is not const */
  *cpi = 11;  /* 23. Ok, *cpi is not const */

  cpci = &j;  /* 24. ERROR: cpci is const  */
  *cpci = 10; /* 25. ERROR: *cpci is const */
  
  pi = &ci;   /* 26. DANGER: constant ci can be changed through pi */
}

Detailed messages

This is why #26 is marked DANGER:

const int value1 = 10; /* const, once initialized, cannot change */

void bar(void)
{
  const int value2 = 5; /* const, once initialized, cannot change          */
  int *pi1 = &value1;   /* DANGER, var can be changed through the pointer! */
  int *pi2 = &value2;   /* DANGER, var can be changed through the pointer! */

  value1 = 10; /* ERROR, constant */
  value2 = 10; /* ERROR, constant */

  *pi1 = 10; /* Woops! Warning, bad things can happen! */
  *pi2 = 10; /* Woops! Warning, bad things can happen! */
}
Fortunately, this has been fixed in C++ where it is a straight-up error.

Short Array Review

Suppose we had an array of 8 chars (8-bits per char) and assigned a value to each char:
unsigned char bytes[8]; /* All values undefined */

bytes[0] = 'J';  /* 0x4A */
bytes[1] = 'o';  /* 0x6F */
bytes[2] = 'h';  /* 0x68 */
bytes[3] = 'n';  /* 0x6E */
bytes[4] = 205;  /* 0xCD */
bytes[5] = 204;  /* 0xCC */
bytes[6] = 76;   /* 0x4C */
bytes[7] = 62;   /* 0x3E */
Remember that we can initialize the array as such:
/* All 8 values are well-defined */  
unsigned char bytes[] = {'J', 'o', 'h', 'n', 205, 204, 76, 62};
If we assume that the address of the bytes variable is 1000, the raw bits of the array would look like this in memory:

Each of the 8 chars is uniquely identified by it's address:
 Address (l-value)        Contents (r-value)
(shown in decimal)        (shown in binary)
----------------------------------------------
&bytes[0] ==> 1000      bytes[0] ==> 01001010 
&bytes[1] ==> 1001      bytes[1] ==> 01101111
&bytes[2] ==> 1002      bytes[2] ==> 01101000
&bytes[3] ==> 1003      bytes[3] ==> 01101110
&bytes[4] ==> 1004      bytes[4] ==> 11001101
&bytes[5] ==> 1005      bytes[5] ==> 11001100
&bytes[6] ==> 1006      bytes[6] ==> 01001100
&bytes[7] ==> 1007      bytes[7] ==> 00111110
We can draw the diagram any way that we like, the computer doesn't care.

As single printable characters (ASCII table):


As integer equivalents of the characters (ASCII values):


The point is, everything in memory is binary. It's up to you to determine how to interpret the ones and zeros. Humans generally choose something other than binary.

Pointer Arithmetic

You can perform limited arithmetic with pointers:
  1. pointer + integer (result is a pointer)
  2. pointer - integer (result is a pointer)
  3. pointer - pointer (result is a signed integer, ptrdiff_t)
When adding/subtracting integers:

The integers that are added to pointers are scaled by the size of the type of pointer. The sizes of *p assume a 32-bit machine:

Expression      Type of p        Size of *p            Value added 
--------------------------------------------------------------------
  p + 1           char *     sizeof(char)   == 1            1
  p + 1          short *     sizeof(short)  == 2            2
  p + 1            int *     sizeof(int)    == 4            4
  p + 1          float *     sizeof(float)  == 4            4
  p + 1         double *     sizeof(double) == 8            8
  p + 1            Foo *     sizeof(Foo)    == X            X

  p + 2           char *     sizeof(char)   == 1            2
  p + 2          short *     sizeof(short)  == 2            4
  p + 2            int *     sizeof(int)    == 4            8
  p + 2          float *     sizeof(float)  == 4            8
  p + 2         double *     sizeof(double) == 8           16
  p + 2            Foo *     sizeof(Foo)    == X           2 * X
  
  p + 3           char *     sizeof(char)   == 1            3
  p + 3          short *     sizeof(short)  == 2            6
  p + 3            int *     sizeof(int)    == 4           12
  p + 3          float *     sizeof(float)  == 4           12
  p + 3         double *     sizeof(double) == 8           24
  p + 3            Foo *     sizeof(Foo)    == X           3 * X
In the table above, Foo is some user-defined object that requires X bytes in memory. The compiler will know about Foo and will scale the value by the appropriate amount. For example, if sizeof(Foo) is 20, then 2 * X would be 40 and 3 * X would be 60.

An array of char:

An array of short:
An array of int:
An array of double:
 

Note: Pointer arithmetic is only valid if the pointers are pointing into an array. If the pointers are not pointing inside an array, the behavior is undefined. When subtracting two pointers, both pointers must be pointing into the same array, otherwise the result is undefined.

Using Pointers with Arrays

We can use pointers with arrays instead of subscripts. Given this array and integer pointer:
int a[5];
int *pi;
We have this layout:
Initializing each element of an integer array to 0:

After the first iteration
for (pi = &a[0]; pi < &a[5]; pi++)
  *pi = 0;

We can combine the increment and dereference operators. All of these loops do the same thing. You may want to refer to your precedence chart.

for (pi = &a[0]; pi < &a[5];)
  *pi++ = 0;
or:
pi = &a[0];
while (pi < &a[5])
  *pi++ = 0;
or
for (pi = &a[5]; pi > a;)
  *--pi = 0;
Subtracting pointers Visually:
A more technically accurate version using ptrdiff_t instead of int:
#include<stddef.h> /* ptrdiff_t */

int a[5];

int *pi1 = &a[1];
int *pi2 = &a[4];

ptrdiff_t diff = pi2 - pi1;  /* diff is 3, (pi2 - pi1) / sizeof(int)  */
diff = pi1 - pi2;            /* diff is -3, (pi1 - pi2) / sizeof(int) */
You will need to include stddef.h for the definition (typedef) of ptrdiff_t.

Comparing Pointers

Notes on using pointers with arrays:

Pointers vs. Arrays

The short Q and A:
Q: Are pointers the same as arrays?
A: No. Never. Ever. Any questions? (Read all about the historical accident or mistake that confuses beginning C programmers.)

Moving on...

Simple points: Remember that the term pointer to is interchangeable with address of.

Suppose we pass this array to some function:

Call the functionThe function definition
foo(a, 5);
void foo(int x[], int elements)
{
  /* do whatever... */
}
We know we can't pass an array by value, we can only pass an address. (The address 100 in this example.) But which "100"? The array or the int? This is incorrect:
foo(&a, 5);
because the function is expecting a pointer to an int and we are passing a pointer to an array of 5 ints which is completely different.

Important note: When passing an array to a function, the compiler passes a pointer to the first element (i.e. the address of the first element) to the function. Understanding that statement is critical.

Read it. Learn it. Live it.

This means that these two function prototypes have the exact same meaning:

Using array notationUsing pointer notation
void foo(int x[], int elements);
void foo(int *x, int elements);
Two functions that print an array of integers: (all functions do the same exact thing)
Using array notationUsing pointer notation
void print_array1(int x[], int elements)
{
  int i;
  for (i = 0; i < elements; i++)
    printf("%i  ", x[i]);
}
void print_array2(int x[], int elements)
{
  int i;
  for (i = 0; i < elements; i++)
    printf("%i  ", *x++);
}
void print_array1(int *x, int elements)
{
  int i;
  for (i = 0; i < elements; i++)
    printf("%i  ", x[i]);
}
void print_array2(int *x, int elements)
{
  int i;
  for (i = 0; i < elements; i++)
    printf("%i  ", *x++);
}
Of course, we'd really want to make sure not to modify the array elements:
Using array notationUsing pointer notation
void print_array1(const int x[], int elements)
{
  int i;
  for (i = 0; i < elements; i++)
    printf("%i  ", x[i]);
}
void print_array2(const int *x, int elements)
{
  int i;
  for (i = 0; i < elements; i++)
    printf("%i  ", *x++);
}
This means that, when dealing with arrays, we can use subscript notation or pointer notation.

The Basic Rule:

array[index]  
is the same as
*(array + index)
where: This means: According to the Basic Rule:
array[index] ==> *(array + index)
Don't forget that due to C's built-in pointer arithmetic, the addition (e.g. array + 3) is scaled:
array[3] ==> *(array + 3 * sizeof(int)) ==> *(array + 12 bytes)
The above means: the fourth element is 12 bytes from the address of array. You can see this from looking at a similar diagram:
As a final note, since addition is commutative, these are equivalent:
array[3]
3[array]
because they lead to this:
array[3] ==> *(array + 3)
3[array] ==> *(3 + array)
The compiler is doing all of the pointer arithmetic behind the scenes. It goes without saying, but I feel the need to say it anyway: "Never, ever use the second form above!".

Relationship Between Subscripts and Pointers

Using the rule to convert a subscript to a pointer/offset, we get:
 a[i] ==> *(a + i)
&a[i] ==> &(*(a + i))
&a[i] ==> &*(a + i)
&a[i] ==> a + i
This shows that the address of any element is just the base address of the array plus the index (scaled).
char a[] = "abcdef";
char *p = a;

printf("%p, %p, %p, %p, %p\n", (void *)a, 
                               (void *)(a + 2), 
                               (void *)&*(a + 2), 
                               (void *)(p + 2), 
                               (void *)&*(p + 2));
Output:
0012FED4, 0012FED6, 0012FED6, 0012FED6, 0012FED6
Other equivalences:
 a[i] ==> *(a + i)
 a[0] ==> *(a + 0)
 a[0] ==> *a
&a[0] ==> &*a
&a[0] ==> a
These calls are equivalent:
f1(*a);    /* pass first element of array */
f1(a[0]);  /* pass first element of array */
and so are these:
f2(&a[0]); /* pass address of first element */
f2(a);     /* pass address of first element */

Pointer Expressions and Arrays

Given this code:
int a[10] = {5, 8, 3, 2, 1, 9, 0, 4, 7, 6};
int *p = a + 2;
Abstract diagram:

or shown with concrete values (addresses are arbitrary, as usual):

Give the equivalent expression using a. (Hint: Determine the type of each expression first)

  1. p
  2. p[0]
  3. *p
  4. p + 3
  5. *p + 5
  6. *(p + 6)
  7. p[6]
  8. &p
  9. p[-1]
  10. p[9]

You will most certainly see this kind of question on quizzes and exams as it checks to see if you really understand C/C++ programming. Make no mistake, this is one of the most important concepts you need to understand if you're going to be a C/C++ programmer (i.e. a game developer).