Structures

Structures

A structure is similar to an array in that it is an aggregate data structure:

The general syntax of a struct:

struct tag { members } variable-list;
Or formatted appropriately:
struct tag
{
  member1
  member2
   ...
  memberN
} variable-list;
Notes:

Create a structure type named TIME, (no space is allocated at this point):

StructureLayout
struct TIME
{
  int hours;
  int minutes;
  int seconds;
};

This example creates two variables of type struct TIME, (space is allocated). Compare to an array:
struct TIME t1, t2; /* You must include the struct keyword */
int t3[3];          /* An array of 3 integers              */
Visually: (the structures and array are uninitialized)

What do you think sizeof(struct TIME) is?

Assigning values to the fields: (Operator precedence chart)
  /* Set the fields of t1 */
t1.hours = 8;
t1.minutes = 15;
t1.seconds = 0;



  /* Set the fields of t2 */
t2.hours = 11;
t2.minutes = 59;
t2.seconds = 59;



  /* Set the elements of t3 */
t3[0] = 8;
t3[1] = 15;
t3[2] = 0;



Initializing Structures

Structures are initialized much like arrays:
Structure definitionInitializing TIME variables
struct TIME
{
  int hours;
  int minutes;
  int seconds;
};
struct TIME t1 = {10, 15, 0}; /* 10:15:00 */
struct TIME t2 = {10, 15};    /* 10:15:00 */
struct TIME t3 = {10};        /* 10:00:00 */
struct TIME t4 = {0};         /* 00:00:00 */

struct TIME t5 = {};          /* Illegal  */
struct TIME t6 = { , , 5};    /* Illegal  */
Another example:
Structure definitionInitializing STUDENT variables
struct STUDENT
{
  char first_name[20];
  char last_name[20];
  int age;
  float GPA;
};
  /* Initialization statement */
struct STUDENT s1 = {"Johnny", "Appleseed", 20, 3.75F};

  /* Equivalent assignment statements */
strcpy(s1.first_name, "Johnny");
strcpy(s1.last_name, "Appleseed");
s1.age = 20;
s1.GPA = 3.75F;

  /* Don't try and do this (you can't use assignment with arrays) */
s1.first_name = "Johnny";   /* Illegal */
s1.last_name = "Appleseed"; /* Illegal */
Graphically:

What is sizeof(struct STUDENT)?

Review of array initialization vs. assignment:

char string[20]; /* Array of 20 characters, uninitialized */

string = "Johnny";        /* Illegal, "pointer" is const  */
strcpy(string, "Johnny"); /* Proper assignment            */
More examples:
Structure definitionInitializing STUDENT variables
struct STUDENT
{
  char first_name[20];
  char last_name[20];
  int age;
  float GPA;
};
  /* Initialize all fields */
struct STUDENT s2 = {"Tom", "Sawyer", 15, 1.30F};

  /* Set age and GPA to 0  */
struct STUDENT s3 = {"Huckleberry", "Finn"};

struct STUDENT s4 = {""};  /* Initialize everything to 0 */
struct STUDENT s5 = {{0}}; /* Initialize everything to 0 */

  /* Initializing arrays */
char first_name[20] = 0;  /* Illegal, need curly braces       */
char first_name[20] = {}; /* Illegal, need at least one value */
char last_name[20] = "";  /* Ok, all elements are 0           */
char last_name[20] = {0}; /* Ok, all elements are 0           */
Slightly different structure:
Structure definitionInitializing STUDENT2 variables
struct STUDENT2
{
  char *first_name;
  char *last_name;
  int age;
  float GPA;
};
  /* Initialization statement */
struct STUDENT2 s1 = {"Johnny", "Appleseed", 20, 3.75F};

  /* Equivalent assignment statements */
s1.first_name = "Johnny";
s1.last_name = "Appleseed";
s1.age = 20;
s1.GPA = 3.75F;

strcpy(s1.first_name, "Johnny");   /* BAD IDEA */
strcpy(s1.last_name, "Appleseed"); /* BAD IDEA */
Remember the string pool?

What is sizeof(struct STUDENT2)?

More review of pointers vs. arrays and initialization vs. assignment:

char s1[20] = "CS120"; /* sizeof(s1)?, strlen(s1)? */
char s2[20];           /* sizeof(s2)?, strlen(s2)? */

char *p1 = "CS120";    /* sizeof(p1)?, strlen(p1)? */
char *p2;              /* sizeof(p2)?, strlen(p2)? */

s2 = "CS120";          /* Illegal                  */
strcpy(s2, "CS120");   /* OK                       */

p2 = "CS120";          /* OK                       */
strcpy(p2, "CS120");   /* Legal, but very bad      */

There is a caveat when initializing a structure with fewer initializers than there are fields in the struct. The GNU compiler will give you warnings with the -Wextra switch:

 1.  struct TIME
 2.  {
 3.    int hours;
 4.    int minutes;
 5.    int seconds;
 6.  };
 7.
 8.  struct TIME t1 = {10, 15, 0}; /* 10:15:00, no warning */
 9.  struct TIME t2 = {10, 15};    /* 10:15:00, warning    */
10.  struct TIME t3 = {10};        /* 10:00:00, warning    */
11.  struct TIME t4 = {0};         /* 00:00:00, no warning */
Warnings:
main2.c:9: warning: missing initializer
main2.c:9: warning: (near initialization for `t2.seconds')
main2.c:10: warning: missing initializer
main2.c:10: warning: (near initialization for `t3.minutes')
Starting with version 4.0.4, this can be suppressed with -Wno-missing-field-initializers. However, do not use this command line switch for assignments or labs unless instructed to do so. Some older compilers would warn, but you couldn't suppress it.

Here's what we're dealing with in CS180: task_struct.

Structures as Parameters

Structures can be passed to functions just like any other value. However, they are different than arrays in that they are passed by value, meaning that the entire structure is copied onto the stack. This is actually a very big difference, and, as you can imagine, it can be a very expensive operation.

Given this structure:

struct TIME
{
  int hours;   /* 4 bytes */
  int minutes; /* 4 bytes */
  int seconds; /* 4 bytes */
};
Passing a TIME structure to a function:

Function to print a TIMECalling the function
void print_time(struct TIME t)
{
  printf("The time is %02i:%02i:%02i\n",
         t.hours, t.minutes, t.seconds);
}
void foo(void)
{
    /* Create time of 10:30:15 */
  struct TIME t = {10, 30, 15};

    /* Pass by value and print */
  print_time(t);
}

Output:
The time is 10:30:15
A more expensive example:

struct STUDENT
{
  char first_name[20]; /* 20 bytes */
  char last_name[20];  /* 20 bytes */
  int age;             /*  4 bytes */
  float GPA;           /*  4 bytes */
};

Function to print a STUDENTCalling the function
void print_student(struct STUDENT s)
{
  printf("Name: %s %s\n", s.first_name, s.last_name);
  printf(" Age: %i\n", s.age);
  printf(" GPA: %.2f\n", s.GPA);
}
void foo(void)
{
  struct STUDENT s1 = {"Johnny",
                        "Appleseed",
                        20,
                        3.75F
                      };

  print_student(s1);
}

Output:
Name: Johnny Appleseed
 Age: 20
 GPA: 3.75
We'll see soon that passing a pointer to a structure is the proper way to do it.

Here's something that confuses beginners:

  /* Initialization statements */
struct STUDENT s1 = {"Johnny", "Appleseed", 20, 3.57F};
struct STUDENT s2 = {"Huckleberry", "Finn", 18, 3.75F};

s1 = s2; /* assign s2 to s1 (assigns arrays!) */
Yes, when an array is part of a structure, you can assign them. The compiler is doing an element-by-element assignment of the array members. The compiler knows that both arrays are identical (they must be) and it also knows the size of the array.

So, if you really, really want to pass an array to a function by value (as opposed to passing a pointer to the first element), just put the array in a structure and pass the structure. The entire array within the structure is copied on the stack.

Also, this only works when you assign the entire structure. If you just try to do this:

s1.first_name = s2.first_name;
You'll get an error message like:
error: assignment to expression with array type
   s1.first_name = s2.first_name;
                 ^

Structures as Members

Structures can contain almost any data type, including other structures. Sometimes these are called nested structures. These types of structures are very common and used in almost all non-trivial programs.
#define MAX_PATH 12
struct DATE
{
  int month;
  int day;
  int year;
};
struct TIME
{
  int hours;
  int minutes;
  int seconds;
};
struct DATETIME
{
  struct DATE date;
  struct TIME time;
};
struct FILEINFO
{
  int length;
  char name[MAX_PATH];
  struct DATETIME dt;
};

Notice the DATE and TIME structures above. They are structurally the same, but are considered two different (and incompatible) types. The C programming language (and C++, as well) don't consider them the same because C uses name equivalence instead of structural equivalence.


Given this code:
struct FILEINFO fi;
We can visualize the struct in memory like this:

Now highlighting the two fields of the DATETIME struct:

Now highlighting the fields of the DATE and TIME structs:

Example:
struct DATE
{
  int month;
  int day;
  int year;
};
struct TIME
{
  int hours;
  int minutes;
  int seconds;
};
struct DATETIME
{
  struct DATE date;
  struct TIME time;
};
struct FILEINFO
{
  int length;
  char name[MAX_PATH];
  struct DATETIME dt;
};
void f1(void)
{
  struct FILEINFO fi;   /* Create FILEINFO struct on stack */

    /* Set date to 7/4/2021 */
  fi.dt.date.day = 4;
  fi.dt.date.month = 7;
  fi.dt.date.year = 2021;

    /* Set time to 9:30 am */
  fi.dt.time.hours = 9;
  fi.dt.time.minutes = 30;
  fi.dt.time.seconds = 0;

  fi.length = 1024;           /* Set length */
  strcpy(fi.name, "foo.txt"); /* Set name   */
}
You've seen this kind of syntax before. It's all over the Internet:
www.digipen.edu
faculty.digipen.edu
www.yahoo.com
uk.yahoo.com
au.yahoo.com
ada.cs.pdx.edu
The "nested fields" of the URL are separated by dots and are resolved from right-to-left instead of left to right, for example:

ada.cs.pdx.edu

An example using initialization:
                  /* length    name               DATETIME           */
                  /* length    name          DATE         TIME       */
                  /* length    name       m  d    y     h  m   s     */
struct FILEINFO fi = {1024, "foo.txt", { {7, 4, 2021}, {9, 30, 0} } };
A very fast way to set everything to 0:
struct FILEINFO fi = {0};
Same operations using a pointer to the structure:
void f2(void)
{
  struct FILEINFO fi;         /* Create FILEINFO struct on stack */
  struct FILEINFO *pfi = &fi; /* Pointer to a FILEINFO struct    */

    /* Set date to 7/4/2021 */
  (*pfi).dt.date.day = 4;
  (*pfi).dt.date.month = 7;
  (*pfi).dt.date.year = 2021;

    /* Set time to 9:30 am */
  (*pfi).dt.time.hours = 9;
  (*pfi).dt.time.minutes = 30;
  (*pfi).dt.time.seconds = 0;

  (*pfi).length = 1024;           /* Set length */
  strcpy((*pfi).name, "foo.txt"); /* Set name   */
}
Due to the order of precedence, we need the parentheses above. Otherwise, without the parentheses, the compiler sees this:
*(pfi.dt.date.day) = 4;
And you can't dereference an integer (day).

Accessing a member of a pointer:

  /*  error: request for member 'length' in something not a structure or union */
pfi.length = 1024;
Closer look:
Expression     Description
-------------------------------------------------------------------------------
pfi            A pointer to a FILEINFO struct
*pfi           A FILEINFO struct
(*pfi).        Accessing a member of a pointer to a FILEINFO struct
pfi->          Accessing a member of a pointer to a FILEINFO struct (shorthand)
The structure pointer operator (or informally the arrow operator) is another programmer convenience along the same vein as the subscript operator and is high on the precedence chart. It performs the indirection "behind the scenes" so:
(*pfi).  is the same as   pfi->
That's why using the structure pointer operator on a structure is illegal; we're trying to dereference something (a structure) that isn't a pointer. And that's a no-no. Same example using the structure pointer operator:
  /* Set date to 7/4/2021 */
pfi->dt.date.month = 7;
pfi->dt.date.day = 4;
pfi->dt.date.year = 2021;

  /* Set time to 9:30 am */
pfi->dt.time.hours = 9;
pfi->dt.time.minutes = 30;
pfi->dt.time.seconds = 0;

pfi->length = 1024;           /* Set length */
strcpy(pfi->name, "foo.txt"); /* Set name   */

Arrays vs. Structures vs. Built-in Types

Unlike arrays, which prohibit most aggregate operations, it is possible in some cases to manipulate structures as a whole.

OperationArraysStructuresBuilt-in types (e.g. int)
ArithmeticNoNoYes
AssignmentNoYesYes
ComparisonNoNoYes
Input/Output (e.g. printf)No (except strings)NoYes
Parameter passingBy address onlyBy address or valueBy address or value
Return from functionNoYesYes
Also, unlike arrays, structure members are not guaranteed to be contiguous in memory, so using pointer arithmetic within the structure is not safe to do.

According to the chart above, it is possible to return a structure from a function. What this allows us to do is to return multiple values from a function. We will still only be returning a single object, but that single object (a struct) may contain any number of members and each member may be a different type with a different value. This can be a very convenient way of returning multiple values/types from a function instead of returning them through pointers.

Structures as Parameters (revisited)

Since the default method for passing structures to functions is pass-by-value, we should modify the function so that it's more efficient. We do that by simply passing the structure by address instead.

Function to print a STUDENT (more efficient)Calling the function
void print_student2(struct STUDENT *s)
{
  printf("Name: %s %s\n", s->first_name, s->last_name);
  printf(" Age: %i\n", s->age);
  printf(" GPA: %.2f\n", s->GPA);
}

Output:
Name: Johnny Appleseed
 Age: 20
 GPA: 3.75
void f12(void)
{
  struct STUDENT s1 = {"Johnny",
                        "Appleseed",
                        20,
                        3.75F
                      };
  print_student2(&s1);
}
Note:

Summary of struct Syntax

The general form:
struct tag { members } variable-list;
with proper formatting:
struct tag
{
  member1
  member2
   ...
  memberN
} variable-list;
Create a structure named TIME, (no space is allocated):
struct TIME
{
  int hours;
  int minutes;
  int seconds;
};
Create two variables of type struct TIME, (space is allocated):
struct TIME t1, t2; /* The struct keyword is required */
without the struct keyword, gcc will complain:
error: unknown type name 'TIME'
 TIME t1, t2;
 ^
The Clang C compiler gives a better message:
error: must use 'struct' tag to refer to type 'TIME'
TIME t1, t2;
^
struct 
1 error generated.

You must include the struct keyword when creating struct variables in C. That's because the type is "struct TIME", not simply "TIME". In C++, the use of the struct keyword is optional when creating variables.


We can do both (define the structure and variables) in one step:

struct TIME
{
  int hours;
  int minutes;
  int seconds;
}t1, t2;            /* This allocates space            */

struct TIME t3, t4; /* Create more TIME variables here */

Leaving off the tag creates an anonymous structure:

struct     /* No name given to this struct */
{
  int hours;
  int minutes;
  int seconds;
}t1, t2;  /* We won't be able to create others later */
This has little use in most code.

Using a typedef to create an alias: (Refer to the notes on typedef.)

struct TIME
{
  int hours;
  int minutes;
  int seconds;
};
typedef struct TIME Time; /* Now, 'Time' is a new type, same as 'struct TIME' */
With typedef we don't need the struct keyword:
Time t1, t2;        /* Create two 'Time' variables        */
struct TIME t3, t4; /* Create two 'struct TIME' variables */
All of the variables are the same (compatible) type, so this works:
t1 = t3; /* OK */
t2 = t4; /* OK */
We can create the typedef when we define the struct:
typedef struct TIME /* tag name     */
{
  int hours;
  int minutes;
  int seconds;
}Time;             /* typedef name */

Time t1, t2;         /* use typedef name                   */
struct TIME t3, t4;  /* use tag name, needs struct keyword */

The tag name and typedef name can be the same:

typedef struct TIME /* tag name            */
{
  int hours;
  int minutes;
  int seconds;
}TIME;              /* typedef same as tag */

TIME t1, t2;        /* use typedef name                     */
struct TIME t3, t4; /* use tag name, needs struct keyword   */

TIME times[10];     /* an array of 10 TIME structs          */
TIME *pt;           /* a pointer to a TIME struct           */
TIME foo(TIME *p);  /* function takes a TIME struct pointer */
                    /*   and returns a TIME struct          */

Usually, when a typedef is created, the struct keyword is not used anymore, simply because it requires extra work (typing) and is unnecessary.

When using typedef, watch out for this:
typedef struct /* No tag! */
{
  int hours;
  int minutes;
  int seconds;
}TIME;          /* typedef (no tag) */

TIME t1;          /* use typedef name           */
struct TIME t2;  /* This doesn't work anymore. */
Depending on where you define t1 and t2, you'll get errors saying something about an "incomplete type".

gcc:

error: storage size of 't1' isn't known
 struct TIME t1;
             ^          
error: 't1' has an incomplete type
  t1 = t2; 
  ^
Clang:
error: variable has incomplete type 'struct TIME'
struct TIME t1;
            ^
error: incomplete type 'struct TIME' is not assignable
        t1 = t2;
        ~~ ^
error: tentative definition has type 'struct TIME' that is never completed
struct TIME t1;
            ^
note: forward declaration of 'struct TIME'
struct TIME t1;
       ^

Self-referencing and mutually-dependent structures

Before any data type can be used to create a variable, the size of the type must be known to the compiler:
struct NODE
{
  int value;
  struct NODE next;  /* illegal */
};
Since the compiler hasn't fully "seen" the NODE struct, it can't be used anywhere, even inside itself. However, this works:
struct NODE
{
  int value;
  struct NODE *next; /* OK */
};
Since all pointers are of the same size, the compiler will accept this. The compiler doesn't fully know what's in a NODE struct (and doesn't need to know yet), but it knows the size of a pointer to it.

Using a forward declaration and typedef:

typedef struct tagNODE NODE; /* Forward declaration */

struct tagNODE
{
  int value;
  NODE *next; /* Don't need struct keyword because of forward declaration above */
};
or (tag and typedef are the same)
typedef struct NODE NODE; /* Forward declaration */

struct NODE
{
  int value;
  NODE *next; /* Don't need struct keyword because of forward declaration above */
};


Two structures that are mutually dependent on each other won't pose any problems. In the source file, one of the definitions must come after the other. The use of the struct keyword gives the compiler enough information:

struct A
{
  int value;
  struct B *b; /* B is a struct, defined later */
};

struct B
{
  int value;
  struct A *a; /* A is a struct */
};
More examples with typedef. First, without typedef:
struct A
{
  struct A *pa; /* Need struct keyword, type is 'struct A' */
  struct B *pb; /* Need struct keyword, type is 'struct B' */
};

struct B
{
  struct A *pa; /* Need struct keyword, type is 'struct A' */
  struct B *pb; /* Need struct keyword, type is 'struct B' */
};

struct A a; /* Need struct keyword */
struct B b; /* Need struct keyword */

with typedef:
typedef struct
{
  struct A *pa; /* Need struct keyword, haven't seen typedef for A yet */
  struct B *pb; /* Need struct keyword, haven't seen typedef for B yet */
}A;

typedef struct
{
  A *pa;        /* Don't need struct keyword because of typedef above  */
  struct B *pb; /* Need struct keyword, haven't seen typedef for B yet */
}B;

A a; /* OK because of typedef */
B b; /* OK because of tyepdef */
with typedef and forward declarations:
typedef struct A A; /* Forward declaration, compiler knows what the symbol A is */
typedef struct B B; /* Forward declaration, compiler knows what the symbol B is */

struct A
{
  A *pa; /* Don't need struct keyword because of forward declaration above */
  B *pb; /* Don't need struct keyword because of forward declaration above */
};

struct B
{
  A *pa; /* Don't need struct keyword because of forward declaration above */
  B *pb; /* Don't need struct keyword because of forward declaration above */
};

A a; /* OK, typedef */
B b; /* OK, typedef */
Notes: Some developers prefer to have distinct names between the tags and the typedefs:
typedef struct tagA A; /* Forward declaration */
typedef struct tagB B; /* Forward declaration */

struct tagA
{
  A *pa; /* Don't need struct keyword because of forward declaration above */
  B *pb; /* Don't need struct keyword because of forward declaration above */
};

struct tagB
{
  A *fp; /* Don't need struct keyword because of forward declaration above */
  B *pb; /* Don't need struct keyword because of forward declaration above */
};

A a; /* OK, typedef */
B b; /* OK, typedef */
This allows one to distinguish between the tag and the typedef:
A a1;           /* OK, typedef         */
struct tagA a2; /* Need struct keyword */
The choice is a matter of preference. Since most programmers never use the struct keyword after they create the typedef, it really doesn't make any difference if you have a distinct name for the tag.