Record-based File I/O

(a.k.a record-oriented, structure-oriented)

Overview

There are two basic types of files: text and binary. Some operating systems don't distinguish between the two, leaving it up to people (or applications). Essentially, text-based files are meant to be read/written by humans. Binary files are meant for the computer.

Text files are generally unstructured and are used for things like:

Binary files are generally (rigorously) structured and used for: There are reasons why you would choose one format over the other: As an example, we'll create a system that contains information about students. To keep it simple, we're just going to track 5 pieces of information:
  1. An unique identifier (C-string)
  2. A student's first name (C-string)
  3. A student's last name (C-string)
  4. A student's age (integer)
  5. A student's GPA (double)
To further restrict the data, the ID will be at most 8 characters, the first name will be at most 20 characters, and the last name will be at most 20 characters.

Our C structure to hold each student record looks like this:

#define MAX_ID_LEN    8
#define MAX_NAME_LEN 20

struct STUDENT
{
  char ID[MAX_ID_LEN];           /* e.g. 101001 */
  char last_name[MAX_NAME_LEN];  /* e.g. Smith  */
  char first_name[MAX_NAME_LEN]; /* e.g. John   */
  int age;                       /* e.g. 22     */
  double GPA;                    /* e.g. 3.14   */
};

Storing the Data as Text

Suppose we store the data in a text file. We'd still have to give it some kind of structure so that we could tell one record from the next. Here's a sample student:
        ID: 101001
 Last name: Faith
First name: Ian
       Age: 18
       GPA: 3.140000
Suppose we store each student as one line of text, with the fields separated by commas and ending with a newline (OS-dependent):
101001,Faith,Ian,18,3.140000<NL>
Here's what a set of records would look like:
101001,Faith,Ian,18,3.140000<NL>
102001,Tufnel,Nigel,19,3.250000<NL>
103001,Savage,Viv,22,3.870000<NL>
104001,Shrimpton,Mick,25,2.610000<NL>
105001,Besser,Joe,19,2.180000<NL>
106001,Smalls,Derek,19,2.640000<NL>
107001,St.Hubbins,David,20,2.900000<NL>
108001,Fleckman,Bobbi,20,3.190000<NL>
109001,Eton-Hogg,Denis,21,3.830000<NL>
110001,Upham,Denny,18,3.310000<NL>
111001,McLochness,Ross,19,1.980000<NL>
112001,Pudding,Ronnie,20,2.890000<NL>
113001,Schindler,Danny,20,3.410000<NL>
114001,Pettibone,Jeanine,28,3.330000<NL>
115001,Fame,Duke,18,2.990000<NL>
116001,Fufkin,Artie,19,2.900000<NL>
117001,DiBergi,Marty,19,3.750000<NL>
118001,Floyd,Pink,20,3.840000<NL>
119001,Zeppelin,Led,19,3.810000<NL>
120001,Mason,Nick,18,2.710000<NL>
121001,Wright,Richard,19,2.940000<NL>
122001,Waters,Roger,19,3.090000<NL>
123001,Gilmore,David,20,3.500000<NL>
Let's look at the pros and cons of using text format:

Pros:

Cons: It's already starting to look like text is going to be too limiting for a real world system, which it is. So, binary it is!

Keep in mind that, if the data is very limited (i.e. few records with few fields), storing the data as text is perfectly acceptable (and I prefer it, personally). However, we'd like to develop a system that can handle very large numbers of records with many fields in each record and we'd like to do this very efficiently. At some point, the textual representation will become a real pain to use. See uuencoding to understand the complexity.

Storing the Record in Binary

As a reminder, here's what our data looks like:
#define MAX_ID_LEN    8
#define MAX_NAME_LEN 20

struct STUDENT
{
  char ID[MAX_ID_LEN];           /* e.g. 101001 */
  char last_name[MAX_NAME_LEN];  /* e.g. Smith  */
  char first_name[MAX_NAME_LEN]; /* e.g. John   */
  int age;                       /* e.g. 22     */
  double GPA;                    /* e.g. 3.14   */
};
Here's some code to show how we might (inefficiently and incorrectly) write out the data in binary:
  /* Initialize a sample record */
struct STUDENT s = {"101001", "Faith", "Ian", 18, 3.14};

  /* Open file for binary/write */
FILE *outfile = fopen("student-record", "wb");

  /* Write all 5 fields of the record to the file */
fwrite(&s.ID, sizeof(char), MAX_ID_LEN, outfile);
fwrite(&s.last_name, sizeof(char), MAX_NAME_LEN, outfile);
fwrite(&s.first_name, sizeof(char), MAX_NAME_LEN, outfile);
fwrite(&s.age, sizeof(int), 1, outfile);
fwrite(&s.GPA, sizeof(double), 1, outfile);

  /* Close the file, flushing all buffers */
fclose(outfile);

In order to keep the sample code simple, very little error handling has been coded. In a Real World application, you would check that all of the I/O functions (fopen, fwrite, etc.) were successful. It is quite possible that they could fail (e.g. disk full, invalid filenames, etc.)

Now that our data is in a binary file, we can no longer simply view it with a text editor. We'll be looking at the files using a hex dump tool called dumpit (Windows, Mac, Linux). Example usage:
dumpit student-record
Output:
student-record:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 31 30 31 30 30 31 00 00  46 61 69 74 68 00 00 00   101001..Faith...
000010 00 00 00 00 00 00 00 00  00 00 00 00 49 61 6E 00   ............Ian.
000020 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000030 12 00 00 00 1F 85 EB 51  B8 1E 09 40               .......Q...@
I've highlighted the ID and age fields in blue and the GPA field in red so you can more easily see the fields.

The first thing we see is that the size of the file is 60 bytes. That's because of the sizes of the fields:

        ID:  8 bytes
 last_name: 20 bytes
first_name: 20 bytes
       age:  4 bytes
+      GPA:  8 bytes
--------------------
            60 bytes
Add them all up and you get 60 bytes. Compare that with 28 or 29 bytes required to hold the data as text:
101001,Faith,Ian,18,3.140000<NL>
It seems that we are using more space than necessary and we are. However, this is only one of the cons of using binary data. And, as stated in the pros and cons above, this isn't always the case. It just happens to be the case for this small example. In the long run, the benefits of using binary data will outweigh the extra space required.

OK, that was... interesting. But remember I said this technique was inefficient AND incorrect? Let's make it more efficient (which will also make it correct at the same time.) This is where C structures really shine.

Instead of writing one field-at-a-time to the file, we can write the entire structure (record) at once. For a small structure like this, the benefits are not quite as significant. However, you can imagine a real world situation where you have hundreds of fields, with many of the fields being structures themselves. Reading/writing individual fields is not only tedious and inefficient, but very error prone.

Take a look at this structure and realize it would be nigh impossible to write out each field individually. You would have to know the exact layout of every field in all of the many nested structures. That's why we don't want to write individual fields!

Not only would it be very difficult and inefficient, but suppose later on you decide to change the type of one of the fields in the structure from, say, a 2-byte short integer to a 4-byte integer. You would have to find every single line in your program where you were reading or writing that field and change it. Good luck with that.

So, we can replace the 5 calls to fwrite above with a single call:
fwrite(&s, sizeof(struct STUDENT), 1, outfile);
And this is the dump of the file:
student-record2:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 31 30 31 30 30 31 00 00  46 61 69 74 68 00 00 00   101001..Faith...
000010 00 00 00 00 00 00 00 00  00 00 00 00 49 61 6E 00   ............Ian.
000020 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000030 12 00 00 00 00 00 00 00  1F 85 EB 51 B8 1E 09 40   ...........Q...@
The first thing you will notice is that the file is larger. It's 4 bytes larger. It was 60 bytes, now it's 64 bytes. You will also notice that the 4 bytes in bold are the reason. What gives?

Long story short: The extra space (padding) is for alignment. This is kind of an involved topic that you can read more about here: Structure Alignment. Briefly, for reasons of efficiency, members (fields) of a structure should be aligned on address boundaries that are multiples of the size of the data. This means that short integers should be on addresses that are evenly divisible by 2, integers and floats should be on addresses that are evenly divisible by 4, long integers (LP64 model), doubles and pointers (64-bit) should be on addresses that are evenly divisible by 8, etc. In order for the double in the structure above to be on the correct address, 4 extra bytes of "padding" are added after the integer so that the double "moves over" to the correct address.

That's why reading/writing individual fields of a structure is more difficult. The proper way is to always read the entire structure, which preserves this extra padding between fields. I showed you the "incorrect" way so that you would understand and appreciate why it was wrong and do it the correct way.

Storing Multiple Records

OK, so we now know how to store a structure in the file, but we want to store many such structures (records). Here's more sample data (23 records) that we are going to store in the file:
#define MAX_ID_LEN 8
#define MAX_NAME_LEN 20

struct STUDENT
{
  char ID[MAX_ID_LEN];           /* e.g. 101001 */
  char last_name[MAX_NAME_LEN];  /* e.g. Smith  */
  char first_name[MAX_NAME_LEN]; /* e.g. John   */
  int age;                       /* e.g. 22     */
  double GPA;                    /* e.g. 3.14   */
};

struct STUDENT Students[] = {
  {"101001", "Faith",      "Ian",     18, 3.14},
  {"102001", "Tufnel",     "Nigel",   19, 3.25},
  {"103001", "Savage",     "Viv",     22, 3.87},
  {"104001", "Shrimpton",  "Mick",    25, 2.61},
  {"105001", "Besser",     "Joe",     19, 2.18},
  {"106001", "Smalls",     "Derek",   19, 2.64},
  {"107001", "St.Hubbins", "David",   20, 2.90},
  {"108001", "Fleckman",   "Bobbi",   20, 3.19},
  {"109001", "Eton-Hogg",  "Denis",   21, 3.83},
  {"110001", "Upham",      "Denny",   18, 3.31},
  {"111001", "McLochness", "Ross",    19, 1.98},
  {"112001", "Pudding",    "Ronnie",  20, 2.89},
  {"113001", "Schindler",  "Danny",   20, 3.41},
  {"114001", "Pettibone",  "Jeanine", 28, 3.33},
  {"115001", "Fame",       "Duke",    18, 2.99},
  {"116001", "Fufkin",     "Artie",   19, 2.90},
  {"117001", "DiBergi",    "Marty",   19, 3.75},
  {"118001", "Floyd",      "Pink",    20, 3.84},
  {"119001", "Zeppelin",   "Led",     19, 3.81},
  {"120001", "Mason",      "Nick",    18, 2.71},
  {"121001", "Wright",     "Richard", 19, 2.94},
  {"122001", "Waters",     "Roger",   19, 3.09},
  {"123001", "Gilmore",    "David",   20, 3.50}
};

int Count = sizeof(Students) / sizeof(*Students);
Here is the hex dump of the binary file. The size of the file is 1,472. There are 23 records and each record is 64 bytes. Multiply 23 * 64 and you get 1,472. We're going to make a slight addition to our file to help when reading back the information. As it stands now, the only way to know how many records are in the file is to read them all one-at-a-time. To make it more efficient, we're going to store that count as the first integer in the file.

To create the file, just use a loop to write each structure to the file. This is what the first few records in the file look like with the count stored:

student-records:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 17 00 00 00 31 30 31 30  30 31 00 00 46 61 69 74   ....101001..Fait
000010 68 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   h...............
000020 49 61 6E 00 00 00 00 00  00 00 00 00 00 00 00 00   Ian.............
000030 00 00 00 00 12 00 00 00  00 00 00 00 1F 85 EB 51   ...............Q
000040 B8 1E 09 40 31 30 32 30  30 31 00 00 54 75 66 6E   ...@102001..Tufn
000050 65 6C 00 00 00 00 00 00  00 00 00 00 00 00 00 00   el..............
000060 4E 69 67 65 6C 00 00 00  00 00 00 00 00 00 00 00   Nigel...........
000070 00 00 00 00 13 00 00 00  00 00 00 00 00 00 00 00   ................
000080 00 00 0A 40 31 30 33 30  30 31 00 00 53 61 76 61   ...@103001..Sava
000090 67 65 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ge..............
0000A0 56 69 76 00 00 00 00 00  00 00 00 00 00 00 00 00   Viv.............
0000B0 00 00 00 00 16 00 00 00  00 00 00 00 F6 28 5C 8F   .............(\.
0000C0 C2 F5 0E 40 31 30 34 30  30 31 00 00 53 68 72 69   ...@104001..Shri
0000D0 6D 70 74 6F 6E 00 00 00  00 00 00 00 00 00 00 00   mpton...........
The first integer (4 bytes) is highlighted. The value 17 is in hexadecimal (and little-endian), which is the value 23 in decimal, the exact number of records in the file. Knowing the count will make it easy to allocate an array large enough to hold all of the records when we read all of them in later.

This is the hex dump of the binary file with the count field. Code showing how to create the file.

void write_students(void)
{
  int i;

    /* Open file to write all records */
  FILE *outfile = fopen("student-records", "wb");

    /* Write the count first */
  fwrite(&Count, sizeof(int), 1, outfile);

    /* Write each record to the file */
  for (i = 0; i < Count; i++)
    fwrite(&Students[i], sizeof(struct STUDENT), 1, outfile);

  fclose(outfile);
}
However, even at this early stage in our development, we can do better. Instead of writing each structure one-at-a-time, we can write the entire array of structures at once.
void write_students(void)
{
    /* Open file to write all records */
  FILE *outfile = fopen("student-records", "wb");

    /* Write the count first */
  fwrite(&Count, sizeof(int), 1, outfile);

    /* Write entire array of structures at once */
  fwrite(Students, sizeof(struct STUDENT), Count, outfile);

  fclose(outfile);
}
I'm sure you're beginning to see the power and elegance of dealing with entire structures (or arrays of structures) using binary files. Literally one line of code to write thousands of structures (which could be thousands of bytes in size, with hundreds of fields) to the file.

Here's the actual binary file for you to experiment with: student-records. You won't be able to view it properly in a browser because it is just binary data. You'll have to download it and use the code above to view it, or view it with some kind of hex editor/viewer like dumpit.

Reading Records from the File

Now that we have all of the data stored in the file, it won't be long until we need to read it and/or modify the data. Reading is just as simple as writing. We can either read one record-at-a-time or read in all records into an array. Sample code for both:

Reading individual records:

void read_records(void)
{
  int count;
  int i;

    /* Open the binary file for reading */
  FILE *infile = fopen("student-records", "rb");

    /* Get count of records in the file */
  fread(&count, sizeof(int), 1, infile);

    /* Read each record and do something with it */
  for (i = 0; i < count; i++)
  {
    struct STUDENT s;
    fread(&s, sizeof(struct STUDENT), 1, infile);
    /* Do something with the record... */
  }

  fclose(infile);
}
Reading entire file into an array:
void read_records(void)
{
  int count;
  struct STUDENT *students;

    /* Open the binary file for reading */
  FILE *infile = fopen("student-records", "rb");

    /* Get count of records in the file */
  fread(&count, sizeof(int), 1, infile);

    /* Allocate room for all of the records */
  students = (struct STUDENT *) malloc(count * sizeof(struct STUDENT));

    /* Read all records at once */
  fread(students, sizeof(struct STUDENT), count, infile);

  /* Do something with the records... */

    /* Print out each student record */
  for (i = 0; i < count; i++)
    print_student(&students[i]);

  free(students);
  fclose(infile);
}

Reminder: There is no error checking being done in this code. In a real application you would need to check that all of the library functions succeeded. (e.g. fopen, malloc, etc.)

Let's do something that will need to be done on a regular basis: Update a student's GPA. These are the steps involved:
  1. Open the file for read/binary.
  2. Locate the student's record by ID. (We call this value the key.)
  3. Read in the entire record.
  4. Modify the GPA.
  5. Write the entire record back out to the file.
  6. Close the file.
This is pretty straight-forward and how we would modify any field within a student's record. However, there's a subtle point that needs to be made. According to the algorithm above, we are reading and writing the same open file at the same time. There are at least a couple of ways we can deal with this. Briefly:
  1. Open the file for read/binary
  2. Read in the record
  3. Close the file
  4. Modify the record (in memory)
  5. Open the file for write/binary
  6. Write the modified record
  7. Close the file
There is nothing wrong with this method and it will work. You already have all of the information to do that. But, C has a better way to deal with this: Open the file for update (i.e. reading and writing). This is what the first algorithm described above does.

For this example, let's change Artie Fufkin's GPA from 2.90 to 3.25. Artie Fufkin's ID is 116001. This function takes an ID and GPA and updates the record in the file. A call to the function would look like this:

update_GPA("116001", 3.25);
This is the function:
/* Find student record with ID and modify GPA */
void update_GPA(const char *ID, double newGPA)
{
  int count;

    /* Open the file for update (read/write) binary */
  FILE *inoutfile = fopen("student-records", "rb+");

    /* Get count of records in the file */
  fread(&count, sizeof(int), 1, inoutfile);

    /* Search for the specified record by ID */
  while (count--)
  {
    long position;    /* The current position in the file */
    struct STUDENT s; /* The record read/modified         */

      /* Get current position in the file so we can return to it */
    position = ftell(inoutfile);

      /* Get the next record */
    fread(&s, sizeof(struct STUDENT), 1, inoutfile);

      /* If the student's record was found, update it */
    if (!strcmp(ID, s.ID))
    {
        /* Update GPA */
      s.GPA = newGPA;
      
        /* Move back to correct position in the file */
      fseek(inoutfile, position, SEEK_SET);

        /* Write out the updated record */
      fwrite(&s, sizeof(struct STUDENT), 1, inoutfile);

        /* Done */
      fclose(inoutfile);

      return;
    }
  }

    /* Record wasn't found */
  printf("Student ID: %s not found.\n", ID);
}
This is Artie Fufkin's original record with the current GPA (2.90) highlighted:
0003C0 85 EB 07 40 31 31 36 30  30 31 00 00 46 75 66 6B   ...@116001..Fufk
0003D0 69 6E 00 00 00 00 00 00  00 00 00 00 00 00 00 00   in..............
0003E0 41 72 74 69 65 00 00 00  00 00 00 00 00 00 00 00   Artie...........
0003F0 00 00 00 00 13 00 00 00  00 00 00 00 33 33 33 33   ............3333
000400 33 33 07 40 31 31 37 30  30 31 00 00 44 69 42 65   33.@117001..DiBe
This is Artie Fufkin's updated record with the new GPA (3.25) highlighted:
0003C0 85 EB 07 40 31 31 36 30  30 31 00 00 46 75 66 6B   ...@116001..Fufk
0003D0 69 6E 00 00 00 00 00 00  00 00 00 00 00 00 00 00   in..............
0003E0 41 72 74 69 65 00 00 00  00 00 00 00 00 00 00 00   Artie...........
0003F0 00 00 00 00 13 00 00 00  00 00 00 00 00 00 00 00   ................
000400 00 00 0A 40 31 31 37 30  30 31 00 00 44 69 42 65   ...@117001..DiBe
With hexadecimal numbers, the values are not obvious. Not only are the 64-bit doubles displayed in hexadecimal, but they are in little-endian order. Here are some conversions with the binary using IEEE-754 notation:

Original GPA:

Decimal: 2.90
 Binary: 0100000000000111001100110011001100110011001100110011001100110011
    Hex: 40 07 33 33 33 33 33 33
New GPA:
Decimal: 3.25
 Binary: 0100000000001010000000000000000000000000000000000000000000000000
    Hex: 40 OA 00 00 00 00 00 00
In a nutshell, that's how record-based I/O works. I leave it as an exercise for the reader to add more functionality like modifying other fields, adding records, deleting records, etc. This brief tutorial has given you all you need to get started.

Notes:

More Efficient File I/O

The previous examples worked just fine, but as the file grows with more data, it soon becomes inefficient to have to read every record each time we update a single record. Like many things, there are multiple ways to solve this "problem". The way we're going to address it is by using an index to the data.

By placing an index at the front of the file, we can more quickly locate where the record is further in the file. This is what the new format of the file looks like:

student-records-indexed:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 17 00 00 00 31 30 31 30  30 31 00 00 31 30 32 30   ....101001..1020
000010 30 31 00 00 31 30 33 30  30 31 00 00 31 30 34 30   01..103001..1040
000020 30 31 00 00 31 30 35 30  30 31 00 00 31 30 36 30   01..105001..1060
000030 30 31 00 00 31 30 37 30  30 31 00 00 31 30 38 30   01..107001..1080
000040 30 31 00 00 31 30 39 30  30 31 00 00 31 31 30 30   01..109001..1100
000050 30 31 00 00 31 31 31 30  30 31 00 00 31 31 32 30   01..111001..1120
000060 30 31 00 00 31 31 33 30  30 31 00 00 31 31 34 30   01..113001..1140
000070 30 31 00 00 31 31 35 30  30 31 00 00 31 31 36 30   01..115001..1160
000080 30 31 00 00 31 31 37 30  30 31 00 00 31 31 38 30   01..117001..1180
000090 30 31 00 00 31 31 39 30  30 31 00 00 31 32 30 30   01..119001..1200
0000A0 30 31 00 00 31 32 31 30  30 31 00 00 31 32 32 30   01..121001..1220
0000B0 30 31 00 00 31 32 33 30  30 31 00 00 31 30 31 30  01..123001..1010
0000C0 30 31 00 00 46 61 69 74  68 00 00 00 00 00 00 00   01..Faith.......
0000D0 00 00 00 00 00 00 00 00  49 61 6E 00 00 00 00 00   ........Ian.....
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 12 00 00 00   ................
0000F0 00 00 00 00 1F 85 EB 51  B8 1E 09 40 31 30 32 30   .......Q...@1020
000100 30 31 00 00 54 75 66 6E  65 6C 00 00 00 00 00 00   01..Tufnel......
000110 00 00 00 00 00 00 00 00  4E 69 67 65 6C 00 00 00   ........Nigel...

[rest of the file...]
Here is sample code that created the indexed file:
void write_students_indexed(void)
{
  int i;

    /* Open file to write all records */
  FILE *outfile = fopen("student-records-indexed", "wb");

    /* Write the count first */
  fwrite(&Count, sizeof(int), 1, outfile);

    /* Then write each ID to the file */
  for (i = 0; i < Count; i++)
    fwrite(&Students[i].ID, MAX_ID_LEN, 1, outfile);

    /* Finally, write all of the records to the file */
  fwrite(Students, sizeof(struct STUDENT), Count, outfile);

  fclose(outfile);
}
Now, when we look up a record, we just have to scan the index and then use that to locate the actual record. We still have to read in the entire index, but that is likely to be significantly less (by orders of magnitude) data than reading in the entire file.

Let's write a function that, given an ID, displays the student record. We'll write an entire program that will accept an ID on the command line and display that student record. Here it is in its entirety: (lookup-student.c)

#include <stdio.h>   /* FILE *, printf, fread, fopen, fclose */
#include <stdlib.h>  /* malloc                               */
#include <string.h>  /* strcmp                               */
#include "student.h" /* Student struct                       */

void print_student(const struct STUDENT *student)
{
  printf("%8s: %s, %s (Age: %i, GPA: %3.2f)\n", 
         student->ID,
         student->last_name,
         student->first_name,
         student->age,
         student->GPA);
}

void display_record(const char *ID)
{
  int i;       /* Loop counter                      */
  int count;   /* The number of records in the file */
  char *index; /* All of the student IDs            */

    /* Open the binary file for reading (hard-coded filename!) */
  FILE *infile = fopen("student-records-indexed", "rb");

    /* Get count of records in the file */
  fread(&count, sizeof(int), 1, infile);

    /* Allocate room for the index */
  index = (char *) malloc(count * sizeof(char) * MAX_ID_LEN);

    /* Read in the entire index */
  fread(index, MAX_ID_LEN, count, infile);

  for (i = 0; i < count; i++)
  {
      /* Does this ID match? */
    if (!strcmp(ID, index + i * MAX_ID_LEN))
    {
      struct STUDENT s;

        /* Calculate offset and move file pointer to that point */
      long position = sizeof(int) + (count * MAX_ID_LEN) + (i * sizeof(struct STUDENT));
      fseek(infile, position, SEEK_SET);

        /* Read record at current position */
      fread(&s, sizeof(struct STUDENT), 1, infile);

        /* Display record */
      print_student(&s);

        /* Clean up */
      fclose(infile);
      free(index);

        /* Done */
      return;
    }
  }

    /* Record wasn't found */
  printf("Student ID: %s not found.\n", ID);

    /* Clean up */
  free(index);
  fclose(infile);
}

int main(int argc, char **argv)
{
  const char *ID = "108001";
  if (argc > 1)
    ID = argv[1];

  display_record(ID);
  return 0;
}
Some points to make: Additional issues and possible changes:

Summary and Other Notes

As long as we use the same compiled program (executable) to read and write the files, everything will work out fine. However, if we compile the code using a different compiler, we could have problems. Specifically, the alignment of the fields in the structures are dependent on the compiler.

This is not an insurmountable problem and is well-known to software developers that deal with binary files. The technique used to make sure that all compilers employ the same alignment is called structure packing. You can read my introduction to it here: Structure Alignment.

Also, check out this excellent article: The Lost Art of Structure Packing.

One final note: Different CPUs order multibyte data types (e.g. integers, doubles) differently (little-endian vs. big-endian). This would also need to be taken into account if you were to use the program/code with different computer architectures.

Here are more links to the file I/O functions used: