Mead's Guide to getopt

Quick links:
  1. Introduction
  2. Using getopt
  3. Option arguments
  4. Unknown and missing option arguments
  5. Non-option arguments
 
  1. Optional option arguments
  2. Long options
  3. Long options and flags
  4. Summary
  5. Other uses for command line arguments

Introduction

Most command line programs can have their behavior altered by supplying options to the command when it is invoked. Consider a typical command line to compile a C program:
gcc -Wall -Wextra main.c foo.c bar.c -O -o program -ansi -pedantic -Werror 
Before gcc can even begin compiling the source files, it first must parse the entire command line so as to understand exactly how the programmer expects the compiler to behave. Imagine having to deal with hundreds of options like the gcc command options. This summary of options shows hundreds of options alone. And don't believe for a minute that they are only for command-line apps. All "real" games have hundreds of them:

It isn't hard to imagine that a program that accepts a lot of arguments, options, and option arguments will need to include a lot of code just to parse the command line. Writing the code yourself is certainly possible, but having to do that for every program is not only tedious, but is very repetitive and inefficient.

Fortunately, there are many libraries and APIs (Application Programming Interface) that are designed specifically for that task. One of those APIs is a function called getopt.

Before looking at getopt, a brief review of command line arguments is in order.


Usually, we see main prototyped as:

int main(void);
However, main is sort of overloaded to take parameters as well:
   /* These prototypes are the same */ 	
int main(int argc, char *argv[]);
int main(int argc, char **argv);
As we've seen with arrays as parameters, the declarations above are equivalent. This trivial program simply prints out each argument:
int main(int argc, char *argv[])
{
  int i;
  
  for (i = 0; i < argc; i++)
    printf("arg%i = %s\n", i, argv[i]);
    
  return 0;
}
If our program was named foo and we were to invoke the program like this:
foo one two three 911
we would see something like this printed out:
foo
one
two
three
911
Another example:
foo one "two three" four 911
foo
one
two three
four
911
Another way of printing the arguments using pointers instead of subscripts:
int main(int argc, char **argv)
{
  while (*argv)
    printf("%s\n", *argv++);

  return 0;
}

Diagram of the arguments when invoked as:
foo one "two three" four 911

The character between the strings Two and Three in the diagram above is the space character, ASCII 32. Also, notice that the double-quote characters are not passed to the program.

Note: Because argv is an array of pointers to characters (strings), you can only pass strings to a program. If you want to use the parameter as a number, you will have to convert it from a string to a number yourself. See the Data Conversion section in the C Runtime Library, specifically the atoi function.

Using getopt

The getopt function is prototyped like this (in getopt.h):
int getopt(int argc, char *const argv[], const char *optstring);
The first two parameters are exactly like main and are usually just passed from main to getopt as-is.

According to the documentation for getopt, you are supposed to include unistd.h. However, I've recently discovered that this doesn't work on all systems. Using getopt.h instead of unistd.h appears to fix the problem.

There are also a few global variables defined within the API:

extern char *optarg;
extern int optind, opterr, optopt;
The interesting part of the function is the last parameter:
const char *optstring
This string is an encoding (of sorts) that contains all of the single-letter options that a program wants to accept. For example, if the program want's to accept these options:
-a   -b   -X
then optstring would simply contain the string: "abX" (The order doesn't matter, although the characters are case-sensitive.) The program would contain code similar to this:
#include <stdio.h>  /* printf     */
#include <getopt.h> /* getopt API */

int main(int argc, char *argv[])
{
  int opt;

  while ((opt = getopt(argc, argv, "abX")) != -1) 
  {
     switch (opt) 
     {
      case 'a':
        printf("Option a was provided\n");
        break;
      case 'b':
        printf("Option b was provided\n");
        break;
      case 'X':
        printf("Option X was provided\n");
        break;
     }
  }
  
  return 0;
}
Sample runs (assume the program has been compiled to a.out):
Command line optionsOutput
./a.out -b
Option b was provided
./a.out -b -X -a
Option b was provided	
Option X was provided	
Option a was provided
./a.out -bXa
Option b was provided 
Option X was provided 
Option a was provided
./a.out -a b X
Option a was provided	
./a.out -t	
./a.out: invalid option -- 't'	
./a.out a b c

Of course, in a real program, the programmer would actually do something with the options rather than just print out the information. However, this demonstrates how the getopt function behaves.

Option Arguments

Below is a simple gcc compile command:
gcc foo.c -o bar.o -c
This command will compile only (-c) the file foo.c into bar.o (-o bar.o). The -o option is different than the -c option in that it requires an argument itself. If you want your options to accept an argument, you must provide getopt with that information.

Let's assume we want the a and b options to accept an argument. This is how optstring would look now:

"a:b:X"
The colon after the letter tells getopt to expect an argument after the option. Now the code looks like this (partial):
while ((opt = getopt(argc, argv, "a:b:X")) != -1) 
{
   switch (opt) 
   {
    case 'a':
      printf("Option a has arg: %s\n", optarg);
      break;
    case 'b':
      printf("Option b has arg: %s\n", optarg);
      break;
    case 'X':
      printf("Option X was provided\n");
      break;
   }
}
If the option requires an argument, the external variable optarg is a pointer to the argument. Recall these global variables defined in getopt.h:
extern char *optarg;
extern int optind, opterr, optopt;

Sample runs (assume the program has been compiled to a.out):

Command line optionsOutput
./a.out -a one -b two -X
Option a has arg: one
Option b has arg: two
Option X was provided
./a.out -aone -btwo
Option a has arg: one
Option b has arg: two
./a.out -X -a
Option X was provided
./a.out: option requires an argument -- 'a'
./a.out -a -X
Option a has arg: -X

When using single-letter options that require an argument, as in -a and -b above, white space between the option and the argument is optional. The first string following the option will be used as the argument (regardless of whether or not it starts with a minus sign).

Unknown Options and Missing Option Arguments

By default, getopt prints errors that it encounters. This is sometimes helpful, but we'd rather deal with the errors ourselves. To disable the automatic error printing, simply put a colon as the first character in optstring:
":a:b:X"
Now, we need to handle two error conditions:
  1. The user has provided an unknown option, e.g.: ./a.out -t
  2. The user has failed to provide a required argument for an option, e.g.: ./a.out -a
The code (partial) looks like this:
while ((opt = getopt(argc, argv, ":a:b:X")) != -1) 
{
   switch (opt) 
   {
    case 'a':
      printf("Option a has arg: %s\n", optarg);
      break;
    case 'b':
      printf("Option b has arg: %s\n", optarg);
      break;
    case 'X':
      printf("Option X was provided\n");
      break;
    case '?':
      printf("Unknown option: %c\n", optopt);
      break;
    case ':':
      printf("Missing arg for %c\n", optopt);
      break;
   }
}
Recall these global variables from getopt.h:
extern char *optarg;
extern int optind, opterr, optopt;
Sample runs (assume the program has been compiled to a.out):
Command line optionsOutput
./a.out -a
Missing arg for a
./a.out -t
Unknown option: t
./a.out -a one -t -X -b
Option a has arg: one
Unknown option: t
Option X was provided
Missing arg for b
./a.out -a one,two,three
Option a has arg: one,two,three
./a.out -a "one two three"
Option a has arg: one two three
A note about the last two examples above:

Non-Option Arguments

Again, a sample gcc compile command:
gcc -Wall -Wextra main.c foo.c bar.c -O -o program -ansi -pedantic -Werror
This command line has:

Before going any further, make sure that you really understand what every string in the example above means. In other words, which strings are commands, options, arguments, or option arguments and why? This will tell you if you understand how the command line works.

Up until now, we've only focused on the options. What about the real arguments to gcc? getopt behaves in two different ways, as this code demonstrates:
#include <stdio.h>  /* printf */
#include <getopt.h> /* getopt */

int main(int argc, char *argv[])
{
  int opt;

  while ((opt = getopt(argc, argv, ":a:b:X")) != -1) 
  {
     switch (opt) 
     {
      case 'a':
        printf("Option a has arg: %s\n", optarg);
        break;
      case 'b':
        printf("Option b has arg: %s\n", optarg);
        break;
      case 'X':
        printf("Option X was provided\n");
        break;
      case '?':
        printf("Unknown option: %c\n", optopt);
        break;
      case ':':
        printf("Missing arg for %c\n", optopt);
        break;
     }
  }

    /* Get all of the non-option arguments */
  if (optind < argc) 
  {
    printf("Non-option args: ");
    while (optind < argc)
      printf("%s ", argv[optind++]);
    printf("\n");
  }
  
  return 0;
}
Again, the global variables from getopt.h:
extern char *optarg;
extern int optind, opterr, optopt;
Sample runs (assume the program has been compiled to a.out):
Command line optionsOutput
./a.out x -a one y -X z
Option a has arg: one
Option X was provided
Non-option args: x y z 
./a.out x y z -a one -b two
Option a has arg: one
Option b has arg: two
Non-option args: x y z 
As you can see, the default behavior for getopt is to move all of the non-option arguments to the end of the array. When getopt has no more options to parse, it returns -1 and the while loop ends. The external variable optind is used as an index into argv so we can retrieve the remaining arguments.

If you want to have getopt parse and return the non-option arguments in the while loop (in the order specified), you must direct it to do so by putting a minus (-) in front of the optstring:

"-:a:b:X"

Note: When supplying both - and : at the front of the string, the minus must come first.

Sometimes, having getopt rearrange the non-option arguments is problematic, especially when some of the options apply only to specific non-option arguments.

Sample code:

#include <stdio.h>  /* printf */
#include <getopt.h> /* getopt */

int main(int argc, char *argv[])
{
  int opt;

  while ((opt = getopt(argc, argv, "-:a:b:X")) != -1) 
  {
     switch (opt) 
     {
      case 'a':
        printf("Option a has arg: %s\n", optarg);
        break;
      case 'b':
        printf("Option b has arg: %s\n", optarg);
        break;
      case 'X':
        printf("Option X was provided\n");
        break;
      case '?':
        printf("Unknown option: %c\n", optopt);
        break;
      case ':':
        printf("Missing arg for %c\n", optopt);
        break;
      case 1:
        printf("Non-option arg: %s\n", optarg);
        break;
     }
  }
  
  return 0;
}
Sample runs:
Command line optionsOutput
./a.out x y z -a foo
Non-option arg: x
Non-option arg: y
Non-option arg: z
Option a has arg: foo
./a.out x -a foo y -b bar z -X w
Non-option arg: x
Option a has arg: foo
Non-option arg: y
Option b has arg: bar
Non-option arg: z
Option X was provided
Non-option arg: w
Putting it all together using this call:
getopt(argc, argv, "-:a:b:X")
and running the program:
./a.out -t x -a foo -M y -b bar z -X w -b
Output:
Unknown option: t
Non-option arg: x
Option a has arg: foo
Unknown option: M
Non-option arg: y
Option b has arg: bar
Non-option arg: z
Option X was provided
Non-option arg: w
Missing arg for b

Optional Option Arguments

Sometimes an option may not require an argument, but it allows an optional argument. There is a syntax for that, as well: Two colons (::) must follow the letter in optstring:
":a::b:X"
In the string above, option a will accept an optional argument. (Option b requires an argument) Usage:
Option syntaxMeaning
-a
OK, No argument provided (optional).
-afoo
OK, argument is foo
-a foo
Wrong, no space allowed with optional arguments.
foo is considered a non-option argument.
-bfoo
OK, argument is foo (required).
-b foo
OK, argument is foo (required).
-b
Wrong, option b requires an argument.
Since the argument is optional, you will have to check the value of optarg to see if it is a valid pointer (otherwise, it's NULL).

Code sample:

while ((opt = getopt(argc, argv, "-:a::b:X")) != -1) 
{
   switch (opt) 
   {
    case 'a':
      printf("Option a has arg: %s\n", optarg ? optarg : "(none)");
      break;
    case 'b':
      printf("Option b has arg: %s\n", optarg);
      break;
    case 'X':
      printf("Option X was provided\n");
      break;
    case '?':
      printf("Unknown option: %c\n", optopt);
      break;
    case ':':
      printf("Missing arg for %c\n", optopt);
      break;
    case 1:
      printf("Non-option arg: %s\n", optarg);
      break;
   }
}
Sample output:
Command lineOutput
./a.out -a -b bar -X
Option a has arg: (none)
Option b has arg: bar
Option X was provided
./a.out -afoo -b bar -X
Option a has arg: foo
Option b has arg: bar
Option X was provided
./a.out -a foo -b bar -X
Option a has arg: (none)
Non-option arg: foo
Option b has arg: bar
Option X was provided

Long Options

There are some problems with the short, single-character options: Just look at the number of options for rsync and wget. They have tons of options. Having the ability to use more than just a single character gives us virtually unlimited options. An example of a long option in diff:
diff file1.txt file2.txt --strip-trailing-cr
There are many options for diff. So, not only do we have unlimited option names with longer names, but they can be self-explanatory. Consider this:
diff file1.txt file2.txt -Z
Do you know what -Z does? Probably not, unless you are an expert with diff. How about this:
diff file1.txt file2.txt --ignore-trailing-space
I'm pretty sure that most programmers know what this does, or have a good idea of what it does. (-Z is short for --ignore-trailing-space. Who knew?)

This is the prototype for the long option function:

int getopt_long(int argc, char * const argv[], const char *optstring, 
                const struct option *longopts, int *longindex);
The first three options are the same as the short version. The last two parameters are new. This is what an option struct looks like:
struct option 
{
  const char *name;    /* name without -- in front                                  */
  int         has_arg; /* one of: no_argument, required_argument, optional_argument */
  int        *flag;    /* how the results are returned                              */
  int         val;     /* the value to return or to put in flag                     */
};
An example (from the man page):
#include <getopt.h> /* getopt */
#include <stdlib.h> /* exit   */
#include <stdio.h>  /* printf */

int main(int argc, char **argv)
{
  int c;

  while (1) 
  {
      int option_index = 0;
      static struct option long_options[] = 
      {
          {"add",     required_argument, NULL,  0 },
          {"append",  no_argument,       NULL,  0 },
          {"delete",  required_argument, NULL,  0 },
          {"verbose", no_argument,       NULL,  0 },
          {"create",  required_argument, NULL,  0 },
          {"file",    optional_argument, NULL,  0 },
          {NULL,      0,                 NULL,  0 }
      };

      c = getopt_long(argc, argv, "-:abc:d::", long_options, &option_index);
      if (c == -1)
           break;

      switch (c) 
      {
        case 0:
          printf("long option %s", long_options[option_index].name);
          if (optarg)
             printf(" with arg %s", optarg);
          printf("\n");
          break;

        case 1:
          printf("regular argument '%s'\n", optarg); /* non-option arg */
          break;

        case 'a':
          printf("option a\n");
          break;

       case 'b':
          printf("option b\n");
          break;

        case 'c':
          printf("option c with value '%s'\n", optarg);
          break;

        case 'd':
          printf("option d with value '%s'\n", optarg ? optarg : "NULL");
          break;

        case '?':
          printf("Unknown option %c\n", optopt);
          break;

        case ':':
          printf("Missing option for %c\n", optopt);
          break;

        default:
          printf("?? getopt returned character code 0%o ??\n", c);
       }
  }

  return 0;
}
Sample output:
Command lineOutput
./a.out --delete=foo -c5 --add=yes --append
long option delete with arg foo
option c with value '5'
long option add with arg yes
long option append
./a.out --d=foo --ad=yes --ap --a
long option delete with arg foo
long option add with arg yes
long option append
Unknown option
./a.out --create=5 --create 6 --c=7 --c 8  
long option create with arg 5
long option create with arg 6
long option create with arg 7
long option create with arg 8
./a.out --file=5 --file 6 --file7
long option file with arg 5
long option file
regular argument '6'
Unknown option 

With the long option, you don't have to provide the entire string. If getopt can deduce what the option is with only a few characters, it will match that option. In the example above, --d matches --delete because it is the only option that begins with --d. In the case of --add and --append, two characters are necessary to disambiguate between them because they both begin with --a.

Another example (partial). This example shows how to associate a short option with the long option. Note the fourth field of the struct is not 0, but is the associated short option:
while (1) 
{
    int option_index = 0;
    static struct option long_options[] = 
    {
        {"add",     required_argument, NULL,  'a'},
        {"append",  no_argument,       NULL,  'p'},
        {"delete",  required_argument, NULL,  'd'},
        {"verbose", no_argument,       NULL,  'v'},
        {"create",  required_argument, NULL,  'c'},
        {"file",    optional_argument, NULL,  'f'},
        {NULL,      0,                 NULL,    0}
    };

    c = getopt_long(argc, argv, "-:a:pd:vc:f::", long_options, &option_index);
    if (c == -1)
         break;

    switch (c) 
    {
      case 0:
        printf("long option %s", long_options[option_index].name);
        if (optarg)
           printf(" with arg %s", optarg);
        printf("\n");
        break;

      case 1:
        printf("regular argument '%s'\n", optarg);
        break;

      case 'a':
        printf("option a with value '%s'\n", optarg);
        break;

     case 'p':
        printf("option p\n");
        break;

      case 'd':
        printf("option d with value '%s'\n", optarg);
        break;

     case 'v':
        printf("option v\n");
        break;

      case 'c':
        printf("option c with value '%s'\n", optarg);
        break;

      case 'f':
        printf("option f with value '%s'\n", optarg ? optarg : "NULL");
        break;

      case '?':
        printf("Unknown option %c\n", optopt);
        break;

      case ':':
        printf("Missing option for %c\n", optopt);
        break;

      default:
        printf("?? getopt returned character code 0%o ??\n", c);
     }
}
Sample output. Notice that there are no longer any long options returned from getopt as they are all short options, even if the user provided a long option:
Command lineOutput
./a.out --delete=foo -c5 --add=yes --append
option d with value 'foo'
option c with value '5'
option a with value 'yes'
option p
./a.out --d=foo --ad=yes --ap
option d with value 'foo'
option a with value 'yes'
option p
./a.out --create=5 --create 6 --c=7 --c 8  
option c with value '5'
option c with value '6'
option c with value '7'
option c with value '8'
./a.out --file=5 --file 6 --file7
option f with value '5'
option f with value 'NULL'
regular argument '6'
Unknown option 
Suppose we remove the v from the string, changing this:
"-:a:pd:vc:f::"
to this:
"-:a:pd:c:f::"
We still have this in our long options structure:
{"verbose", no_argument,       NULL,  'v'},
Running this:
./a.out -v
produces this:
Unknown option v
Running this:
./a.out --verbose
produces this:
?? getopt returned character code 0166 ??
The character code is the ASCII code (shown here in octal) for the letter v.

Long Options and Flags

Sometimes, we just want a true or false value. We want to enable or disable something.

The getopt_long provides a short-hand notation for setting flags as this example shows:

#include <getopt.h> /* getopt */
#include <stdio.h>  /* printf */

/* File scope flags, all default to 0 */
static int f_add;
static int f_append;
static int f_create;
static int f_delete;
static int f_verbose;

int main(int argc, char **argv)
{
  int c;

  while (1) 
  {
    int option_index = 0;
    static struct option long_options[] = 
    {
      {"add",     no_argument, &f_add,     1},
      {"append",  no_argument, &f_append,  1},
      {"create",  no_argument, &f_create,  1},
      {"delete",  no_argument, &f_delete,  1},
      {"verbose", no_argument, &f_verbose, 1},
      {NULL,      0,           NULL,       0}
    };

    c = getopt_long(argc, argv, "-:", long_options, &option_index);
    if (c == -1)
      break;

    switch (c) 
    {
      case 1:
        printf("regular argument '%s'\n", optarg);
        break;

      case '?':
        printf("Unknown option %c\n", optopt);
        break;
    }
  }

  printf("    f_add: %i\n", f_add);
  printf(" f_append: %i\n", f_append);
  printf(" f_delete: %i\n", f_delete);
  printf(" f_create: %i\n", f_create);
  printf("f_verbose: %i\n", f_verbose);

  return 0;
}
Sample output:
Command lineOutput
./a.out --verbose --create
    f_add: 0
 f_append: 0
 f_delete: 0
 f_create: 1
f_verbose: 1
./a.out --verbose --append --create --add --delete
    f_add: 1
 f_append: 1
 f_delete: 1
 f_create: 1
f_verbose: 1
./a.out --v --c --ap --ad --d
    f_add: 1
 f_append: 1
 f_delete: 1
 f_create: 1
f_verbose: 1
./a.out -v -c -d -a
Unknown option v
Unknown option c
Unknown option d
Unknown option a
    f_add: 0
 f_append: 0
 f_delete: 0
 f_create: 0
f_verbose: 0
Recall the option structure:
struct option 
{
  const char *name;    /* name without -- in front                                  */
  int         has_arg; /* one of: no_argument, required_argument, optional_argument */
  int        *flag;    /* how the results are returned                              */
  int         val;     /* the value to return or to put in flag                     */
};
Notes:

Summary

Behavior: Summary for short options: Summary for long options: Other points:

Other Uses For Command Line Arguments

Even if you think that you aren't going to use the command line or that no one else is going to run your program from the command line, there are other reasons why supporting command line arguments will make your program much nicer to use.

Uses in GUI Environments

Have you ever dragged and dropped a file in Windows, Mac, Linux, or any other graphical environment? (Of course you have.) Well, a lot of the functionality of drag and drop is implemented via command line arguments. That's right, command line arguments.

If you drag and drop a file onto an executable (or shortcut) in a graphical environment, you will notice that the executable is launched (run) and it opens the file that was dropped. So, if you drop a file onto Notepad.exe in Windows, Notepad will run and open the file that was dropped. This doesn't happen automatically. It works because the people that programmed Notepad provided a command line interface, which can also be used by GUI environments. (It's actually just another way to communicate with the program.)

If you dropped a file named foo.txt onto Notepad.exe, the GUI environment is really doing something like this:

Notepad.exe foo.txt
which is very similar to how you would run Notepad.exe from the command line and pass the name of the file, foo.txt, to the program.

Launching a Program from a Running Program

Most students have learned a simple trick that executing a program from within their C/C++ code can be done using the system function. If your program was running and you wanted to execute Notepad.exe (Windows, of course) from your program and have it edit foo.txt, you would do this:

system("Notepad foo.txt");
This will load Notepad.exe and pass the filename, foo.txt. to Notepad. Notepad.exe will parse the command line and realize that it is being asked to load the file foo.txt.

It is often a good idea to have your programs support command line arguments, even if you don't think that you or anyone else will ever actually execute the program from a command line. Command line arguments are a very simple way for one process to communicate with another and are supported on all operating systems in pretty much the same way.


Additional Information: