Notes for Advanced Linux Programming - 2. Writing Good GNU/Linux Software

2. Writing Good GNU/Linux Software

2.1. Interaction With the Execution Environment

2.1.1. Command Line

  • When a program is invoked from the shell, the argument list contains the entire both the name of the program and any command-line arguments provided.

% ls -s /

The argument list has three elements:

The first is the name of the program ls.

The second and third are the two command-line arguments, -s and /.

  • The main function can access the argument list via the argc and argv parameters.

  • argc is the number of items in the argument list.

  • argv is an array of character pointers.

#include <stdio.h>

int main (int argc, char* argv[])

{

    printf (“The name of this program is ‘%s’.\n”, argv[0]);

    printf (“This program was invoked with %d arguments.\n”, argc - 1);

    /* Were any command-line arguments specified? */

    if (argc > 1) {

        /* Yes, print them. */

        int i;

        printf (“The arguments are:\n”);

        for (i = 1; i < argc; ++i)

            printf ("%s\n", argv[i]);

    }

    return 0;

}

  • The arguments have two categories:options (or flags) and other arguments.

  • Options come in two forms:

  • Short options consist of a single hyphen and a single character. Short options are quicker to type.

  • Long options consist of two hyphens, followed by a name made of lowercase and uppercase letters and hyphens. Long options are easier to remember and easier to read.

Short Form

Long Form

Purpose

-h

--help

Display usage summary and exit

-o filename

--output filename

Specify output filename

-v

--verbose

Print verbose messages

  • The function getopt_long understands both short and long options.

  • Include <getopt.h>.

  • Provide two data structures.

  • The first is a character string containing the valid short options, each a single letter. An option that requires an argument is followed by a colon.

char* const short_options = “ho:v”;

  • Construct an array of struct option elements. Each element corresponds to one long option and has four fields.

  • The first field is the name of the long option

  • The second specifies whether this option takes an argument.

  • The third is NULL

  • The fourth is a character specifying the short option for the long option.

  • The last element of the array should be all zero.

const struct option long_options[] = {

{ “help”, 0, NULL, ‘h’ },

{ “output”, 1, NULL, ‘o’ },

{ “verbose”, 0, NULL, ‘v’ },

{ NULL, 0, NULL, 0 }

};

  • To use the getopt_long function, specify argc, argv, short option string, long option array.

  • Each time you call getopt_long, it parses a single option, returning the short option letter, or –1 if no more options are found.

  • Call getopt_long in a loop, and handle the specific options in a switch statement.

  • If getopt_long encounters an invalid option, it prints an error message and returns the character ‘?’

  • When handling an option that takes an argument, the global variable optarg points to the text of that argument.

  • After getopt_long has finished parsing all the options, the global variable optind contains the index (into argv) of the first nonoption argument.

#include <getopt.h>

#include <stdio.h>

#include <stdlib.h>

/* The name of this program. */

const char* program_name;

/* Prints usage information for this program to STREAM (typically stdout or stderr), and exit the program with EXIT_CODE. Does not return. */

void print_usage (FILE* stream, int exit_code)

{

    fprintf (stream, “Usage: %s options [ inputfile ... ]\n”, program_name);

    fprintf (stream,

        “ -h --help Display this usage information.\n”

        “ -o --output filename Write output to file.\n”

        “ -v --verbose Print verbose messages.\n”);

    exit (exit_code);

}

/* Main program entry point. ARGC contains number of argument list elements; ARGV is an array of pointers to them. */

int main (int argc, char* argv[])

{

    int next_option;

    /* A string listing valid short options letters. */

    const char* const short_options = “ho:v”;

    /* An array describing valid long options. */

    const struct option long_options[] = {

    { “help”, 0, NULL, ‘h’ },

    { “output”, 1, NULL, ‘o’ },

    { “verbose”, 0, NULL, ‘v’ },

    { NULL, 0, NULL, 0 } /* Required at end of array. */

    };

    /* The name of the file to receive program output, or NULL for

    standard output. */

    const char* output_filename = NULL;

    /* Whether to display verbose messages. */

    int verbose = 0;

    /* Remember the name of the program, to incorporate in messages.

    The name is stored in argv[0]. */

    program_name = argv[0];

    do {

        next_option = getopt_long (argc, argv, short_options, long_options, NULL);

        switch (next_option)

       {

        case ‘h’: /* -h or --help */

        /* User has requested usage information. Print it to standard output, and exit with exit code zero (normal termination). */

            print_usage (stdout, 0);

        case ‘o’: /* -o or --output */

        /* This option takes an argument, the name of the output file. */

            output_filename = optarg;

            break;

        case ‘v’: /* -v or --verbose */

            verbose = 1;

            break;

        case ‘?’: /* The user specified an invalid option. */

        /* Print usage information to standard error, and exit with exit code one (indicating abnormal termination). */

            print_usage (stderr, 1);

        case -1: /* Done with options. */

            break;

        default: /* Something else: unexpected. */

            abort ();

        }

    } while (next_option != -1);

    /* Done with options. OPTIND points to first nonoption argument. For demonstration purposes, print them if the verbose option was specified. */

    if (verbose) {

        int i;

        for (i = optind; i < argc; ++i)

            printf (“Argument: %s\n”, argv[i]);

    }

    /* The main program goes here. */

    return 0;

}

2.1.2. Standard I/O

  • The standard C library provides three streams

  • Standard input stream: stdin

  • Standard output streams: stdout

  • standard error stream: stderr

  • These three streams are also accessible with the underlying UNIX I/O commands via file descriptors.

  • These are file descriptors 0 for stdin, 1 for stdout, and 2 for stderr.

  • You can redirect both standard output and standard error to a file or pipe

% program > output_file.txt 2>&1

% program 2>&1 | filter

The 2>&1 syntax indicates that file descriptor 2 (stderr) should be merged into file descriptor 1 (stdout).

To redirect stdout and stderr to a file, 2>&1 must follow a file redirection.

To redirect stdout and stderr to a pipe, 2>&1 must precede a pipe redirection.

  • Stdout is buffered. You can explicitly flush the buffer by calling the following: fflush (stdout);

  • stderr is not buffered.

2.1.3. Program Exit Codes

  • When a program ends, it indicates its status with an exit code.

  • The exit code is a small integer

  • An exit code of zero denotes successful, while nonzero exit codes indicate error.

  • You can obtain the exit code of the most recently executed program using the special $? Variable

% ls /

bin coda etc lib misc nfs proc sbin usr

boot dev home lost+found mnt opt root tmp var

% echo $?

0

% ls bogusfile

ls: bogusfile: No such file or directory

% echo $?

1

2.1.4. The Environment variables

  • The environment is a collection of variable/value pairs.

  • Common environment variables

  • USER contains your username.

  • HOME contains the path to your home directory.

  • PATH contains a colon-separated list of directories through which Linux searches for commands you invoke.

  • printenv program will print the current environment.

  • you access an environment variable with the getenv function.

  • use the setenv and unsetenv functions to set or clear environment variables.

  • You must access a special global variable named environ to enumerate all the variables in the environment.

#include <stdio.h>

/* The ENVIRON variable contains the environment. */

extern char** environ;

int main ()

{

    char** var;

    for (var = environ; *var != NULL; ++var)

    printf (“%s\n”, *var);

    return 0;

}

  • when a new program is started, it inherits the environment of the program that invoked it.

  • Environment variables are commonly used to communicate configuration information to programs.

2.1.5. Using Temporary Files

  • Temporary files are stored in the /tmp directory.

  • Different instances should use different temporary filenames.

  • Unauthorized users cannot alter the program’s execution by modifying or replacing the temporary file.

  • Temporary filenames should be generated in a way that cannot be predicted externally.

  • The mkstemp function

  • creates a unique temporary filename from a filename template,

  • creates the file with permissions so that only the current user can access it,

  • opens the file for read/write.

  • Temporary files created with mkstemp are not deleted automatically.

  • If the temporary file is for internal use only and won’t be handed to another program, it’s a good idea to call unlink on the temporary file immediately.

#include <stdlib.h>

#include <unistd.h>

/* A handle for a temporary file created with write_temp_file. In this implementation, it’s just a file descriptor. */

typedef int temp_file_handle;

/* Writes LENGTH bytes from BUFFER into a temporary file. The temporary file is immediately unlinked. Returns a handle to the temporary file. */

temp_file_handle write_temp_file (char* buffer, size_t length)

{

    /* Create the filename and file. The XXXXXX will be replaced with characters that make the filename unique. */

    char temp_filename[] = “/tmp/temp_file.XXXXXX”;

    int fd = mkstemp (temp_filename);

    /* Unlink the file immediately, so that it will be removed when the file descriptor is closed. */

    unlink (temp_filename);

    /* Write the number of bytes to the file first. */

    write (fd, &length, sizeof (length));

    /* Now write the data itself. */

    write (fd, buffer, length);

    /* Use the file descriptor as the handle for the temporary file. */

    return fd;

}

/* Reads the contents of a temporary file TEMP_FILE created with write_temp_file. The return value is a newly allocated buffer of those contents, which the caller must deallocate with free. LENGTH is set to the size of the contents, in bytes. The temporary file is removed. */

char* read_temp_file (temp_file_handle temp_file, size_t* length)

{

    char* buffer;

    /* The TEMP_FILE handle is a file descriptor to the temporary file. */

    int fd = temp_file;

    /* Rewind to the beginning of the file. */

    lseek (fd, 0, SEEK_SET);

    /* Read the size of the data in the temporary file. */

    read (fd, length, sizeof (*length));

    /* Allocate a buffer and read the data. */

    buffer = (char*) malloc (*length);

    read (fd, buffer, *length);

    /* Close the file descriptor, which will cause the temporary file to go away. */

    close (fd);

    return buffer;

}

  • If you don’t need to pass the temporary file to another program, you can use the tmpfile function.

  • creates and opens a temporary file

  • returns a file pointer to it.

  • The temporary file is already unlinked, so it is deleted automatically when the file pointer is closed (with fclose) or when the program terminates.

1.2. Writing and Using Libraries

1.2.1. Archives

  • An archive is simply a collection of object files stored as a single file.

  • When you provide an archive to the linker, the linker searches the archive for the object files it needs, extracts them, and links them into your program.

  • You can create an archive using the ar command.

% ar cr libtest.a test1.o test2.o

  • The sequence of archives specified on the command line is very important.

  • When the linker encounters an archive on the command line, it searches the archive for all definitions of symbols that are referenced from the object files that it has already processed but not yet defined.

  • The object files that define those symbols are extracted from the archive and included in the final executable.

1.2.2. Shared Libraries

  • When a shared library is linked into a program, the final executable does not actually contain the code that is present in the shared library. It just contains a reference to the shared library.

  • The library can be “shared” among all the programs that link with it.

  • A shared library is not merely a collection of object files, the object files that compose the shared library are combined into a single object file.

  • A program that links against a shared library always includes all of the code in the library or none, rather than just those portions that are needed.

  • To create a shared library, you must compile the objects with the -fPIC option.

% gcc -c -fPIC test1.c

  • You combine the object files into a shared library

% gcc -shared -fPIC -o libtest.so test1.o test2.o

  • Linking with a shared library is just like linking with a static archive.

% gcc -o app app.o -L. –ltest

  • The linker prefers to choose the shared library version, unless you explicitly choose the static one.

% gcc -static -o app app.o -L. –ltest

  • The ldd command displays the shared libraries that are linked into an executable.

  • When you link a program with a shared library, the linker does not put the full path to the shared library in the resulting executable.

  • It places only the name of the shared library.

  • When the program is actually run, the system searches for the shared library and loads it.

  • The system searches only /lib and /usr/lib, by default.

  • Use the -Wl,-rpath option when linking the program, the system will search /usr/local/lib when app is run.

  • Setting the LD_LIBRARY_PATH environment variable when running the program will let the linker search the directories given there in addition to the directories given with the -L option when it is building an executable.

  • Create test1.c

void f1()

{

printf("Test 1 fucntion 1");

}

gcc -c -fPIC test1.c

-rw-rw-r-- 1 liuchao liuchao 1124 Jul 31 19:51 test1.o

  • Create test2.c

void f2()

{

printf("Test 2 function 2");

}

gcc -c -fPIC test2.c

-rw-rw-r-- 1 liuchao liuchao 1124 Jul 31 19:52 test2.o

  • Create libstatictest.a

ar cr libstatictest.a test1.o test2.o

-rw-rw-r-- 1 liuchao liuchao 2508 Jul 31 19:53 libstatictest.a

  • Create libdynamictest.so

gcc -shared -fPIC -o libdynamictest.so test1.o test2.o

-rwxrwxr-x 1 liuchao liuchao 4573 Jul 31 19:54 libdynamictest.so

  • Create app.c

int main(int argc, char* argv[])

{

f2();

}

gcc -c app.c

-rw-rw-r-- 1 liuchao liuchao 772 Jul 31 19:55 app.o

  • Create staticapp

gcc -o staticapp app.o -L. –lstatictest

-rwxrwxr-x 1 liuchao liuchao 4861 Jul 31 19:56 staticapp

  • Create dynamicapp

gcc -o dynamicapp app.o -L. -ldynamictest

  • Run staticapp

./staticapp

Test 2 function 2

  • Run dynamicapp

./dynamicapp

./dynamicapp: error while loading shared libraries: libdynamictest.so: cannot open shared object file: No such file or directory

  • Ldd staticapp

ldd staticapp

linux-gate.so.1 => (0x0060d000)

libc.so.6 => /lib/libc.so.6 (0x007a3000)

/lib/ld-linux.so.2 (0x00786000)

  • Ldd dynamicapp

ldd dynamicapp

linux-gate.so.1 => (0x00861000)

libdynamictest.so => not found

libc.so.6 => /lib/libc.so.6 (0x00111000)

/lib/ld-linux.so.2 (0x00786000)

  • export LD_LIBRARY_PATH=.

export LD_LIBRARY_PATH=.

echo $LD_LIBRARY_PATH

.

./dynamicapp

Test 2 function 2

1.2.3. Library Dependencies

  • One library will often depend on another library.

Libtiff which contains functions for reading and writing image files in the TIFF format uses the libraries libjpeg for JPEG and libz for compression.

  • Create a program that uses libtiff.

#include <stdio.h>

#include <tiffio.h>

int main (int argc, char** argv)

{

TIFF* tiff;

tiff = TIFFOpen (argv[1], “r”);

TIFFClose (tiff);

return 0;

}

  • compile this program and link with libtiff

% gcc -o tifftest tifftest.c –ltiff

  • This will pick up the shared-library version of libtiff. Because libtiff uses libjpeg and libz, the shared-library versions of these two are also drawn in.

% ldd tifftest

libtiff.so.3 => /usr/lib/libtiff.so.3 (0x4001d000)

libc.so.6 => /lib/libc.so.6 (0x40060000)

libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0x40155000)

libz.so.1 => /usr/lib/libz.so.1 (0x40174000)

/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

  • Static libraries cannot point to other libraries.

% gcc -static -o tifftest tifftest.c -ltiff

/usr/bin/../lib/libtiff.a(tif_jpeg.o): In function ‘TIFFjpeg_error_exit’:

tif_jpeg.o(.text+0x2a): undefined reference to ‘jpeg_abort’

/usr/bin/../lib/libtiff.a(tif_jpeg.o): In function ‘TIFFjpeg_create_compress’:

tif_jpeg.o(.text+0x8d): undefined reference to ‘jpeg_std_error’

tif_jpeg.o(.text+0xcf): undefined reference to ‘jpeg_CreateCompress’

...

  • To link this program statically, you must specify the other two libraries yourself:

% gcc -static -o tifftest tifftest.c -ltiff -ljpeg –lz

  • If two static libraries are mutually dependent, you should provide a single library multiple times on the command line.

% gcc -o app app.o -lfoo -lbar –lfoo

1.2.4. Dynamic Loading and Unloading

  • You can open a shared library named libtest.so by calling dlopen

dlopen (“libtest.so”, RTLD_LAZY)

  • To use dynamic loading functions, include the <dlfcn.h> header file and link with the –ldl option to pick up the libdl library.

  • The return value of dlopen is a void * that is used as a handle for the shared library.

  • You can pass the handle to the dlsym function to obtain the address of a function that has been loaded with the shared library.

void* handle = dlopen (“libtest.so”, RTLD_LAZY);

void (*test)() = dlsym (handle, “my_function”);

(*test)();

dlclose (handle);

  • Both dlopen and dlsym return NULL if they do not succeed. In that event, you can call dlerror (with no parameters) to obtain a human-readable error message describing the problem.

  • The dlclose function unloads the shared library.

posted @ 2010-02-11 11:52  刘超觉先  阅读(715)  评论(0编辑  收藏  举报