Assignment: Pointers

Contents

Overview

This assignment is a warmup assignment to practice using pointers and manual memory allocation in C.

This assignment prepares you for next week’s Commands assignment, where you will implement a shell command parser as a small library in C. The shell is the program that reads and interprets the commands you type at a command line terminal. To build a command parser, we first need to get comfortable with C programming using pointers and strings.

Goals

  • To use (likely your first!) low-level pointers to program with memory at a detailed granularity.
  • To become familiar with the byte-addressable memory model through pointers and arrays at the C programming language level of abstraction.
  • To understand the link between pointer arithmetic and array indexing and representation.

Advice

Do true pair programming.

We recommend working with a partner on this assignment. Choose your own partner or find a partner by submitting this form by Friday, February 20th. Assuming you work with a partner, do your programming together, sitting in front of one screen. Take turns driving. See the Team Workflow for git setup.

You must work together: it is a violation of academic integrity to split up the work.

Programming with C pointers has a tendency to go wrong in entertaining or painful ways unless you practice careful discipline in programming.

For each function in this assignment:

  1. Plan carefully before typing any code. Consider drawing high-level memory diagrams.
  2. Implement one function at a time.
  3. Test each function extensively, including with tools for detecting pointer errors, before implementing the next.
  4. Commit each tested function before implementing more.

This careful process will save time by catching bugs early and making sure you can go back to previously working versions if things start going wrong later.

Time Reports

This warmup assignment is new this semester, so we have no self-reported times from previous semesters.

Our goal is for the median time to be around 8 hours.

Setup

Get your repository with cs240 start pointers, whether working alone or with a teammate.

Your repository initially contains the following files:

  • For the command library implementation:
    • pointers.c: a file where you will implement various functions using pointers
    • pointers.h: a C header file describing the function signatures defined in pointers.c
  • For testing, demonstration, and automation:
    • test.c: a file where you will add test cases for each function
    • Makefile: recipes to compile the various parts

Compile all code with make.

Test code by either running make test or by running make followed by ./test.bin.

Running the ./test.bin executable directly allows you to pass which subset of tests you would like to run.

For example, if you’d like to run just the first two tests, run make then ./test.bin 1 2.

You must use a CS 240 GNU/Linux computing environment for CS 240 code assignments.

Tasks

You have two goals for this assignment: 1. Add function implementations in pointers.c. 2. Test those functions in test.c using a combination of assert statements, inspecting printed output, and using the memory management tool valgrind.

Function implementations in pointers.c

pointers.c contains 12 functions you need to implement: 10 standalone functions followed by 2 command utility functions you will use in the next assignment, Commands.

Each function has a header comment that explains what you should implement.

Test cases in test.c

test.c contains a main function that drives program execution. The main function by default calls 12 test functions, one for each pointers.c implementation.

We have provided the body of the test function for the first case, test_bump_by, and the start of the next case, test_bump_by_inplace. The body uses the C library function assert to check whether the expected properties hold after the function-under-test is called. See the section below on assertions.

You must write complete the test functions for the remaining functions: 2-12.

You can run a subset of test function by passing a space-separated list, e.g., ./test.bin 1 2 3.

When a test fails (for example, before you implement anything!) you will see the following:

[avh@cs pointers] make test ./test.bin Runnning all tests 1-10 test.bin: test.c:14: test_bump_by: Assertion `bump_by(3, 4) == 7' failed. make: *** [Makefile:27: test] Aborted (core dumped)

Note that this tells you which function and line within test.c had a failure.

command_show and command_free

The last two functions you must implement are shell command utility functions we will use in the next assignment, Commands.

Both of these functions have a void return type (they instead perform other actions: printing and freeing the command array passed in, respectively). Rather than testing these with assert, you will want to manually inspect the output of command_show and use valgrind to check for the lack of memory errors.

We suggest working on the functions in-order.

First, read the header comment on the function to make sure you understand what it should do. Write the simplest test case you can think of in test.c and make sure the code still compiles when you run make.

Then, implement the functionality in pointers.c, making sure to run make frequently. Once you have an implementation, run ./test.bin 2 (for example, to test function 2) and check that none of your assertions fail. If any assertions do fail, debug your implementation. If all pass, then add additional test cases and repeat this process.

Once you have a rigorous set of test cases, repeat running the test with valgrind to check for memory safety violation: valgrind ./test.bin 2.

Once all your code and tests are implemented, you should see run valgrind ./test.bin and see something like the following (exact allocated blocks will differ based on your test cases, but you should see no leaks are possible and 0 errors): ==1535469== ==1535469== HEAP SUMMARY: ==1535469== in use at exit: 0 bytes in 0 blocks ==1535469== total heap usage: 10 allocs, 10 frees, 1,125 bytes allocated ==1535469== ==1535469== All heap blocks were freed -- no leaks are possible ==1535469== ==1535469== For lists of detected and suppressed errors, rerun with: -s ==1535469== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Assertions

Assertions are “executable documentation” or “executable specifications”: they document rules about how code should be used or how data should be structured, but they also make it easier to detect violations of these rules (a.k.a. bugs!). Use the assert(...) statement in C by including assert.h and asserting expected properties.

The provided starter code already includes code that asserts some basic functionality.

An assert statement passes when its argument is truthy, in which case execution continues as normal.

When an assert fails (when its argument is falsey), an error message will be printed and execution will halt immediately. Detecting errors early like this saves a lot of time. Add assertions to make the “rules” of your code clear wherever you make assumptions.

Debugging

valgrind and gdb are the tools of choice. We will also go over these in the Pointers and Arrays in C lab.

valgrind

Valgrind is an extended memory error checker. It helps catch pointer errors, misuse of malloc and free, and more. Run valgrind on your compiled program like this: valgrind ./test.bin or valgrind ./test.bin 2. The valgrind tool will run your program and observe its execution to detect errors at run time. See the tools page for additional reference links.

Be sure to periodically run your code under valgrind during development and testing to help catch memory errors early.

gdb

Use GDB to help debug your programs when you need more information than valgrind offers. When debugging programs with pointers, pay special attention to the pointer values your program generates. Inspect them like other variables or use the address-of (&) and dereference (*) operators at the gdb prompt to help explore program state.

Documentation and quick reference for gdb:

Does GDB tell you that it cannot display something due to compiler optimizations? If so, turn off compiler optimizations by adding -O0 (that’s “space dash Capital-Oh zero”) at the end of the CFLAGS = ... line in your Makefile. The CFLAGS variable here determines what flags are passed to the compiler. -O0 enables level zero of optimization (i.e., none). Run make clean and then make to compile your code again without optimizations.

Submission

Before submitting, disable any diagnostic printing in pointers.c. Only command_show should print, as specified.

Be sure to run make test one last time and manually check the results for correctness. You should see: ./test.bin Runnning test 1 No failed assertions in test 1 For all 12 tests, plus printed output for command_show. valgrind should report no memory errors or leaks.

Submit: The course staff will collect your work directly from your hosted repository. To submit your work:

  1. Test your source code files one last time. Make sure that, at a minimum, submitted source code is free of syntax errors and any other static errors (such as static type errors or name/scope errors). In other words: the code does not need to complete the correct computation when invoked, but it must be a valid program. We will not grade files that do not pass this bar.

  2. Make sure you have committed your latest changes. (Replace FILES with the files you changed and MESSAGE with your commit message.)

    $ git add FILES
    $ git commit -m "MESSAGE"
    
  3. Run the command cs240 sign to sign your work and respond to any assignment survey questions.

    $ cs240 sign
    

    (If this encounters an error, instead execute cs240.s26 sign.)

  4. Push your signature and your latest local commits to the hosted repository.

    $ git push
    

Confirm: All local changes have been submitted if the output of git status shows both:

  • Your branch is up to date with 'origin/main', meaning all local commits have been pushed
  • nothing to commit, meaning all local changes have been committed

Resubmit: If you realize you need to change something later, just repeat this process.

Grading

Your grade will be out of 75 pointers points derived as follows:

  • Functional Correctness (40 points):
    • Your code passes all of our (private) test cases.
  • Memory Safety and Efficiency (20 points):
    • Your code is free of memory safety violations.
    • Your code allocates no more memory than strictly necessary.
  • Design, style, documentation, and test cases (15 points):
    • Your test.c file contains sufficient test cases for normal and boundary cases for valid and invalid commands.

      You should extend test.c with your own suite of test inputs and assertions, and run it under valgrind. Careful and extensive testing will help you check that your code meets the specification and runs free of memory safety violations and memory leaks. Take this seriously.

    • Your code in pointers.c is well-organized, easy to read, and uses clear and uncomplicated control flow. Helper functions are used to simplify code where appropriate.
    • Your code is well documented, using appropriate comments to highlight aspects that may not be obvious from the code itself.

We compute the correctness, memory safety, and efficiency components of your grade by running your code on a private suite of test inputs under valgrind to detect memory safety violations, measure memory allocation, and detect memory leaks.

More Tips

printf

The standard way of printing output in C is the printf function from the stdio library. The printf function accepts a format string (a template) and additional arguments to format according to each of the format specifiers (holes in the template with directions about how to fill them given the right kind of value). Use this linked documentation/introduction to get started with printf. Here’s an example:

void display_line_info(char* command_line) {
  // Note: strlen computes the length of a string,
  // but you may not use it in this assignment.
  printf("Command line is \"%s\" (%d characters).\n",
         command_line, strlen(command_line));
}

Calling display_line_info("Hello world!") would print this output:

Command line is "Hello world!" (12 characters).

Some important things to know when using printf:

  • printf prints exactly what you tell it to, not more: you must use a newline character ('\n') if you want a new line to be printed.
  • Instead of figuring out how to build a larger string just to print it, either use the printf format string to do the building or use multiple printf calls with smaller individual strings.
  • Even when printing a preexisting string value, an explicit format string should generally be used:

    char* my_string = computed_somehow();
    // Use this:
    printf("%s", my_string);
    // Not this:
    printf(my_string);
    
  • Since displaying individual characters to the screen is expensive, standard output (stdout), the channel to which printf prints to display in the terminal, is buffered: individual pieces of output are gathered up into larger chunks and flushed to the display all at once. This improves efficiency. For most printing tasks, this implementation detail is entirely imperceptible and not necessary to understand. However, if you are debugging code that interleaves small printfs with other operations that could crash or cause valgrind errors (e.g. memory operations), there’s a chance you will be confused if you are not aware of it.
    • The newline character ('\n') typically flushes all buffer text. In other words, printing a newline causes any text that has already been printed–but has not yet appeared in the output–to appear in the output immediately, in the order it was printed.
    • Strings without a newline may not appear immediately – perhaps only when the next newline is printed.
    • To flush the buffer explicitly at any time, you can use fflush(stdout);. This is not needed for a working pointers.c implementation, but if you run into seemingly truncated output while debugging, well-placed fflush calls might help clarify things during debugging. In general, this is a good reason to use a debugger instead of print-based debugging.

C Function Headers and Declaration Order

In C, a function is allowed to be used only after (i.e., later in the file than) its declaration. This differs from Java, which allows you to refer to later methods. When declaring helper functions, you can do one of a few things to deal with this:

  1. Just declare your helper function before the functions that use it.
  2. Write a function header earlier in the file and the actual definition later in the file. The function header just describes the name and type of the function, much like an interface method in Java. For example:

     // A function header declares that such a function exists,
     // and will be implemented elsewhere.
     int helper(int x, int y);
    
     // Parameter names are optional in headers.
     int helper2(char*);
    
     void needsHelp() {
         // OK, because header precedes this point in file.
         helper(7, 8);
         helper2("hello");
     }
    
     int helper(int x, int y) {
         return x + y;
     }
    
     int helper2(char* str) {
         return 7;
     }
    
  3. If the functions would likely get used elsewhere, then put the header in a header file, a file ending in .h that contains only function headers (for related functions) and data type declarations. For example, if you added another general function (not just a helper function) for manipulating pointers, it would be best to place a function header for it with the other function headers in pointers.h so that users of your command library can call it.

    Header files are included (essentially programmatically copy-pasted) by the #include directive you often see at the tops of C source files. Then these functions can be used and their implementations will later be found elsewhere if compiled correctly.

License

Creative Commons License
Pointers by Alexa VanHattum at Wellesley College is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Source code and evaluation infrastructure for Pointers are available upon request for reuse by instructors in other courses or institutions.