Contents

Preliminaries, Processes, Programming

The Plan

Preparation (before class)

Read:

  1. About CS 341 (syllabus)
  2. OSTEP: Introduction

Post questions in #readings or discuss and answer others’ questions there.

Note: See my message in the #readings stream about footnote 7 in the OSTEP Introduction.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic plan.

Live Note Board

  • Welcome, course logistics overview, questions about course?
  • Breakout: with groups, create and claim a frame in the live note board and discuss:
    • Course meta ideas:
      • What are the main types of work required by this course?
      • Why do you want to take a class about OS? What are some topics you hope to see? Is there something you have encountered in past computing that you hope this class explains? What are some skills/experiences you hope to get out of the class?
      • What are some norms you would like to uphold in discussion, collaboration, and other peer interactions in this course?
    • OS ideas:
      • What is an operating system (OS)? What are its purposes?
      • What are some aspects of the popular/non-technical understanding of “what is an OS” that do not seem feature here or are peripheral to the central jobs of the OS?
      • What are the “Three Easy Pieces” of OSTEP, the themes the book uses to explore OS? Give examples of each.
      • What’s the difference between policy and mechanism? Why is it important? (We will keep asking this one.)
      • Are there any other ideas from the reading that you found striking or surprising? Any lingering questions?
  • Merge, share, review, open questions.
  • Next Steps:
    • Poll about times for drop-in hours.
    • Prep for Tuesday.

Processes, Kernel/User Separation, Syscalls

Preparation (before class)

Read:

  1. How to Read
  2. OSTEP: Processes
  3. OSTEP: Direct Execution

Post questions in #readings or discuss and answer others’ questions there.

Note: While you are reading, consider: What is a process? Is the kernel a process? Do user programs call kernel functions? What’s the difference between a system call and a function call? These questions will factor in our in-class exercises.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic syscalls.

  1. Logistics / coming soon:
    • Drop-in hours: please fill poll by end of class.
    • Coming soon: Project Tool setup steps for Wednesday.
    • Wednesday will be a “build day” to get started on the first project.
    • Thursday will mix more Process topics and our first “paper day.”
  2. Breakout discussion (15min): Consider the reading questions above, plus:
    • Note: Drawing pictures can help with some of these.
    • What general steps are required to switch from running one process to running a different process? What key hardware storage needs to be updated? What context data needs to be saved?
    • Why is a non-cooperative approach to switching between processes preferable?
    • Any other questions from you or #readings.
  3. Review and questions: Processes, kernel, kernel/user separation, syscall example, context switch example.
  4. Breakout (long): Measuring System Calls
  5. Takeaways and wrap up.

Process API, Interprocess Communication

Preparation (before class)

Read:

  1. OSTEP: Review 4.4 (Process States)
  2. OSTEP: Review 6.3 (Switching Between Processes)
  3. OSTEP: Process API
  4. Paper: The Unix Time-Sharing System

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic processes.

  1. Quick thoughts on Syscalls.
  2. Paper Discussion
  3. Questions about Process API?
  4. Quick thoughts about file descriptors and pipes.
  5. Exercise: Practice with fork

Sharing the CPU and Memory

CPU Scheduling

Preparation (before class)

Read:

  1. OSTEP: CPU Scheduling
  2. OSTEP: Multi-level Feedback (skim)
  3. OSTEP: Lottery Scheduling (skim)

Post questions in #readings or discuss and answer others’ questions there.

Note: Consider these questions as you read. What are some important constraints that an OS scheduler needs to satisfy? What are the trade-offs between non-preemptive and preemptive schedules? (Think back to non-cooperative and cooperative process management from past readings as well.) What are some problems with simple schemes like SJF, FIFO, etc.? Which of these problems does round robin solve? Does it introduce other limitations?

For more depth, what are the key problems that multi-level feedback or lottery scheduling aim to solve? What are the big ideas behind their solutions? (Distinguish policy vs. mechanism.) How do these two scheduling techniques stack up against a simpler mechanism like round robin?

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic sched.

  • Announcements
    • Reminder: sign up for paper discussion leads. In your week, check in ahead with your team to plan the intro and discussion questions. I am quite happy to “join your team” as an advisory member if you want to check anything ahead of time or ask advice.
    • Project due tonight by end of Monday EST.
      • As mentioned Thursday, Part 3 is removed. Parts 1-2 kept folks busy enough. :) I mentioned possibly posting Part 3 for “extra fun,” but I have decided to hold onto it for now. We may use it as a structured warm-up exercise for another future project.
    • Next project out tomorrow: build a shell with I/O redirection, pipes, etc.
      • Find / report partners in #projects > teams.
    • Tomorrow (Tuesday) is Election Day. High order bits for CS 341:
      • Voting is more important than any class activity/deadline/etc., no questions asked.
      • Optional class meeting Tuesday for those who with to occupy their minds with systems of computers (rather than systems of people) for the evening. Please read, but no other participation is required. Thursday or later will catch up as needed.
      • Our norms for working together in class (welcoming, respectful, inclusive, elevating each others’ contributions as we explore systems together) continue no matter what or when the outcome of this election. (If we are not reaching that goal, please be in touch about how we could improve.)
      • I will be around Wednesday for an open “build day” on the shell. There are no structured activities scheduled.
      • I respect that this week (or beyond) could tax your focus in 341. Please be in touch if things are impacting your ability to engage.
      • For a few reasons, I do not currently anticipate holding full-class discussions of the election.
        • Confident and effective moderation of a large group conversation on a taxing subject that means many (and potentially intensely emotional) things to many people is not in my skill set.
        • If you wish or need to talk individually or in smaller groups, please seek my out—I would be glad to engage in smaller conversations.
  • Goals, now moved to its own page. Please aim to complete by Tuesday evening.
  • Quick thoughts from Ben and open questions about scheduling.
  • Scheduling exercises

Virtual Memory: Address Translation

Preparation (before class)

Read:

  1. OSTEP: Address Spaces
  2. OSTEP: Address Translation
  3. OSTEP: Segmentation (skim)
  4. OSTEP: Introduction to Paging

Post questions in #readings or discuss and answer others’ questions there.

Note: What is the difference between virtual memory/address/space and physical memory/address/space? What is the key abstraction that virtual memory provides for processes and why is that abstraction important/useful? How does address translation uphold protection and isolation of the address spaces of different processes?

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic vm-pages.

  • Drop-in hours updated with new weekly standard schedule. Also post to Zulip. I can sometimes to quick pop-ups at other times if I see your message. Or get in touch to schedule an appointment in advance.
  • Shell project is up. Get started early! Report your team on #projects > teams so I can create a GitHub repo for you.
  • Reminder: Goals, now moved to its own page. Please do this soon.
  • Short breakout: Discuss questions from reading notes above, plus:
    • Why do base/bounds approaches have fragmentation problems?
    • What if you want more regions than code, globals, heap, stack?
    • Why might paging alleviate the fragmentation issues of base/bounds approaches?
    • What are the potential benefits/costs of a relatively small page size? A relatively large page size?
  • All class open Q&A, plus general thoughts on virtual memory. (These CS 240 slides on virtual memory may also be useful visual reference.)
  • Breakout:
    • Work with the “Homework (Simulation)” exercises from OSTEP: Paging to practice address translation in a paging system, using the vm-paging simulator.
    • There’s no need to submit, but feel free to create a shared scratchpad.
  • Start shell if you have time.

Virtual Memory: TLBs, Page Tables, Swap

Preparation (before class)

Read:

  1. OSTEP: Translation Lookaside Buffers
  2. OSTEP: Advanced Page Tables
  3. OSTEP: Swapping Mechanisms
  4. OSTEP: Swapping Policies
  5. Paper: Hints for Computer System Design

Post questions in #readings or discuss and answer others’ questions there.

Note: I also recommend Prof. Alexandra (Sasha) Fedorova’s blog post, Why mmap is faster than system calls.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic vm-opt-swap.

  • Thursday:
    • Paper discussion: Hints for Computer System Design
    • Shell
      • If you did not get a GitHub notification email for your new repo, log into GitHub, then visit the CS 341 organization, where you should see your team repo listed.
        • Useful GitHub feature: Issues is a simple tool for organized discussion of new features to add, to-do items, long-form questions/discussions about design/implementation, nontrivial bugs, debugging processes, etc. To notify me directly about an issue that needs my eyes, just use @bpw in the text of that issue.
  • Monday:
    • Virtual memory paging mechanism highlights (live notes), questions:
    • Virtual memory performance:
      • TLB miss rate
      • replacement policy
      • working set size, thrashing, OOM
      • How can fork-exec be fast if it has to copy the private virtual address space / contents? COW!
    • Virtual memory manipulation API/syscalls (POSIX, including Linux)
      • Manipulate memory mappings: most munificent marvelous mmap
      • Manipulate memory protections: mprotect

Concurrency and Threads

Threads and Locks

Preparation (before class)

Read:

  1. OSTEP: Concurrency and Threads
  2. Sophomoric Parallelism and Concurrency (SPAC), Section 6: Basic Shared-Memory Concurrency
  3. OSTEP: Locks

Post questions in #readings or discuss and answer others’ questions there.

Note: These readings will have you jump between C (OSTEP) and Java (SPAC) views of concurrency. Questions to consider while reading: What is they key difference between a process and a thread? What are some examples of kernel data structures we have considered that must require some synchronization (such as locking) to control concurrent access safely? Does your answer differ depending on whether there is one CPU vs multiple CPUs?

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic threads.

  • Threads
    • Quick breakouts to cover discussion questions.
    • Open questions. (Notes)
    • More on Thursday and next week, including practice programming.
  • Shell check-in.
  • End early! Convert to drop-ins for shell hacking.
  • Tomorrow: launch virtual memory project, some structured activity.

Virtual Memory Workshop

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic vm-workshop.

  • mmap and demand paging intro
  • mmap exercise
  • Report teams for next project in #projects > teams.

Multicore Scheduling

Preparation (before class)

Read:

  1. OSTEP: Multi-CPU Scheduling (skim)
  2. Paper: Managing Contention for Shared Resources on Multicore Processors

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic multi-sched.

More Concurrency

More Locks

Preparation (before class)

Read:

  1. OSTEP: Locks (review)
  2. OSTEP: Locked Data Structures
  3. SPAC, Sections 7 and 8: Race Conditions, Concurrency Programming Guidelines

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic locks.

  • Breakout. Discuss any/all:
    • What’s the difference between concurrency and parallelism? (4-minute (or 2min @ 2x!) answer from 251)
      • When are threads concurrency or parallelism useful? When is there a benefit to building a multithreaded concurrent or parallel program instead of a sequential program?
      • How can threads implement parallelism?
      • How can threads implement concurrency?
    • Can you think of an example where using the same number of threads as the number of available CPU cores is a better choice than using more threads than cores? Vice versa? Neither? What does “better” mean?
    • Lock implementation:
      • Why/when could a spinlock be inefficient (or worse)? Which of correctness, fairness, performance might be impacted?
      • Under what sorts of program executions / behavior would a spinlock work out OK? Think about level of contention.
      • Curious about other lock implementation strategies? Check out “biased locking.”
  • Notes for today: please index by breakout number.

Condition Variables, Semaphores, Barriers

Preparation (before class)

Read:

  1. OSTEP: Condition Variables
  2. OSTEP: Semaphores
  3. SPAC, Sections 9 and 10: Deadlock and Additional Synchronization Primitives

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic coord.

  • Notes
  • Condition Variables, Semaphores.
  • Bounded Buffer / Producer-Consumer.
  • Demo with race conditions / locks.
  • Exercise 1: Thread programming warmup.
    • Everybody (or at least 1 member per team): clone this repository and get started.
    • Later, 1 breakout team member: create a private repo of the form counter-threads-TEAM, replacing TEAM by a name for your team. Keep the repo under our course organization (wellesleycs341f20t2). Share it with your breakout teammates by GitHub username. (Settings > Manage Access > Invite teams or people.)
  • Exercise 2: Semaphore practice.
    • Pull up a copy of The Little Book of Semaphores, by Allen Downey.
    • Skim over Chapter 2 to find the book’s notation conventions for the semaphore primitives you already know. This should be really quick.
      • Feel free to pick whichever of the various semaphore notation/naming schemes makes most sense to your group. The important part is to be sure you are all on the same page about what your notation/names mean.
    • Devise pseudocode solutions to the simple problems in Chapter 3. The book follows with solutions so you can check your ideas.
    • Try out these problems from Chapter 4:
      • Producer-consumer
      • Readers-writers
      • Dining philosophers
  • Soon: programming project with threads.

Concurrency Bugs

Preparation (before class)

Read:

  1. SPAC, Sections 7 and 9: Race Conditions, Deadlock (review)
  2. OSTEP: Concurrency Bugs
  3. Paper: Eraser: A Dynamic Data Race Detector for Multithreaded Programs

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic concurrency-bugs.

IO, Storage

IO and Storage

Preparation (before class)

Read:

  1. OSTEP: I/O Devices
  2. OSTEP: Hard Disk Drives (skim)
  3. OSTEP: RAID (skim)

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic io.

Today we will focus on some design questions about I/O, storage, and OS support for them. (Here’s a Google Doc version for your group to take notes.) These questions are less about understanding low-level mechanical details and more about considering how constraints of the I/O model and device technology may motivate different designs in the big picture. Prioritize #1.

  1. Should device driver code run:

    • In kernel space, controlled from user application processes via system calls?
    • In user space, with direct unmediated control by each user application process that wants to use the device?
    • In user space in a dedicated “server” process for each device, with control from “client” user application processes by inter-process communication (IPC, e.g., pipes, sockets)?

    For these options, consider:

    • Which options are in use by mainstream operating systems?
    • What are the pros and cons of each option? Consider the goals of virtualization, protection, resource management, access control/security, performance, …
    • Do the tradeoffs differ based on the type of device or I/O? Consider: storage devices (hard disks, solid-state disks), network interfaces, audio/video devices for live performance, embedded systems such as robotics, medical devices, or transportation systems
    • Do the tradeoffs differ based on the applications running on the system? Consider: a shared server like tempest, your personal laptop, a performance-critical dedicated database server (with little else running on the system), embedded systems such as robotics, medical devices, or transportation systems

    Overall, if tasked with designing a new OS, what policy would you recommend?

  2. Compare and contrast CPU scheduling and disk scheduling.
    • What are some similarities and differences between:
      • The fundamental constraints of the scheduling problem
      • Scheduling algorithms that are used in practice
      • Time scales
    • Are there any ways in which CPU scheduling and disk scheduling on the same system could coordinate that would improve performance or utilization?
  3. RAID (focus mainly on RAID 0, 1, and 5, summarized more briefly here)
    • In what ways can RAID improve performance, reliability?
      • Which RAID configurations benefit which of these properties? How?
      • Which RAID configurations could actually harm which of these properties? How?
    • Hard disks are increasingly rare, replaced by solid-state drives based on flash and other technologies (next week).
    • How does RAID simplify or complicate disk scheduling?

File Systems

File Systems

Preparation (before class)

Read:

  1. OSTEP: Files and Directories
  2. OSTEP: File System Implementation
  3. OSTEP: FFS (skim)

Post questions in #readings or discuss and answer others’ questions there.

Note: What if storage devices were less like hard disks and more like persistent random access memories? Would you design anything about the file system differently?

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic fs.

  • Pressing questions about file system design?
  • Design Discussion:

    What if storage devices were less like hard disks and more like persistent random access memories? Would you design anything about the file system differently?

    This is not an unrealistic question to ask. Many file system designs date from disk days. New storage devices (Flash SSDs, NVMe, PCM, Memristors, …) are organized and can perform closer and closer to persistent RAM.

    To be clear, this design exercise will take us outside how commodity hardware and OSes usually work today, but we might stumble upon some ideas for how they could be improved or reorganized in the near future. Research has already been exploring this direction and some products may be starting to shift gradually.

    Consider:

    • Think of specific examples where current file system APIs are arise from or are tied to assumptions about how persistent storage works. What file system API functions fit this pattern? What file system API functions seem more independent of storage model?
    • If we keep the file system as a separate abstraction, as it is currently, would it be beneficial to change the API used to operate on individual files? (open, read, write, etc.)
      • Do files need to be treated like byte/block sequences as they are now? (Think seeking, cursors, etc.)
      • Are there better ways of working with storage for certain kinds of data?
    • Is there a need for caching or buffering file access? Does this simplify crash consistency (Come back to this crash question on Thursday!)
    • What if we erase the distinction between memory and persistent storage? What if everything can be accessed like memory–in the address space–but it persists even after a process terminates and even after the machine is power cycled?
      • How do memory allocation/management and disk space management relate?
      • If a process is an address space + logical control flow, but address spaces name persistent storage, is it useful to have an idea of a “persistent process” that never ends (but is not always running)? Could we have processes that live beyond reboot? What’s the difference between a “persistent process” and a program? What sorts of processes might fit this model? What sorts might not?
      • How do we deal with memory corruption and program crashes? (What do you do or rely on in the current model to deal with these?)
      • Is there still a use for layering a file system on top of the persistent memory model or does “storage as memory” suffice?
      • Is it useful to have non-persistent memory (like we think of memory today: contents disappear upon power cycle) or could we go to a single kind of all-encompassing persistent memory-storage?
    • Think about performance, virtualization, protection, security, concurrency, fault tolerance, …
  • Remainder: project check-in time.

File System Performance

Preparation (before class)

Read:

  1. Paper: The Design and Implementation of a Log-Structured File System
  2. OSTEP: Flash-based SSDs (skim)

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic fs-perf.

  • Paper Discussion: The Design and Implmentation of a Log-Structured File System
  • Remaining class meetings this week:
    • Extra time today: project drop-in hours
    • Wednesday: project drop-in hours
    • Thursday: optional reading/discussion on file system safety and consistency (or work on project instead if you prefer).
  • Class meetings next week:
    • Monday: Paper discussion about virtual machines (default topic). If there’s a strong consensus for another area of interest (by Thursday this week), we can adjust.
    • Tuesday: project drop-in hours
    • Wednesday-Thursday: project presentations (sign up for a timeslot)

Project Drop-in Hours

File System Safety and More

Preparation (before class)

Read:

  1. OSTEP: Crash Consistency
  2. OSTEP: Data Integrity and Protection (skim)
  3. The Zettabyte File System (skim)

Post questions in #readings or discuss and answer others’ questions there.

Note: Today’s reading/participation are optional. Read/join if you are interested. Work on the project if you prefer.

Beyond

Virtual Machines

Preparation (before class)

Read:

  1. OSTEP: Virtual Machine Monitors (skim)
  2. Paper: Xen and the Art of Virtualization

Post questions in #readings or discuss and answer others’ questions there.

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic vmm.

Project Drop-in Hours

Project Presentations

Agenda and Activities (in class)

Feel free to post questions that come up during class on Zulip in the #live stream using topic pres.

Open Final Project Presentations (schedule)