- A file is a stream of bytes, and everything is a file.
- A process is an active program with all its attendant resources.
Files
By now, everyone is familiar with files and hierarchical directories, but this was not standard when Unix first came on the scene. It was a powerful organizational idea.
The uniform treatment of files as streams of bytes was also a powerful idea. Previously, other operating systems supported a wide assortment of mechanisms for storing data in files, and the programmer was burdened with keeping track of all manner of details including block sizes and how records were formatted and packed into blocks. Unix pushed those concerns further down into the operating system and its device drivers and required that storage devices support the simple, uniform stream of bytes model to the programmer.
Representing various devices as files was also a powerful idea. The fact that the programmer could treat terminal I/O and file I/O exactly the same (unless she was writing a terminal graphics package or disk management system) was a great simplification. It also provided a relatively standard model for adding new devices to the system.
The idea of reifying information and system state as virtual
files was carried further in Plan 9 from Bell
Labs. For example, Plan 9 made it possible to query running
process information via virtual files — a design present
in Linux's /proc
directory. Semantic File
Systems embedded queryable, distributed information sources
into the file system.
Finally, the uniform treatment of files led to the notions of pipes and I/O redirection, which simplified program development and prototyping, and put prototyping tools in the hands of ordinary users.
Processes
Processes are what make things happen on a computer. The first order of business is to distinguish a program from a process. A program is just a bunch of bytes in a file. It is a static pile of data lacking any agency. A process is a running program together with the necessary resources to support it (descriptors for its open files, storage for its data, information about its user and group IDs for permission checking, its priority for scheduling).A Unix process has the illusion that it has a machine to itself, i.e., Unix provides each process with a virtual address space and a virtual CPU that is separate from those of all other processes.
A process may have more than one thread of control. Each process thread has its own virtual CPU, but the threads share the same virtual memory. Thus, all threads run the same program code and can access the same data, but each has its own registers (including the program counter) and stack.
Linux is unusual among Unix systems in that it does not really
distinguish threads from processes. When a new process is
created (in other operating systems, we would say a process is
spawned), it can be created to share virtual memory (and
other resources) or not. Linux, therefore, ultimately uses the
same operating system facility for creating a new process or a
new thread, and the operating system scheduler treats them the
same.
Modified: 28 January 2008