... e input to another program. Tools and applications There are hundreds of tools available to UNIX users, although some have been written by third party vendors for specific applications. Typically, tools are grouped into categories for certain functions, such as word processing, business applications, or programming. The UNIX Kernel Technically speaking, the UNIX kernel 'is' the operating system. It provides the basic full time software connection to the hardware.

By full time, I mean that the kernel is always running while the computer is turned on. When the system boots up, the kernel is loaded. Likewise, the kernel is only exited when the computer is turned off. The UNIX kernel is built specifically for a machine when it is installed. It has a record of all the pieces of hardware it needs to talk to and knows what languages they speak (how to turn switches on and off to get a desired result). Thus, a kernel is not easily ported to another computer.

Each individual computer will have its own tailor- made kernel. And if the computer's hardware configuration changes during its life, the kernel must be 'rebuilt' (told about the new pieces of hardware). However, though the connection between the kernel and the hardware is 'hard coded' to a specific machine, the connection between the user and the kernel is generic. regardless of how the kernel interacts with the hardware, no matter which UNIX computer you use, you will have the same kernel interface to work with. The kernel also handles memory management, input and output requests, and process scheduling for time-shared operations (we " ll talk more about what this means later). To help it with its work, the kernel also executes daemon programs which stay alive as long as the machine is turned on and help perform tasks such as printing or serving web documents.

UNIX operating system has built in 'shells' which wrap around the kernel and provide a much more user-friendly interface. Let's take a look at shells. There are five commonly used shells: /bin / sh - the Bourne Shell. This is the oldest, least-common denominator shell with the fewest commands.

Programs created with commands from the Bourne shell (called scripts) will run on some of the other shells (particularly BASH and KSH below). /bin / bash - the Bourne Again Shell, or BASH. This is the shell developed and maintained by GNU, AKA the Free Software Foundation. BASH is the default shell in Linux, and is an extension of the Bourne shell.

This means that scripts developed in the Bourne shell will run in BASH, but BASH scripts will not necessarily run in the Bourne shell. All of the commands available in the Bourne shell are also available in BASH, as well as many more. /bin / kwh - the Korn Shell, or KSH. This shell, another extension of the Bourne shell, was developed by David Korn.

Contains all the commands from the Bourne shell plus lots of extras. Again, Bourne shell scripts will always run in KSH, but the reverse is not necessarily true. Linux distributions and Cygwin sometimes are shipped without KSH. /bin / cash - the C Shell or CSH. Using the C shell is NOT recommended. Programming in the C shell is considered harmful.

You have been warned. /bin / tc sh - the TC Shell or TCSH - a useful extension of the C shell that is OK to use. Structure of the file system UNIX file systems look a bit like an inverted tree. At the top of the tree is the single directory /, the root directory.

This corresponds roughly with Window's C: directory. Do not confuse the root directory with the root user! In some UNIX systems, the root user's home directory is /, but in others (notably Linux), the root user has a special home called /root. Inside the root directory are the main directories where the UNIX files live. Nearly all of these also have directories inside them, and these subdirectories often have many levels of subdirectories of their own. Hence, UNIX has a hierarchical file system (directories inside directories) as opposed to a flat file system (as in the ancient MacIntosh es) where only one level of directory is allowed.

Files in the computer have to be in some directory somewhere, even the root directory. A few of the subdirectories of the root directory are pretty common to all UNIX systems: /bin - system-wide executable's go here / dev - all the hardware device files go here / etc - all system-wide configuration files go here / home - users' home directories live here / lib - for executable library files only / top - for temporary files - this one can be cleaned out since no valuable stuff should be put here / us - this is where programs (non-system executable's) live - usually the biggest directory because all programs used by everybody go here / var - a special directory that contains print queues and other weird beasts The UNIX file system is organised as a hierarchy of directories starting from a single directory called root which is represented by a / (slash). Imagine it as being similar to the root system of a plant or as an inverted tree structure. Immediately below the root directory are several system directories that contain information required by the operating system.

The file holding the UNIX kernel is also here. The PDP-7 Unix file system Structurally, the file system of PDP-7 Unix was nearly identical to today's. It had 1) An i-list: a linear array of i-nodes each describing a file. An i-node contained less than it does now, but the essential information was the same: the protection mode of the file, its type and size, and the list of physical blocks holding the contents. 2) Directories: a special kind of file containing a sequence of names and the associated i-number. 3) Special files describing devices.

The device specification was not contained explicitly in the i-node, but was instead encoded in the number: specific i-numbers corresponded to specific files. Process control 'process control,' mean the mechanisms by which processes are created and used; today the system calls fork, exec, wait, and exit implement these mechanisms. Unlike the file system, which existed in nearly its present form from the earliest days, the process control scheme underwent considerable mutation after PDP-7 Unix was already in use. Today, the way in which commands are executed by the shell can be summarized as follows: 1) The shell reads a command line from the terminal. 2) It creates a child process by fork. 3) The child process uses exec to call in the command from a file.

4) Meanwhile, the parent shell uses wait to wait for the child (command) process to terminate by calling exit. 5) The parent shell goes back to step 1). Processes (independently executing entities) existed very early in PDP-7 Unix. There were in fact precisely two of them, one for each of the two terminals attached to the machine. There was no fork, wait, or exec. There was an exit, but its meaning was rather different, as will be seen.

The main loop of the shell went as follows. 1) The shell closed all its open files, then opened the terminal special file for standard input and output (file descriptors 0 and 1). 2) It read a command line from the terminal. 3) It linked to the file specifying the command, opened the file, and removed the link.

Then it copied a small bootstrap program to the top of memory and jumped to it; this bootstrap program read in the file over the shell code, then jumped to the first location of the command (in effect an exec). 4) The command did its work, then terminated by calling exit. The exit call caused the system to read in a fresh copy of the shell over the terminated command, then to jump to its start (and thus in effect to go to step 1). The initial implementation of fork required only 1) Expansion of the process table 2) Addition of a fork call that copied the current process to the disk swap area, using the already existing swap IO primitives, and made some adjustments to the process table.

In fact, the PDP-7's fork call required precisely 27 lines of assembly code. Of course, other changes in the operating system and user programs were required, and some of them were rather interesting and unexpected. But a combined fork-exec would have been considerably more complicated, if only because exec as such did not exist; its function was already performed, using explicit IO, by the shell. The exit system call, which previously read in a new copy of the shell (actually a sort of automatic exec but without arguments), simplified considerably; in the new version a process only had to clean out its process table entry, and give up control. IO Redirection The very convenient notation for IO redirection, using the '>' and 'xx to get a listing of the names of one's files in xx, on Multics the notation was ioc all attach user output file attach user output syn user i / o Even though this very clumsy sequence was used often during the Multics days, and would have been utterly straightforward to integrate into the Multics shell, the idea did not occur to us or anyone else at the time.

I speculate that the reason it did not was the sheer size of the Multics project: the implement ors of the IO system were at Bell Labs in Murray Hill, while the shell was done at MIT. We didn't consider making changes to the shell (it was their program); correspondingly, the keepers of the shell may not even have known of the usefulness, albeit clumsiness, of ioc all. (The 1969 Multics manual [4] lists ioc all as an 'author-maintained,' that is non-standard, command. ) Because both the Unix IO system and its shell were under the exclusive control of Thompson, when the right idea finally surfaced, it was a matter of an hour or so to implement it. Pipes One of the most widely admired contributions of Unix to the culture of operating systems and command languages is the pipe, as used in a pipeline of commands. Of course, the fundamental idea was by no means new; the pipeline is merely a specific form of co routine.

Even the implementation was not unprecedented, although we didn't know it at the time; the 'communication files' of the Dartmouth Time-Sharing System [10] did very nearly what Unix pipes do, though they seem not to have been exploited so fully. Pipes appeared in Unix in 1972, well after the PDP-11 version of the system was in operation, at the suggestion (or perhaps insistence) of M. D. McIlroy, a long-time advocate of the non-hierarchical control flow that characterizes co routines. Some years before pipes were implemented, he suggested that commands should be thought of as binary operators, whose left and right operand specified the input and output files. Thus a 'copy' utility would be commanded by input file copy outputfileTo make a pipeline, command operators could be stacked up.

Thus, to sort input, paginate it neatly, and print the result off-line, one would write input sort paginate offprint In today's system, this would correspond to sort input | pr | opr The idea, explained one afternoon on a blackboard, intrigued us but failed to ignite any immediate action. There were several objections to the idea as put: the infix notation seemed too radical (we were too accustomed to typing 'cp x y' to copy x to y); and we were unable to see how to distinguish command parameters from the input or output files. Also, the one-input one-output model of command execution seemed too confining. What a failure of imagination! Some time later, thanks to McIlroy's persistence, pipes were finally installed in the operating system (a relatively simple job), and a new notation was introduced. It used the same characters as for I/O redirection. For example, the pipeline above might have been written sort input >pr>or>The idea is that following a '>' may be either a file, to specify redirection of output to that file, or a command into which the output of the preceding command is directed as input.

The trailing '>' was needed in the example to specify that the (nonexistent) output of or should be directed to the console; otherwise the command or would not have been executed at all; instead a file or would have been created. The new facility was enthusiastically received, and the term 'filter' was soon coined. Many commands were changed to make them usable in pipelines. For example, no one had imagined that anyone would want the sort or pr utility to sort or print its standard input if given no explicit arguments. Soon some problems with the notation became evident.

Most annoying was a silly lexical problem: the string after '>' was delimited by blanks, so, to give a parameter to pr in the example, one had to quote: sort input >'pr -2'>or>Second, in attempt to give generality, the pipe notation accepted ''; this meant that the notation was not unique. One could also write, for example, or.