$A just=1;invert=0;undsh='~';capsh='~';line=72;dcpi=12;und='~';cap='~'
   $A left=2;pageno=1
   $L15UM












The Structure of the EMAS 2900 Kernel


   $L9M

D. J. Rees


Department of Computer Science
Edinburgh University



   $P
   The role of the kernel of the Operating System EMAS 2900 and the
implementation of its functions is described in some detail. The
significance of local scheduling policies and their implications on the
design of the kernel are discussed with particular reference to paging
management and scheduling control. It is shown that the concept of local
and global control of resources can lead to a considerable
simplification in the structure of an operating system kernel. The
resulting EMAS 2900 provides time-sharing services very effectively and
efficiently to a large computing community.
   $N
   $L4U


Introduction

   $P
   EMAS 2900 is a multi-access time-sharing system for the ICL 2900
series of computers which was developed by a small group of staff from
the Department of Computer Science and the Edinburgh Regional Computing
Centre in Edinburgh University.  The development was in essence a
reimplementation of the EMAS system which ran on the ICL System 4-75
computer using the same underlying philosophy for the most part but
taking into account the experience of several years use and improvement
it had undergone and new insights which had resulted.  As with any large
system, the EMAS implementation had become somewhat untidy and more
difficult to maintain as time went by and as the original team left the
scene. A significant goal was therefore to achieve a simplification and
"cleaning-up" of the system in order to facilitate future in-service
developments and improvements and to postpone the inevitable point in
time when complexity escalation and the consequent poorly-understood
interactions make further changes difficult to contemplate. There was
also a strong desire to investigate and exploit the architecture of the
2900 series in relation to multi-access systems. An overall view of this
project has been described by Stephens et al. [ST80]. The purpose of the
present paper is to describe the kernel of the system in some detail, in
particular the organisation of the virtual memory control and
scheduling.
   $P
   The objectives and structure of EMAS were described by Whitfield et
al. [WH73] and much of that has been carried forward to EMAS 2900. The
objectives remain very similar, namely, to provide a large-scale
interactive system which both gives good facilities and response to
users and makes efficient use of the hardware resources. Technological
developments since the days of System-4 have changed some aspects. The
trend, for instance, towards machines with very much larger main stores
and away from drum storage as a paging medium has affected the approach
to scheduling. From the standpoint of the kernel, the most significant
features are the provision of a virtual machine for each of a large
number of users, the controlled sharing of information between those
virtual machines and the requirement for response and efficiency. A
virtual machine in EMAS 2900 consists of a virtual address space and a
virtual processing unit which provides access to the non-privileged
instruction set and to a wide variety of system services including
input-output services and a comprehensive file storage system.  The term
"process" is used here to signify the operation of such a virtual
machine. What the user sees at his console is at a much higher level
than this since he is insulated from the raw virtual machine by a
"sub-system" which provides a command interpreter and a convenient
interface to the system services.
   $P
   One of the most important design concepts in both EMAS systems has
been that of process-local page replacement policies. Overall
performance has vindicated this choice in comparison with systems using
global policies and theoretical studies also bear out the wisdom of this
choice [DE78]. The full recognition of the significance of this policy
motivated what is perhaps the main difference in structure between the
two EMAS systems. Whereas in EMAS the implementation of the local
policies was intermingled with the rest of the resident kernel, in the
EMAS 2900 system a very clear separation has been made between local
policy controllers and controllers of global functions. In principle,
each process contains an incarnation of a local controller whose
function is to control those and only those resources which have been
allocated to that process from the global scheduling controller. This
notion of completely separated local controllers has also been utilised
with very great benefit in the design of the communications sub-system,
described by Laing and Shelness in [LA81].
   $P
    The implementation of this policy of separation was greatly
facilitated by the organisation of 2900 virtual address spaces.  Each
virtual address space of 2**32 bytes is divided into two halves, a
"local" half unique to each process and a "public" half which is shared
by all processes.  The local half contains all the programs and data in
use by the user of that particular process together with an incarnation
of the local controller, the director and the sub-system. The director
is the innermost layer of software of a process and incorporates many of
the local system services such as the file system services. Its
functions in EMAS 2900 remain the same as in EMAS which was described in
[RE75]. The sub-system implements the next layer of the hierarchy which
includes the basic command interpreter, editors, compilers and loaders
etc. As much as possible of this material is shared between processes
using the standard sharing mechanisms of the system.  This includes the
director and sub-system code together with all the compilers and editors
etc. that the user may happen to be using. The local controller code is
also shared but it was found convenient to compile this as a module of
the kernel as will be clarified later. The public half of the virtual
address space contains the kernel of the system i.e. the global
controller, the message passing dispatcher, device handlers etc. The
arrangement of processes and controllers is shown in figure 1.
   $V23
   $L23


<----------------local--------------------><-----------public----------->
 _______________________________________________________________________
|            |          |        |        ||                            |
|   local    | director |  sub-  |  user  ||           kernel           |
| controller |          | system |        ||  (global controller etc.)  |
|____________|__________|________|________||____________________________|
                                          |
 _________________________________________|     _
|            |          |        |        |      |
|            |          |        |        |      |
|____________|__________|________|________|      |
                                          |      |
                                          .      |  other
                                          .      | processes
 _________________________________________       |
|            |          |        |        |      |
|            |          |        |        |      |
|____________|__________|________|________|     _|

                              figure 1

   $P
   The kernel thus appears in every process address space and indeed
always runs in virtual mode unlike most earlier hardware designs such as
the 4-75 where it ran in real address mode.  Switching between the
current local space and the kernel therefore does not involve switching
virtual machines. Peripheral interrupts, for example, can be directed to
an address in the kernel in the same virtual space whilst interrupts
such as page faults and local process time-outs can be taken directly by
the local controller. There they can immediately be dealt with according
to the resources that have been allocated to that process.  The
resources in question are primarily pages of physical storage and
CPU-time but there are also various other internally defined resources
such as "active memory" sections (described below) which also have to be
controlled.
   $P
   The basic page size of the 2900 series architecture is 1K bytes. In
the light of the implementors' experience on EMAS we decided that this
would probably be too small for best efficiency. To overcome this
problem EMAS 2900 groups these basic pages together to form larger
units. We had originally hoped to be able to experiment with different
unit sizes so as to choose the most efficient but this proved to be
infeasible mainly due to the difficulties of varying physical block
formats on disc and magnetic tape. The unit size we fixed on was the
same 4K bytes that we had used on EMAS and this gave a useful continuity
in addition to being what we felt was a sensible choice. "Pages"
hereinafter therefor refer to these 4K byte unit multiples of basic
pages.
   $P
   The whole of the operating system is written in IMP, the language
used in Edinburgh University for most systems implementation work. This
was described by Stephens in [ST74]. The architecture of the 2900 series
was designed very much with the use of this kind of high level language
in mind [BU78]. In particular, the hardware defines a stack segment
which can be used as the stack for IMP storage allocation and procedure
calling protocols. A potential problem for operating system kernels is
the size of its internal arrays when the number of users and processes
is likely to vary considerably either over time or over the various
machines in a range such the 2900 series. This has been overcome in the
EMAS 2900 kernel by making use of the fact that it runs in virtual space
itself although permanently resident. A dynamic scheme is used. Each
array that may need to be extended is mapped into a separate segment.
Initially, a suitable minimum size is chosen and thereafter physical
store pages claimed from the free page-frame list in the ordinary way
can be added onto the end of the segment and locked down as and when
required. An extra entry is then appended to the appropriate page table.
   $P
   In order to coordinate the sharing of information, physical movement
of pages is initiated and controlled globally but this is in response to
requests from local controllers. Each local controller is unaware of any
sharing of pages which is taking place between processes. It makes
requests purely for certain pages to be made available to it in main
store. The global controller then fulfills the request as best it can. 
For instance, if the page is already in store being used by another
process or is still in store from a previous usage but that physical
page-frame has not yet been re-allocated, the global controller can
simply tell the requesting local controller where the required page is
and allow it to continue without having to initiate a page transfer from
backing store.  This is all transparent to the local controller.
Similarly, when a local controller decides that the process it controls
no longer needs a particular page, it just tells the global controller
and leaves the global controller to get on with removing it while it
itself continues. If the global controller knows that another process is
still using the page it will not need to page it out. This method of
operation in which the local controller  only has to be concerned with
its own individual process makes implementation of the local controller
very much easier and the result much more reliable. The details of its
data structures and implementation are described later. The

INT:C


Command: