Construction and architecture of a dedicated compile server

Introduction
Previous work
Servant architecture
Compilation server file system: Local File System (LFS)
Compilation server performance measurements
Elves and Shoemaker approach to a compilation server
Conclusion and Future work
References

Gaius Mulley and Keith Verheyden
School of Computing
University of Glamorgan

ABSTRACT

A compilation server has been built upon a microkernel, employing a lightweight dedicated remote procedure call protocol, a reduced but fast local file system and input/output libraries which exploit these characteristics. The performance of the compilation server remote procedure call protocol has been measured and it can out perform NFS by a factor of five. The architecture of the server and clients also contribute to increased performance and ease of maintenance.

Introduction

Traditionally a department network is constructed around a large filestore which serves a number of clients. The filestore is normally directly manipulated by a dedicated fileserver host which obeys remote procedure calls issued by the client. The performance of such a configuration is limited by the network throughput, the fileserver and client response time. In a teaching department where laboratory work is being undertaken on many clients, great demands are briefly placed on a fileserver and clients for the duration of the timetabled laboratory.

There is a continual push to maximize laboratory resources which is countered by technology obsolescence. Typically a workstation has a realistic life span of three years. It is often the case that after this period the older machines are perceived as being of less use. Although the new machines might have improved resources, the older machines, while unable to deliver the required performance of new applications might be capable of functioning as a dedicated server.

As Pike ¹⁰ reported, the workstation reaches redundancy quickly as it is too slow for fast compilation and too expensive just to be used as an X terminal. The approach taken by Plan 9 ¹¹ and Chorus ¹ is to rely more on specialist system and application servers. Users interact with inexpensive graphic terminals in the case of Plan 9 and X terminals with Chorus. This method is pragmatic and utilizes appropriate hardware for a particular task.

The approach presented here is to examine a specialist application server on a network supporting two operating systems. This paper will report on the benefits of a compilation server, the design decisions taken and some preliminary performance results of the server.

Previous work

Onodera ⁹ describes a compilation server for a C based object oriented language (COB) developed at IBM’s Tokyo Research Laboratory. This server is a process which handles successive compilation requests from the client. The server retains internal data structures generated by serving a request and reuses these data structures in successive compilations. Thus it requires header files only to be initialized once. Onodera shows that compilation time can be substantially reduced using pre-parsed header files on successive compilations.

Koehler and Horspool ⁶ report on a compilation server built from lcc ⁴ . This approach builds up a partial symbol table from the macros defined in header files and ensures that the cache is consistent from one compilation to another. It is shown that on average successive compilation times reduce to 38% of the original processing time.

However user security (ownership of header files) is not addressed with either of these approaches. In a University teaching environment this is an important consideration.

A common method for installing a compiler is to have one copy locally held on each client workstation. The compiler and associated utilities and library files could be maintained by rdist ³ . This tool has been around since the release of 4.3 BSD UNIX and is used to automatically distribute software across a number of networked workstations. Using rdist at Glamorgan would not be practical as all workstations may not be running UNIX (our laboratory machines are configured to run UNIX and Windows NT) and it is believed the performance problems described later would still remain.

The efficiency of the workstation in performing traditional compilation is questionable at Glamorgan. Over a period of several months of typical laboratory use all compilations and their elapsed and required execution time have been logged. These results are shown in graph 1

Graph 1: Cumulative frequency graph for compilation speed vs % of compilations

and they show that over half of all compilations take twice as long or longer to execute than is necessary when run on client workstations. Note that it is extremely unlikely that more than one compilation was being executed at a time on any individual workstation since the students were developing code in the Ceilidh environment ^{7, 2} which only activates one compilation per workstation per user. However the slow compilation time can be explained if the workstation is blocked waiting for network based input/output or alternatively if the workstation is carrying out other activities. The alternative to local compilation on a workstation environment, proposed in this paper, is to integrate a dedicated compilation server seamlessly onto the network.

Servant architecture

The approach taken in this paper is to produce a compilation server built on a microkernel, specialist remote procedure calls and file system components. User modules, library modules and temporary assembly files are all cached contiguously. The compiler and assembler are modified to exploit contiguous file provision. The goals of this research are threefold: improved compiler performance; isolation from operating system upgrades and ease of maintenance.

The proposed architecture is shown in figure 1. A network host is configured as the compilation server (cs). This host is a dedicated machine (for example a 486/Pentium class PC) running a microkernel, specialist RPCs and a simple memory based filesystem. The other components, namely the user agent (ua) and compilation server daemon (csd) run on Unix Workstations. At Glamorgan the ua typically runs on a student workstation and the csd runs on the fileserver.

The ua constructs a compilation server command (see table 1) and passes a list of arguments and environment variables to the csd running on the fileserver (figure 1: {1}). The csd interprets argv[0] and decides whether the command should be executed locally or whether the command must be passed to the compilation server (figure 1: {3}). Should a compilation be requested then the compilation server checks its cache and the fileserver for the most current source of the module. The module is compiled and an assembler file produced and held locally on the compilation server. GNU as permanently resides on the compilation server and has been linked with a simplified libc which exploits the contiguous storage of all local files. Once the object file has been created it is transferred to the fileserver (figure 1: {4}). Finally the exit status is sent to the appropriate ua (figure 1: {2}).

Image grohtml-202601.png

Figure 1: Compilation server architecture

Table 1 lists the commands together with a description and their location. From the users perspective these commands appear local, in fact m2f, cq, rq, csls, csdf are symbolic links to ua which is held on each client.

A degree of isolation from impromptu operating system upgrade can be obtained through this proposed architecture. In the past the authors have been at the receiving end of an upgrade of client based compilers, the newer version having a different executable file format to the older one. A software development class, using a Modula-2 compiler and gcc related sub tools was severely disrupted since these tools were dependent upon the older executable file format. By having all compiler utilities held on the compilation server and locating the ua on the clients this problem will not occur.

Image grohtml-202602.png

Table 1: Compilation server associated commands

Ultimately using the architecture above it should be possible to generate a set of software development tools which are independent of operating system. Only the ua need be ported to the different operating systems.

Compilation server file system: Local File System (LFS)

The file operations required by the server are extremely limited. Only open, close, sequential read and write operations are permitted and only one intermediate file and object file are created at a time and all files are held contiguously in main memory. File ownership and permissions are maintained within this file system. The LFS was in part inspired by the Bullet file system ¹² but with a few simplifications due to the restricted nature of the compilation server requirements. The desired high performance can, in part, be achieved by exploiting these simple requirements. Most internal copying of file fragments, (typically regarded as efficient buffering on traditional workstations using filesystems held on magnetic media), can be eliminated by giving applications direct access to the raw file held in the LFS. On the compilation server access to these files is achieved through minor modifications to stdio ⁵ in libc. In the microkernel the following function is introduced:

(*
lfsmemmap - attempt to map a files content into memory.
If lfsmemmap is successful then variable, a,
is assigned to where the memory file image
exists and the file length is returned.
If the file cannot be located in memory then
a is set to NIL and -1 is returned.
*)

PROCEDURE lfsmemmap (fd: INTEGER; VAR a: ADDRESS) : INTEGER ;

which allows stdio to obtain the address and length of a file held in the LFS. The LFS is split into four regions: temp, user, libs and object.

	libs		Contains all definition and implementation library files.
	user		Contains user modules which are to be compiled.
	temp		Only ever contains one file, the intermediate assembler file created by the compiler. This is then read by the assembler and the corresponding object file is created.
	object		The object file created by the assembler. Once the assembler has closed this file it is transferred to the fileserver.

Compilation server performance measurements

The performance of the compilation server remote procedure call mechanism is shown in graph 2 and is compared to the equivalent NFS data transfer rate. These results were obtained from a 66MHz 80486 with ne2000 Ethernet card with 16 MB RAM. The experiments consisted of an infinite source and infinite sink under which the compilation server RPC protocol was stressed. The graph shows the performance of this configuration over a varied data block size. The same experiment was conducted using NFS and dd was employed to transfer different block sizes across the network.

It can be seen that Linux NFS does not achieve more than 118 KBytes/s for large block sizes (64k) whereas the same hardware is capable of achieving a fivefold performance increase when using specialist software and a block size of 64k.

Graph 2: Compilation server protocol performance vs NFS

Elves and Shoemaker approach to a compilation server

The authors are currently working on implementing a history mechanism with the compilation server daemon. This keeps a list of compilation requests with the users environment, later when the compilation server is idle and an environment variable is set, the file is recompiled with full optimization and the object file placed into a users local directory. This is an attempt to harness idle resource and counter the long compilation time that high levels of optimization require ^{13, 4} .

Conclusion and Future work

The compilation server has been built at Glamorgan. Preliminary performance results are promising and confirm that the server will out perform the traditional compiler configuration.

The ua is tiny compared to the compiler, assembler and microkernel. In the future the ua will be ported to different operating system platforms. One of the obvious platforms is the JVM ⁸ . Another important advantage of this approach is that after minor modifications to the ua users can undertake software development remotely via a suitable network connection to the Glamorgan csd. Previously users wishing to develop software remotely either were required to remotely login to a Glamorgan workstation and transfer source files by hand or alternatively they would have to install a complete compiler, libraries and development tools. The authors believe the compilation server solution to be more attractive.

References

1.		F. Armand, M. Gien, F. Herrmann, and M. Rozier, “Revolution 89 or "Distributing UNIX Brings it Back to its Original Virtues",” Proceedings of Workshop on Experiences with Building Distributed (and Multiprocessor) Systems, Ft. Lauderdale, Fl USA (5-6 October, 1989).
2.		S. Benford, E. Burke, E. Foxley, N. Gutteridge A.M. Zin, The Ceilidh System: A General Overview, Learning Technology Research, Computer Science Department, Nottingham University (1994).
3.		M.A. Cooper, Overhauling Rdist for the ’90s, Long Beach, CA (October 19, June 23, 1992).
4.		C. Fraser and D Hanson, A Retargetable C Compiler: Design and Implementation, Benjamin/Cummings (1995).
5.		B.W. Kernighan and D.M. Richie, The C Programming Language 2nd Edition (ANSI C), Prentice-Hall (1988).
6.		B. Koehler, R. Horspool, “CCC: A Caching Compiler for C,” Software Practice and Experience 27(2), pp. 155-165 (February 1997).
7.		S.F. Lewis, “Developing a Modula 2 course for Ceilidh,” CTI Computing 5th Annual Conference on Teaching of Computing, pp. 126-128, Dublin (1997).
8.		T. Lindholm and F. Yellin, The Java Virtual Machine Specification, Addison Wesley Publishing Company (1997).
9.		Tamiya Onodera, “Reducing Compilation Time by a Compilation Server,” Software Practice and Experience 23(5), pp. 477-485 (May 1993).
10.		R. Pike, D. Presotto, K. Thompson, and H. Trickey, “Plan 9 from Bell Labs,” Proceedings of the Summer 1990 UKUUG Conference, pp. 1-9, London (July 1990).
11.		D. Presotto, R. Pike, K. Thompson, and H. Trickey, “Plan 9, A Distributed System,” Proceedings of the Spring 1991 EurOpen Conference, pp. 1-9, Troms (July 1990).
12.		A.S. Tanenbaum, Modern Operating Systems, Prentice-Hall (1992).
13.		M. Wolfe, “Fast code vs. fast compile,” Public communication in comp.compilers moderated newsgroup (22 January 1997). Oregon Graduate Institute (OGI), Portland, Oregon.

This document was produced using groff-1.22.4.