My Operating System Design.

I’ve been getting back into developing my operating system lately, spurred on by the tutorials I’ve been failing to write (if anybody does want them, email or leave a comment, and I’ll try and get around to them faster). I’ve got to the stage (or at least, I think I have) where I need to make a few of the important design decisions. So in this post, I’m going to discuss them.

The operating system release I made back in January doesn’t really represent the current state of what I’ve done. Firstly, I’ve set up Bochs on my PC (and Qemu on my iBook at school), so that I can test my system without having to have extra hardware on hand. I’m currently in the middle of writing drivers for video and keyboard, so that you’ll be able to use the system without a serial cable and an extra PC. But that isn’t really the design of my operating system, more just what I’ve previously done.

Over the time I’ve been writing this OS, I’ve been led in two different major directions to take my operating system in. The first, and more standard, direction is to create a modular microkernel, stick a whole heap of drivers on top of it, add a GUI, and call it Windows (or not). The second direction, and the one I think I am finally leaning towards, is to create a distributed operating system. There have been a few of these in the past, the most famous being Plan9 (and yet 99.9% of computer nerds have still never heard of it).

A distributed operating system is basically one in which each computer acts as part of a larger system and shares resources with the other computers. Resources could be processing time (one computer’s threads might run on another computer), hard disk space, network time servers, or anything, really.

What’s great about this is, assuming the user’s home computer is connected to the network, a user can sit down at any computer on the network, log in, and have exactly the same interface and files as if they were at home. They could even start a process running, log off, let the other computers on the network process it, and log back in again to get the results. Of course, there becomes an issue if nobody leaves their computer idling to run other people’s tasks.

Not all computers are created equal. Some might be 32bit, some might be 64bit. Some might be a Celeron 366 running in my bedroom, some might be multiprocessor servers in a data centre. And assuming they can all run the same software is probably not a great idea. For that reason, I’m going to implement a scripting language, and all the processes are going to be interpreted by all the peers in the system. No native code, except for the kernel, will be running.

This scripting language is going to be something along the lines of Lisp. This is one of the reasons I’ve been trying (and failing) to write a Lisp interpreter. I’m choosing Lisp because if I can implement both code and data using the same object model, it will make it simpler to transfer code and data between peers, I’ll only have to code one transfer mechanism. I also happen to like the idea of Lisp a lot, despite not having created anything major in it.

I’m not going to be implementing support for a lot of old hardware. I’m not going to bother writing floppy drivers, serial port drivers, or other things like this. The console, the hard drive, and the network are the most important peripheral devices, and the ones I will be concentrating on most.

There are quite a few problems that will need to be ironed out. What happens to secure data? Where does the data go when a node goes offline? How can we check the security of a node? I think by implementing a few checks into the client software, it’s possible to solve most of these problems.

While I realise my dream is a long way off, I hope I can make a move towards such a system being a reality. While I’m away at Kakadu I hope to have a bit of time to think more closely about some of the protocols involved. Now, back to work for me!