Upstart in Universe
upstart is a replacement for the init daemon, the process spawned by the kernel that is responsible for starting, supervising and stopping all other processes on the system.
The existing daemon is based on the one found in UNIX System V, and is thus known as sysvinit. It separates jobs into different “run levels” and can either run a job when particular run levels are entered (e.g. /etc/init.d/rc 2) or continually during a particular run level (e.g. /sbin/getty).
The /etc/init.d/rc script is also based on the System V one (and is in the sysv-rc package), it simply executes the stop then start scripts found in /etc/rcN.d (where N is the run level) in numerical order.
Why change it?
Running a fixed set of scripts, one after the other, in a particular order has served us reasonably well until now. However as Linux has got better and better at dealing with modern computing (arguably Linux’s removable device support is better than Windows’ now) this approach has begun to have problems.
The old approach works as long as you can guarantee when in the boot sequence things are available, so you can place your init script after that point and know that it will work. Typical ordering requirements are:
Hard drive devices must have been discovered, initialised and partitions detected before we try and mount from /etc/fstab.
Network devices must have been discovered and initialised before we try and start networking.
This worked ten years ago, why doesn’t it work now? The simple answer is that our computer has become far more flexible:
Drives can be plugged in and removed at any point, e.g. USB drives.
Storage buses allow more than a fixed number of drives, so they must be scanned for; this operation frequently does not block.
To reduce power consumption, the drive may not actually be spun up until the bus scan so will not appear for an even longer time.
Network devices can be plugged in and removed at any point.
Firmware may need to be loaded after the device has been detected, but before it is usable by the system.
Mounting a partition in /etc/fstab may require tools in /usr which is a network filesystem that cannot be mounted until after networking has been brought up.
We’ve been able to hack the existing system to make much of this possible, however the result is chock-full of race conditions and bugs. It was time to design a new system that can cope with all of these things without any problems.
What we needed was an init system that could dynamically order the start up sequence based on the configuration and hardware found as it went along.
Design of upstart
upstart is an event-based init daemon; events generated by the system cause jobs to be started and running jobs to be stopped. Events can include things such as:
the system has started,
the root filesystem is now writable,
a block device has been added to the system,
a filesystem has been mounted,
at a certain time or repeated time period,
another job has begun running or has finished,
a file on the disk has been modified,
there are files in a queue directory,
a network device has been detected,
the default route has been added or removed.
In fact, any process on the system may send events to the init daemon over its control socket (subject to security restrictions, of course) so there is no limit.
Each job has a life-cycle which is shown in the graph below:
The two states shown in red (“waiting” and “running”) are rest states, normally we expect the job to remain in these states until an event comes in, at which point we need to take actual to get the job into the next state.
The other states are temporary states; these allow a job to run shell script to prepare for the job itself to be run (“starting”) and clean up afterwards (“stopping”). For services that should be respawned if they terminate before an event that stops them is received, they may run shell script before the process is started again (“respawning”).
Jobs leave a state because the process associated with them terminates (or gets killed) and move to the next appropriate state, following the green arrow if the job is to be started or the red arrow if it is to be stopped. When a script returns a non-zero exit status, or is killed, the job will always be stoped. When the main process terminates and the job should not be respawned, the job will also always be stopped.
As already covered, events generated by the init daemon or received from other processes cause jobs to be started or stopped; also manual requests to start or stop a job may be received.
The communication between the init daemon and other processes is bi-directional, so the status of jobs may be queries and even changes of state to all jobs be received.
How does it differ from launchd?
launchd is the replacement init system used in MacOS X developed as an “Open Source” project by Apple. For much of its life so far, the licence has actually been entirely non-free and thus it has only become recently interesting with the licence change.
Much of the goal of both systems appears initially to be the same; they both start jobs based on system events, however the launchd system severly limits the events to only the following:
file modified or placed in queue directory,
particular time (cron replacement),
connection on a particular port (inetd replacement).
Therefore it does not actually allow us to directly solve the problems we currently have; we couldn’t mount filesystems once the “filesystem checked” event has been recived, we couldn’t check filesystems when the block device is added and we certainly couldn’t start daemons once the complete filesystem (as described by /etc/fstab) is available and writable.
The launchd model expects the job to “sit and wait” if it is unable to start, rather than provide a mechanism for the job to only be started when it doesn’t need to wait. Jobs that need /usr to be mounted would need to spin in a loop waiting for /usr to be available before continuing (or use a file in a tmpfs to indicate it’s available, and use that modification as the event).
This is not especially surprising given that Apple have a high degree of control over both their hardware and the actual underlying operating system; they don’t need to deal with the wide array of different configurations that we have in the Linux world.
Had the licence been sufficiently free at the point we began development of our own system, we would probably have extended launchd rather than implement our own. At the point Apple changed the licence, our own system was already more suitable for our purposes.
How does it differ from initng?
Initng by Jimmy Wennlund is another replacement init daemon intended to replace the sysvinit system used by Linux. It is a dependency-based system, where upstart is an event-based system.
The notion of a dependency-based system is interesting to talk about at this point. Jobs declare dependencies on other jobs that need to happen before the job itself can be started. Starting the job causes its dependencies to be started first, and their dependencies, and so on. When jobs are stopped, if running jobs have no dependencies, they themselves can be stopped.
It’s a neat solution to the problem of ordering a fixed boot sequence and the problem of keeping the number of running processes to a minimum needed.
However this means that you need to have goals in mind when you boot the system, you need to have decided that you want gdm to be started in order for it, and its dependencies, to be started. Initng uses run levels to ensure this happens, where a run level is a list of goal jobs that should be running in that run level.
It’s also not clear how the dependencies interact with the different types of job, a dependency on Apache would need the daemon to be running where a dependency on “checkroot” would need the script to have finished running. Upstart handles this by using different events (“apache running” vs. “checkroot stopping”).
Again while interesting, Initng does not solve the problems that we wanted to solve. It can reorder a fixed set of jobs, but cannot dynamically determine the set of jobs needed for that particular boot.
A typical example would be that if the only dependency on the job that configures networking is the mount network filesystems job, then should that job fail or notbe a goal (e.g. because there are no network filesystems to be mounted) the result is that network devices themselves will not be configured. You could make everything a goal, and just use the dependencies to determine the order, however this is less efficient than just ordering the existing sysv-rc scripts (which
can be done at install time).
Another example is that often you simply don’t know whether something is a dependency or not without reading other configuration, for example the mount network filesystems may be a dependency of everything under /usr or may just be a dependency of anything allowing the user to login if it just mounts /home.
The difference in model can be summed up as “initng starts with a list of goals and works out how to get there, upstart starts with nothing and finds out where it gets to.”
How does it differ from Solaris SMF?
SMF is another approach to replacing init developed by Sun for the Solaris operating system. Like initng it’s a dependency-based system, so see above for the differences between those systems and upstart.
SMF’s main focus is serive management; making sure that once services are running, they stay running, and allowing the system administrator to query and modify the states of jobs on the system.
Upstart provides the same set of functionality in this regard, services are respawned when they fail and system administrators can at any time query the state of running services and adjust the state to their liking.
Will it replace cron, inetd, etc?
The goal of upstart is to replace those daemons, so that there is only one place (/etc/event.d) where system administrators need to configure when and how jobs should be run.
In fact, the goal is that upstart should also replace the “run event scripts” functionality of any daemon on the system. Daemons such as acpid, apmd and Network Manager would send events to init instead of running scripts themselves with their own perculiar configuration and semantics.
A system administrator who only wanted a particular daemon to be run while the computer was on AC power would simply need to edit /etc/event.d/daemon and change “on startup” to “on ac power”.
What about compatibility?
There’s a lot of systems administrators out there who have learned how Linux works already and will not want to learn again immediately, there’s also a large number of books that cover the existing software and won’t cover upstart for at least a couple of years.
For this reason, compatibility is very important. upstart will continue to run the existing scripts for the forseeable future so that packages will not need to be updated until the author wants.
Compatibility command-line tools that behave like their existing equivalents will also be implemented, a system administrator would never need to know that crontab -e is actually changing upstart jobs.
Does it use D-BUS?
“To D-BUS people, every problem seems like a D-BUS problem.”
The UNIX philosophy is that something should do just one job, and do it very well. upstart’s one job is starting, supervising and stopping other jobs; D-BUS’s one job is passing messages between other jobs.
D-BUS does provide a mechanism for services to be activated when the first message is sent to them, thereby starting other jobs. Some people have taken this idea and extended it to suggest that all a replacement init system need do is register jobs with D-BUS and turn booting into a simple matter of message parsing.
This seems wrong to me, D-BUS would need to be extended to supervise these services, provide means for them to be restarted and stopped; as well as deal with being process #1 which means cleaning up after children whose parent’s have died, etc. It seems far simpler to arrange for D-BUS to send an event to init when it needs a service to be started, and focus on being a very good message passing system.
The IPC mechanism used by upstart is not currently D-BUS because of various problems, however it’s always been expected that even if init itself doesn’t communicate with D-BUS directly, there would be a D-BUS proxy that would ensure messages about all init jobs and events would be given to D-BUS and D-BUS clients could send messages to init to query and change the state of jobs.
What is the implementation plan?
Because this is process #1 we are changing, we want to make sure that we get it right. Therefore instead of releasing a fully-featured daemon and configuration to the world, we’re developing it in the following stages:
Principal development; at the end of this stage the daemon has been implemented and can manage jobs as described.
Replacement of /sbin/init while running the existing sysv-rc scripts. This is the shake-down test of the daemon, can it perform the same job as the existing sysvinit daemon without any regressions?
/etc/rcS.d scripts replaced by upstart jobs. These consitute the majority of tasks for booting the system into at least single-user mode, and contain many of the current ordering problems and race conditions. If the daemon solves the problems here, it will be a success.
Other daemon’s scripts replaced by upstart jobs on a package-by-package basis; this will be an ongoing effort during which upstart will continue running the existing sysv-rc scripts as well as its own jobs. During this time the event system may be tweaked to ensure it truly solves the problems we need.
Replcement of cron, atd, anacron and inetd. This will happen alongside the above and result in a single place to configure system jobs.
Modification of other daemons and processes to send events to init instead of trying to run things themselves.
The current plan is that we will be at least part of the way into stage #3 by the time edgy is released, with that release shipping with upstart as the init daemon and the most critical rcS scripts being run by it to correct the major problems
For edgy+1 we hope to have completed stage #5 and be at least part of the way into the implementation of stage #6. From the start of development of edgy+2, no new packages will be accepted unless they provide upstart jobs instead of init scripts and init scripts will be considered deprecated.
What state is it in now?
The init daemon has been written and is able to manage jobs as described above, receiving events on the control socket to start and stop them. This has now been uploaded to the Ubuntu universe component in the upstart package for testing before it becomes the init daemon.
We welcome any experienced users who want to help test this; install the package and follow the instructions in /usr/share/doc/upstart/README.Debian to add a boot option that will use upstart instead of init. If your system boots and shut downs normally (other than a slightly more verbose boot without usplash running) then it is working correctly.
Other types of events will be added as required during development and testing. Currently only a basic client tool (initctl) has been written, compatibility tools such as shutdown will be written over the next week or two before it replaces our sysvinit package.