This article describes the internals of launching an instance in OpenStack Nova.
Overview
Launching a new instance involves multiple components inside OpenStack Nova:
- API server: handles requests from the user and relays them to the cloud controller.
- Cloud controller: handles the communication between the compute nodes, the networking controllers, the API server and the scheduler.
- Scheduler: selects a host to run a command.
- Compute worker: manages computing instances: launch/terminate instance, attach/detach volumes…
- Network controller: manages networking resources: allocate fixed IP addresses, configure VLANs…
Note: There are more components in Nova like the authentication manager, the object store and the volume controller but we are not going to study them as we are focusing on instance launching in this article.
The flow of launching an instance goes like this: The API server receives a run_instances command from the user. The API server relays the message to the cloud controller (1). Authentication is performed to make sure this user has the required permissions. The cloud controller sends the message to the scheduler (2). The scheduler casts the message to a random host and asks him to start a new instance (3). The compute worker on the host grabs the message (4). The compute worker needs a fixed IP to launch a new instance so it sends a message to the network controller (5,6,7,8). The compute worker continues with spawning a new instance. We are going to see all those steps in details next.
API
You can use the OpenStack API or EC2 API to launch a new instance. We are going to use the EC2 API. We add a new key pair and we use it to launch an instance of type m1.tiny.
2 | euca-add-keypair test > test .pem |
3 | euca-run-instances -k test -t m1.tiny ami-tiny |
run_instances() in api/ec2/cloud.py is called which results in compute API create() in compute/API.py being called.
1 | def run_instances( self , context, * * kwargs): |
3 | instances = self .compute_api.create(context, |
4 | instance_type = instance_types.get_by_type( |
5 | kwargs.get( 'instance_type' , None )), |
6 | image_id = kwargs[ 'image_id' ], |
Compute API create() does the following:
- Check if the maximum number of instances of this type has been reached.
- Create a security group if it doesn’t exist.
- Generate MAC addresses and hostnames for the new instances.
- Send a message to the scheduler to run the instances.
Cast
Let’s pause for a minute and look at how the message is sent to the scheduler. This type of message delivery in OpenStack is defined as RPC casting. RabbitMQ is used here for delivery. The publisher (API) sends the message to a topic exchange (scheduler topic). A consumer (Scheduler worker) retrieves the message from the queue. No response is expected as it is a cast and not a call. We will see call later.
Here is the code casting that message:
1 | LOG.debug(_( "Casting to scheduler for %(pid)s/%(uid)s's" |
2 | " instance %(instance_id)s" ) % locals ()) |
5 | { "method" : "run_instance" , |
6 | "args" : { "topic" : FLAGS.compute_topic, |
7 | "instance_id" : instance_id, |
8 | "availability_zone" : availability_zone}}) |
You can see that the scheduler topic is used and the message arguments indicates what we want the scheduler to use for its delivery. In this case, we want the scheduler to send the message using the compute topic.
Scheduler
The scheduler receives the message and sends the run_instance message to a random host. The chance scheduler is used here. There are more scheduler types like the zone scheduler (pick a random host which is up in a specific availability zone) or the simple scheduler (pick the least loaded host). Now that a host has been selected, the following code is executed to send the message to a compute worker on the host.
2 | db.queue_get_for(context, topic, host), |
5 | LOG.debug(_( "Casting to %(topic)s %(host)s for %(method)s" ) % locals ()) |
Compute
The Compute worker receives the message and the following method in compute/manager.py is called:
1 | def run_instance( self , context, instance_id, * * _kwargs): |
2 | """Launch a new instance with specified options.""" |
run_instance() does the following:
- Check if the instance is already running.
- Allocate a fixed IP address.
- Setup a VLAN and a bridge if not already setup.
- Spawn the instance using the virtualization driver.
Call to network controller
A RPC call is used to allocate a fixed IP. A RPC call is different than a RPC cast because it uses a topic.host exchange meaning that a specific host is targeted. A response is also expected.