IBM OpenStack

最新推荐文章于 2024-09-23 18:35:18 发布

zeng_84_long

最新推荐文章于 2024-09-23 18:35:18 发布

阅读量1.7k

点赞数

文章标签： ibm service token properties list instantiation

https://www.ibm.com/developerworks/mydeveloperworks/blogs/e93514d3-c4f0-4aa0-8844-497f370090f5/entry/quantum_folsom12?lang=en

OpenStack Keystone Workflow & Token Scoping

Boden 100000JYGN | Tuesday 9:05 PM | Tags: keystone openstack | Comments (0) | Visits (289)

While recently browsing the OpenStack documentation updates for the Folsom release, I came across a new (new to me anyway) Keystone diagram which provides a well deserved depiction of a typical end-user workflow using Keystone as an identity service provider. This diagram not only provides greater incite to this typical workflow, but it also illustrates the notion of scoped vs unscoped tokens. I've pasted the diagram below for convenience, but the original document can be found on the OpenStack documentation site.

To obtain a scoped token, use the POST /tokens Keystone API as in step 1. There are two forms of this API:

Use the same request body as in step 1 passing in your user id and credentials, but this time specify a tenantName to scope the token.
Use a request body which contains both your unscoped token and the tenant name to scope the token. This later form allows you obtain a scoped token without POSTing your credentials again.

Both forms of this API are shown below.

Scope a token by resending credentials

POST http://localhost:5000/v2.0/tokens

{

"auth":{

"tenantName":"DefaultTenant",

"passwordCredentials":{

"username":"boden",

"password":"my9password"

}

Scope a token using your existing unscoped token

POST http://localhost:5000/v2.0/tokens

{

"auth":{

"tenantName":"DefaultTenant",

"token":{

"id":"MIICDgYJKoZIhvcNAQcCoIr8JKYG0ywOaMc2lYqhIQhLApqJpOns="

}

The response will include a new scoped token and associated metadata as shown in the example response below:

{

"access":{

"token":{

"expires":"2012-10-06T18:41:34Z",

"id":"+dRsTPGAvw4yPrk-F7-eqcycJg",

"tenant":{

"description":"Default Tenant",

"enabled":true,

"id":"6f8945f2d47f4abea149b7a0176b12a8",

"name":"DefaultTenant"

}

"serviceCatalog":[

{

"endpoints_links":[

"endpoints":[

{

"adminURL":"http://localhost:9292/v1",

"region":"RegionOne",

"publicURL":"http://localhost:9292/v1",

"internalURL":"http://localhost:9292/v1",

"id":"ef3aa115fa104a33914d1ba05a8a1195"

}

"type":"image",

"name":"glance"

{

"endpoints_links":[

"endpoints":[

{

"adminURL":"http://localhost:1337",

"region":"RegionOne",

"publicURL":"http://localhost:1337",

"internalURL":"http://localhost:1337",

"id":"ee855645535e422fb316ed3eb652d94c"

}

"type":"compute",

"name":"local-node-js"

{

"endpoints_links":[

"endpoints":[

{

"adminURL":"http://localhost:35357/v2.0",

"region":"RegionOne",

"publicURL":"http://localhost:5000/v2.0",

"internalURL":"http://localhost:5000/v2.0",

"id":"882ad7040d8e46ef9cd036b8f685bb33"

}

"type":"identity",

"name":"keystone"

}

"user":{

"username":"boden",

"roles_links":[

"id":"0e982248b8a14e4bb838f956b9b79e7a",

"roles":[

{

"name":"member"

}

"name":"boden"

"metadata":{

"is_admin":0,

"roles":[

"9eaeea6cc78345738d3d7e12f6a9012e"

]

}

Notice the token metadata in the response is now scoped to the DefaultTenant tenant which was specified in the API request. Moreover notice the response includes an array of service endpoints. These endpoints identity the services your token has access to based on the service/endpoint catalog the Keystone service manages. As a consumer of the API, you now need to look through these service endpoints to determine which service you wish to use.

Let's quickly recap the main points of the endpoint/service catalog maintained by Keystone.

With that in mind, we can see the response above contains a service/endpoint for compute and identity. This indicates that my scoped token can access APIs for the respective compute and identity services using the respective base endpoint URL given in the response. For more details on setting up services and endpoints in Keystone, see the OpenStack documentation.

IMPORTANT Remember that you now need to use the scoped token id in subsequent API calls passing it in via the X-Auth-Token header.

Step 4: Invoke the target endpoint service API

You now have a scoped token and know the URL of the endpoint API you wish to invoke. The next step is to actually invoke the service endpoint itself for the API(s) you wish to use. The diagram shows the endpoint service in step 4 using Keystone to validate your token. While this is true if you are using UUID based tokens, it is not true if you are using PKI. Let's outline the main differences.

UUID

With UUID based tokens, your token ID will be a UUID -- a unique string used to identity the token you hold. In this scheme the Keystone service you obtained the token from maintains an index of token UUIDs to their respective metadata and validity. As this token ID does not contain embedded metadata, the endpoint service must invoke Keystone (passing along your UUID based token ID) to validate the given token is authenticated and valid. In response Keystone will return the metadata associated with the token (much like the metadata return in the response from step 3) including your roles, tenancy, etc. which can then be used internally by the service while processing your API request. The important thing to note here is that the endpoint service is calling into Keystone for each API request it handles to validate your UUID based token.

PKI

The PKI token scheme was introduced in OpenStack Folsom RC1 and removes the need for the endpoint service to call into Keystone for each request. The bullets below outline how PKI works in a nutshell.

PKI (Public Key Infrastructure) is based on a public/private certificate pair using X.509 technology.
Keystone holds both a public and private certificate.
Anybody can get the public certificate from Keystone via REST API call, but the private certificate key is not exposed outside of the Keystone service itself.
When using PKI, each of the endpoint services will ask Keystone (via REST API) for the public key the first time the service is invoked. The endpoint service will then save off the public key (caching it) for later use.
When Keystone builds your token in PKI mode, it creates the token JSON object containing your token metadata, encrypts the JSON using the private key, and then creates a signature of the encrypted token using MD5. This encrypted/hashed token becomes your token ID which is returned upon successful authentication (step 3 response) and is then used in your X-Auth-Token header. Take note that in this case your token ID includes the metadata for your token and not just a UUID as in the case of the UUID token scheme.
When an endpoint service is invoked in PKI mode, it will verify the token signature and decrypt the token using the public key it has. Note that this is enough for the service to confirm validity since only Keystone has the private key and hence only Keystone could have encrypted the token using the private key.
Since the decrypted token includes all the token metadata, the service no longer needs to invoke Keystone to get the token metadata and can validate the token internally.

The important take away is that PKI is much performant than the UUID scheme since the endpoint service does not need to make a REST call into Keystone to validate the token on a per request basis. As such it can increase service performance and reduce network "chatter".

You can setup the token scheme in your keystone.conf by specifying the token_format as PKI and including the certificate paths required to carry out PKI by Keystone. The snippet below shows an example development configuration which uses the test certificate keys provided by the Keystone tests (used for ad-hoc testing). In a production environment your own certificate keys would be used.

[signing]

#token_format = UUID

certfile = /home/boden/workspaces/openstack/keystone/tests/signing/signing_cert.pem

keyfile = /home/boden/workspaces/openstack/keystone/tests/signing/private_key.pem

ca_certs = /home/boden/workspaces/openstack/keystone/tests/signing/cacert.pem

#key_size = 1024

#valid_days = 3650

#ca_password = None

token_format = PKI

Step 5: Validate role metadata

Step 5 is really about the endpoint service using you token's metadata to verify you can access the requested service/operation. This is often referred to as Role Base Access Control (RBAC) and involves the service using a 'rule engine' to determine you token contains the proper role access based on the service's respective policy.json file. Although RBAC and concepts related to the policy.json file are beyond the scope of this post, it's worth noting that each endpoint service contains its own policy which is enforced on a per service basis -- that is, the internals of each service validate RBAC. For more details on how RBAC works, there is a good blog post on knowledgestack You can also check the OpenStack docs which document the contents of the policy.json file on for each service type.

If you do not have the proper access to the operation you have requested, the service will deny your request and return an error.

Step 6: Service API request

In step 6, the endpoint service actually performs the operation you have requested as per the API URI. This might be a 'run instance', 'create volume', etc. depending on the API URI and service you are invoking. In the scheme of this post, there is nothing all that interesting to describe for this step.

Step 7: Return response

As you have probably guessed, step 7 returns an API response to the user. Again nothing interesting to dive into for this step in the context of this post.

FAQ

(Q) Can I work with (use) an unscoped token outside of Keystone when I'm invoking APIs on service endpoints?

(A) It's not recommended -- almost all service implementations require a scoped token and therefore you will likely not get far using unscoped tokens.

(Q) How do non-Keystone endpoint services know the token I'm using is PKI based vs UUID based?

(A) PKI based tokens are substantially larger in length than UUID based tokens as they contain the encrypted/signed token metadata whereas UUID based tokens are a fixed length string. Therefore the service checks the length of your token to determine.

(Q) Can a single Keystone service manage multiple OpenStack Cloud deployments (regions)?

(A) Yes and in fact this is somewhat common in order to provide a single identity access point for consumers. However keep in mind that multiple Keystone services can share the same data store which allows you to provide multiple Keystone APIs with a single backing store (for example to support geo based access).

Ratings ⁰

Quantum Folsom

YongShengGong 270005A0T0 | Sep 17 | Tags: quantum openstack | Comments (0) | Visits (939)

Openstack and Quantum

Quantum consists of many components in Folsom release:

Quantum-server provides REST API for outside to access and manage the virtual network models. Inside the quantum-server, there is a plugin (Also AKA backend) to operate the models in Database. Quantum-sever is should be deployed on the controller node and HA can be put before multiple quantum-server processes to deal with large cloud system.
By now, l3-agent communicates with quantum-server by REST API via quantum client library. L3-agent is used to manage floating ip and router. One L3-router can only manage one external network.
plugin agent runs with nova-compute, which maps the virtual networks to actual network environment so that to assure the connectivity of VMs.
DHCP agent runs per-subnet.

categories of networks

network is divided into physical network and virtual network in quantum. the network object in quantum model is for virtual networks. we identify each physical network as a name in plugin's configure file. Physical network is actual existing network in the infrastructure. It may be a VLAN with a given vlan id, or a flat network. Virtual networks are realized by a given kind of physical network via network biding object. Virtual networks also are divided into tenant network or provider network. But most of the time, we cannot see much difference between them. If we provide the physical network name, type and segment id, it is a provider network, or we are creating a tenant network. Tenant network's network binding object is selected from a pool, defined in configure file.

VM gets the IP address with the dhcp client.
', '', ' $\"Recommend$ Recommend this Entry', '', ''].join(''); document.write(s); } } Ratings ⁰

Openstack's Floating IPs

YongShengGong 270005A0T0 | July 7 | Tags: openstack floating | Comments (0) | Visits (1,453)

To see how floating IPs are implemented, we first associate a floating IP to our instance's fixed IP. We assume the fixed IP of our instance is 10.10.10.2.

To create a range of floating ips in default pool:

#nova-manage floating create --ip_range=192.168.1.232/30

To allocate a floating ip from this pool:

# Nova floating-ip-create

We got an IP 192.168.1.233. And then assign it to our instance, which has id 8f773639-c04f-4885-9349-ac7d6a799843:

#nova add-floating-ip 8f773639-c04f-4885-9349-ac7d6a799843 192.168.1.233

Floating ip bound to public interface

FLAGS.public_interface is used to bind floating IPs. After we run “nova add-floating-ip” command, we can see we got this floating IP under the public_interface:

# ip addr list dev wlan0

3: wlan0:…

link/ether 08:11:96:75:91:54 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.90/16 brd 192.168.255.255 scope global wlan0

inet 192.168.1.233/32 scope global wlan0

inet6 fe80::a11:96ff:fe75:9154/64 scope link

valid_lft forever preferred_lft forever

Rules in nat table for floating IP

After instance gets a floating IP, on nova-network host, there are rules:

-A nova-network-OUTPUT -d 192.168.1.233/32 -j DNAT --to-destination 10.10.10.2

-A nova-network-PREROUTING -d 192.168.1.233/32 -j DNAT --to-destination 10.10.10.2

-A nova-network-float-snat -s 10.10.10.2/32 -j SNAT --to-source 192.168.1.233

We can see that DNAT rule is used to translate the floating IP into the instance's fixed IP. So if a packet is arriving nova-network host with the floating IP as target IP, the target IP will be translated. And then we have a SNAT rule, which will translate our traffic from the instance's fixed IP into the floating IP. Since all the traffic from VMs to outside of the fixed network will be pointed to gateway which is set by nova-network's dnsmasq process, with this SNAT rule, the traffic out from VMs can be masked as from the floating IP successfully. In addition, we have a DNAT rule in wrapped OUTPUT chain, which will allow local process on nova-network to access VMs with floating IP.

Ping VMs with floating IP

To allow pinging VMs with floating IP, we need some more rules. Remember that on nova-compute host, we have a specific chain for each instance, and the rules in it just allow the traffic from IPs in the fixed subnet. If we ping floating IP, the traffic will be dropped by those rules since the source IP of the ping packet is not within the fixed subnet. Obviously, we need add a rule to allow icmp traffic.

To add rule allowing ping, we use OpenStack's security group rule concept:

# nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0

After that, we can see we have one more rule created under that instance's specific chain:

-A nova-compute-inst-1 -p icmp -j ACCEPT

We can use the same way to enable ssh to the VMs with floating IP.

Table in database

', ' ', ' $\"Recommend$ Recommend this Entry', '', ''].join(''); document.write(s); } }

Ratings ⁰

OpenStack VNC console

YongShengGong 270005A0T0 | May 30 | Tags: vnc console openstack | Comments (0) | Visits (1,799)

In computing, Virtual Network Computing (VNC) is a graphical desktop sharing system that uses the RFB protocol to remotely control another computer. It transmits the keyboard and mouse events from one computer to another, relaying the graphical screen updates back in the other direction, over a network. In IaaS system like OpenStack, VNC is a very convenient tool for end user to access the VMs by GUI.

Nova provides two kinds of VNC proxies: noVNC and nova-xvpvncproxy.

Components

To enable VNC console to access VMs started by OpenStack, we need many components to collaborate:

Components on controller node
- nova-api, which is API server of OpenStack
- nova-consoleauth, which is used to authorize VNC client
- noVNC, which is a VNC proxy for browser. An alternative proxy is nova-xvpvncproxy, which can be accessed by a Java client at https://github.com/cloudbuilders/nova-xvpvncviewer.
Components on compute node
- nova-compute, which is a nova binary to instantiate VMs
- libvirt driver, which is used by nova-compute to interact with libvirt server
- VNC server, which is part of hypervisor?

Steps for user to access VNC console

2. nova-compute running on that compute host accepts that message, generates a token and calls libvirt driver's get_vnc_console methods.

3. Libvirt driver will connect with libvirt server to get VM's vnc port, and look up the vnc host from FLAGS.vncserver_proxyclient_address.

4. nova-api will send out "authorize_console" message to nova-consoleauth, which then caches the connection information returned by compute-node with token as key

5. user browsers the URL returned the previous command, like below:

[root@robinlinux utils]# nova get-vnc-console myserver20 novnc
+-------+----------------------------------------------------------------------------------+
| Type | Url |
+-------+----------------------------------------------------------------------------------+
| novnc | http://controlnode:6080/vnc_auto.html?token=34ce7a44-6186-43d8-be14-2ea9e028b8fa |
+-------+----------------------------------------------------------------------------------+

6. noVNC, which is listening on 6080 HTTP port, accepts the URL request, sends out "check_token" message to nova-consoleauth

7. nova-consoleauth checks the cached connections and returns one according to the requested token key

8. noVNC begins the proxy work, connecting the VNC server

About configuration items in nova.conf

#read by nova-compute to compose vnc-console URL
# for novnc
novncproxy_base_url=http://controlnode:6080/vnc_auto.html
# for xvpvnc
xvpvncproxy_base_url=http://controlnode:6081/console
#read by nova-compute to instantiate VMs
vncserver_listen=0.0.0.0

#read by libvirt driver to compute vnc-console URL
vncserver_proxyclient_address=compute_node

Ratings ⁰

OpenStack nova-scheduler and its algorithm

YongShengGong 270005A0T0 | Apr 28 | Tags: scheduler openstack | Comments (2) | Visits (3,832)

Abstract

Among the current core projects of OpenStack, Nova project is the core of the cores. Just as described in OpenStack website, Nova is a cloud computing fabric controller, the main part of an IaaS system. There are more than 20 binaries in OpenStack nova project. Among them, nova-scheduler is responsible to decide which compute node host should launch an image instance (server in terms of OpenStack) among other responsibilities. This article describes the way this component does its job together with other components and how it makes decisions faced with more than one compute node host and one instantiation request.

Overview

Just as said on the OpenStack website, OpenStack's mission is to produce the ubiquitous Open Source Cloud Computing platform that will meet the needs of public and private clouds regardless of size, by being simple to implement and massively scalable. In such cloud environments, there is often more than one compute node to instantiate image instances on. How to manage and measure these compute nodes is a very prominent problem. Based on some data, how to react to one user's request is a hot spot as well.

Let first look at where the nova-scheduler works in big picture.

Note: This figure is from http://ken.pepple.info/openstack/2012/02/21/revisit-openstack-architecture-diablo/.

Just as shown by above figure, nova-scheduler interacts with other components through queue and central database repo. For scheduling, queue is the essential communications hub.

All compute nodes (also known as hosts in terms of OpenStack) periodically publish their status, resources available and hardware capabilities to nova-scheduler through the queue. nova-scheduler then collects this data and uses it to make decisions when a request comes in.

Note: this picture comes from nova project.

Above picture shows us the general idea of how the scheduler does its main job. The whole process divides into two phases. Filtering phase will generate a list of suitable hosts by applying filters. Weighting phase will sort the hosts according to their weighted cost scores, which are given by applying some cost functions. The sorted list of hosts is candidates to fulfill the user's request. How many hosts in this list will be used depends on the number of instances requested in one request.

Following this overview, the rest contents of this article will describe:

1.What an instantiation request looks like and how it goes to nova-scheduler;

2.What are the main components in nova-scheduler;

3.How the nova-scheduler components collaborate to finish the scheduling work.

Scheduler Invocation

To depict the work of nova-scheduler, we have to talk a little about nova-api first. Just as with other nova binaries, nova-api is a WSGI server. Python Routes is adopted to map RESTful URL into internal Controller's method.

We can know from above figure that the main methods involved for nova-api to response to the image instantiation request. First the HTTP request is mapped to Controller's create() method. This method processes the request body and then invokes compute_api's create() method, and then method _create_instance() is called. At last the _schedule_run_instance() method will call rpc_method() to send out message onto message queue for nova-scheduler.

The rpc_method is called as follows:

rpc_method(context,

FLAGS.scheduler_topic,

{"method": "run_instance",

"args": {"topic": FLAGS.compute_topic,

"request_spec": request_spec,

"admin_password": admin_password,

"injected_files": injected_files,

"requested_networks": requested_networks,

"is_first_time": True,

"filter_properties": filter_properties}})

Of the RPC message, Field injected_files represents files that will be injected into VM disk image, while requested_networks is for network information, such as which network(s) will be used by the instance(s).

Another two parts of this message, request_spec and filter_properties, need further explanation. We will use following command to generate sample values in following sections:

nova boot --image a3fb743d-42df-49ba-b9c4-8042ebbd344e --flavor 1 myserver --hint test=testvalue –availability_zone=myzone::testhost

filter_properties

The filter_properties part of RPC message is to help nova-scheduler. Normally, it will contain scheduler_hints information from user request. We have sample content like this from previous nova boot command:

filter_properties:{

'scheduler_hints': {

'force_hosts': [u'testhost'],

u'test': u'testvalue'

}

If our availability_zone complies with the pattern zone:xx:host, force_hosts field will be in scheduler_hints. The value of force_hosts can target the request to a given host directly before going through scheduler's filters. Also there can be ignore_hosts in scheduler_hints, which means the specified hosts will be skipped during scheduling. Both force_hosts and ignore_hosts are applied before going through filters. Please see following section Inside of FilterScheduler.

request_spec

The requst_spec part of RPC message is encapsulation or normalization of HTTP request.

Below table is the sample content of request_spec in the RPC message by previous nova boot command.

{

'num_instances': 1,

'block_device_mapping': [],

'image': {

'status': 'active',

'name': 'cirros_blank',

'deleted': False,

'container_format': 'ami',

'created_at': '2012-04-05 14:26:24',

'disk_format': 'ami',

'updated_at': '2012-04-05 14:26:25',

'properties': {

'kernel_id': '46bf134e-2e6e-472a-a159-f4cd51f36d84',

'ramdisk_id': '106dc550-783e-4de7-951d-f4f3d5427698'

'min_ram': '0',

'checksum': '2f81976cae15c16ef0010c51e3a6c163',

'min_disk': '0',

'is_public': True,

'deleted_at': None,

'id': 'a3fb743d-42df-49ba-b9c4-8042ebbd344e',

'size': 25165824

'instance_type': {

'root_gb': 0L,

'name': u 'm1.tiny',

'deleted': False,

'created_at': None,

'ephemeral_gb': 0L,

'updated_at': None,

'memory_mb': 512L,

'vcpus': 1L,

'flavorid': u '1',

'swap': 0L,

'rxtx_factor': 1.0,

'extra_specs': {},

'deleted_at': None,

'vcpu_weight': None,

'id': 2L

'instance_properties': { # used to comsume virtual resources

'vm_state': 'building',

'ephemeral_gb': 0L,

'access_ip_v6': None,

'access_ip_v4': None,

'kernel_id': '46bf134e-2e6e-472a-a159-f4cd51f36d84',

'key_name': None,

'ramdisk_id': '106dc550-783e-4de7-951d-f4f3d5427698',

'instance_type_id': 2L,

'user_data': '',

'vm_mode': None,

'display_name': u 'myserver',

'config_drive_id': '',

'reservation_id': 'r-bdbnl7aa',

'key_data': None,

'root_gb': 0L,

'user_id': u '81ced34d11954800906096555539c885',

'uuid': u '4ccc7c93-cbde-4233-a7cb-5db81f82489b',

'root_device_name': None,

'availability_zone': u 'myzone', # default to FLAGS.default_schedule_zone

'launch_time': '2012-04-11T15:08:55Z',

'metadata': {},

'display_description': u 'myserver',

'memory_mb': 512L,

'launch_index': 0,

'vcpus': 1L,

'locked': False,

'image_ref': u 'a3fb743d-42df-49ba-b9c4-8042ebbd344e',

'architecture': None,

'power_state': 0,

'auto_disk_config': None,

'progress': 0,

'os_type': None,

'project_id': u '9d049e4b60b64716978ab415e6fbd5c0',

'config_drive': ''

'security_group': ['default']

}

Some values in instance_properties are copied from instance_type. Both copies of these values play a role in scheduler's work. Red colored parts are important to nova scheduler. Most of them can be used in filters and cost functions.

Nova-scheduler class diagram

Before diving into how the nova-scheduler deals with the instantiation request message, we had better have a look at the data structure it used.

Just as shown by above figure, many classes or modules work together in nova-scheduler:

1.SchedulerManager sits between the queue and the other nova-scheduler components. It receives requests from queue and delegates jobs to its driver. The driver is defined by configuration option FLAGS.scheduler_driver with "nova.scheduler.multi.MultiScheduler" as default value.

2.Scheduler, parent class for all other schedulers, has compute_api and host_manager attributes. The value of compute_api is nova.compute.api.API. The value of host_manager is defined by FLAGS.scheduler_host_manager with "nova.scheduler.host_manager.HostManager" as default value.

3.MultiScheduler is a subclass of Scheduler designed to delegate to configurable drivers per resource type (compute and volume today). Value of compute_driver is defined by configur ation option FLAGS.compute_scheduler_driver with default value "nova.scheduler.filter_scheduler.FilterScheduler". Value of volume_driver is not within the article's scope.

4.FilterScheduler is responsible for selecting hosts and provisioning resources. It chooses the host by applying filters and calculate s weighted cost. Host which passes filters and has least cost wins.

5.ChanceScheduler chooses the host randomly from running hosts

6.SimpleScheduler chooses the host based on the running cores. Host with least running cores wins out.

7.API is compute API, used to call API service of OpenStack compute.

8.HostManager is for collect ing and saving host data .

9.Module least_cost contains the cost function and WeightedHost class.

10.WeightedHost is a value object, with weight and hoststate as two fields .

11.HostState records host's capabilit ies and virtual consumptions of resources during one request processing .

Inside of nova-scheduler

When the request is sent out in the form of RPC message, it will be received by nova-scheduler, which will call SchedulerManager ' s run_instance() method. Following the calling chain, the control will arrive at FilterScheduler ' s scheduler_run_instance() method in the end.

Inside of FilterScheduler

By default, compute related scheduling will come to this class. Its schedule_run_instance() method will take the control to fulfill user request.

Above diagram shows us the main tasks done by schedule_run_instance method :
1. _schedule()

This method selects some hosts to instantiate the image. By design, the instantiation request can be to create more than one instance. So it will return a sorted list of WeightedHosts by the weight. Least weight comes first. Also this function will populate filter_properties with more data, such as request_spec, config_options and instance_type, etc before calling filters and cost functions.

1.1 get_cost_function()

###### (FloatOpt) How much weight to give the fill-first cost function. A negative value will reverse behavior: e.g. spread-first

compute_fill_first_cost_fn_weight=-1.0

###### (ListOpt) Which cost functions the LeastCostScheduler should use

least_cost_functions="nova.scheduler.least_cost.compute_fill_first_cost_fn"

###### (FloatOpt) How much weight to give the noop cost function

FilterScheduler's cost functions are organized in a list of tuples of weight and cost function. Functions are defined by FLAGS. least_cost_functions, and corresponding weights are defined in separated options. For example, in above configuration fragment, compute_fill_first_cost_fn_weight defines weight for default function nova.scheduler.least_cost.compute_fill_first_cost_fn. As of writing, this function just return free RAM of a HostState:

def compute_fill_first_cost_fn(host_state, weighing_properties):

"""More free ram = higher weight. So servers will less free

ram will be preferred."""

return host_state.free_ram_mb

1.2 get_all_host_states()

It returns a dict of all the hosts the HostManager knows about. Also, each of the consumable resources in HostState is pre-populated and adjusted based on capabilities data of HostManager. A sample of the returned dict looks like {"host1":hoststate for host1, "host2":hoststate for host2,...}. Please see later sections for hoststate.

1.3 filter_host()

This function takes a list of HostStates and filter_properties as parameters and returns those which can pass the filters.
Filters allowed are defined by FLAGS.scheduler_available_filters with "nova.scheduler.filters.standard_filters" as default value. In fact, it will traverse the filter path and return a list of filter classes. As of now, the list is such as:

'nova.scheduler.filters.isolated_hosts_filter.IsolatedHostsFilter'

'nova.scheduler.filters.compute_filter.ComputeFilter'

'nova.scheduler.filters.availability_zone_filter.AvailabilityZoneFilter'

'nova.scheduler.filters.ram_filter.RamFilter'

'nova.scheduler.filters.json_filter.JsonFilter'

'nova.scheduler.filters.all_hosts_filter.AllHostsFilter'

'nova.scheduler.filters.core_filter.CoreFilter'

'nova.scheduler.filters.affinity_filter.AffinityFilter'

'nova.scheduler.filters.affinity_filter.DifferentHostFilter'

'nova.scheduler.filters.affinity_filter.SameHostFilter'

'nova.scheduler.filters.affinity_filter.SimpleCIDRAffinityFilter

For how each filter filters host, please see FilterScheduler development reference .

Filters used are defined by FLAGS.scheduler_default_filters with "AvailabilityZoneFilter,RamFilter,ComputeFilter" as default value.

Each Filter has defined a host_passes() function which receives HostState and filter_properties as parameters and returns bool to indicate if the host specified in HostState is a good candidate for this filter.

1.4 Passes_filters()

With each HostState object, filter_host() method will call its passes_filters() to check if the host can pass all filters defined. Before going through filters, this method checks if the host complies with rules defined by force_hosts and ignore_hosts fields of scheduler_hints. If field ignore_hosts exists and the host represented by the HostState is in the list, the host fails. If field force_hosts exists, whether the host represented by the HostState object passes depends on if it is in force_hosts. After these rules, if not filtered out, the host will go through the filters until one of the filters fails. If all filters pass, the host will be ok to next phase-cost weighting.

1.5 Weighted_sum()

With cost_functions returned by get_cost_function(), HostStates returned by filter_host(), and filter_properties as parameters, this function will first score each host by running each cost function to generate a grid kind of like the table below:

	fn#1	fn#2	fn#n
Host1	Score#1_1	Score#1_2	Score#1_n
Host2	Score#2_1	Score#2_2	Score#2_n
Hostn	Score#n_1	Score#n_2	Score#n_

And then it will calculate the weighted scores for each host by multiplying score and weight of each cost function to generate a list of weighted final scores. The formula used is:

Final score of a certain host = ∑(weight of cost function * score returned by this function for the host)

And then it will associate the scores with HostStates to generate a list of tuples of score and HostState.

Last it will sort the tuples and return a WeightedHost using the first tuple. This way the least cost host will win.

1.6 getHostState()

It returns the HostState object from selected WeightedHost

1.7 consume_from_instance()

It takes instance_properties as parameter, which comes from request_spec. This function adjusts HostState object's data to virtually consume the resources so that the HostState object can enter into the next loop of host choosing for this request's next instance.

2 _provision_resource()

This function creates requested resource, such as an image instance.

2.1 cast_to_compute_host()

This function casts a queue message to target host so that it will create an image instance according to the request_spec.

Scheduler’s intelligent data - Host Capabilities

Host capability data is another important input for the scheduler. Every OpenStack service, such as nova-compute or nova-network, can publish its capabilities. Above figure depicts how the compute host updates its capabilities of compute service and publishes them to nova scheduler, and how the nova scheduler saves this data for later use. Roughly, the whole process splits into three parts:
1. To collect capabilities

ComputeManager's _report_driver_status() method is a periodic task, which calls update_service_capabilityes() to update the capabilities. LibvirtConnection (There are other connection types. Which one to use depends on the configuration in nova.conf and is usually hypervisor dependent) is the one that does actual job. Its method get_host_stats() is used to collect host capabilities.

One sample of capabilities data looks like:

{

u 'disk_available': 226,

u 'cpu_info': {

u 'vendor': u 'Intel',

u 'model': u 'Westmere',

u 'arch': u 'x86_64',

u 'features': [u 'rdtscp', u 'x2apic', u 'xtpr', u 'tm2', u 'est', u 'vmx', u 'ds_cpl', u 'monitor', u 'pbe', u 'tm', u 'ht', u 'ss', u 'acpi', u 'ds', u 'vme'],

u 'topology': {

u 'cores': u '2',

u 'threads': u '2',

u 'sockets': u '1'

}

u 'hypervisor_type': u 'QEMU',

u 'vcpus_used': 0,

u 'vcpus': 4,

u 'host_memory_free': 1718,

u 'disk_total': 375,

u 'host_memory_total': 3845,

u 'hypervisor_version': 15000,

u 'disk_used': 149

}

2. To publish capabilities

The method publish_service_capabilities() is another periodic task of ComputerManager. It will delegate scheduler api to send out the capabilities onto message queue. The message has topic 'scheduler', service name 'compute' and hostname besides the capabilities.

3. To Receive Capabilities

When the message is on the queue, nova-scheduler will get it and call the SchedulerManager. And then it will call Scheduler's update_service_capablilites() method, which will invoke the HostManager's update_service_capablilites() method. After that the capabilities data of that given service for that given host will be saved by HostManager until next update.

Summary

To wrap up, as a cloud scales to two hosts, the scheduler plays a role. More hosts there are in a cloud, more important the scheduler is. Among all the inputs to nova-scheduler, three are important. They are configuration in nova.conf, service capabilities of each host and the request spec. The configuration in nova.conf decides the static and run-time class structure, service capabilities works as base intelligent data and request spec is the service target. Nova-scheduler can schedule to certain hosts and skip some hosts according to request spec. In addition to the hosts specified in request spec, zone concept can also help scheduler to distribute requests to zone member hosts. After knowing the inside, default behavior of nova-scheduler can be easily modified in nova.conf.