I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
I’ve been trying to see how well Ceilometer, one of the core components of OpenStack that actually provides some of this stuff, would work. Initially, I was a bit bummed, but after fumbling around for a while, I am starting to see the light.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
You see, the reason I almost abandoned the idea of using Ceilometer was due to the fact that some of the “meters” it provides are, well, a bit nonsensical (IMHO).
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
For example, there’s network.outgoing.bytes
, which is what you would expect… sort of. Turns out, this is a “cumulative” meter. In other words, this meter tells me thetotal number of bytes sent out a given Instance’s virtual NIC. Ceilometer has the following meter types to measure :
- Cumulative: Increasing over time (e.g.: instance hours)
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
- Gauge: Discrete items (e.g.: floating IPs, image uploads) and fluctuating values (e.g.: disk I/O)
- Delta: Changing over time (bandwidth)
I’ll take this aside and defend the fine folks working on Celiometer on this one. Ceilometer was built initially to generate non-repudiable bulling info. Technically, AFAIK, that is the project’s only goal – though it has morphed to gain things like an alarm framework.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
So, now you can see why network.outgoing.bytes
would be cumulative: so you can bill a customer for serving up torrents, etc.
Anyway, I can’t imagine that I’m the only person looking to get Delta metrics out of a Cumulative one, so I thought I’d document my way of getting there. Ultimately there might be a better way, YMMV, caveat emptor, covering my backside, yadda yadda.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.Transformers to the Rescue!
… no, not that kind of Transformer. Lemme ‘splain. No, there is too much. Let me sum up.
Actually, before we start diving in, let’s take a quick tour of Ceilometer’s workflow.
The general steps to the Ceilometer workflow are:
Collect -> (Optionally) Transform -> Publish -> Store -> Read
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
Collect
There are two methods of collection:
- Services (Nova, Neutron, Cinder, Glance, Swift) push data into AMQP and Ceilometer slurps them down
- Agents/Notification Handlers poll APIs of the services
This is where our meter data comes from.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.Transform/Publish
This is the focus of this post. Transforms are done via the “Pipeline.”
The flow for the Pipeline is:
Meter -> Transformer -> Publisher -> Receiver
- Meter: Data/Event being collected
- Transformer: (Optional) Take meters and output new data based on those meters
- Publisher: How you want to push the data out of Ceilometer
- To my knowledge, there are only two options:
- RPC
- UDP (using msgpack)
- To my knowledge, there are only two options:
- Receiver: This is the system outside Ceilometer that will receive what the Publisher sends (Logstash for me, at least at the present – will likely move to StatsD + Graphite later on)
Store
While tangental to this post, I won’t leave you wondering about the “Store” part of the pipeline. Here are the storage options:
* Default: Embedded MongoDB
* Optional:
* SQL
* HBase
* DB2
Honorable Mention: Ceilometer APIs
Like pretty much everything else in OpenStack, Ceilometer has a suite of OpenAPIsthat can also be used to fetch metering data. I initially considered this route, but in the interest of efficiency (read: laziness), I opted to use the Publisher vs rolling my own code to call the APIs.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.Working the Pipeline
There are two Transformers (at least that I see in the source):
- Scaling
- Rate of Change
In our case, we are interested in the latter, as it will give us the delta between two samples.
To change/use a given Transformer, we need to create a new pipeline via /etc/ceilometer/pipeline.yaml
Here is the default pipeline.yaml:
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.---
-
name: meter_pipeline
interval: 600
meters:
- "*"
transformers:
publishers:
- rpc://
-
name: cpu_pipeline
interval: 600
meters:
- "cpu"
transformers:
- name: "rate_of_change"
parameters:
target:
name: "cpu_util"
unit: "%"
type: "gauge"
scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"
publishers:
- rpc://
The “cpu_pipeline” pipeline gives us a good example of what we will need:
- A name for the pipeline
- The interval (in seconds) that we want the pipeline triggered
- Which meters we are interested in (“*” is a wildcard for everything, but you can also have an explicit list for when you want the same transformer to act on multiple meters)
- The name of the transformation we want to use
(scaling|rate_of_change)
- Some parameters to do our transformation:
- Name: Optionally used if you want to override the metric’s original name * Unit: Like Name, can be used to override the original unit (useful for things like converting network.*.bytes from B(ytes) to MB or GB)
- Type: If you want to override the default type (remember they are:
(cumulative|gauge|delta)
)
Scale: A snippet of Python for when you want to scale the result in some way (would typically be used along with Unit)
- Side note: This one seems to be required, as when I omitted it, I got the value of the cumulative metric. Please feel free to comment if I goobered something up there.
Looking at all of this, we can see that the cpu_pipeline, er, pipeline:
- Multiplies the number of vCPUs in the instance (
resource_metadata.cpu_number
) times 1 billion (10^9, or 10**9 in Python)- Note the “or 1”, which is a catch for when
resource_metadata.cpu_number
doesn’t exist
- Note the “or 1”, which is a catch for when
- Divides 100 by the result
The end result is a value that tells us how taxed the Instance’s is from a CPU standpoint, expressed as a percentage.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.Bringing it all Home
Armed with this knowledge, here is what I came up with to get a delta metric out of the network.*.bytes metrics:
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
Armed with this knowledge, here is what I came up with to get a delta metric out of the network.*.bytes metrics:
name: network_stats
interval: 10
meters:
- "network.incoming.bytes"
- "network.outgoing.bytes"
transformers:
- name: "rate_of_change"
parameters:
target:
type: "gauge"
scale: "1"
publishers:
- udp://192.168.255.149:31337
In this case, I’m taking the network.incoming.bytes and network.outgoing.bytesmetrics and passing them through the “Rate of Change” transformer to spit a gauge out of what was previously a comulative metric.
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.I could have (and likely will) taken it a step further and used the scale parameter to change it from bytes to KB. For now, I am playing with OpenStack in a VM on my laptop, so the amount of traffic is small. After all, the difference between 1.1 and 1.4 in a histogram panel in Kibana isn’t very interesting looking :)
Oh, I forgot… the Publisher. Remember how I said the UDP Publisher uses msgpack to stuff its data in? It just so happens that Logstash has both a UDP input and a msgpack codec. As a result, my Receiver is Logstash – at least for now. Again, it would make alot more sense to ship this through StatsD and use Graphite to visualize the data. But, even then, I can still use Logstash’s StatsD output for that. Decisions, decisions :)
Since the data is in Logstash, that means I can use Kibana to make pretty charts with the data.
Here are the bits I added to my Logstash config to make this happen:
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.input {
udp {
port => 31337
codec => msgpack
type => ceilometer
}
}
At that point, I get lovely input in ElasticSearch like the following:
{
"_index": "logstash-2014.01.16",
"_type": "logs",
"_id": "CDPI8-ADSDCoPiqY9YqlEw",
"_score": null,
"_source": {
"user_id": "21c98bfa03f14d56bb7a3a44285acf12",
"name": "network.incoming.bytes",
"resource_id": "instance-00000009-41a3ff24-f47e-4e29-86ce-4f50a61f78bd-tap30829cd9-5e",
"timestamp": "2014-01-16T21:54:56Z",
"resource_metadata": {
"name": "tap30829cd9-5e",
"parameters": {},
"fref": null,
"instance_id": "41a3ff24-f47e-4e29-86ce-4f50a61f78bd",
"instance_type": "bfeabe24-08dc-4ea9-9321-1f7cf74b858b",
"mac": "fa:16:3e:95:84:b8"
},
"volume": 1281.7777777777778,
"source": "openstack",
"project_id": "81ad9bf97d5f47da9d85569a50bdf4c2",
"type": "gauge",
"id": "d66f268c-7ef8-11e3-98cb-000c29785579",
"unit": "B",
"@timestamp": "2014-01-16T21:54:56.573Z",
"@version": "1",
"tags": [],
"host": "192.168.255.147"
},
I’ve been dabbling a bit more with
OpenStack
as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.
"sort": [
1389909296573
]
}
Finally, I can then key a Histogram panel in Kibana to key on the “Volume” field for these documents and graph the result, like so:
I’ve been dabbling a bit more with OpenStack as of late. If you know me, you can likely guess my goal is how to ingest logs, monitor resources, etc.OK, so maybe not as pretty as I sold it to be, but that’s the data’s fault – not the toolchain :) It will look much more interesting once I mirror this on a live system.
Hopefully someone out there on the Intertubes will find this useful and let me know if there’s a better way to get at this!
原文链接:
https://cjchand.wordpress.com/2014/01/16/transforming-cumulative-ceilometer-stats-to-gauges/