Auto-Scaling your Apps with TOSCA & Cloudify - An Orchestration Tool Comparison Pt II of IIPosted By: Eli Polonsky on June 2, 2015
This is the second part of our Orchestration tool comparison. You can find part I here.
PLEASE NOTE: This blog post does not offer a fully working example of auto-scaling with TOSCA & Cloudify, only a theoretical example of how it would work. We hope to have a working example in the near future.
This section assumes basic knowledge of Cloudify and TOSCA. If you don’t happen to have this knowledge, you can have a look at the following links to help you understand what they’re all about:
TOSCA is an evolving open standard for a templating language to orchestrate applications on the cloud. It is not bound to any specific cloud provider, and is in itself nothing but a templating language, with no engine to back it up. This is inherently different from Heat, which provides both a templating language and the orchestration engine behind it.
OpenStack Orchestration made easy with Cloudify. Get it now. Go
So just for context, and for those who don’t know much about Cloudify, it is an orchestration engine that understands TOSCA templates and can interact with multiple cloud providers, and that is what we will be demonstrating in this section.
The most important concept to understand about TOSCA templates is that every single entity in your application topology is a node_template, which is of a certain node_type, and has a lifecycle interface that exactly depicts how to manipulate, create, and delete the entity.
A node_template can be thought of as the equivalent to a resource and a node_type is equivalent to the various Heat Resource Types. The lifecycle interface however, does not have a correspondent concept. This fact will prove very important to us later on.
So again, let’s dive right in to an example. We will try to adapt the same use case from before, auto-scaling a Wordpress server that connects to a static and shared MariaDB instance, to be written in TOSCA and consumed by Cloudify.
This is a definition of a node_template of the host that we want to install the MariaDB instance on. In TOSCA, each logical entity in the application should be modeled separately, with relationships tying these entities to one another. In order to install the MariaDB instance on this host, we define a new node_template.
With Cloudify you don’t have to use user_data in order to install the software; instead, you break the installation process and the lifecycle interface hooks into different installation parts, with all the logic itself residing inside scripts. Notice the relationships section, which tells the db to be installed specifically on the db_host. Let’s move on to the Scaling related part.
So like the previous post, I’ll recap, any auto-scaling process implementation should always answer three basic questions:
Which resource to scale?
What does the scaling process do?
When should the scaling process be triggered?
Q1: The Which
Here we defined a pretty cool node_template. It is of type cloudify.nodes.openstack.Server, and it has some additional interfaces to give it monitoring capabilities. We can see that the cloudify.interfaces.monitoring_agent takes care of installing a monitoring agent called Diamond, and the cloudify.interfaces.monitoring configures the agent with various collectors that gather data from the host. Remember that with OpenStack Heat, all of this configuration is hidden from the user, but it also exists. Now let’s define the actual Wordpress application.
Notice that we are defining cloudify.interfaces.monitoring on this node_template as well, which tells the monitoring agent on the host to add an HttpdCollector, which is part of many built-in collectors in Diamond. This is awesome, because we can later use these metrics to do some intelligent application specific scaling.
Q2: The What
Now we want to understand exactly what the scaling process will do. In Cloudify, every process is thought of as a workflow (to learn more about this you can read this post “What is a Workflow?”), which is essentially an automation process algorithm. Each workflow is comprised of invocations to interface operations associated with a specific node_type.
The workflow definition itself is part of the template:
And the workflow code is a python method written with the Cloudify Workflow API.
Let's see a small snippet:
Remember that every node_template implements the lifecycle operations differently, and these are what embody the difference between scaling a MariaDB instance or a Wordpress instance. But the process remains the same. Also, because this is all part of the template itself, and not part of the workflow engine, a user can customize this process how ever he sees fit.
Q3: The When
This part is what TOSCA defines as Groups and Policies. However, as of writing this post, policies haven’t yet really been fully designed. Cloudify has its own version of this, which might find its way to the official spec eventually, but for now, this is how it looks:
Let’s break down the elements to understand what’s going on.
For the most part, we can see that the terms used here are relatively familiar to us. We have a group called autoscaling_group that defines in its members which nodes will be considered for examination. Notice that we are specifying the wordpress node_type, and not the wordpress_host, since we are interested in metrics that are specific to the application, not the host.
We also define the scaleup_one_instance policy, which instructs the Cloudify engine to trigger some action once the ReqPerSec metric value is above the 1000 threshold for a measurement period of at least 60 seconds. It is clearly shown what sort of action the engine will take. This is encapsulated in the triggers section of the scaleup_one_instance policy, which declares that the scale workflow should be executed with the given parameters, and in our case, these parameters are telling the engine to add an additional wordpress_host instance.
On the backend side of Cloudify, all of the metrics are stored in a time series metric database called InfluxDB, which can be easily queried. Cloudify also provides a very elegant python rest client. With this client you can very easily trigger any kind of workflow you like:
This is of course also available via REST API. This means that if a curtain calculation is needed upon which scaling should be made, and they are not exposed in the policies, it should be fairly easy to implement this separately.
So, what did we learn from all of this? The purpose of this post series was to understand how one can perform automatic scaling on OpenStack, as well as understand the current gaps in the implementations, and see what can be improved.
We saw two different methods, and I think that it is a fair bet to say that you won’t find any good tools out there that do this in a completely different manner. So it’s safe to say, that what we just saw covers pretty much all there currently is.
The first method was done using the native OpenStack orchestration tool, aka, Heat. I think that Heat does a very good job with regards to what Heat was initially built for, which is orchestrating OpenStack infrastructure resources. It still feels like they don’t really live and breathe application orchestration, which makes it difficult to manage and scale applications, not just infrastructure. The fact that the scaling process is hard-coded inside the engine might prove to be a serious limitation for complex use cases, but is actually an advantage for the most common ones.
The second way was by using an open source orchestration engine called Cloudify, which adheres to the TOSCA open standard. Using this tool gives you more native integration with applications of various sorts, but less integration with OpenStack resources. You might find that some resource types are not yet supported, but overall it has very good coverage of the OpenStack API. Using Cloudify, you have full control over the scaling process. You can extend and modify the built-in workflow to suit your specific needs, which is a great ability to have.
Granted, this will require some python coding at the moment, but plans are being made to make this workflow composition completely declarative. Another thing worth mentioning, is the fact that Cloudify is completely cloud agnostic, it does not rely on built-in type definition or monitoring capabilities of a specific cloud provider. This is as a direct result from using a standard like TOSCA, which makes all of it possible using the concept of a lifecycle interface. This means that you can take you Cloudify installation and templates, and migrate or extend your workload to a different cloud provider with no changes to the scaling-related aspects of the template.
I hope this gives you fairly extensive coverage of the options out there. Eventually, you will have to think carefully about your use case and choose the approach that best fits what you’re looking to achieve. That said, regardless of what that is, you will most likely find a solution, which is very good news.