Workflow Automation PT II - Docker Orchestration and Management

Posted By: Luther Trammell on April 17, 2015


Docker Orchestration | What is Workflow | Workflow Automation | Workflow Engine| Open Source | Docker Automation | Docker Management | Cloud Orchestration


When we design the blueprint, it’s important to think about when certain data is available. Inputs to the blueprint can be used at any time, but some data is only available at runtime.

In my previous post What is a Workflow? I described the basic premise behind workflows, how these flows work, and what they’re comprised of. I used a simple example, and the order of operations required in order for them install correctly. 


Cloudify workflows out of the box - at any scale.  Go

In this post, I’d like to dive into a slightly more complex scenario for installing workflows, so let’s look at another scenario: running a Docker application across multiple nodes.

Let’s say we have an application like the Nodecellar application, that consists of a NodeJS app and a MongoDB backend. (And not just because this is the example from our documentation, it really does the job of demonstrating this well. Really.)

The requirements here are:

  • Two virtual hosts.

  • Each virtual host must have a Docker server installed and running.

  • One of our virtual hosts must have a MongoDB container running.

  • The other must have NodeJS/Nodecellar app container.

  • The Nodecellar/NodeJS container must be created after the MongoDB container. (Otherwise, when the NodeJS app starts it will not be able to connect to the database.)

In this scenario there are four nodes and three relationships.

The Nodes:

  1. Virtual Host 1

  2. Virtual Host 2

  3. Node JS/Nodecellar app container

  4. MongoDB container

The Relationships:

  1. MongoDB container contained_in Virtual Host 2

  2. NodeJS/Nodecellar container contained_in Virtual Host 1

  3. Nodecellar container connected_to MongoDB container

It doesn’t really matter when the virtual hosts are created, as long as they’re created before the containers.

The MongoDB container must be created before the NodeJS/Nodecellar container, and it must be started before the NodeJS/Nodecellar container is started.

The complicated part is the installation and starting of the Docker server.

Docker needs to run as root. However, the Cloudify host agent does not run as root, but the host agent needs to install the Cloudify Docker plugin and run the related plugin operations.

This poses a problem, because cloud images with Docker installed by default are uncommon in public clouds. Furthermore, it is necessary to add the user agent on your Cloudify agent VM to the Docker group on your system.

Having Cloudify install Docker for you is problematic for two reasons:

  • The Docker installation is a bit time-consuming, especially for a production environment.

  • Once the Cloudify Agent is installed on the target host, it’s permissions cannot be reloaded.

(There is currently no way to log the Cloudify host agent out, or reload its permissions through plugin operations.)

Therefore, in general the user needs to make sure to bring an image with Docker pre-installed and the correct group membership already established.

There is another approach that involves using userdata in the Cloudify Nodecellar Docker Example repo.

Let’s consider this excerpt from the Nodecellar container definition in the Docker example blueprint:

nodecellar_container:

See the Cloudify Nodecellar Docker Example Simple Blueprint.

Notice that the relationship type that defines the relationship between the nodecellar_container and the mongod_container is depends_on.

This satisfies the requirement that the MongoDB container is running before the NodeJS Container is started.

During the workflow, runtime properties of the nodes will be assigned.

In order to access such properties in a blueprint, you need to use get_attribute. This is an intrinsic function of the Cloudify TOSCA DSL.

Hint: Remember that you can only use this function in an input to an operation, or in a blueprint output.

When a blueprint requires Docker containers running on separate hosts, we require runtime properties of some of the nodes.

Specifically, the IP of the Mongo Host, so that we can tell the Nodecellar container where to look for MongoDB.

See: Cloudify Nodecellar Docker Example Openstack Blueprint.

Here we are not only dependent on the fact that the container is running, but also that the IP runtime property of its virtual host has been set.

--- Is this too abrupt?

When creating blueprints and working with workflows, it’s important to consider the implications of using particular elements in certain places.

You can find the differences between the different node types and relationship types in our documentation, and where and when to use each.

The different operations that are available in the install workflow ensure that you are able to gracefully and flexibly design your orchestration blueprint.

In the final post of this series I will demonstrate how to take these flexible workflows and scale them across large-scale deployments.

blog comments powered by Disqus