Skopos™ Deployment Plan Reference

The Skopos Plan file is a YAML file that defines the steps that the Skopos deployment engine needs to execute to perform a deployment (e.g., initial deployment or an upgrade).

Note that starting with version 0.8, Skopos supports automatic generation of deployment plans and there is no need to create plans manually. This document is retained primarily as a reference for reviewing and understanding generated plans - and for those who want to understand better how Skopos works.

The plan syntax provides sequencing and conditional flow with the following key capabilities:

  • invoke a plugin to perform target environment operation (e.g., create new instance or wait until a component is ready to service traffic)
  • invoke a script to perform an application-specific operation (e.g., quality gate)
  • request user confirmation to proceed or fail (manual quality gate)
  • delay subsequent steps for a fixed interval of time
  • iterate over a set of instances for a given component
  • direct the sequence to cleanup steps in case of failure
  • indicate the success or failure of any step or sequence of steps (incl. the plan as a whole)
  • interpolation of variables (substituting symbolic names with values from model or target environment descriptors)

Note: In the beta version, the plan needs to be defined manually for each deployment type; the engine uses the plan as provided, coupling it with a model and, optionally, with a target execution environment. In an upcoming version of Skopos, the plan is going to be generated automatically. For automatically generated plans, this reference provides the information required to read the YAML file and make any desired changes prior to deployment.

Note: this is the beta definition of the plan; future versions may use different syntax and semantics for the plan. Where practical, we will try to retain backward compability and/or provide conversion tools.

Here is what a Skopos deployment plan looks visually (when loaded in the Skopos engine):

plan-top

plan-detail

Here is what a Skopos deployment plan looks like in YAML (fragment):

doctype: 'com.datagridsys.doctype/skopos/deploy-plan'
version: 'alpha/1'

steps:
  -
    id: preflight
    class: group
    steps:
      -
        id: pre-10
        label: "Check new front image exists"
        class: action
        plugin: ec2-docker
        action: image_check
        arguments: { arg: "yourrepo/front:2.0" }
        on_fail: fail
      -
        id: pre-20
        label: "Check new back image exists"
        class: action
        plugin: ec2-docker
        action: image_check
        arguments: { arg: "yourrepo/back:2.0" }
        on_fail: fail
    on_fail: fail

  -
    id: back
    class: component
    old_vers: [ "1.0" ]
    new_ver: "2.0"
    steps:
      -
        id: back-10
        label: "Create new backs"
        class: instance-loop
        target_num: 2
        selector: target
        steps:
          -
            id: back-10-10
            label: "Create new back"
            class: action
            plugin: ec2-docker
            action: inst_create
            arguments: { selector : "new" }
            on_fail: fail

...

The minimum required elements of a deployment plan are:

  • a header, identifying the document format
  • a top-level steps section with zero or more steps and flow control

While the engine is capable of executing almost arbitrary plans, the visual interface in the present version of Skopos is more rigid and will display correctly only plans that fit a fixed pattern. See Limitations

Header section

The header section of the plan should contain the following two elements:

doctype: 'com.datagridsys.doctype/skopos/deploy-plan'
version: 'alpha/1'

This header identifies the YAML file as a Skopos deployment plan and provides version identification for forward compatibility with later versions of Skopos.

Top-level steps chain

The steps at the top level of the plan are all composite steps that are containers for other steps.

A sequence of steps in the plan is called a chain. Chains consist of one or more steps. Each step has:

  • a unique identifier (id attribute, user-assigned)
  • an optional label (label attribute, used to display the step in the UI)
  • a class name (class attribute, one of several supported step classes)
  • flow control attributes that define what to do after the step completes (on_pass and on_fail attibutes)
  • other attributes depending on the class
steps:
  -
    id: preflight
    label: Check Pre-requisites
    class: group
    steps:
        ...
    on_fail: fail

  -
    id: front
    class: component
    old_vers: [ "1.0" ]
    new_ver: "2.0"
    steps:
        ...
    on_fail: fail

id

The id attribute of a step is a unique identifier of the step. The id is case sensitive and alphanumeric, with hypen and underscore allowed. While most step IDs can be freely assigned, it makes sense to observe some convention that ensures the uniqueness of the IDs; see the sample plan(s) for examples.

The step ID is used in two important ways:

  • as a target for flow control statements (on_pass and on_fail)
  • to identify the step in the deployment log

Note that the id of component steps must match the component name

label

Step labels are free-form text that the Skopos user interface displays when showing the plan. If the label is omitted, Skopos will display the step ID as a label.

class

The step class defines what the Skopos engine will do at a given step.

Top-level steps support only the following two classes:

  • class: group - a group of deployment steps that are not related to a specific component and its instances. Group steps typically contain a number of pre-requisites (when used before the component steps) or post-deployment operations (when used after the component steps). They can also be used between components, e.g., to prepare for data migration or perform any custom actions. A group step contains its own step chain with low-level steps to be performed for the group.

  • class: component - a group of deployment steps related to deploying a specific component from the model. The component step contains its own step chain with low-level steps to be performed to deploy/upgrade the component. In addition, component class steps have the following attributes:

    • component - name of the component in the model. If omitted, the id attribute of this step is used
    • new_ver - target version of the component to be deployed. Used only for display
    • old_vers - list of zero or more prior versions found to be replaced. Used only for display

on_pass, on_fail

The flow control attributes determine what Skopos should do after a step completes. In the case of composite steps (such as the group or component steps), the result of the composite step is the result of the step chain inside the composite step.

on_pass determines what next step Skopos should continue with in case the current step completed successfully ("passed").

on_fail determines what next step Skopos should continue with in case the current step failed.

In case a flow control attribute is missing, it is assumed to be next - the flow will continue with the next step in the chain; if it is the last step of a chain, it will complete the chain with success ("pass"). Note that this rule is also applied to failure results, so if on_fail is omitted, the failure is ignored and the next step is taken.

For top-level steps, the following special values and sub-attributes are supported:

  • next special label - means the next step in the same chain; if there is no next step, this is equivalent to pass (see below). Also, if on_pass/on_fail are not explicitly provided for a step, next is assumed.

  • pass special label - means that the chain has to be completed with "pass" and no more steps from this chain will be executed. The next upper level chain will continue; if this is a top-level step (group or component), then the plan will be completed. Note that on_pass: pass is required at the last step if there are cleanup steps to follow (so that the last forward step does not fall into the cleanup step sequence)

  • fail special label - means that the chain has to be completed with "fail" status and no more steps from it will be executed. If this is a top-level step, then the whole deploy plan will be failed.

  • ,pause sub-attribute - can be specified after any label (ID or special label) to cause Skopos to pause when taking this exit from the step. For example, on_fail: worker-900,pause means that on failure, the next step will be worker-900 and Skopos will automatically pause the plan before continuing with that step.

  • ,warn sub-attribute - like ,pause but instead of pausing, causes a warning message to be added to the log. This sub-attribute can be combined with pause (e.g., `on_fail: worker-900, pause, warn).

  • ,info attribute - like ,warn but instead of a warning, an info message is logged. This sub-attribute can be used by itself or in combination with ,pause, for example: "on_fail: next, pause, info" or "on_pass: next, info" or

Low-level steps chains

id

The id attribute of a step is a unique identifier of the step. The id is case sensitive and alphanumeric, with hypen and underscore allowed. While most step IDs can be freely assigned, it makes sense to observe some convention that ensures the uniqueness of the IDs; see the sample plan(s) for examples.

The step ID is used in two important ways:

  • as a target for flow control statements (on_pass and on_fail)
  • to identify the step in the deployment log

Example: id: worker-10

label

Step labels are free-form text that the Skopos user interface displays when showing the plan. If the label is omitted, Skopos will display the step ID as a label.

Example: label: "Create new worker instance"

cleanup

A boolean attribute that defines whether the step is part of a cleanup sequence (rolling back a failed sequence). The attribute is used in three important ways:

  • as a hint in the visual display (cleanup steps are organized right-to-left and in different color)
  • as a delimiter between steps in a chain (transition from cleanup-to-non-cleanup or vice-versa is treated as end of chain)
  • as a factor to decide whether the last step in a chain causes chain pass or fail (see on_pass/on_fail label next)

The default value is false and the attribute is typically omitted when false.

Example: cleanup: true

class

The step class defines what the Skopos engine will do at a given step.

Non-top level step chains support the following classes (see respective sections below for details on each class and its permitted attributes):

  • action - execute a plugin action or script
  • delay - delay plan execution a fixed amount of time
  • manual-gate - stop plan execution until manual (or API-driven) decision is made (manual "pass" or "fail")
  • instance-loop - loop a set of steps over all instances (replicas) of a component; allowed only inside component steps

Example: class: action

on_pass, on_fail

  • <step-id> label - the step ID of the step to continue with after the success or failure, respectively, of the current step. For example, if worker-900 is the cleanup step that needs to be executed upon failure of the current step, use on_fail: worker-900.

  • next special label - continue with the next step in the chain, immediately following the current one. This is the default if the flow control attribute is omitted. Special rules:

    • if this is the last step in a chain, this is equivalent to pass if the step is not a cleanup step or fail if the step is a cleanup step (has attribute cleanup set to true)
    • otherwise, if this step's cleanup attribute value differs from the next step's cleanup attribute value, this is treated as end of chain (see above for pass/fail outcome)
    • otherwise, continues with the next step in the chain (which has the same value of its cleanup attribute)
  • pass special label - complete the chain successfully; no further steps from this chain will be executed. This will complete successfully the parent of the current step - a group step, a component step or an iteration of an instance loop step.

  • fail special label - complete the chain with a failure; no further steps from this chain will be executed. This will fail the parent of the current step - a group step, a component step or an instance loop step. Note that if an iteration of a instance loop is failed, this will exit the loop, failing it.

  • ,pause sub-attribute - can be specified after any label (ID or special label) to cause Skopos to pause when taking this exit from the step. For example, on_fail: worker-900,pause means that on failure, the next step will be worker-900 and Skopos will automatically pause the plan before continuing with that step.

  • ,warn sub-attribute - like ,pause but instead of pausing, causes a warning message to be added to the log. This sub-attribute can be combined with pause (e.g., `on_fail: worker-900, pause, warn).

  • ,info attribute - like ,warn but instead of a warning, an info message is logged. This sub-attribute can be used by itself or in combination with ,pause, for example: "on_fail: next, pause, info" or "on_pass: next, info" or

Examples:

  • on_pass: next
  • on_fail: worker-990
  • on_fail: worker-990,pause
  • on_fail: fail

action steps

Action steps invoke external actions that control the execution environment/infrastructure. Typically, these invoke plugins (e.g., to create a container instance or add an instance to a load balancer) or application-specific scripts (e.g., quality gate that verifies some aspect of the operation or performs an application-specific action that needs to be executed during deployment - e.g., drain a queue of elements before destroying it)

Action steps have two forms: plugin invocation and script invocation. In both cases, Skopos runs an external executable. Plugins are Skopos-specific and get a lot of environment and other variables; they can also create and destroy component instances. Scripts are simple executables (usually shell scripts but can be any executable) that takes command line arguments are returns exit code (and prints any errors to stderr)

Plugin actions

  • plugin - name of the plugin (e.g., docker, elb, ec2-docker, consul)
  • action - name of the action within the plugin (e.g., inst_create, lb_attach)
  • arguments - list of named arguments to be passed to the action. For example, arguments: { name: lb, wait: "{{.timeout}}" }

The arguments are passed to plugin's action as is (after Variable Interpolation, see section in this document). There is a special argument, which is not passed but instead selects what instance ID should be given to the action. Possible values are: selector: new to refer to the new instance (whether it is created or not yet) and selector: old to refer to the old instance.

See the Plugins Reference for more details on plugins, actions and their arguments.

Action steps can refer to any of the standard plugins that are included with Skopos or may refer to user-provided plugins. See Installing Skopos for details on how to configure Skopos to with user-provided plugins.

Example: class: action plugin: docker action: inst_create arguments: { selector: new }

Script exec

  • exec - command to execute, either as a single string or as a list of arguments.

See the Plugins Reference for more details on executing scripts.

Action steps typically refer to user-provided application-specific scripts. See Installing Skopos for details on how to configure Skopos to with user-provided scripts.

Note that plugins and scripts share namespace, so plugin names and script names cannot overlap.

User-defined plugins and scripts take precedence over any standard plugins and scripts included with Skopos.

Example: class: action exec: "check_perf {{.new}}" # or ["check_perf", "{{.new}}"]

delay steps

A delay step introduces a short time delay before proceeding. The delay is configurable. The step always completes successfully when the delay period completes.

  • duration - how long to delay, in seconds (e.g., duration: 2). Default is 1 second.

Note that pausing the plan execution will not pause the delay; however, if the deployment is paused, the steps after the delay step will not be executed until the plan is resumed.

In general, we don't recommend using delays when waiting for tasks to complete; it is better to use gates (e.g., action with custom script that waits until the desired result is produced); for example, the ec2-docker plugin has lb-wait action which will wait until a component becomes in-service for a given load balancer.

Example: class: delay duration: 30 # seconds

manual-gate steps

A manual gate stops the execution of the deployment until a manual response is provided for the step, indicating success or failure. The manual gate is intended to be used when there is no automatic check (yet) for a certain condition and a human is required to check it and indicate the result in the user interface.

The manual gate can also be used as a way to pause (e.g., for approval).

Note that the manual response can be provided using the API or the skopos control utility (gate-pass and gate-fail actions using the skopos post command).

  • message - A short text message to be shown in the user interface next to the Pass/Fail prompt

Example: class: manual-gate message: "End-to-end performance is acceptable" # Pass/Fail

instance-loop steps

Instance loops provide the ability to perform a sequence of steps (a chain) multiple times, once for each instance (replica) of a component. It is similar to a "for" loop in programming languages, except that it is limited to iterating over instances of components (whether the instances are existing or need to be created).

  • target_num - sets the number of iterations. Setting target_num: 4 will perform 4 iterations regardless of the selector and number of new/old instances.
  • start_index - defines a starting instance index in the loop (0 if not specified). Setting start_index to 2 and target_num to 3 will loop over instances 2, 3 and 4. start_index can be specified even if target_num is not - in that case, the loop will iterate starting from start_index and ending with the last index available (new or old depending on the selector)
  • selector - if target_num is not specified, the selector can be used to automatically determine the number of iterations. The possible values determine whether the loop should iterate over the old or new instances, or over the number of target instances. selector cannot be specified of target_num is specified (only one or the other is valid). Setting selector: old will iterate over the existing (old version) instances; it is used most frequently to disconnect or destroy the old instances after the new ones were successfully set up, as well as to reconnect/restore to operation the old instances in case the new instances failed. Setting selector: new will iterate over any new instances that already exist (typically to destroy during rollback). Setting selector: target will iterate over the target number of instances (usually, to be created, whether new or replacing old instances). The selector is used to determine the number of iterations (matching the number of existing instances with old and new, and matching the requested number of target instances); If start_index and selector are specified and target_num is not, the iterations will start from start_index and end with the last selected instance.
  • steps - chain of steps to be executed on each iteration; when the chain completes with pass the next iteration will be started (unless the loop ends, in which case the loop will complete with pass). When the chain completes with fail, the loop immediately terminates and the loop step completes with fail.

Variable Interpolation

In some cases, it is useful to be able to use variables in the plan which can be set in the model and/or the target environment descriptors. In addition to such user-defined variables, Skopos defines a several standard variables that can also be used.

In the current version of Skopos, variable interpolation is performed only on the arguments of plugins and scripts in action steps. The syntax of values follows the Go language template library ({{.field}} is replaced with the value of the field variable).

  • .project - project name, as provided to Skopos when loading model/plan (e.g., myproj in skopos load --project myproj model.yaml plan.yaml)
  • .component - the name of the current component (matches the id of the component step); valid only within component steps
  • .image - the target image of the current component (from the image attribute in the model); valid only within component steps
  • .index - current instance index in an instance loop (essentially, the index of .old/.new rather than the ID); can be used as instance number (slot); valid only in arguments of actions inside instance loops
  • .id - the instance ID of the selected instance of a component (valid only inside component steps with a selector attribute); used for script arguments (plugin arguments use selector)
  • .ipaddr - the IP address of the selected instance of a component (valid only inside component steps with a selector attribute); used for script arguments (plugin arguments use selector)
  • .new (deprecated) - the instance ID of the new instance of a component (valid only inside component steps); used for script arguments (plugin arguments use selector). Deprecated, use selector attribute and the .id variable
  • .new_ip (deprecated) - the IP address of the new instance of a component (valid only inside component steps); used for script arguments (plugin arguments use selector). Deprecated, use selector attribute and the .ipaddr variable
  • .old (deprecated)- the instance ID of the old instance of a component (valid only inside component steps); used for script arguments (plugin arguments use selector). Deprecated, use selector attribute and the .id variable
  • .old_ip (deprecated) - the IP address of the old instance of a component (valid only inside component steps); used for script arguments (plugin arguments use selector). Deprecated, use selector attribute and the .ipaddr variable
  • .vars.<varname> - user-defined variable from the vars section in the model or target environment descriptor
  • .Model.Components.<component>.Image - target image to deploy for a <component>. It provides the value of the image attribute of a component from the model. It can be used, for example, with the docker plugin action image_check or image_pull to ensure that the target image is available prior to starting the destructive steps of the plan. Unlike .image, this variable can be used outside of componet steps, e.g., in a pre-flight group step. The same syntax can be used to access any attribute of the model

Examples:

exec: "check_running {{.new}}" exec: "curl -sS http://{{.new}}:8080/healthz"

arguments: { wait: "{{.mytimeout}}" }

plugin: docker
action: image_pull
arguments: { arg: "{{.Model.Components.redis.Image}}"}

Limitations

  • top-level steps can be only of class group or component

  • no nested loops

  • no goto explicit step id within loop step chains (only fail to terminate)

  • only non-top-level steps can have cleanup flag set to true

  • for single instance components:

    • forward steps must be in sequence (typical on_pass: next or omitted)
    • cleanup steps must be in sequence (typical on_pass: next or omitted)
    • cleanup steps must be clearly marked with a cleanup: true attribute
  • for multiple instance components, in addition, the only pattern supported is:

    • zero or more pre-loop steps (which cannot be loops)
    • one or more steps inside the loop, using only on_fail: fail (or pass) for failure handling (no cleanup inside the loop)
    • zero or more post-loop steps, which can have loops but they should be single-step loops and will not be shown visually as loops
    • a single cleanup chain which can have loops but they shuold be single-step loops and will not be shown visually as loops

Plan Errors

  • The following requirements exist and are enforced by the engine at plan load time:
    • only known step classes are allowed
    • step IDs must be unique within the plan (not only within their scopes)
    • on_pass/on_fail "goto" explicit step ID is allowed only within the same scope
    • on_pass/on_fail "goto" explicit step ID must be to existing step

Note that extra attributes on steps are ignored by the plan parser; this means that a mistyped attribute may be skipped and the default used instead.

  • The following requirements exist but are not yet verified prior to deployment start:
    • all plugins must exist, all actions must exist and take the arguments listed
    • all required plugin configurations must be defined in the target environment descriptor
    • all scripts must exist

If the above requirements for plugins and scripts are not met for an action step in the plan, the action will fail and follow the failure path defined in the plan. If the error is in the cleanup path (e.g., mistyped plugin name) this may prevent the recovery from completing and may leave the application in non-operational state. The recommended recovery path is to correct the plan and re-run it.

Practical Considerations

The following practical rules can help diagnose and fix problems with plan's behavior.

If you are seeing unexpected visual flow and/or unexpected behavior (e.g., step failing but the plan continuing), please verify if all of the following are observed at the failure point:

  • (for visual problems) the Limitations are met, especially on the pattern in multi-instance components
  • on_fail required at outer levels (instance loop, component/group)
  • cleanup in loop (must assume at least one has completed, so cleanup everything)