consul plugin

This plugin allows adding 'soft enable' control to deployment/upgrade plans, to change the visibility of instances to service discovery without stopping the instances or requesting them to register/de-register from the cluster.

IMPORTANT:

  • The consul plugin requires adding service_discovery_auto: true to the target environment configuration (at the top level). This tells the plan generator that new instances come up in 'enabled' state and it should not add a step to call sd_attach after a new instance is created. Likewise, manually-created plans should not have sd_attach immediately following inst_create (however, sd_wait_join is OK and will wait for the instance to register). As of the current implementation, the consul plugin uses its own instance identification which is independent of the 'core' plugin (used to create new instances) and it does not know of the existence of a new instance until it connects to the consul cluster, therefore sd_attach on a just-created instance will simply fail.
  • The sd_wait_join operation provided by the consul plugin waits for the instance to appear in the nodes list and to register a service. It does no other checks before returning - in particular, it does not check whether the instance would be actually discoverable via DNS or a Consul API call (it might not be, if the instance chooses to join the cluster in 'unhealthy' state and update it to 'healthy' later on). If the application expects that a new instance is in a state ready to use when sd_wait_join completes, it should register the service with Consul only when it is actually ready to work.

Requirements to services

The plugin's operation is based on modifying a service tag (enable). In order for the Consul plugin to be able to attach/detach instances from Consul, the service configuration of each service should have the "enableTagOverride": true option when starting the consul agent.

This option is required only for the attach/detach capability; wait_join can be used regardless.

Also, please ensure that the node_id setting on the Consul plugin is set to the same node naming mechanism as used by the services. See the next section for details.

For the service attach/detach operations to be effective, the application should use a modified service discovery, with a special query that selects only service instances that have the enable tag. NOTE Changes in the application may be required for this to be done. There are two possible methods to select only instances that have the enabled tag:

  • Use a DNS query in the form 'enabled.myservice.service.consul' (where 'myservice' is the name of the registered service to be found by the query).
  • Set up a "prepared query" in the Consul cluster, then change the DNS suffix used to make the queries from .service.consul to .query.consul. Here is the JSON text for the prepared query needed to select only 'enabled' services:
{
    "Name": "",
    "Template": {
        "Type": "name_prefix_match"
    },
    "Service": {
        "Service": "${name.full}",
        "Tags": [ "enabled" ]
    }
}

IMPORTANT: if it is desirable to have new instances added to a working application after the initial deploy (and without using a Skopos plan) and the application was configured as above to select only services with the 'enabled' tag, one must ensure that newly launched instances will register the service with the 'enabled' tag by default. For example, one might use this as a service config file for consul (note the enableTagOverride flag, which allows changing the tag without accessing the instance's own consul agent):

{
  "service": {
    "name": "worker",
    "Tags":["enabled"],
    "enableTagOverride": true
  }
}

Use of this plugin is limited to applications in which each component instances registers one consul service. Using components that register multiple services is not supported and will cause it to fail.

Configuration

The Consul plugin supports the following configuration parameters:

base_url

If necessary, one can specify where the plugin should look for Consul, with the following setting (under the "consul" key in the "plugin_config" section - see the Target Environment Configuration Section in this document):

{
   "base_url" : "http://host_or_ip_address:8500/"
}

Note that this setting defaults to localhost:8500 and the above parameters is not needed if there is a consul agent accessible on localhost. This would normally be the case if the Skopos engine is ran on a host that is a member of the cluster and the engine runs on the same network as Consul itself (e.g., if Skopos runs as a container with its network in 'host' mode).

node_id

By default, the Consul plugin expects that the components register as Consul nodes using the exact same instance identifier as the one used by the 'core' plugin that created those instances.

The instance ID from the core plugin may or may not be the name of the corresponding consul node, e.g., for the case of the 'ec2-asg' plugin, this is the EC2 instance Id (a string in the form i-A_LONG_HEX_NUMBER). NOTE the default behavior of the Consul agent is to use the hostname as the node name. To make sure the nodes are registered with their EC2 ID: either set the hostname to be the EC2 ID or use the -node option on the consul command line, e.g.:

consul agent -node `curl -s http://169.254.169.254/latest/meta-data/instance-id`

If the instance ID from the core plugin cannot be used as node ID, the Consul plugin can derive the node ID from the IP address provided by the core plugin (if supported - currently, all of the 'core' plugins do provide the instances' IP addresses). To enable this, add the following setting, to the plugin configuration (under the "consul" key in the "plugin_config" section):

"node_id" : "ipsearch"

This will cause the 'consul' plugin to request the list of nodes from Consul and search it for nodes with matching IP addresses. This option should work regardless of the name that each node uses to register itself with Consul, but it adds an extra API call to consul each time the plugin is used.

The plugin can also be configured to expect one of the following alternate node IDs, also derived from the instance's IP address:

  • a string in the form 'ip-N-N-N-N'
  • a string in the form 'ip-N-N-N-N.ec2.internal'

With N-N-N-N being the 4 bytes of the instance's IP address. The alternate IDs are selected by adding the setting, respectively:

"node_id" : "ec2_short_hostname"

or "node_id" : "ec2_full_hostname"

in the Consul plugin configuration.

join_timeout

join_timeout defines the amount of time, in seconds, that sd_wait_join should wait for an instance to be registered as a Consul node. The default value is 360.

service_timeout

service_timeout defines the amount of time, in seconds, that sd_wait_join should wait for a node to register a service. This value is independent of join_timeout and is applied after the join_timeout time. The default is 60. If the application joins Consul and registers a services as one operation (e.g., by starting the Consul agent with a configuration file that defines a service), service_timeout can be set to 0 (a short timeout is applied automatically even in this case, because internally Consul does not register a node and a service in the same API call, making it possible for the node to appear in the list before the service is registered). To disable the wait for service completely, set service_timeout to a negative value. The negative setting should be used only if the entire application consists of components that do not register services at all or if all components that use a service from another component are tolerant to being started before that service is registered.