Using Consul for Service Discovery in Multiple Data Centers – Part 2

1 Star2 Stars3 Stars4 Stars5 Stars (2 votes, average: 5.00 out of 5)
Loading...

Consul configuration in a multi Data Center environment

This is part two on how to configure Consul in a multi Data Center environment. Click here for part one.

Note: An updated post using the most recent version of Consul (version 1.4.2) is available here, the below configuration will work with consul version 0.9.2 and below.

Consul Multi Data Center Diagram

Verify your cluster configuration

In the first part we successfully completed creating two separate data centers stand alone Consul environments.

Below, we are first gong to verify the Consul environment as working properly, we will then move-on to configure Consul GEO prepared query for multi Data Center use, then, we will finally complete the Consul setup with a few API examples, using Curl or Python. as a bonus I added a Consul Availability Dashboard.

To test/verify your cluster, just run the below DNS checks.
Tip: Consul should return 3 address.

dig @10.10.1.11 -p 8600 consul.service.consul    

; <<>> DiG 9.10.4-P8 <<>> @10.10.1.11 -p 8600 consul.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23854
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;consul.service.consul.         IN      A

;; ANSWER SECTION:
consul.service.consul.  0       IN      A       10.10.1.11
consul.service.consul.  0       IN      A       10.10.1.13
consul.service.consul.  0       IN      A       10.10.1.12

;; Query time: 0 msec
;; SERVER: 10.10.1.11#8600(10.10.1.11)
;; WHEN: Thu Aug 31 15:53:51 EDT 2017
;; MSG SIZE  rcvd: 98

To return Data Center specific information.
For DC1 just run the below.

dig @10.10.1.11 -p 8600 consul.service.dc1.consul

; <<>> DiG 9.10.4-P8 <<>> @10.10.1.11 -p 8600 consul.service.dc1.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29978
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;consul.service.dc1.consul.     IN      A

;; ANSWER SECTION:
consul.service.dc1.consul. 0    IN      A       10.10.1.13
consul.service.dc1.consul. 0    IN      A       10.10.1.12
consul.service.dc1.consul. 0    IN      A       10.10.1.11

;; Query time: 0 msec
;; SERVER: 10.10.1.11#8600(10.10.1.11)
;; WHEN: Thu Aug 31 15:55:05 EDT 2017
;; MSG SIZE  rcvd: 102

For DC2 just run the below.

dig @10.10.1.11 -p 8600 consul.service.dc2.consul

; <<>> DiG 9.10.4-P8 <<>> @10.10.1.11 -p 8600 consul.service.dc2.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9414
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;consul.service.dc2.consul.     IN      A

;; ANSWER SECTION:
consul.service.dc2.consul. 0    IN      A       10.10.1.111
consul.service.dc2.consul. 0    IN      A       10.10.1.112
consul.service.dc2.consul. 0    IN      A       10.10.1.113

;; Query time: 1 msec
;; SERVER: 10.10.1.11#8600(10.10.1.11)
;; WHEN: Thu Aug 31 15:55:40 EDT 2017
;; MSG SIZE  rcvd: 102

To lookup a specific service.
Tip: If omitting the DC keyword, it will always return the local client DC.

dig @10.10.1.11 -p 8600 db1.service.consul

; <<>> DiG 9.10.4-P8 <<>> @10.10.1.11 -p 8600 db1.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62200
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;db1.service.consul.            IN      A

;; ANSWER SECTION:
db1.service.consul.     0       IN      A       192.168.1.50

;; Query time: 0 msec
;; SERVER: 10.10.1.11#8600(10.10.1.11)
;; WHEN: Thu Aug 31 16:07:59 EDT 2017
;; MSG SIZE  rcvd: 63

So far we have been configuring two stand alone working Consul clusters. however it will only return local results, meaning if your client has joint dc1, you will get back dc1 results.

In order to return any available DC results, we will have to configure Consul GEO auto failover.
In the next section I will show you how to configure Consul GEO failover.

Consul Multi Data Center Configuration

To use Consul GEO failover, you will have to create something referred to as a prepared query.

The Consul prepared query gives you the capability to do an API or DNS lookup, based on the query keyword. prepared query's also gives you the capability to cascade data centers, more is explained below.

There are many options available to use in a prepared query, you can basically use any reg-ex style to manipulate the dns/api data returned.

In our example below, we create a prepared query matching lookups with the Service name, in our example service db1 will be looked-up and matched, first matching dc1, then if failed return dc2. an example lookup is below.

To create the prepared query run the below.

curl --request POST --data \
'{
  "Name": "",
  "Template": {
    "Type": "name_prefix_match"
  },
  "Service": {
    "Service": "${name.full}",
    "Failover": {
      "NearestN": 2
    }
  }
}' http://10.10.1.11:8500/v1/query

Tip: The name.full matches any service name.

To verify the prepared query was added to the system, just run the below, the results should look similar to the below output.

curl -q http://10.10.1.11:8500/v1/query |python -m json.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   356  100   356    0     0   168k      0 --:--:-- --:--:-- --:--:--  347k
[
    {
        "CreateIndex": 132754,
        "DNS": {
            "TTL": ""
        },
        "ID": "9873c8a0-5699-2da8-0c2e-44f6bab1a91e",
        "ModifyIndex": 132754,
        "Name": "",
        "Service": {
            "Failover": {
                "Datacenters": null,
                "NearestN": 2
            },
            "Near": "",
            "NodeMeta": null,
            "OnlyPassing": false,
            "Service": "${name.full}",
            "Tags": null
        },
        "Session": "",
        "Template": {
            "Regexp": "",
            "RemoveEmptyTags": false,
            "Type": "name_prefix_match"
        },
        "Token": ""
    }
]

Other prepared query options are to set the Datacenters lookup order, or max Data Center hops, an example is below.

curl --request POST --data \
'{
  "Name": "",
  "Template": {
    "Type": "name_prefix_match"
  },
  "Service": {
    "Service": "${name.full}",
    "Failover": {
      "NearestN": 3
      "Datacenters": ["dc1", "dc2"]
    }
  }
}' http://10.10.1.11:8500/v1/query

For a full list of options please check the Consul API query docs.

Finally, we are now ready for the fun part. to test the HA across Data Centers, follow the below.

The assumption is that both DC's you configured run a DB like MySQL or similar service. in our example its MySQL on port 3306, feel free to replace with your choice of db/application.

Below is an example service lookup using the query keyword, instead of the service keyword across data centers, the first lookup reaches/returns DC1, and the second lookup reaches/returns DC2, since DC1 is down.

dig @10.10.1.11 -p 8600 db1.query.consul srv

; <<>> DiG 9.10.4-P8 <<>> @10.10.1.11 -p 8600 db1.query.consul srv
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13712
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;db1.query.consul.              IN      SRV

;; ANSWER SECTION:
db1.query.consul.       0       IN      SRV     1 1 3306 c0a80132.addr.dc1.consul.

;; ADDITIONAL SECTION:
c0a80132.addr.dc1.consul. 0     IN      A       192.168.1.50

;; Query time: 0 msec
;; SERVER: 10.10.1.11#8600(10.10.1.11)
;; WHEN: Thu Aug 31 15:46:53 EDT 2017
;; MSG SIZE  rcvd: 99

As you can see it returns the local IP Address
Tip: If you like the return to be a different IP Address, you can replace that by adding the Address keyword.

Now, stop the local DB or Consul agent, run the exact same lookup again, and here you go, it should/will return the remote db address, like the example below.

dig @10.10.1.11 -p 8600 db1.query.consul srv

; <<>> DiG 9.10.4-P8 <<>> @10.10.1.11 -p 8600 db1.query.consul srv
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59521
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;db1.query.consul.              IN      SRV

;; ANSWER SECTION:
db1.query.consul.       0       IN      SRV     1 1 3306 ac100132.addr.dc2.consul.

;; ADDITIONAL SECTION:
ac100132.addr.dc2.consul. 0     IN      A       172.16.1.50

;; Query time: 1 msec
;; SERVER: 10.10.1.11#8600(10.10.1.11)
;; WHEN: Thu Aug 31 15:48:33 EDT 2017
;; MSG SIZE  rcvd: 99

Using Python to access Consul Data

Till now we ware only using regular DNS tools like dig or nslookup. in the next section I will show you how to use Python to return the same results.

To simple return DNS results, we can use one of the Python DNS modules. a simple example is below.

#!/bin/python

from dns import resolver

consul_resolver = resolver.Resolver()
consul_resolver.port = 8600
consul_resolver.nameservers = ["10.10.1.11"]

# srv record
answer = consul_resolver.query("db1.query.consul", 'SRV')
print answer.response.additional[0].items[0].address, answer[0].port

#just ip
answer = consul_resolver.query("db1.query.consul", 'A')
for answer_ip in answer:
    print(answer_ip)

Running the above Python script will return two lines, one line will show just the IP address, and one line will show the IP Address and the port.
In our case it should look like the below.

192.168.1.50 22
192.168.1.50

All of the examples above was using regular built-in Python modules, next I will show you how you can use the more specific Python Consul Module.

To use Python Consul Module, we must first install the Python Module, you do so by running the below.

pip install python-consul

There are so many ways to manipulate Consul data with the Python module, it will be hard to cover all of them, I will include just a few examples below.

Returning Consul node results.

#!/bin/python

import consul, os, sys, json, re, pprint

# Uncomment if you have proxy issues
#del os.environ['http_proxy']
#del os.environ['HTTP_PROXY']
#del os.environ['https_proxy']
#del os.environ['HTTPS_PROXY']

def main():
    c = consul.Consul(host='10.10.1.11')    
    
    _, nodes = c.catalog.nodes()
    for i in range(0, len(nodes)):
        # Print full node array
        print nodes[i]
        # Print Node Name and IP
        print nodes[i]['Node'], nodes[i]['Address']

if __name__ == "__main__":
    main()

Disabling a Consul service.

#!/bin/python

import consul, os, sys, json, re, pprint
def main():
    c = consul.Consul(host='10.10.1.11')

    # Put a node in maintenance mode.
    c.agent.maintenance('true', 'this is a test')
    # Remove the node from maintenance mode
    c.agent.maintenance('false')

if __name__ == "__main__":
    main()

Below is a partial list of options, just uncomment what you are trying to use.

#!/bin/python

import consul, os, sys, json, re, pprint

del os.environ['http_proxy']
del os.environ['HTTP_PROXY']
del os.environ['https_proxy']
del os.environ['HTTPS_PROXY']
 
def main():
    c = consul.Consul(host='10.10.1.11')    
    #c = consul.Consul()
 
    # Register Service
    #c.agent.service.register('my_service',
                             #service_id='my_service_1',
                             #port=3306,
                             #tags=['mytag1', 'mytag2'])
 
    #print(c.agent.services())
 
    # From agent view list all registered Services, checks, members
    #for x in c.agent.services():
        #print(x)
    #for x in c.agent.checks():
        #print(x)
    #for x in c.agent.members():
        #print(x)

    # Node in maintance
    #c.agent.maintenance('true', 'this is a test')
    #c.agent.maintenance('false')

    # Service in maintance
    #c.agent.service.maintenance('http', 'false', 'this is a test')
    #c.agent.service.maintenance('http', 'false')

    # Return all dc's
    #for x in c.catalog.datacenters():
        #print (x)

    # Returns nodes
    _, nodes = c.catalog.nodes()
    #print nodes
    for i in range(0, len(nodes)):
        print nodes[i]
        print nodes[i]['Node'], nodes[i]['Address']
        #print nodes[i]['Node']['Datacenter']
    
    #for x in c.catalog.nodes():
        #pprint.pprint(x)
        #print(Node.x)

    #for x in c.health.checks():
        #print (x)

    #for x in c.agent.self():
        #print(x)
 
    # To remove the service entry
    #c.agent.service.deregister(service_id='my_service_1')
 
if __name__ == "__main__":
    main()

For the full list of the python module options please click here.

Using Consul data for a data center Dashboard - server availability

One of the benefits of Consul is the health checks.
I am using Consul health checks, as a method to provide server availability per Data Center.

The below Availability Dashboard uses Consul to dynamically update server available, and was build using the Gentellela template, available here.
Availability Dashboard screen capture is below

I hope you enjoyed reading the Consul Multi Data Center setup and configuration, give it a thumbs up by rating the article or by just providing feedback.

You might also like - realted to Docker Kubernetes / micro-services.

2
Leave a Reply

avatar
3000
1 Comment threads
1 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
2 Comment authors
Eli KleinmanPooja Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Pooja
Guest
Pooja

Hi Eli, Thank you for writing such a good consul notes. I want to configure geo failover for three dcs for postgres with patroni service. My question is where we write prepared queries ? Do we need to run curl command on consul agent itself? I I