Advanced MySQL Load Balancing with NGINX Plus

In this article, we explain how to use some of the advanced features in NGINX Plus and the open source NGINX software to load balance MySQLTM database servers in a Galera cluster. We explore splitting of reads and writes across load-balancing database servers, advanced health checks and monitoring, and some best practices for maximizing availability.

(Both NGINX and NGINX Plus support the features we’re discussing, but for ease of reading we refer to NGINX Plus throughout.)

Setting Up a Galera Cluster of MySQL Database Servers

This blog builds on our previous blog about MySQL load balancing with NGINX Plus and assumes you have a working knowledge of MySQL, Galera Cluster, and TCP load balancing with NGINX Plus.

We followed the DigitalOcean guide to set up three MariaDB database servers as a Galera cluster, but you can set up an equivalent cluster however you choose, and substitute MySQL for MariaDB. (MariaDB is an open source database which the vendor states is “a binary drop-in replacement for MySQL”. For our purposes they are completely equivalent, but if you’re interested you can read about the incompatibilities between them.)

Our examples use a three-server cluster with one designated as the master, but the basic technique can be expanded to very large clusters. You also need a running nginx process to proxy incoming requests from the application to the database cluster.

Configuring Simple Load Balancing

For a simple kind of load balancing, we have NGINX Plus forward all queries to the designated master in the cluster. We’re doing TCP load balancing, so we put the directives in the stream context:

stream {
upstream galera {
#default to master, fallback to secondary when unavailable
server db-node01.example.com:3306;
server db-node02.example.com:3306 backup;
server db-node03.example.com:3306 down;
}

server {
listen 3306;
proxy_pass galera;
proxy_connect_timeout 1s;
}
}

This configuration confers a number of advantages when working with the cluster.

First, we can conceive of the cluster as a single database for the purpose of development, while still taking advantage of the replication benefits of having a cluster.

Second, we no longer have to worry about parallel writes and collisions happening between the multiple masters in the cluster. In the previous blog, all servers acted as master and we had an issue with parallel updates to different servers leading to potentially erroneous data. Even though the cluster was behaving as directed, when a client tried to insert multiple values at once on multiple servers the results were counterintuitive and not what a developer would expect.

Another benefit of this configuration is that it eliminates replication lag, where a high volume of traffic relative to the network speed between the two servers causes a noticeable delay between writing data on one server and being able to read it on a replicating server. In other words, if you read data from one server immediately after it’s written to the database by another server, you might not see the update. We avoid this in our setup because the load balancer directs all traffic to the same master MySQL server.

Splitting Reads and Writes

While writes are best directed at the master server in a cluster, it’s often desirable to direct read queries to the secondary members of the cluster, which aren’t doing much of the heavy lifting. Setting this up is a bit more involved – the NGINX Plus load balancer doesn’t parse the incoming SQL queries, so the reads and writes have to be distinguished in the application tier instead.

The easiest way to split reads and writes is to set up separate load balancers for them, each listening on a different port. The application directs all transactions containing just SELECT queries to the read load balancer and all the transactions with INSERT, UPDATE, or DELETE queries to the write load balancer. Here’s a sample config file for this setup.

stream {
# Separate TCP load balancers for reads and writes
upstream galera_write {
#zone is necessary for health checks later
zone backend 64k;
server db-node01.example.com:3306;
server db-node02.example.com:3306 backup;
server db-node03.example.com:3306 down;
}

upstream galera_read {
zone backend;
server db-node02.example.com:3306;
server db-node03.example.com:3306;
}

server {
listen 3308;
status_zone tcp_server;
proxy_pass galera_write;
proxy_connect_timeout 1s;
}

server {
listen 3309;
status_zone tcp_server;
proxy_pass galera_read;
proxy_connect_timeout 1s;
}
}

We then have to rework our application code to direct requests to the write port (here, 3308) or read port (3309) as appropriate. This can be a fair amount of work, but many common database libraries (such as Django’s and Hibernate) provide facilities for making your application more database-aware in the context of multiple databases.

It’s important to note that while the intent of the application architecture is for reads to go through the read load balancer and writes through the write load balancer, nothing prevents reading on the write port and vice versa. To prevent that, we could separate the load balancers on different hosts and grant them different permissions so that the read load balancer would be restricted by the database to just reads, but we’re not showing that configuration here.

Configuring Health Checks

When it comes to health checks with SQL server load balancing, you have a lot of options. The simplest is just to verify that when NGINX Plus forwards a connection to the upstream database servers, it responds to the attempt to connect to the database. This is the default behavior in both NGINX Plus and NGINX. While the proxy doesn’t have the ability to communicate natively with the database, you get a quick response that lets you know the servers are still there.

This simple kind of health check is sufficient in some cases, but it has an awkward drawback – the load balancer continues to send requests to an upstream database server that is accepting new connections even if it can’t actually service requests (for example, when a replication issue between the database servers prevents the server from performing writes).

In production, you need a way to parse the database response to something meaningful to more fully make use of NGINX’s health check capabilities. This is usually done by running a process on the database server, on an open port, that checks the database’s health and responds to HTTP requests from the load balancer with the database’s status. You can either write your own script that runs a simple SELECT query (or any database health check you choose) whenever a health request ping comes in, or use an existing health check script for MySQL load balancing.

For the test cluster, we installed the linked health check script from GitHub and set it up as a service running on port 9200 (its default port). Since the script uses HTTP to respond to the health check of the pings, we confirm it that it’s working correctly by checking the head of its HTTP response when the node is healthy and again when it is down (we’re showing just the first line of the curl output, with the status code):

user@dbnode-03:# curl -i http://localhost:9200
200 OK

user@dbnode-03:# sudo service mysql stop
* Stopping MariaDB database server mysqld [ OK ]

user@dbnode-03:# curl -i http://localhost:9200
503 Service Unavailable

The final step is to set up NGINX Plus to use port 9200 for the advanced health check. The ability to direct health checks to a specific port is a new feature in NGINX Plus Release 8. It’s as simple as adding the port parameter to the health_check directive in the server block:

[config]stream {
upstream galera {
...
}

server {
listen 3306;
proxy_pass galera;
proxy_connect_timeout 1s;
health_check port=9200;
}
}

If you’re using NGINX rather than NGINX Plus, you can use a clever workaround to close the MySQL port to outside traffic whenever the server is unavailable. This allows you to get the same in-depth health check as with NGINX Plus, but in this case the health check is sent to the same port as regular requests to the service.

Whichever method you use, when the health check returns an error response, the load balancer automatically removes the database server from the pool until the server recovers. A common cause of database downtime is unwittingly giving it too much work, for example an unexpectedly large join or a query that suddenly becomes slow. In this case, the server usually recovers on its own, gradually passing a greater percentage of health checks, but recovery is faster if it has a chance to recover from the excessive load. For this scenario, NGINX Plus offers a way to slowly ramp up traffic to a server over a period of time after it has failed health checks.

Conclusions

We've reviewed and expanded on some concepts in MySQL load balancing using a simple Galera cluster setup. Using NGINX Plus as the load balancer, we split distributed load differently for reads and writes to maximize resources. We also looked at a few ways to do in-depth health checks for the cluster so NGINX Plus can better manage the upstream resources it’s load balancing. The combination of NGINX Plus and Galera allows you to maximize your hardware resources while maintaining reliability.

To try out NGINX Plus for MySQL load balancing in your own environment, start your free 30-day trial today or contact us for a live demo.

The post Advanced MySQL Load Balancing with NGINX Plus appeared first on NGINX.

Source: nginx

About KENNETH 19688 Articles
지락문화예술공작단

Be the first to comment

Leave a Reply

Your email address will not be published.


*


이 사이트는 스팸을 줄이는 아키스밋을 사용합니다. 댓글이 어떻게 처리되는지 알아보십시오.