Can you gain performance with Pgpool-II as a load balancer?

The world’s most valuable resource is no longer the oil or gold, but data. And at the heart of data lies the database which is expected to store, process and retrieve the desired data as quickly as possible. But having a single database server doesn’t mostly serve the purpose. A single server has it’s drawbacks with a huge risk of data loss and down time.

So most data platforms use multiple replicated database servers to ensure high availability and fault tolerance of their databases. In such an environment, how can we maximize performance against the resourced being used.

One of the question I get asked very often from Pgpool-II users is, what is the performance benefit of using load balancing of Pgpool-II? Once in a while, we get complains from users that they get better performance when they connect directly to PostgreSQL than through Pgpool-II, even when the load balancing is working fine. How true is this complain?

In this blog, I’ll benchmark the performance of PgPool-II load balancing feature against a direct connection to PostgreSQL.

Before we start measuring the actual performance, let’s start with what is Pgpool-II load balancer and why we should use it.

Replicated PostgreSQL servers.

Almost in every data setup we need more than one copy of database servers to ensure minimum or zero downtime and to safeguard against data lost, For that we use the replicated database servers.

There are many ways to create a replicated PostgreSQL cluster, which is the topic for some other blog post, but most of the replication solution exists for PostgreSQL supports one way replication. That means one master PostgreSQL server feeding data to one or multiple standby servers. and in this setup only maser server is capable of handling the write queries while standby servers sits idle waiting for the moment when the master goes down and one of them gets promoted to become a new master.

This setup is good enough to handle server failures and to provide the high availability but it is not so good enough when it comes to efficiently using the resources, Since the standby servers are justing sitting there and doing nothing.

This is where the Pgpool-II comes in. One of the most used feature of Pgpool-II along with its many other exciting features is it’s intelligent load balancer, Which routes the read queries to the standby servers and sends the writing queries to the master, hence distribute the read load on up-to as many as 128 read replicas.

So what should we expect from the load balancing?

Okay now come to the actual question, How much performance should I expect after adding the Pgpool-II as a load balancer on my PostgreSQL cluster.

The best way is try this out.

First lets create a simple two node PostgreSQL setup, with native streaming replication and measure the performance using the pgbench, with and without Pgpool-II.

For performance benchmarking, we are going to use a small virtual instance with 2GB of RAM and a 2 v-core CPU. This will help amplify the performance gaps between the two strategies.

For this setup I am using the standard configurations. On Pgpool-II side I have just configured the backend_*0 and backend_*1 settings to point to our Master and Standby Server, Other than that I have set following configuration parameters

num_init_children  = 50
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = ‘stream’

Rest of the configuration parameters are kept to their default values.

For PostgreSQL there is no change in configurations except for setting the listen_addresses = ‘*’ and setting the pg_hba.conf to allow TCP/IP connections from Pgpool-II and client IP addresses.

The hardware specs used for PostgreSQL and Pgpool-II servers are very basic, I am using Linux CentOS 7 instances hosted on openstack with 2GB RAM, 1 VCPU and 20.0GB Disk

So lets run the most basic query with and without Pgpool-II and compare the timings, I am expecting double performance with Pgpool-II since I have two backend nodes [dancer]

Lets fire our first query and note the timings

So lets run a very basic query with and without Pgpool-II and compare the timings, I am expecting double performance with Pgpool-II since I have two backend nodes 😛

Lets select the version through Pgpool-II

$ bin/psql -p 9999 postgres -c "\timing" -c "select version()";
Timing is on.
                                                 version                                    
--------------------------------------------------------------------------------------------
 PostgreSQL 11.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

Time: 1.115 ms

Lets try same using direct PostgreSQL connection

$ bin/psql -p 5432 postgres -c "\timing" -c "select version()";
Timing is on.
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 11.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

Time: 0.523 ms

Wait what ?????

There must be something wrong. Why is the direct connection more than twice as fast as Pgpool-II 😕

Lets try pgbench to compare the TPS with and without pgpool-II

Initialise pgbench data using 100 scale factor

$ pgbench -i -s 100 -F 100 -n -h 192.168.2.186 pgbenchdb

Run benchmark for 30 sec and 5 clients with Pgpool-II

$ pgbench -S -T 30 -n -h 192.168.2.214 -p 9999 -c 5 pgbenchdb 
transaction type: <builtin: select only>
scaling factor: 100
query mode: simple
number of clients: 5
number of threads: 1
duration: 30 s
number of transactions actually processed: 118660
latency average = 1.267 ms
tps = 3946.490694 (including connections establishing)
tps = 3947.312911 (excluding connections establishing)

Now same benchmark with direct connection

$ /usr/pgsql-10/bin/pgbench -S -T 30 -n -h 192.168.2.186 -p 5432 -c 5 pgbenchdb 
transaction type: <builtin: select only>
scaling factor: 100
query mode: simple
number of clients: 5
number of threads: 1
duration: 30 s
number of transactions actually processed: 154373
latency average = 0.972 ms
tps = 5143.877489 (including connections establishing)
tps = 5144.725530 (excluding connections establishing)

Again direct connection with PostgreSQL is still performing better. 🤒

Let try to increase some load, benchmark with 35 clients, keeping the scale factor similar

30 seconds , 35 clients with Pgpool-II

$ /usr/pgsql-10/bin/pgbench -S -T 30 -n -h 192.168.2.214 -p 9999 -c 35 pgbenchdb 
transaction type: <builtin: select only>
scaling factor: 100
query mode: simple
number of clients: 35
number of threads: 1
duration: 30 s
number of transactions actually processed: 176664
latency average = 5.958 ms
tps = 5874.876717 (including connections establishing)
tps = 5875.934778 (excluding connections establishing)

Now same with direct connection

$ /usr/pgsql-10/bin/pgbench -S -T 30 -n -h 192.168.2.186 -p 5432 -c 35 pgbenchdb 
transaction type: <builtin: select only>
scaling factor: 100
query mode: simple
number of clients: 35
number of threads: 1
duration: 30 s
number of transactions actually processed: 190913
latency average = 5.560 ms
tps = 6295.480640 (including connections establishing)
tps = 6296.532446 (excluding connections establishing)

The difference between the direct connection and one through Pgpool-II seems to be shrinking after increasing the number of clients in pgbench

But Why?

Let’s analyze the results. Why is this happening, and if this is the case, why should I use a load balancer if the direct connection always performs better than load balanced connections.

If we look at the architecture of Pgpool-II, it is a middleware with the smart “Layer 7” load-balancer. Meaning that the Pgpool-II terminates the client connection and reads the query within. Parse and analyze it to make a load‑balancing decision based on the type of the query and other settings (like white/black_function_lists, prefix if any, current transaction state … e.t.c ). Forward the query to the one of the PostgreSQL backend server, and relay back the results back to the client. So this not only adds an extra hop between client and PostgreSQL server but also has some processing overheads as well.

Putting it all into perspective

Looking at above test results and architecture of Pgpool-II, it appears that despite the load balancer having its own overheads it still starts to catch up with the direct connection performance when we increase the load.

Let us do another test before reaching to a conclusion. This time further increase the load and run benchmark with 45 clients.

Start with Pgpool-II

$ /usr/pgsql-10/bin/pgbench -S -T 30 -n -h 192.168.2.214 -p 9999 -c 45 pgbenchdb 
transaction type: <builtin: select only>
scaling factor: 100
query mode: simple
number of clients: 45
number of threads: 1
duration: 30 s
number of transactions actually processed: 180896
latency average = 7.611 ms
tps = 5912.255620 (including connections establishing)
tps = 5913.375497 (excluding connections establishing)

Direct connection

$ /usr/pgsql-10/bin/pgbench -S -T 30 -n -h 192.168.2.186 -p 5432 -c 45 pgbenchdb 
transaction type: <builtin: select only>
scaling factor: 100
query mode: simple
number of clients: 45
number of threads: 1
duration: 30 s
number of transactions actually processed: 140167
latency average = 9.744 ms
tps = 4618.172310 (including connections establishing)
tps = 4618.860616 (excluding connections establishing)

yahhhh! Now we are getting better performance with Pgpool-II

Conclusion

Pgpool-II is a middleware software and sits between client and PG server, and it has its own processing and network overheads. So for smaller databases and workloads it does not increase the performance, infect it degrades the overall throughput because of extra hop between client and PostgreSQL server and the processing involved on Pgpool-II side ( to make the load-balancer smart so that it can route the read and write queries to the appropriate destination).

Performance increase because of load balancing only starts to happen when load on single PostgreSQL server becomes significant and it’s performance starts to degrade because of large database size or heavy processing load.

Test Results with 100 scale factor

pgbench for 30 sec with 100 scale factor

Test Results with 500 scale factor

pgbench for 30 sec with 500 scale factor

From our benchmarks, we can clearly see that PgPool-II load balancing has its benefits. It may be counter productive for some cases. This blog has attempted to give you a better perspective on how to setup Pgpool-II for your high availability environment. Therefore, as with any other benchmark recommendations, it is very important that you tune your setup as per your workload to gain maximum performance.

I have used a very small virtual instances for the tests and for the systems with more RAM and processing power, the switchover point where Pgpool-II starts to out perform will come with much larger loads.