When you develop an application that uses a database, one of the topics you need to consider is how to connect to the database. Some application use a one-to-one connection, meaning the user connects to the application and it connects to the database for this user. Using this method it’s easier to implement some separation and isolation between the users (for example, each user can have their own database user with different permissions). But the more common way, for reasons we’ll discuss here, is using a connection pool. With a connection pool, the application opens a few connections to the database in advance, then each time the application needs to run something against the database, it uses one of the available open connections to do so.
Benefits of Connection Pools
When we use one-to-one connection to the database, the main problem is scale. A database server can only hold so many connections: for example, with Oracle, the default maximum connection is in the hundreds and depends on the number of CPUs, PosgreSQL has the default set to 100. Both can be changed, but this is not really a database limitation. Many database systems (Oracle by default, PostgreSQL, and MySQL among others) use a thread or process for each session connected to the database. This means that if you open many connections, even if they are idle, they consume resources (some memory and CPU resources in order to maintain and manage them).
With a connection pool, the application opens a few connections to the database in advance, then just reuse these connections. The overhead of creating new processes/threads and manage them, as well as the connection operation (like opening the socket or perform authentication) are basically eliminated.
When using a connection pool, the application handles the connections in an efficient way, it uses all available connections to perform the operations, removes the overhead, and can increase and decrease the amount of connections as needed.
The idea is that you have “connection” objects (called the pool) in your application. Each object holds a real connection to the database. When the application gets to a part in the code that needs access to the database, it takes one of the objects from the pool and uses it, marking it as “in use” so other application threads or processes won’t use the same one. When the application finishes whatever it needed to do, it releases the object back to the pool, marking it “available” so it can be used by other parts of the application.
Configuration
Configuring the connection pooling depends on your connection pool provider, and there are many. So I won’t really get into the technical details here, but I do want to discuss the concepts.
The main configuration options for connection pools are (I’ll use the connection pool perspective as you can configure a few pools for different parts of the application):
- Initial number of connections – this is a number of database connections that the pool initiate once it starts.
- Maximum number of connections – this is the maximum number of database connections the pool can have.
- Connection increment – when the application needs a connection from the pool, but no connections are available, the pool will either wait until a connection is available or open a new connection for the request. This setting configures how many new connections to open once there are no free ones (it might be better to open a few so they will be ready for use the next time the application needs one). This can happen multiple times until we reach the maximum number of connections the pool can have.
- Idle time before shrinking – If you want, you can configure the option to shrink the pool if it has grown. This means that if you have a minimum of 10 connection and maximum of 20, if the pool grew to anything beyond 10, we can configure it to gradually shrink if the extra connections are no longer needed. This is usually configured by idle time, so if an extra connection (beyond the initial amount) is idle for X seconds, it will be closed, shrinking the pool up to a minimum of the initial number.
But how do you know what values to use? This depends on many things, but here are a few guidelines:
- The first rule is to make sure that the database can handle the amount of work it gets.
- It all depends on utilization:
- If your application takes a connection from the pool but leaves it idle for long periods, you can have a larger number of connections in the pool.
- If the connections taken from the pool are doing a lot of database work for long period of times, make sure you don’t have too many connections in the pool.
- Follow the connection pool usage. If you see that you usually have a larger number of connections than the initial number, consider increasing the initial number.
- If you configure a pool increment, set a reasonable idle time for shrinking. There is no point in allocating and deallocating connections all the time, it’s a resource consuming operation.
Connection Leak
One of the main problems of connection pools is a connection leak. When you use a connection pool in the application, it’s your responsibility to release the connections back into the pool. Sometimes, unexpected things happen and the connection doesn’t get released. In theses cases, the connection is marked as “used” and other parts of the application cannot use it. If this happens often, at some point, the pool will run out of available connections and the application will get stuck. Even more than that, the session might have an open transaction and hold lock. When you “lose” this connection the locks will not be released and the transactions won’t be committed or rolled back, leading to all kind of issues.
Note that the shrinking option won’t shrink these connections as they are marked as “used”, it can only remove available connections.
When something like this happens it’s tricky to find the cause. In one case I encountered, the back-end team started an entire project to create an infrastructure that will allow following the used connections so they will be able to “find” them and get to the part of the code that used them last. Once they figured out when the leak happens, they managed to fix it so the connection will be properly released.
Tracing the database session might help as well. If the leak happens in a development environment, you can trace the entire database and then identify the lost session and figure out what happened last.
In any case, this is not an easy task and it requires time and effort.
Overload Problem
The other main problem with connection pools is overload. This is something that many developers are not even aware of, but it is very important. As I explained before, with the standard configuration the application takes a connection from the pool and uses it. If there is no available connection, the connection pool will start a new connection up to the maximum configured. The problem is what happens when the database server is loaded.
Think about a scenario where all 10 connections (let’s assume that this is the initial size) are used, but they are not just allocated, they are actually running heavy operations on the database. In this case the database might be quite loaded with high CPU or high I/O usage (or both). What happens when the application gets to a point where is wants to take another connection for something? The pool will not have available connections and will open a new connection to the database. The new connection will be used causing even more load to the database. And because the database is loaded, this connection will take time to complete its tasks so other parts of the application probably won’t have available connections as well, so the the pool will start even more connections.
This is where things go downhill very quickly. When the database is loaded, each operation takes longer, causing the connections to be busy for a longer time, and because they will not be released back to the pool we’ll see more and more connections being created, causing more and more load on the database. This will continue to happen until the the pool will reach the maximum connections configured and the application will likely hang, or the database will crash taking the application down with it.
The solution in these cases is a bit counter intuitive. We need to DECREASE the number of connections, so the active connections will have the chance to complete their operations without adding more and more connections and load to the system. During this time we can also find the problematic connections and kill them or stop their operation, allowing the system to get back to normal.
The long term solution for this kind of a problem is to configure the pool correctly. Just taking the default, whatever that might be, might be not adequate to your system and in certain scenarios cause this snowball behavior.
Summary
Connection pools are often overlooked. Developers just make up numbers or take the default values from the pool provider. This is problematic as the behavior of the application is crucial when thinking about the connection pool. You need to understand your application (how often you take a connection from the pool and release a connection to the pool, how many parts of the application need connections concurrently, etc.), how heavy your load is on each connection, the database configuration and the load the database can handle. Only with this information you can make the right decision about your connection pool configuration.
Don’t hesitate to add a comment with your thoughts or experience and if you wish to get an email every time I publish a new post, subscribe here:
There are some nice youtube videos on this:
Real-World Performance – 13 – Large Dynamic Connection Pools – Part 1
Real-World Performance – 14 – Large Dynamic Connection Pools – Part 2
You can find this list here: http://www.oracle.com/goto/oll/rwp
Thanks Rich, the Real World Performance Team is a great resource.
I’ve always loved to go to their sessions at conferences.
This is advice I have offered to several clients, along with an explanation such as the excellent one you have provided here.
The real problem is not technical in nature, but rather it is rooted in the minds if those that control the systems.
One example is the application manager that steadfastly refused to reduce the number of connections to the database, which was many thousands.
My next advice was to use a connection pool, and greatly reduce the number of connections allocated to the database.
I don’t know if that was ever done, though I have asked.