Volatile data sources in Business Intelligence

Recently I’ve been working on issues related to volatile data sources. Yellowfin – our Business Intelligence solution – uses a connection pool for managing multiple connections to connected data sources. This allows us to control how many open connections we have to the data source, and to reuse connections, rather than creating and destroying connections each time a connection is required.

Challenge: Connection pool maintenance job

One issue related to using a connection pool is that connections remain open for long periods of time, and usually this will be TCP socket connection. If there is a network issue and the socket is closed, or the database becomes unavailable, the connection will be lost.

Yellowfin has always featured a connection pool maintenance job that tests the integrity of each connection every minute. However, we recently discovered that there was a possibility that this process was unable to detect corrupt connections correctly. If a corrupt connection was given to Yellowfin to perform a query, it could potentially fail as soon as it was asked to perform a task.

Solution: Connection verification

Recent modifications to the maintenance job have added functionality that is able to test the connection to the data source. This forces a command to be issued on the connection to verify integrity. If the maintenance job detects a problem, it can close and recreate the connection. This functionality verifies that the connections are solid.

Problem: Source disconnection destroys connections to the connection pool

Some Yellowfin clients are required to take reporting database sources offline periodically. Forcing these sources offline enables clients to conduct a range of tasks, including data warehousing jobs, server replication / mirroring or transaction log shipping. While these jobs only take a few seconds, they still force a full disconnection to complete them, which in turn destroys all the connections in the connection pool – any data source disconnection will result in all the connections in the connection pool being broken. Another recent modification to the connection pool addressed this issue.

Solution: Connection Retry

Connection Retry functionality has been added to address these short-term outages. When a connection is requested, and that connection is unavailable, Yellowfin will now loop through a retry process, waiting for a configurable period for the source to become available. This forces the user to wait in some instances, therefore helping to keep their user experience error free.