Yellowfin Evaluation Guide

Yellowfin is used for both enterprise analytics and embedded analytics use cases and for building bespoke analytical applications. Use this guide to ensure Yellowfin is the right technical fit for your requirements.

Scaleability & Performance Overview

  • In this section

    Updated 15 June 2020
  • Introduction

    How can I scale Yellowfin?

    The Yellowfin application supports Clustering. Clustering allows for multiple Application Servers to share a single Yellowfin Repository.
    Yellowfin is “Cluster aware” and a new node will find and join an existing Yellowfin Cluster. Cluster Messaging allows for Yellowfin to send messages between nodes to maintain the correct application state.
    Background tasks can be distributed between nodes to allow for greater throughput of Broadcast and Signal jobs.

    Learn more in our guide to clustering here

    What metrics do we monitor to scale?

    Yellowfin application performance (as opposed to report performance covered in <aud) is primarily influenced by two factors:

    1. How many processes active at any given point in time
    2. How many resources those processes consume

    The most obvious processes are those generated from users in the application, such as reading existing reports or creating a new report. Less obvious tasks include those run in the background of the application such as signals jobs, email broadcasts, or data transformation flows. At any given time, Yellowfin can only run as many tasks as there are threads available on the server. This is broadly tracked by the number of “Server Cores” available to the Yellowfin application.

    The amount of resources these processes consume can be influenced by a number of factors that need to be monitored individually <see monitoring (report performance)>, such as the number of rows returned, or the type of charts generated. This will consume memory on the JVM.

    Scaling the Yellowfin application to handle increased work flows is typically done by monitoring these resources <see debugging & auditing>, and then allocating more as needed. In larger environments we typically recommend that you enable clustering to allow these processes to be distributed across multiple servers.

  • Server Sizing

    How do I choose the right server size?

    Choosing the right server requires analyzing your users’ workflow today (and projecting into the future) based on the metrics described above. When looking at the number of users, we will want to look at not only how many may be in the application at any point in time, but how active they will be within the application. Are they logging in to quickly check one dashboard, or are they logging in to explore the dataset and create ad hoc reports for hours on end?

    For a full guide on choosing the right server

    Does Yellowfin licence based on server size?

    Yes, Yellowfin does provide the option to licence on the number of server cores that can be used. In this model, rather than limit by a count of users, we will limit the resources that can be allocated to those at any given time.

    Will I be able to scale my data store?

    As Yellowfin is primarily a direct read application, the server does not need to be configured to support a database.

    Scaling of the reporting databases is managed entirely by the client separate from the Yellowfin application.

  • Clustering

    Can I scale horizontally by clustering the application server?

    The Yellowfin application supports Clustering. Clustering allows for multiple Application Servers to share a single Yellowfin Repository.
    Yellowfin is “Cluster aware” and a new node will find and join an existing Yellowfin Cluster. Cluster Messaging allows for Yellowfin to send messages between nodes to maintain the correct application state.
    Background tasks can be distributed between nodes to allow for greater throughput of Broadcast and Signal jobs.

    Learn more in our guide to clustering here 

    How can I cluster Yellowfin to provide for high-availability?

    Yellowfin can be clustered by layering multiple application servers on top of a single shared configuration database. Application nodes can reside wherever you choose as long as they are able to communicate between each other. Additionally it is possible to configure each cluster node to run specific tasks, allowing you to dedicate servers to high resource processes such as signals or broadcasts.

    How does licensing work

    Yellowfin provides a single license file which is uploaded through the application’s UI (also possible to do through WS). Once the licence has been applied, the licensing parameters are stored in the shared configuration database. As new application nodes connect to that database, they compare themselves against the stored licence.

  • Caching

    Application Caching

    Several caches are used to store content meta-data in memory.
    Some of these caches include:

    • Report Definition Cache
    • View Cache
    • Cached Filter Cache

    These caches allow the application to be performant when under heavy load. The amount of data stored and maximum time to cache can be configured to best suit the use-case and system environment.

    Data Caching

    Yellowfin can cache data for commonly run reports in memory. This increases performance when concurrent users are viewing the same content.
    Data caching stops the same query from running against the source database within a particular time-frame. Yellowfin Reports can also be scheduled to be run at a particular time, and the cached results served to end users until the next scheduled run.

  • Resource Management

    How do I manage database connection pools?

    Yellowfin uses a Connection Pool to manage database connections, and reuse database connections where possible. Each connection can have separate connection pool limits and sizes. The Connection Pool can be used to protect the database from being inundated with report queries from Yellowfin. Connection pool settings are managed via the administration console.