Setup the Hive JDBC Driver
7 February, 2013
Hi,
Please can you provide instructions on how to setup the Hive JDBC driver in Yellowfin 6.2
Regards,
Nick
Please can you provide instructions on how to setup the Hive JDBC driver in Yellowfin 6.2
Regards,
Nick
Hi Nick,
This will be a 2 step process;
[size="18">Step 1 ;[/size]
-Stop the Yellowfin server.
-Save the attached hivejdbc.jar to a location on your PC
This will now make the Hive driver selectable.
The next step is to actually point to your Hive jdbc driver files.
You can download a copy of the drivers we have here , though this may not be compatible with your Hive version (as there are quite a few versions).
[size="18">Step 2;[/size]
-Start Yellowfin and create your Hive connection (see screenshot below)
-Point to your Hive jdbc drivers .
Note: These must be located on the same server as the Yellowfin application.
That should be it.
If the attached drivers do not work for your installation, you will need to find the exact version of Hive you are using and try to find it online, as we do not have other drivers.
We will do our best to help you with this, though please be aware that we did not create the drivers, and can only send across what we find.
Please let me know how you go.
Regards,
David
This will be a 2 step process;
[size="18">Step 1 ;[/size]
-Stop the Yellowfin server.
-Save the attached hivejdbc.jar to a location on your PC
This will now make the Hive driver selectable.
The next step is to actually point to your Hive jdbc driver files.
You can download a copy of the drivers we have here , though this may not be compatible with your Hive version (as there are quite a few versions).
[size="18">Step 2;[/size]
-Start Yellowfin and create your Hive connection (see screenshot below)
-Point to your Hive jdbc drivers .
Note: These must be located on the same server as the Yellowfin application.
That should be it.
If the attached drivers do not work for your installation, you will need to find the exact version of Hive you are using and try to find it online, as we do not have other drivers.
We will do our best to help you with this, though please be aware that we did not create the drivers, and can only send across what we find.
Please let me know how you go.
Regards,
David
Hi,
I have a question regarding the Hive integration. By default Hive queries are very slow. How does one get around the slowness? We have a large amount of data stored in Hive. End Users need to do interactive queries on parts of this data. How can this requirement be architected using YellowfinBI? Please advise.
Thanks.
I have a question regarding the Hive integration. By default Hive queries are very slow. How does one get around the slowness? We have a large amount of data stored in Hive. End Users need to do interactive queries on parts of this data. How can this requirement be architected using YellowfinBI? Please advise.
Thanks.
Hi,
Hive DB's are very good for storage as they can hold a large amount of data, but are very slow to run queries against.
What most people do is transfer chunks of data from their Hive DB into another database that can run the queries a lot quicker, and unless you're using Petabytes of data you don't really need to use a Hive DB.
In saying all of this, you could create your YF view and then cache it into a different database via view caching, found in the summary TAB mentioned here.
If you have any questions on this, we will need move this to it's own thread.
Regards,
David
Hive DB's are very good for storage as they can hold a large amount of data, but are very slow to run queries against.
What most people do is transfer chunks of data from their Hive DB into another database that can run the queries a lot quicker, and unless you're using Petabytes of data you don't really need to use a Hive DB.
In saying all of this, you could create your YF view and then cache it into a different database via view caching, found in the summary TAB mentioned here.
If you have any questions on this, we will need move this to it's own thread.
Regards,
David
i've tried with the hive driver provided here by David, but it doesn't work. (it is the Hive 1 connector and it has been deprecated for Hive2 for concurrency), so it accepts the driver but hangs out.
I've also tried with drivers provided by Cloudera (impala and hive jdbcs 3 and 4)
I'm working with Cloudera quickstart CDH 5.2 and Yellowfin 7.1
Thanks!
I've also tried with drivers provided by Cloudera (impala and hive jdbcs 3 and 4)
I'm working with Cloudera quickstart CDH 5.2 and Yellowfin 7.1
Thanks!
Hi Guest,
Whilst Yellowfin can definitely connect to Hive1 and Hive2 (we have many clients currently doing so), there are, as you may be aware, different �flavours� of Hive such as Cloudera and unfortunately we haven�t been able to test Yellowfin against all of these different Hive "flavours" yet.
So, because Cloudera is one of those that we haven�t tested against yet the first thing to do in your situation would be to connect to your Hive2/Impala database with a 3rd party database tool such as DB Visualiser or SQuirreL. This will prove that a JDBC connection is indeed possible. Yellowfin essentially makes the same connection as those database tools, so obviously if one of those database tools can successfully connect then Yellowfin should also be able to.
Another point is that, regardless of what �flavour� of Hive is being used, the most common issue that has occurred with previous attempts at connecting Yellowfin to Hive is that dependencies have been missed out when copying all Hive drivers to the Yellowfin driver folder (as specified in the field "Driver Path"). The easiest way to ensure that all necessary dependencies have been included in your Yellowfin installation is to add all jar files from your Hive/Impala lib folders. To give you an idea of what this may involve, once I couldn�t connect to a specific version of Hive/Hadoop until I had copied all jars from the Hive lib folder and the Hadoop lib, (all in all there were about 70 of them) and then the connection was successful.
And one more point is, once you've added all of your jar files from the Hive lib and the Hadoop lib folders to the Yellowfin field called "Driver Path" as shown in the below screenshot:
then also please make sure there are no Hive drivers in the folder yellowfinappserverweb-inflib so there is no chance of a mix-up of drivers for the class-loader.
I hope that makes sense and it helps your connection, please let us know how your DB Visualiser/SQuirreL connection with Hive/Impala goes, and also how things went after adding all Hive/Impala jars to the Yellowfin driver folder.
regards,
Dave
Whilst Yellowfin can definitely connect to Hive1 and Hive2 (we have many clients currently doing so), there are, as you may be aware, different �flavours� of Hive such as Cloudera and unfortunately we haven�t been able to test Yellowfin against all of these different Hive "flavours" yet.
So, because Cloudera is one of those that we haven�t tested against yet the first thing to do in your situation would be to connect to your Hive2/Impala database with a 3rd party database tool such as DB Visualiser or SQuirreL. This will prove that a JDBC connection is indeed possible. Yellowfin essentially makes the same connection as those database tools, so obviously if one of those database tools can successfully connect then Yellowfin should also be able to.
Another point is that, regardless of what �flavour� of Hive is being used, the most common issue that has occurred with previous attempts at connecting Yellowfin to Hive is that dependencies have been missed out when copying all Hive drivers to the Yellowfin driver folder (as specified in the field "Driver Path"). The easiest way to ensure that all necessary dependencies have been included in your Yellowfin installation is to add all jar files from your Hive/Impala lib folders. To give you an idea of what this may involve, once I couldn�t connect to a specific version of Hive/Hadoop until I had copied all jars from the Hive lib folder and the Hadoop lib, (all in all there were about 70 of them) and then the connection was successful.
And one more point is, once you've added all of your jar files from the Hive lib and the Hadoop lib folders to the Yellowfin field called "Driver Path" as shown in the below screenshot:
then also please make sure there are no Hive drivers in the folder yellowfinappserverweb-inflib so there is no chance of a mix-up of drivers for the class-loader.
I hope that makes sense and it helps your connection, please let us know how your DB Visualiser/SQuirreL connection with Hive/Impala goes, and also how things went after adding all Hive/Impala jars to the Yellowfin driver folder.
regards,
Dave