Interest in Hadoop – a free open-source Java-based framework from the Apache Software Foundation that supports the distribution and running of applications on clusters of servers with thousands of nodes and petabytes of data – is enabling many organizations to surface previously untapped troves of potentially valuable corporate data.
Additional database support in Yellowfin 6.1 targets Big Data
We added support for Hadoop, and a stack of other databases, in the latest release of our Business Intelligence software – Yellowfin 6.1 – which was released last week in order to meet organizations’ growing Big Data needs.
But, we’re not here to talk about Yellowfin. The point is that Hadoop is enabling the analysis of expansive and increasingly complex data, as well social media analytics and text-mining applications, and allows businesses to analyze much larger volumes of data than they would be able to through traditional database systems. And, technology providers need to offer technical support, and corporate IT professionals need to consider Hadoop as a viable means for leveraging their burgeoning data assets.
Follow this link for more information on Hadoop >
Big Data isn’t all about volume, and neither is Hadoop
However, Hadoop doesn’t just offer a means to utilize expansive data sets. This narrow perception doesn’t give the full story regarding the Big Data phenomenon, and the immense amount of attention it’s received over the last 18 months. For the purpose of this discussion, Big Data can be defined as: The overall volume of active data an organization stores as well as the size, variety and velocity of the data sets it uses for its BI and analysis.
The recent TDWI Best Practices report Big Data Analytics explores the influx of organizations initiating, or expanding existing, analytical solutions for Big Data. The report also reveals trends and provides best practice recommendations. Again, TDWI emphasized that the term, and advantages of, Big Data is not merely restricted to data size.
“Most definitions of big data focus on the size of data in storage,” stated the report. “Size matters, but there are other important attributes of big data, namely data variety and data velocity. The three Vs of big data (volume, variety, and velocity) constitute a comprehensive definition, and they bust the myth that big data is only about data volume.”
In fact, whilst completing the end-user survey (of 325 data management professionals) to underpin its Big Data Analytics study, TDWI came across a number of respondents who had deployed Hadoop for reasons other than its ability to handle large data volumes: “[Respondents] said the same thing: Hadoop’s scalability for big data volumes is impressive, but the real reason they’re working with Hadoop is its ability to manage a very broad range of data types,” read the report.
Size matters; but it’s all relative
The continued and rapid expansion of corporate data assets has been well documented and discussed within the business analytics industry and the analyst community. However, much of the discussion has revolved around hype. This was clearly evidenced by the inclusion of Big Data on Gartner’s 2011 Hype Cycle, which, according to Gartner, is already crescendoing towards the “peak of inflated expectations”.
Now, does this mean that Big Data is somehow irrelevant? Not at all – just overhyped. Big Data doesn’t just mean attempting to collate and meaningfully interpret terabytes and terabytes and terabytes of data. Big Data is relative – if an SMB is struggling to work with gigabytes worth of data, then they (relative to their situation) have a Big Data challenge to resolve. As industry expert Colin White said: “Stop debating size of big data and focus on use cases. It’s not just size. Data variety and workload types are more important.” For more on this topic, check out Yellowfin CEO Glen Rabie’s presentation at the Australian Software Innovation Forum’s The New World of Data conference: //www.yellowfinbi.com/YFCommunityNews-Big-Data-It-s-not-the-size-it-s-how-you-use-it-107287
Tune back in for part two of this two-part series to discover the key drivers, benefits and recommended approach to Big Data analytics.