Apache Spark is part of the Hadoop project and now it serves as a stimulus for implementing “big data” solutions: Looker – a major player on BI software market – has recently launched the support of Haddoop project solutions, including Spark. The aim is to make access to data easier.
“Big data” technologies received a massive push forward when Hadoop emerged ten years ago. The same have repeated recently. ASF states that Spark, when implemented as an alternative to MapReduce, allows to increase the rate of data processing and analyzing up to 100x which is essential in modern business. Spark is becoming more and more popular as the companies struggle to achieve data analytics real-time with next to none loss of time.
One of the software vendors who has appreciated Spark efficiency is Looker. Not long ago this BI platform announced that it has updated Impala/Hive support and launched Presto and Spark support.
This innovation enables Looker clients process data via Hadoop with all its inherent efficiency. What is more important, Hadoop project has eventually entered the stage when it becomes not a scientific experiment but a useful tool. Big data infrastructure seems increasingly valuable for business for now companies are dealing with massive data flows. Facebook makes use f Presto while many other companies worldwide are deploying Spark. The experts expect Looker to be just one of the first in a long range of software vendors who adopted Spark support.
Blue Hill Research representative said in a recent interview that the amount of information stored by the company’s clients increased significantly due to the Internet of Things and social media networks. These enormous quantities data exceed storage and analyzing capacities of conventional systems. Hadoop is aimed at alleviating – if not entirely solving – the problem of data analysis.
The other way to deal with the ever increasing flow of data is storing only a part of it and then processing it portion by portion by means of other systems. This approach requires additional time, though, so it does not fit those organizations who considers real-time business intelligence a top priority.
As Frank Bien of Looked noted, the times are changing: the main point is to make the massive data flows dynamic, to get it analyzed fast enough and to make it easily accessible for business. Analytics and BI markets merge are gradually merging with “big data” one so the future of the associated data processing lies with Hadoop and similar solutions.