Integration provides Apache Spark-enabled disparate data analysis, automated data harmonization, and robust data governance on Hadoop
MENLO PARK, CA — May 6, 2015 – ClearStory Data, the company bringing business-oriented Data Intelligence to everyone, today announced it has further integrated its Apache Spark-based business user application and data harmonization engine with CDH, the latest version of Cloudera’s industry-leading Hadoop distribution platform. With this integration, ClearStory Data brings its Apache Spark-based data processing capabilities to CDH customers as a native, integrated data source. ClearStory Data is a certified technology partner of Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™.
Hadoop-based data hubs have increasingly emerged as a singular location for enterprise and consumer internet companies to store all data. As a result, data hubs hold a rich variety of data with the potential for extremely high-value insights when integrated with new generation Apache Spark-based analytics solutions.
By pairing ClearStory Data with Cloudera, customers are able to speed data access and analysis by leveraging ClearStory’s unique capabilities around data inference, data harmonization and metadata management. Further, given ClearStory’s intuitive application experience, it can be used by business analysts and other staff without the requirement for specialized data or technical skills.
“Enterprises across various sectors such as insurance, banking, retail, and healthcare are using data hubs to answer questions about the business,” says Tim Stevens, vice president of corporate and business development at Cloudera. “ClearStory Data helps deliver enterprise-scale Apache Spark-enabled processing, self-service data analysis, and holistic, consumable insights for line of business users.”
The integrated solution is focused on answering important business questions on a fast-cycle, leveraging data from CDH. ClearStory Data’s simple business application, user-guided interface, collaborative “Data Stories” and “Interactive, Collaborative StoryBoards™” let business users be more self-reliant with data analysis and frees up IT resources, while enforcing appropriate user and data governance controls based on the nature and sensitivity of the data stored in CDH.
“Today’s enterprise data hubs require Apache Spark’s fast in-memory processing because they have terabytes, if not petabytes of information, and there’s complexity and security concerns about who gets access to what data both inside and outside the organization,” says Vaibhav Nivargi, chief architect and technical co-founder of ClearStory Data. “The combination of Cloudera and our platform makes large volumes of disparate data more consumable. It allows business users to freely explore and collaborate on data in Hadoop with a user-guided experience through our simple business application.”
For more information, please visit www.clearstorydata.com.
About ClearStory Data
ClearStory Data is bringing Data Intelligence to everyone to accelerate the way business leaders get answers from more data, on a faster cycle, across any number of disparate data sources. ClearStory Data’s solution simplifies data access to internal and external sources, automates data harmonization via Intelligent Data Harmonization™ across disparate data, enables fast, collaborative exploration, and reduces business wait times for insights via Interactive, Collaborative StoryBoards™. ClearStory Data lets business users be more self-reliant on reaching richer, faster insights. Its end-to-end solution includes an integrated ApacheSpark-based data processing platform and an incredibly simple user application model for business consumption of insights. The company is headquartered in Menlo Park, CA and backed by Andreessen Horowitz, DAG Ventures, Google Ventures, Khosla Ventures and Kleiner Perkins Caufield & Byers (KPCB).