Press Release  |  August 12, 2015

ClearStory Data and Google Cloud Platform Collaborate to Combine ClearStory’s Spark-Powered Data Analysis with Google Cloud Dataflow for Business-ready, Fast-cycle Insights

Combined solution empowers data analysts and business users to reduce costly data wrangling cycles and speed analysis and exploration on large data volumes

MENLO PARK, CA.— August 12, 2015 – ClearStory Data, the company bringing business-oriented Data Intelligence to everyone, today announced a new collaboration with Google Cloud Platform to speed data analysis on large, diverse data to answer new business questions. By integrating ClearStory’s Apache Spark-based analysis engine and its built-in data prep and data harmonization capabilities with Google Cloud Dataflow, users can analyze more data, explore and reach fast-cycle, business-ready insights. The combined solution is an intuitive, scalable solution that is language-agnostic and can be used to perform complex batch or streaming data processing and analytics. ClearStory Data announces general availability of the new solution today.

The combination of Google Cloud Dataflow with ClearStory’s Apache Spark-based analysis engine, built-in data harmonization capabilities and intuitive user interface, democratizes access to fast-cycle insights so everyone in business can make informed, timely, data-driven decisions using up-to-the-minute data analysis from disparate sources. The resulting data analysis is visualized using ClearStory’s Interactive, Collaborative Storyboards™ that let users share and collaborate on insights, ask new questions, and create fresh analysis on various data groupings.

“Through our new product integration with Google Cloud Dataflow, our customers can quickly distill meaningful insights from large, disparate data sets to reach smarter answers faster based on what’s happening that impacts their businesses now,” said Tim Howes, chief technology officer at ClearStory Data. “By combining Google Cloud Dataflow for its powerful data transformation pipeline capabilities with ClearStory’s Apache Spark-based harmonization and visual, Interactive StoryBoards™ for collaboration, businesses can ask data-driven questions and easily collaborate with less IT dependency.”

Customer benefits of using ClearStory Data and Google Cloud Dataflow include:

  •       Reduced cost and time saved when processing and analyzing large datasets. It automatically optimizes your data-centric pipeline code by collapsing multiple logical passes into a single execution pass.
  •       Fast, out-of-the-box data access and data prep and inference: ClearStory provides highly scalable and Apache Spark-powered data access in Google Cloud Dataflow and vice versa. Upon accessing data from Google Cloud Dataflow’s pipeline, ClearStory’s Data Inference Engine determines attributes in the source data to accelerate data prep and data harmonization, eliminating traditional, lengthy, complex data prep operations.
  •       Fast-cycle analysis: Users can select disparate data to be blended and harmonized.  ClearStory’s solution and built-in Intelligent Data Harmonization™ automates blending more sources and complex data to deliver immediate holistic, visual insights.
  •       Easy business consumption of data insightsBusiness users are empowered to be more self-reliant in asking new questions and iterating on answers quickly with ClearStory’s Interactive, Collaborative StoryBoards™ that capture the latest insights. As data refreshes, users can more easily see, collaborate, and answer key business questions on a fast cycle. This enables consistent, faster, data-driven decisions.
  •       Simplicity for technical users: Cloud Dataflow makes it easy to write data-processing pipelines that incorporate both batch and stream-processing capabilities that are language-agnostic.
  •       Increased efficiencies: Full lifecycle management of required compute resources, in order to reduce burden related to resource management and cluster operations. ClearStory’s data harmonization capabilities reduce IT-dependency and frees up those technical resources to work on more strategic company initiatives.


For more resources and information on ClearStory Data and Google Cloud Dataflow, please visit or


About ClearStory Data

ClearStory Data is bringing Data Intelligence to everyone to accelerate the way business leaders get answers from more data, on a faster cycle, across any number of disparate data sources. ClearStory Data’s solution simplifies data access to internal and external sources, automates data harmonization via Intelligent Data Harmonization™ across disparate data, enables fast, collaborative exploration, and reduces business wait time for insights via Interactive, Collaborative StoryBoards™. ClearStory Data lets business users be more self-reliant in reaching richer, faster insights. Its end-to-end solution includes an integrated Apache Spark-based data processing platform and an incredibly simple user application model for business consumption of insights. The company is headquartered in Menlo Park, CA and backed by Andreessen Horowitz, DAG Ventures, Google Ventures, Khosla Ventures and Kleiner Perkins Caufield & Byers (KPCB).

Media Contact
Carol Kimura
VP, Marketing
+1 650.322.2408

Related Resources