ClearStory Data + Google Cloud Platform



ClearStory Data, a scalable data analytics solution, and Google Cloud Dataflow speeds data access, data prep and analysis for fast interactive insights.

ClearStory Data + Google Cloud Platform

ClearStory Data is a scalable Apache Spark-based data analysis solution that speeds the cycle of accessing data from multiple disparate sources, combining and blending them into holistic insights, and delivering interactive business insights. Data sources include on-premise and cloud-based sources, Google Cloud Platform data sources, and Big Data platforms such as Hadoop. ClearStory Data speeds data access, reduces data prep times, and enables faster visual exploration of data from Google Cloud data sources including seamless integration with Dataflow.

ClearStory Data embeds Apache Spark as a native data processing engine to speed analysis on both large and diverse data. Its embedded Apache Spark-based capabilities bring scale, speed, and simplicity to data analysis by speeding data preparation, automating data blending and harmonization for fast answers, and enabling users to perform deep, analytical explorations on more data. The result is new answers from more data for the business.

Integrated with Google Cloud Dataflow

ClearStory Data can function as both a data sink and data source to Google Cloud Dataflow’s scalable data processing pipelines. ClearStory complements Dataflow’s developer-driven approach with built-in automated semantic profiling of diverse data and data blending capabilities. Together, the combined solution offers speed, scale, and simplified data analysis for data stewards, analysts, and business users.


  • Save time: accelerate data preparation via automated data inference across large disparate datasets
  • Simplify: intelligently match data with common dimensions on-the-fly, making it easy to blend data from sources originating from the Dataflow pipeline with other disparate data
  • Govern: leverage built-in user and data governance capabilities to control how data is accessed, and maintain an audit trail for every analysis
  • Share: enable collaborative data analysis so users can actively collaborate on data workflows, analytics and insights