New advances in ClearStory’s Apache Spark-native solution brings smart data matching, rapid data harmonization, and an even simpler visual, guided application that eliminates onerous data prep and modeling, automates data blending and speeds business results
O’Reilly Strata + Hadoop World, New York, NY – September 24, 2015 – ClearStory Data, the company bringing business-oriented Data Intelligence to everyone, today announced new enhancements to its Apache Spark-native Intelligent Data Harmonization™ capabilities to address a broad market shift and accelerating requirement for everyday business people to be able to access and blend disparate data and be more self-reliant in reaching deeper insights. ClearStory Data, the first industry solution natively built on Apache Spark, runs in the cloud and provides a simple business user application to speed disparate data analysis, accelerate data discovery, and lets business users be self-reliant.
The Business Intelligence market is no longer in a period of just rapid change. We’ve now reached an inflection point where rapid transformation is happening. Two thirds (64%) of companies are already trying to combine five to 15 sources of data.1 Shockingly, 92 percent of business users are still using Excel for analysis for lack of better alternatives.1
Together with the urgent need to access more disparate data and speed business insights, the attractive economics and speed advantages of cloud-hosted solutions are further accelerating the shift to modern solutions. To address the need for scalable and fast data processing, Apache Spark has spiraled into the forefront, and solutions like ClearStory prove Apache Spark’s advantages for disparate and scalable data analysis. The combination of these trends is why the modernization of BI and data analytics is experiencing its fastest pace of change ever.
Through 2017, Gartner predicts that the number of business users doing data analysis will grow five times faster than the number of highly skilled data scientists, 2 and most business users and analysts will access self-service tools to prepare data for analysis.3
To address this major market shift today, ClearStory announces advances to its Apache Spark-native Intelligent Data Harmonization and blending capabilities. New capabilities include smart data matching and a visual, guided interface to make multi-source, complex dataset blending simple, fast and auditable. Smart data matching is made visible and intuitive to the user through new Google-like visual maps that let business users see the specific route that resulted in holistic insights. The new, guided application interface introduces auto-alerts when combining disparate data that may not blend well or have inconsistencies in the data itself, so users can quickly navigate to what they want to see and explore in the harmonized data. ClearStory continues to accelerate modernization on every front, from its underlying Apache Spark-native capabilities, through to the user application model so tasks that were once overwhelming, complex and slow in traditional BI are now fast, simple and accessible to everyone.
Highlights of ClearStory’s new advances include:
- Turnkey access to a broad range of data and visual guidance for ad-hoc data blending and harmonization: Users can easily navigate a streamlined visual route to make changes on-the-fly and enrich insights with more data. Each step generates a dynamically updating “Data Story” that reveals insights that are explorable and flexible. The types of data that businesses need to combine and harmonize today include Excel files, relational data from existing data repositories, semi-structured data and data from cloud-based applications.
- Enhanced data intelligence and smart data matching with visual cues on data blending and best matches across complex data: The new ClearStory interface provides an interactive experience that informs the user about the auto-detected matching attributes of the datasets and relationship between attributes across data sources. The system additionally detects and alerts on data dimension inconsistencies and guides users to the right and best way to harmonize disparate data. In situations where disparate data is being combined that doesn’t fit together is recognized, users are alerted via visual cues and can review and resolve any ambiguity and inconsistencies while blending data in a few clicks.
- New Data Lineage Visualizer™ for deep visibility into data lineage to ensure data integrity and insight integrity: Business users’ preferred way of manipulating data is through a visual, self-service experience so they can answer their questions faster, versus being provided rigid IT-created dashboards that constrain their insights. ClearStory’s advanced interface provides users with a visual map and data-blending blueprint in one intuitive, unified view. Users can click on any dataset in the map to ascertain its underlying granular details – to see how the data was blended together, and to trace the data harmonization path that has been applied. In a single view, users can easily answer the questions and iterate fast, to answer new questions without relying on data specialists to do it for them.
“Customers in hyper-competitive markets, driven by consumer demand or increased market complexity such as CPG, national retail, healthcare and logistics/manufacturing, need near real-time business-ready insights. They also urgently require solutions that automate complex tasks and provide an easy-to-use, intuitive application interface,” says Chief Product Officer Ali Tore at ClearStory Data. “Through deeper visibility into what and how data is being blended and harmonized, paired with even more automated intelligence through smart data matching, ClearyStory’s Apache Spark-native solution and business-friendly application lets business users be more self-reliant with more data access and makes combining data a lot easier and faster.”
With today’s announcement and ClearStory’s advances in its Apache Spark-based data matching and the richness and simplicity of the ClearStory Data user model, business users can be self-sufficient in generating fast, business-ready insights across more data. ClearStory Data’s integrated, Apache Spark-native architecture and visual, guided application interface speeds up every step from data access, prep, modeling and data harmonization.
ClearStory Data is a strategic sponsor and exhibitor at O’Reilly Strata + Hadoop World from Sept 29-Oct 1, 2015 in New York City. If you attend the conference, please stop by booth #429 to see a live demo.
1 Harvard Business Review, Analytics Services Report: “Data Blending: A Powerful Method for Faster, Easier Decisions” – August 31, 2015
2 Gartner, Inc. Report: “Smart Data Discovery Will Enable a New Class of Citizen Data Scientist” – June 29, 2015
3 Gartner, Inc. Report: “The Rise of Data Discovery Has Set the Stage for a Major Strategic Shift in the BI and Analytics Platform Market” – June 15, 2015
About ClearStory Data
ClearStory Data is bringing Data Intelligence to everyone to accelerate the way business leaders get answers from more data, on a faster cycle, across any number of disparate data sources. ClearStory Data’s solution simplifies data access to internal and external sources, automates data harmonization via Intelligent Data Harmonization™ across disparate data, enables fast, collaborative exploration, and reduces business wait time for insights via Interactive, Collaborative StoryBoards™. ClearStory Data lets business users be more self-reliant in reaching richer, faster insights. Its end-to-end solution includes an integrated Apache Spark-based data processing platform and an incredibly simple user application model for business consumption of insights. The company is headquartered in Menlo Park, CA and backed by Andreessen Horowitz, DAG Ventures, Google Ventures, Khosla Ventures and Kleiner Perkins Caufield & Byers (KPCB).
Sr. Director, Corporate Marketing