Spark Speeds Towards the Next Data Processing Revolution

A mighty flame followeth a tiny spark. — Dante Alighieri If you know anything about Apache Spark, you know that its chief claim to fame is speed. With in-memory processing, Spark promises ten- or hundred-fold improvements or more in data processing times over traditional MapReduce. Spark is also more flexible, supporting a wider variety of workloads than… Read more »

A Little Spark to Wildfire

Open Source Project Birthed at U.C. Berkeley Takes Off in the Enterprise This week the fast-growing Apache Spark community is gathering in New York City to celebrate and collaborate on one of the most popular open source projects today. Launched in U.C. Berkeley’s AMPLab in 2009, Apache Spark has begun to catch on like wildfire during the last year and a… Read more »

ClearStory Data and Hortonworks: Enabling Fast-Cycle, Business-Ready Analytics and the “Must-Haves” for Secure, Governed Data Lakes

We are excited to be working with and announcing ClearStory Data’s integration with Hortonworks Data Platform (HDP) during Strata + Hadoop World 2015. This partnership with Hortonworks is significant as it brings ClearStory’s business-ready, fast-cycle, scalable analysis on Hadoop Data Lakes and specifically on the Hortonworks Data Platform (HDP). ClearStory’s integration that includes a data… Read more »

The 2+2=5 Principle and the Perils of Analytics in a Vacuum

Strategic decision making in enterprises playing in a competitive field requires collaborative information seeking (CIS). Complex situations require analysis that spans multiple sessions with multiple participants (that collectively represent the entire context) who spend time jointly exploring, evaluating, and gathering relevant information to drive conclusions and decisions. This is the core of the 2+2=5 principle…. Read more »

Ending the Era of “Look But Don’t Touch”: Introducing Interactive, Collaborative StoryBoards

We’re very excited to be at the Strata Conference + Hadoop World in New York City this week. Last year we chose this conference to announce our Data Intelligence platform to the world. This year we are unveiling a major new capability that advances data storytelling: ClearStory’s new Interactive, Collaborative StoryBoardsTM powered by Spark. But… Read more »

It’s the End of the (Analytics and BI) World as We Know It

“That’s great, it starts with an earthquake, birds and snakes, an aeroplane, and Lenny Bruce is not afraid.” –REM, “It’s the End of the World as We Know It (and I Feel Fine)” REM’s famous “It’s the End of the World…”song rode high on the college radio circuit back in the late 1980s. It was a… Read more »

ClearStory + Spark = Data Exploration Freedom

The release of Spark 1.0 marks a significant step in the move away from MapReduce based big data processing. In-memory. Distributed. Scale out. Machine Learning. 100X faster – on initial benchmarks and our Spark-inside Solution is evidence of the blazing speed. Data Scientists and Data Engineers are rejoicing – and drooling. Although for some users… Read more »

On the Meaning of Spark 1.0

The Spark 1.0 release marks some significant changes, and it is worth taking some time to consider just what this release means for users and developers of Spark and what they can now expect from the project. The answer to that question has clearly changed over time, and not all of that change is completely… Read more »

ClearStory and Databricks at the Spark Summit

To follow up on the success of the Spark Summit in San Francisco this past week, I wanted to share a great conversation I had with Reynold Xin who is one of the co-founders of Databricks and the main author behind Shark. Here’s a discussion capturing highlights of our chat at the Summit, in which we discuss how… Read more »

Spark Summit 2013

On the eve of the very first Spark Summit, there is a lot to be excited about. The past few months have been quite eventful for the Spark community – The Apache Software Foundation accepted Spark as an official incubator project and graduated Apache Mesos to a Top-Level Project, Databricks was founded to commercialize Spark… Read more »