IBM launches 'data-centric cloud-based development environment' for Apache Spark

This week, the Apache Spark Summit is being held in San Francisco, and yesterday, Microsoft announced an "extensive commitment" for Spark to power its key big data and analytics products, including Cortana Intelligence Suite, Power BI, and Microsoft R Server.

Today, IBM joined in the action by announcing "the first cloud-based development environment for near real-time, high performance analytics".

It says it has invested $300 million to develop Spark as "a type of 'analytics operating system'", and the launch of its new Data Science Experience is aimed at offering the ability to more easily interrogate and analyze big data, and extract deeper insights more quickly than ever before.

The company describes the Data Science Experience as "an interactive, collaborative, cloud-based environment where data scientists can use multiple tools to activate their insights", explaining further:

IBM created the Data Science Experience to extend the speed and agility of Spark to more than two million members of the R community through new contributions to SparkR, SparkSQL and Apache SparkML.

[...]

The Data Science Experience’s open and collaborative environment allows data scientists to accelerate and simplify data ingestion, curation and analysis by bringing together the content, data, models, and open source resources from IBM and others including H2O, RStudio, Jupyter Notebooks on Apache Spark in a single security-rich managed environment.

Bob Picciano, Senior Vice President for IBM Analytics, said that its efforts will "accelerate innovation", adding: "IBM’s Digital Science Experience is the killer enterprise app for Apache Spark, and gives data scientists new opportunities to deliver insight-driven models to developers, and opens the door for unprecedented innovation from the open source community."

IBM has already integrated Spark into many of its own platforms and products, and last year, it open-sourced its SystemML technology to boost Spark's machine learning capabilities.

Picciano said that, following IBM's announcement today, data scientists can not only enjoy greater access to large data sets, but they will also gain the ability to work more efficiently with such large volumes of data.

Further details on the IBM Data Science Experience can be found on the company's site.