Back in mid-March, Microsoft announced the general availability of Azure Databricks, which has now gotten a series of additions. However, before diving straight into said announcements, a bit of context is required.
As some may know, Databricks is actually a standalone company founded in 2013 by former UC Berkeley AMPLab's Matei Zaharia. The company's goal is to help clients who wish to take advantage of the Apache Spark cluster-computing framework.
The aforementioned framework was created in 2009 by Zaharia as Spark, then open-sourced in 2010. In 2013 it was donated to the Apache Software Foundation, where it became Apache Spark and was switched to the Apache 2.0 open-source license.
With all that in mind, Azure Databricks is an analytics platform that leverages the Apache Spark framework, optimizing it for Microsoft's cloud. This is aimed at big data and AI workloads, which makes the announcement of general availability for its RStudio integration all the more welcome. RStudio is an IDE written in R that's quite popular in the data science community, an IDE which can now run inside of Azure Databricks.
The platform is also extending to four new regions - namely UK South, UK West, Australia East, and Australia South East - and supports the recently announced preview of Azure Data Lake Storage Gen2. For those not familiar, this is another big data-focused solution, specifically a " hyper-scale repository" for data analytics.