AWS Databricks Cloud Integration Demo

Get started for freeDownload Notebooks

background-image

Databricks runs on AWS and integrates with all of the major services you use like S3, EC2, Redshift and more. In this demo, we’ll show you how Databricks integrates with each of these services simply and seamlessly to enable you to build a lakehouse architecture.

See full list of demos →

Dive Deeper into Databricks on AWS

Video transcript

Databricks Lakehouse on AWS overview

The Databricks Lakehouse Platform sits at the heart of the AWS ecosystem, and easily integrates with popular Data + AI services like Kinesis streams, S3 buckets, Glue, Athena, Redshift, QuickSight and much more. In this demo, we’ll show you how Databricks integrates with each of these services in a simple, seamless way.

Connecting to EC2, S3, Glue and IAM

When we start up a Spark cluster on Databricks, we can configure it to use the Glue Data Catalog, and also attach it to an IAM instance profile that allows Databricks to provision and manage EC2 instances, S3 buckets and other AWS services.

One of the first things we do while working with AWS Databricks is to set up a Spark cluster in your Virtual Private Cloud, which can autoscale up and down to control cloud costs as your data workloads change. Databricks Spark clusters use EC2 instances on the back end, and you can configure them to use the AWS Glue Data Catalog. You can also set up AWS instance profiles on your cluster to control and manage access to S3 buckets and other resources.

Return to top →

background-image

Try Databricks free for 14 days

By clicking “Get started for free”, you agree to thePrivacy PolicyandTerms of Service

Ready to get started?

Baidu
map