我們最受歡迎的開源項目
Apache Spark is a unified engine for executing data engineering, data science and ML workloads.
Apache Spark is a unified engine for executing data engineering, data science and ML workloads.
Apache火花™
Delta Lake lets you build a lakehouse architecture on top of storage systems such as AWS S3, ADLS, GCS and HDFS.
MLflow manages the ML lifecycle, including experimentation, reproducibility, deployment and a central model registry.
Redash enables anyone to leverage SQL to explore, query, visualize, and share data from both big and small data sources.
Delta Sharing is the industry’s first open protocol for secure data sharing, making it simple to share data with other organizations.
磚支持這些額外的流行的開源技術
Databricks supports TensorFlow, a library for deep learning and general computation on clusters
Facebook, the creator of PyTorch, and Databricks have collaborated on integrations
Deep learning API written in Python, running on top of TensorFlow. Available in Databricks Runtime for ML
An open source suite of tools for collaborative data science using R
Widely used Python package for machine learning built on top of NumPy, SciPy and Matplotlib
A distributed gradient boosting library that has bindings in languages such as Python, R and C++
HashiCorp Terraform is a popular open source tool for creating safe and predictable cloud infrastructure across several cloud providers. Databricks Terraform provider allows customers to manage their entire Databricks workspaces along with the rest of their infrastructure using a flexible, powerful tool. Using Terraform also encourages customers to adopt best practices with infrastructure as code (IaC)