Databricks Runtime 6.0 for ML(不支持)

Databricks於2019年10月發布了這張圖片。

Databricks Runtime 6.0 for Machine Learning為機器學習和數據科學提供了一個現成的環境Databricks Runtime 6.0(不支持).Databricks Runtime ML包含許多流行的機器學習庫,包括TensorFlow, PyTorch, Keras和XGBoost。它還支持使用Horovod進行分布式深度學習訓練。

有關更多信息,包括創建Databricks Runtime ML集群的說明,請參見介紹Databricks運行時機器學習

新功能

Databricks Runtime 6.0 ML是建立在Databricks Runtime 6.0之上的。有關Databricks Runtime 6.0中的新功能的信息,請參見Databricks Runtime 6.0(不支持)發行說明。

使用新的MLflow Spark數據源大規模查詢MLflow實驗數據

MLflow實驗的Spark數據源現在提供了一個標準API來加載MLflow實驗運行數據。這可以使用DataFrame api大規模查詢和分析MLflow實驗數據。對於給定的實驗,DataFrame包含run_ids、metrics、params、標簽、start_time、end_time、狀態和工件的artifact_uri。看到MLflow實驗

改進

  • Hyperopt GA

    Hyperopt on Databricks現在普遍可用。自公開預覽以來的顯著改進包括支持MLflow在Spark worker上的日誌記錄,正確處理PySpark廣播變量,以及使用Hyperopt選擇模型的新指南。我們還修複了日誌信息、錯誤處理、UI中的小錯誤,並使我們的文檔更易於閱讀。詳細信息請參見Hyperopt文檔

    我們已經更新了Databricks如何記錄Hyperopt實驗,以便您現在可以在Hyperopt運行期間通過傳遞度量來記錄自定義度量mlflow.log_metric函數(見log_metric).如果您想記錄除損失之外的自定義指標,這是非常有用的hyperopt.fmin函數被調用。

  • MLflow

    • 增加MLflow Java客戶端1.2.0

    • MLflow現在被提升為頂級圖書館

  • 升級的機器學習庫

    • Horovod從0.16.4升級到0.18.1

    • MLflow從1.0.0升級到1.2.0

  • 蟒蛇分布從5.2.0升級到2019.03

刪除

  • Databricks ML模型導出被刪除。使用MLeap用於導入和導出模型。

  • Hyperopt的以下屬性hyperopt。SparkTrials刪除:

    • SparkTrials.successful_trials_count

    • SparkTrials.failed_trials_count

    • SparkTrials.cancelled_trials_count

    • SparkTrials.total_trials_count

    它們被以下功能所取代:

    • SparkTrials.count_successful_trials ()

    • SparkTrials.count_failed_trials ()

    • SparkTrials.count_cancelled_trials ()

    • SparkTrials.count_total_trials ()

係統環境

Databricks Runtime 6.0 ML的係統環境與Databricks Runtime 6.0不同:

以下部分列出了Databricks Runtime 6.0 ML中包含的與Databricks Runtime 6.0中包含的不同的庫。

頂級庫

Databricks Runtime 6.0 ML包括以下頂級

Python庫

Databricks Runtime 6.0 ML使用Conda進行Python包管理,包括許多流行的ML包。下麵介紹Databricks Runtime 6.0 ML的Conda環境。

CPU集群上的Python 3

名字databricks-ml渠道-pytorch-違約依賴關係-_libgcc_mutex = 0.1 =主要-_py-xgboost-mutex = 2.0 = cpu_0-_tflow_select = tripwire = mkl-absl-py =是0.7.1 = py37_0-asn1crypto = 0.24.0 = py37_0-阿斯特= 0.8.0 = py37_0-backcall = 0.1.0 = py37_0-補丁= 1.0 = py_2-bcrypt = 3.1.6 = py37h7b6447c_0-布拉斯特區= 1.0 = mkl-寶途= 2.49.0 = py37_0-boto3 = 1.9.162 = py_0-botocore = 1.12.163 = py_0-c-ares = 1.15.0 = h7b6447c_1001-ca證書= 2019.1.23 = 0-certifi = 2019.3.9 = py37_0-cffi = 1.12.2 = py37h2e261b9_1-chardet = 3.0.4 = py37_1003-單擊= 7.0 = py37_0-cloudpickle = 0.8.0 = py37_0-彩色光= 0.4.1 = py37_0-configparser = 3.7.4 = py37_0-密碼= 2.6.1 = py37h1ba5d50_0-周期計= 0.10.0 = py37_0-cython = 0.29.6 = py37he6710b0_0-decorator = 4.4.0 = py37_1-docutils = 0.14 = py37_0-entrypoints = 0.3 = py37_0-et_xmlfile = 1.0.1 = py37_0-瓶1.0.2 = = py37_1-freetype的= 2.9.1 = h8a8886c_1-未來= 0.17.1 = py37_0-恐嚇= 0.2.2 = py37_0-gitdb2 = 2.0.5 = py37_0-gitpython = 2.1.11 = py37_0-grpcio = 1.16.1 = py37hf8bcb03_1-gunicorn = 19.9.0 = py37_0-h5py = 2.9.0 = py37h7918eee_0-hdf5 = 1.10.4 = hb1b8bf9_0-html5lib = 1.0.1 = py_0-icu = 58.2 = h9c2bf20_1-idna = 2.8 = py37_0-intel-openmp = 2019.3 = 199-ipython = 7.4.0 = py37h39e3cac_0-ipython_genutils = 0.2.0 = py37_0-itsdangerous = 1.1.0 = py37_0-jdcal = 1.4 = py37_0-絕地= 0.13.3 = py37_0-jinja2 = 2.10 = py37_0-jmespath = 0.9.4 = py_0-jpeg = 9 b = h024ee3a_2-keras = 2.2.4 = 0-keras-applications = 1.0.8 = py_0-keras-base = 2.2.4 = py37_0-keras-preprocessing = 1.1.0 = py_1-kiwisolver = 1.0.1 = py37hf484d3e_0-krb5 = 1.16.1 = h173b8e3_7-libedit = 3.1.20181209 = hc058e9b_0-libffi = 3.2.1 = hd88cf55_4-libgcc-ng = 8.2.0 = hdf63c60_1-libgfortran-ng = 7.3.0 = hdf63c60_0-libpng = 1.6.36 = hbc83047_0-libpq = 11.2 = h20c2e04_0-libprotobuf = 3.8.0 = hd408876_0-libsodium = 1.0.16 = h1bed415_0-libstdcxx-ng = 8.2.0 = hdf63c60_1-libtiff = 4.0.10 = h2733197_2-libxgboost = 0.90 = he6710b0_0-libxml2 = 2.9.9 = hea5a465_1-libxslt = 1.1.33 = h7d1a2b0_0-llvmlite = 0.28.0 = py37hd408876_0-lxml = 4.3.2 = py37hefd8a0e_0-尖吻鯖鯊= 1.0.10 = py_0-減價= 3.1.1 = py37_0-markupsafe = 1.1.1 = py37h7b6447c_0-mkl = 2019.3 = 199-mkl_fft = 1.0.10 = py37ha843d7b_0-1.0.2 mkl_random = = py37hd81dba3_0-模擬= 3.0.5 = py37_0-ncurses = 6.1 = he6710b0_1-networkx = 2.2 = py37_1-忍者= 1.9.0 = py37hfd86e86_0-鼻子= 1.3.7 = py37_2-numba = 0.43.1 = py37h962f231_0-numpy = 1.16.2 = py37h7e9f1db_0-numpy-base = 1.16.2 = py37hde5b4d6_0-olefile = 0.46 = py37_0-openpyxl = 2.6.1 = py37_1-openssl = 1.1.1b = h7b6447c_1-熊貓= 0.24.2 = py37he6710b0_0-paramiko = 2.4.2 = py37_0-parso = 0.3.4 = py37_0-pathlib2 = 2.3.3 = py37_0-容易受騙的人= 0.5.1 = py37_0-pexpect = 4.6.0 = py37_0-pickleshare = 0.7.5 = py37_0-枕頭= 5.4.1之前= py37h34e0f95_0-皮普= 19.0.3 = py37_0-厚度= 3.11 = py37_0-prompt_toolkit = 2.0.9 = py37_0-protobuf = 3.8.0 = py37he6710b0_0-psutil = 5.6.1 = py37h7b6447c_0-psycopg2 = 2.7.6.1 = py37h1ba5d50_0-ptyprocess = 0.6.0 = py37_0-py-xgboost = 0.90 = py37he6710b0_0-py-xgboost-cpu = 0.90 = py37_0-pyasn1 = 0.4.6 = py_0-pycparser = 2.19 = py37_0-pygments = 2.3.1 = py37_0-pymongo = 3.8.0 = py37he6710b0_1-= py37h7b6447c_0 1.3.0 pynacl =版本-pyopenssl = 19.0.0 = py37_0-pyparsing = 2.3.1 = py37_0-pysocks = 1.6.8 = py37_0-python = 3.7.3 = h0371630_0-python-dateutil = 2.8.0 = py37_0-python編輯器的1.0.4 = = py_0-pytorch-cpu = 1.1.0 = py3.7_cpu_0-pytz = 2018.9 = py37_0-pyyaml = 5.1 = py37h7b6447c_0-readline = 7.0 = h7b6447c_5-= 2.21.0 = py37_0請求-s3transfer = 0.2.1 = py37_0-scikit-learn = 0.20.3 = py37hd81dba3_0-scipy = 1.2.1 = py37h7c811a0_0-setuptools = 40.8.0 = py37_0-simplejson = 3.16.0 = py37h14c3975_0-singledispatch = 3.4.0.3 = py37_0-6 = 1.12.0 = py37_0-smmap2 = 2.0.5 = py37_0-sqlite = 3.27.2 = h7b6447c_0-sqlparse = 0.3.0 = py_0-statsmodels = 0.9.0 = py37h035aef0_0-彙總= 0.8.3 = py37_0-tensorboard = 1.13.1 = py37hf484d3e_0-tensorflow = 1.13.1 = mkl_py37h54b294f_0-tensorflow-base = 1.13.1 = mkl_py37h7ce6ba3_0-tensorflow-estimator = 1.13.0 = py_0-tensorflow-mkl = 1.13.1 = h4fcabd2_0-termcolor = 1.1.0 = py37_1-tk = 8.6.8 = hbc83047_0-torchvision-cpu = 0.3.0 = py37_cuNone_1-tqdm = 4.31.1 = py37_1-traitlets = 4.3.2 = py37_0-urllib3 = 1.24.1 = py37_0-virtualenv = 16.0.0 = py37_0-wcwidth = 0.1.7 = py37_0-webencodings = 0.5.1 = py37_1-websocket-client = 0.56.0 = py37_0-werkzeug = 0.14.1 = py37_0-輪= 0.33.1 = py37_0-打包= 1.11.1 = py37h7b6447c_0-xz = 5.2.4 = h14c3975_4-yaml = 0.1.7 = had09818_2-zlib = 1.2.11 = h7b6447c_3-zstd = 1.3.7 = h0b5b093_0-皮普-argparse = = 1.4.0-databricks-cli = = 0.9.0-碼頭工人= = 4.0.2-fusepy = = 2.0.4-大猩猩= = 0.3.0-horovod = = 0.18.1-hyperopt = = 0.1.2.db8-matplotlib = = 3.0.3-mleap = = 0.8.1-mlflow = = 1.2.0-nose-exclude = = 0.5.0-pyarrow = = 0.13.0-querystring-parser = = 4-seaborn = = 0.9.0-tensorboardx = = 1.8前綴/磚/ conda / env / databricks-ml

包含Python模塊的Spark包

火花包

Python模塊

版本

graphframes

graphframes

0.7.0-db1-spark2.4

spark-deep-learning

sparkdl

1.5.0-db5-spark2.4

tensorframes

tensorframes

0.7.0-s_2.11

Java和Scala庫(Scala 2.11集群)

除了Java和Scala庫在Databricks Runtime 6.0, Databricks Runtime 6.0 ML包含以下jar:

組ID

工件ID

版本

com.databricks

spark-deep-learning

1.5.0-db5-spark2.4

com.typesafe.akka

akka-actor_2.11

2.3.11

ml.combust.mleap

mleap-databricks-runtime_2.11

0.14.0

ml.dmlc

xgboost4j

0.90

ml.dmlc

xgboost4j-spark

0.90

org.graphframes

graphframes_2.11

0.7.0-db1-spark2.4

org.mlflow

mlflow-client

1.2.0

org.tensorflow

libtensorflow

1.13.1

org.tensorflow

libtensorflow_jni

1.13.1

org.tensorflow

spark-tensorflow-connector_2.11

1.13.1

org.tensorflow

tensorflow

1.13.1

org.tensorframes

tensorframes

0.7.0-s_2.11