你好,
我使用我的IDE,磚連接9.1版本lts毫升連接到磚集群與火花3.1版本,下載一個火花模型的訓練和使用mlflow保存。
所以它似乎能夠找到一個複製模型,然後是不對的。相同的在磚工作筆記本產品使用磚連接,問題隻發生在我的IDE。
我們同樣的錯誤與不同模型在不同的存儲庫。它最近開始出現。
我也有同樣的問題在其他環境中集群10.4 lts毫升和databricks-connect 10.4.6。
你有一個主意嗎?
代碼:
mlflow.set_tracking_uri (“磚”)
model_path =dbfs: /磚/ mlflow-tracking / 197830957424395/7c5e692873874dadae4f67f44c1aa310 /工件/ rfModel”
model_res = mlflow.spark.load_model (model_path)
看到StackTraceError:
2022/10/06 mlflow 15:17:11信息。火花:文件dbfs: /磚/ mlflow-tracking / 197830957424395/7c5e692873874dadae4f67f44c1aa310 /工件/ rfModel / sparkml DFS未找到。將嚐試上傳該文件。
22/10/06 15:17:39警告DBFS: DBFS創建在/ tmp / mlflow / f020cb9a-47b2-49ee-8b12-cf2754db61a9 /元數據/部分- 00000 2299 ms
22/10/06 15:17:42警告DBFS: DBFS創建在/ tmp / mlflow / f020cb9a-47b2-49ee-8b12-cf2754db61a9 /元數據/ _SUCCESS 1687 ms
22/10/06 15:17:46警告DBFS: DBFS mkdir / tmp / mlflow / f020cb9a-47b2-49ee-8b12-cf2754db61a9 /階段/ 0 _randomforestclassifier_77e9017cbf4d 2302 ms
2022/10/06 mlflow 15:19:13信息。火花:SparkML模式複製到/ tmp / mlflow / f020cb9a-47b2-49ee-8b12-cf2754db61a9
查看工作細節........ https ....
視圖在........工作細節https .....
22/10/06 15:19:16錯誤儀表:. io .StreamCorruptedException:無效的類型代碼:00
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1698)
java.io.ObjectInputStream.defaultReadFields (ObjectInputStream.java: 2405)
java.io.ObjectInputStream.readSerialData (ObjectInputStream.java: 2329)
java.io.ObjectInputStream.readOrdinaryObject (ObjectInputStream.java: 2187)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1667)
java.io.ObjectInputStream.defaultReadFields (ObjectInputStream.java: 2405)
java.io.ObjectInputStream.readSerialData (ObjectInputStream.java: 2329)
java.io.ObjectInputStream.readOrdinaryObject (ObjectInputStream.java: 2187)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1667)
java.io.ObjectInputStream.readObject (ObjectInputStream.java: 503)
java.io.ObjectInputStream.readObject (ObjectInputStream.java: 461)
scala.collection.immutable.List SerializationProxy.readObject美元(List.scala: 488)
在sun.reflect.GeneratedMethodAccessor419。調用(未知源)
sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java: 43)
java.lang.reflect.Method.invoke (Method.java: 498)
java.io.ObjectStreamClass.invokeReadObject (ObjectStreamClass.java: 1184)
java.io.ObjectInputStream.readSerialData (ObjectInputStream.java: 2296)
java.io.ObjectInputStream.readOrdinaryObject (ObjectInputStream.java: 2187)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1667)
java.io.ObjectInputStream.defaultReadFields (ObjectInputStream.java: 2405)
java.io.ObjectInputStream.readSerialData (ObjectInputStream.java: 2329)
java.io.ObjectInputStream.readOrdinaryObject (ObjectInputStream.java: 2187)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1667)
java.io.ObjectInputStream.readArray (ObjectInputStream.java: 2093)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1655)
java.io.ObjectInputStream.defaultReadFields (ObjectInputStream.java: 2405)
java.io.ObjectInputStream.readSerialData (ObjectInputStream.java: 2329)
java.io.ObjectInputStream.readOrdinaryObject (ObjectInputStream.java: 2187)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1667)
java.io.ObjectInputStream.readArray (ObjectInputStream.java: 2093)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1655)
java.io.ObjectInputStream.defaultReadFields (ObjectInputStream.java: 2405)
java.io.ObjectInputStream.readSerialData (ObjectInputStream.java: 2329)
java.io.ObjectInputStream.readOrdinaryObject (ObjectInputStream.java: 2187)
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1667)
java.io.ObjectInputStream.readObject (ObjectInputStream.java: 503)
java.io.ObjectInputStream.readObject (ObjectInputStream.java: 461)
在org.apache.spark.sql.util.ProtoSerializer。anonfun deserializeObject美元1美元(ProtoSerializer.scala: 6631)
scala.util.DynamicVariable.withValue (DynamicVariable.scala: 62)
org.apache.spark.sql.util.ProtoSerializer.deserializeObject (ProtoSerializer.scala: 6616)
com.databricks.service.SparkServiceRPCHandler.execute0 (SparkServiceRPCHandler.scala: 728)
在com.databricks.service.SparkServiceRPCHandler。anonfun executeRPC0美元1美元(SparkServiceRPCHandler.scala: 477)
scala.util.DynamicVariable.withValue (DynamicVariable.scala: 62)
com.databricks.service.SparkServiceRPCHandler.executeRPC0 (SparkServiceRPCHandler.scala: 372)
com.databricks.service.SparkServiceRPCHandler立刻2.美元美元調用(SparkServiceRPCHandler.scala: 323)
com.databricks.service.SparkServiceRPCHandler立刻2.美元美元調用(SparkServiceRPCHandler.scala: 309)
java.util.concurrent.FutureTask.run (FutureTask.java: 266)
在com.databricks.service.SparkServiceRPCHandler。anonfun executeRPC美元1美元(SparkServiceRPCHandler.scala: 359)
scala.util.DynamicVariable.withValue (DynamicVariable.scala: 62)
com.databricks.service.SparkServiceRPCHandler.executeRPC (SparkServiceRPCHandler.scala: 336)
com.databricks.service.SparkServiceRPCServlet.doPost (SparkServiceRPCServer.scala: 167)
javax.servlet.http.HttpServlet.service (HttpServlet.java: 707)
javax.servlet.http.HttpServlet.service (HttpServlet.java: 790)
org.eclipse.jetty.servlet.ServletHolder.handle (ServletHolder.java: 799)
org.eclipse.jetty.servlet.ServletHandler.doHandle (ServletHandler.java: 550)
org.eclipse.jetty.server.handler.ScopedHandler.nextScope (ScopedHandler.java: 190)
org.eclipse.jetty.servlet.ServletHandler.doScope (ServletHandler.java: 501)
org.eclipse.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java: 141)
org.eclipse.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java: 127)
org.eclipse.jetty.server.Server.handle (Server.java: 516)
在org.eclipse.jetty.server.HttpChannel.lambda處理1美元(HttpChannel.java: 388)
org.eclipse.jetty.server.HttpChannel.dispatch (HttpChannel.java: 633)
org.eclipse.jetty.server.HttpChannel.handle (HttpChannel.java: 380)
org.eclipse.jetty.server.HttpConnection.onFillable (HttpConnection.java: 277)
org.eclipse.jetty.io.AbstractConnection ReadCallback.succeeded美元(AbstractConnection.java: 311)
org.eclipse.jetty.io.FillInterest.fillable (FillInterest.java: 105)
在org.eclipse.jetty.io.ChannelEndPoint 1.美元運行(ChannelEndPoint.java: 104)
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask (EatWhatYouKill.java: 338)
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce (EatWhatYouKill.java: 315)
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce (EatWhatYouKill.java: 173)
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run (EatWhatYouKill.java: 131)
org.eclipse.jetty.util.thread.ReservedThreadExecutor ReservedThread.run美元(ReservedThreadExecutor.java: 383)
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java: 882)
org.eclipse.jetty.util.thread.QueuedThreadPool Runner.run美元(QueuedThreadPool.java: 1036)
java.lang.Thread.run (Thread.java: 748)
…
py4j.protocol。Py4JJavaError:調用o588.load時發生一個錯誤。
:. io .StreamCorruptedException:無效的類型代碼:00
java.io.ObjectInputStream.readObject0 (ObjectInputStream.java: 1698)
謝謝你的幫助。
你好,
我改變databricks-connect版本10.4.12 mlflow版本是1.26但是它不工作。
我有winutils。exe在我下venv Lib \網站\ pyspark \ bin。
我的環境變量HADOOP_HOME是好的。
python 3.8.10版本。
謝謝你的幫助。
看到stacktraceerror:
2022/10/14 mlflow 17:18:09信息。火花:文件dbfs: /磚/ mlflow-tracking / 67260056032267/6580b479a0ba43beaa3dd7971561fbb7 /工件/ model_rf / sparkml DFS未找到。將嚐試上傳該文件。
回溯(最近的電話):
用戶文件“C: \ \ NSR \ py-packages \ \ test_mlflow測試。py”第21行,在<模塊>
模型= exp.get_model(筆名=“model_rf”)
文件“C: \ \ NSR \ py-packages \ ircem \ mlflow用戶。在get_model py”, 178行
返回mlflow.spark.load_model (model_path)
文件“D: \ venv_python \ Python38 \ lib \網站\ mlflow \火花。在load_model py”, 711行
返回_load_model (model_uri = model_uri dfs_tmpdir_base = dfs_tmpdir)
文件“D: \ venv_python \ Python38 \ lib \網站\ mlflow \火花。在_load_model py”, 659行
model_uri = _HadoopFileSystem。maybe_copy_from_uri (model_uri dfs_tmpdir)
文件“D: \ venv_python \ Python38 \ lib \網站\ mlflow \火花。在maybe_copy_from_uri py”, 382行
返回cls.maybe_copy_from_local_file (_download_artifact_from_uri (src_uri) dst_path)
文件“D: \ venv_python \ Python38 \ lib \網站\ mlflow \火花。在maybe_copy_from_local_file py”, 349行
cls。copy_from_local_file (src, dst remove_src = False)
文件“D: \ venv_python \ Python38 \ lib \網站\ mlflow \火花。在copy_from_local_file py”, 331行
cls._fs ()。copyFromLocalFile (remove_src cls._local_path (src) cls._remote_path (dst))
文件“D: \ venv_python \ Python38 \ lib \網站\ py4j \ java_gateway。在__call__ py”, 1304行
return_value = get_return_value (
文件“D: \ venv_python \ Python38 \ lib \網站\ pyspark \ sql \跑龍套。py”, 117行,在裝飾
返回f(*, * *千瓦)
文件“D: \ venv_python \ Python38 \ lib \網站\ py4j \協議。在get_return_value py”, 326行
提高Py4JJavaError (
py4j.protocol。Py4JJavaError:調用o334.copyFromLocalFile時發生一個錯誤。
:. lang。如果:org.apache.hadoop.io.nativeio.NativeIO Windows.access0美元(Ljava / lang / String; I) Z
在美元org.apache.hadoop.io.nativeio.NativeIO窗口。access0(本地方法)
org.apache.hadoop.io.nativeio.NativeIO Windows.access美元(NativeIO.java: 793)
org.apache.hadoop.fs.FileUtil.canRead (FileUtil.java: 1215)
org.apache.hadoop.fs.FileUtil.list (FileUtil.java: 1420)
org.apache.hadoop.fs.RawLocalFileSystem.listStatus (RawLocalFileSystem.java: 601)
org.apache.hadoop.fs.FileSystem.listStatus (FileSystem.java: 1972)
org.apache.hadoop.fs.FileSystem.listStatus (FileSystem.java: 2014)
org.apache.hadoop.fs.ChecksumFileSystem.listStatus (ChecksumFileSystem.java: 761)
org.apache.hadoop.fs.FileUtil.copy (FileUtil.java: 406)
org.apache.hadoop.fs.FileUtil.copy (FileUtil.java: 390)
org.apache.hadoop.fs.FileSystem.copyFromLocalFile (FileSystem.java: 2482)
org.apache.hadoop.fs.FileSystem.copyFromLocalFile (FileSystem.java: 2448)
在sun.reflect.NativeMethodAccessorImpl。invoke0(本地方法)
在sun.reflect.NativeMethodAccessorImpl。調用(未知源)
在sun.reflect.DelegatingMethodAccessorImpl。調用(未知源)
在java.lang.reflect.Method。調用(未知源)
py4j.reflection.MethodInvoker.invoke (MethodInvoker.java: 244)
py4j.reflection.ReflectionEngine.invoke (ReflectionEngine.java: 380)
py4j.Gateway.invoke (Gateway.java: 295)
py4j.commands.AbstractCommand.invokeMethod (AbstractCommand.java: 132)
py4j.commands.CallCommand.execute (CallCommand.java: 79)
py4j.ClientServerConnection.waitForCommands (ClientServerConnection.java: 195)
py4j.ClientServerConnection.run (ClientServerConnection.java: 115)
java.lang.Thread.run(未知源)