我麵臨一個錯誤當我試圖讀取數據從任何MongoDB集合使用MongoDB火花連接器v10。在磚v13.x x。
以下錯誤似乎從行開始# 113MongoDB火花連接器圖書館(v10.2.0):
. lang。NoSuchMethodError: org.apache.spark.sql.types.DataType.sameType (Lorg / apache / / sql /類型/數據類型;火花)Z - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Py4JJavaError回溯(最近稱去年)文件<命令- 3492412077247672 >:6 1 mongo_opts ={”連接。uri”: conf.mongodb。read_uri 2“數據庫”:“setorizacao”, 3‘收藏’:‘出口’,4“outputExtendedJson”:“真正的”}- - > 6 mongo_outl = spark.read。負載(格式= mongodb, * * mongo_opts)文件/磚/火花/ python / pyspark / instrumentation_utils。py: 48, _wrap_function。<當地人>。包裝器(* args, * * kwargs) 46開始= time.perf_counter() 47個試題:- - - - - - > 48 res = func (* args, * * kwargs) 49記錄器。log_success (50 module_name class_name、function_name time.perf_counter()——開始,簽名51)52返回res文件/磚/火花/ python / pyspark / sql /讀寫。py: 314年DataFrameReader。負載(自我、路徑、格式、模式* *選項)312年返回self._df (self._jreader.load (self._spark._sc._jvm.PythonUtils.toSeq(路徑)))313:- - > 314年返回self._df (self._jreader.load())文件/磚/火花/ python / lib / py4j-0.10.9.7-src.zip / py4j / java_gateway。py: 1322年JavaMember。__call__(自我,* args) 1316命令=原型。CALL_COMMAND_NAME + 1318 \ 1317 self.command_header + \ args_command + \ 1319原型。END_COMMAND_PART 1321回答= self.gateway_client.send_command(命令)- > 1322 return_value = get_return_value(1323回答,自我。gateway_client,自我。target_id, self.name) 1325 for temp_arg in temp_args: 1326 if hasattr(temp_arg, "_detach"): File /databricks/spark/python/pyspark/errors/exceptions/captured.py:188, in capture_sql_exception..deco(*a, **kw) 186 def deco(*a: Any, **kw: Any) -> Any: 187 try: --> 188 return f(*a, **kw) 189 except Py4JJavaError as e: 190 converted = convert_exception(e.java_exception) File /databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name) 324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) 325 if answer[1] == REFERENCE_TYPE: --> 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError( 331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n". 332 format(target_id, ".", name, value)) Py4JJavaError: An error occurred while calling o1020.load. : java.lang.NoSuchMethodError: org.apache.spark.sql.types.DataType.sameType(Lorg/apache/spark/sql/types/DataType;)Z at com.mongodb.spark.sql.connector.schema.InferSchema.lambda$inferSchema$4(InferSchema.java:103) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566) at com.mongodb.spark.sql.connector.schema.InferSchema.inferSchema(InferSchema.java:112) at com.mongodb.spark.sql.connector.schema.InferSchema.inferSchema(InferSchema.java:78) at com.mongodb.spark.sql.connector.MongoTableProvider.inferSchema(MongoTableProvider.java:60) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.getTableFromProvider(DataSourceV2Utils.scala:91) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils$.loadV2Source(DataSourceV2Utils.scala:138) at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:333) at scala.Option.flatMap(Option.scala:271) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:331) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:226) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397) at py4j.Gateway.invoke(Gateway.java:306) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195) at py4j.ClientServerConnection.run(ClientServerConnection.java:115) at java.lang.Thread.run(Thread.java:750)
我已經測試了所有版本的連接器從10.1.0 10.2.0火花。我也測試了所有版本的磚13日從13.0到13.2。我已經測試了在版本5和6的MongoDB服務器(Atlas)。
現在我使用Maven存儲庫的庫坐標org.mongodb.spark: mongo-spark-connector_2.12:10.2.0,但以前我也用官方的jar文件已經具備這個鏈接。
使用火花3.0.2版本連接器適用於讀和寫操作。寫操作在版本10也沒問題。x的火花連接器。
我試圖讀取相同的數據集合的MongoDB在當地設置的火花,這工作正常。為此,我用的版本引發3.4.1,Java 11.0.19 (Azul祖魯語)和Python 3.10.6 (PySpark)。
下麵的錯誤不發生在磚12.2和。