讀取數據形式url使用火花,community edition,路徑相關的錯誤,有什麼建議嗎?
url = " https://raw.githubusercontent.com/thomaspernet/data_csv_r/master/data/adult.csv "從pyspark進口SparkFiles spark.sparkContext.addFile (url) # sc.addFile (url) # sqlContext = sqlContext (sc) # df = sqlContext.read.csv (SparkFiles.get (“adult.csv”),頭= True, inferSchema = True) df = spark.read.csv (SparkFiles.get (“adult.csv”),頭= True, inferSchema = True)
錯誤:
路徑不存在:dbfs: / local_disk0 /火花- 9 - f23ed57 - 133 - e - 41 - d5 - 91 b2 - 12555 d641961 / userfiles d252b3ba - 499 c - 42 c9 - be48 - 96358357 - fb75 / adult.csv
嗨,功能性sparkfiles我已經知道的概念,功能在Azure是不正確的。
這裏的討論:
https://community.m.eheci.com/s/question/0D53f00001XD3pjCAD/sparkfiles-strange-behavior-on-azure-..。
對不起,這個備份…
從pyspark進口SparkFiles url = " http://raw.githubusercontent.com/ltregan/ds-data/main/authors.csv " spark.sparkContext.addFile (url) df = spark.read.csv(“文件:/ /”+ SparkFiles.get (“authors.csv”),頭= True, inferSchema = True) df.show ()
我得到這個空輸出:
+ + | | + + + +
任何想法?3.2.2火花在Mac M1