你好,
我想設置默認“spark.driver。maxResultSize”筆記本on my cluster. I know I can do that in the cluster settings, but is there a way to set it by code?
我也知道怎麼做當我啟動一個火花會話,但是在我的情況下我直接從特征存儲負載和想要改變我的熊貓pyspark數據幀。
從磚進口feature_store進口的熊貓pd pyspark.sql進口。從操作係統函數作為f。導入路徑加入fs = feature_store.FeatureStoreClient () prediction_data = fs.read_table (name =名字)prediction_data_pd = prediction_data.toPandas ()
嗨@Maximilian Hansinger,
請試試這個:-
從進口SparkContext pyspark pyspark進口SparkConf相依= SparkConf () .setMaster(紗)\ .setAppName (“xyz”) \這裏(“spark.driver。extraClassPath”、“/ usr /地方/ bin / postgresql-42.2.5.jar) \這裏(spark.executor。實例”,4)\這裏(spark.executor。核”,4)\這裏(spark.executor。內存,10 g) \這裏(spark.driver。內存,15克)\這裏(spark.memory.offHeap。啟用”,真的)\這裏(spark.memory.offHeap。大小,20克)\這裏(spark.dirver。maxResultSize ', ' 4096 ') spark_context = SparkContext(參看= conf)
嗨@Maximilian Hansinger,或者試試這個:-
從pyspark。sql進口SparkSession火花= (SparkSession。建築部分(紗)#取決於您選擇的集群管理器.appName . config (“spark.driver (“xyz”)。extraClassPath”、“/ usr /地方/ bin / postgresql-42.2.5.jar) . config (spark.executor。”,4). config (“spark.executor實例。核’,4). config (“spark.executor。內存,10 g) . config (“spark.driver。內存,15克). config (“spark.memory.offHeap。”,真的). config (“spark.memory.offHeap啟用。大小,20克). config (“spark.dirver。maxResultSize ', ' 4096 ')) sc = spark.sparkContext