程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Pyspark code runs with errors. How to solve them? (Language Python)

編輯:Python
The phenomenon and background of the problem

operating system :Ubuntu 20.04
Spark edition :3.2.1
Hadoop edition :3.3.1
Python edition :3.8.10
Java edition :1.8.202

from pyspark import SparkConf, SparkContext

conf = SparkConf().setAppName("WordCount").setMaster("local")
sc = SparkContext(conf=conf)
inputFile = "hdfs://localhost:9000/user/way/word.txt"
textFile = sc.textFile(inputFile)
wordCount = textFile.flatMap(lambda line : line.split(" ")).map(lambda word : (word, 1)).reduceByKey(lambda a, b : a + b)
wordCount.foreach(print)

Spark Medium operation results and error reporting contents

pycharm Medium operation results and error reporting contents

Process finished with exit code 1

My solution ideas and tried methods

I thought it was py4j There is a problem with the file directory , It turns out that it's not ; Later on pycharm The error is caused by importing the package file. It may be caused by version compatibility

What I want to achieve

Normal operation code


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved