记录学习遇到的各种问题

最新推荐文章于 2023-02-02 14:48:55 发布

一杯敬朝阳一杯敬月光

最新推荐文章于 2023-02-02 14:48:55 发布

阅读量400

点赞数

分类专栏：推荐系统文章标签： bug

本文链接：https://blog.csdn.net/qq_xuanshuang/article/details/114466033

版权

推荐系统专栏收录该内容

23 篇文章 1 订阅

订阅专栏

1. 跑王喆的https://github.com/wzhe06/SparrowRecSys代码，自己配置的版本有冲突

2.maven搭建spark环境出错

3.编译spark出错

4.spark Standalone 模式提交出错

5.spark yarn 模式提交出错

6. nlp pytorch data Field

1. 跑王喆的https://github.com/wzhe06/SparrowRecSys代码，自己配置的版本有冲突

一直报这个错Exception in thread "main" java.lang.IllegalArgumentException: Unsupported class file major version 55，去网上搜了一下，是版本不一致的问题，后来却是发现IDEA设置时java版本的不一致，项目中用的是11(File -> Project Structure ->Project Settings -> project, IntelliJ IDEA -> 用的是8)，均改成1.8版本，完美解决。

2.maven搭建spark环境出错

Error:(3, 12) object apache is not a member of package org
import org.apache.spark.SparkConf

3.编译spark出错

（1）Spark Project Parent POM ........................... FAILURE

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-clean-plugin:3.0.0:clean (default-clean) on project spark-parent_2.11: Execution default-clean of goal org.apache.maven.plugins:maven-clean-plugin:3.0.0:clean failed: Plugin org.apache.maven.plugins:maven-clean-plugin:3.0.0 or one of its dependencies could not be resolved: Could not transfer artifact org.codehaus.plexus:plexus-component-annotations:jar:1.5.5 from/to central (https://repo.maven.apache.org/maven2): transfer failed for https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar: Operation timed out (Read failed) ->

解决方法：在编译之前，命令行线运行：mvn clean -Dmaven.clean.failOnError=false

（2）[INFO] Spark Project Launcher ............................. FAILURE

[ERROR] Failed to execute goal on project spark-launcher_2.11: Could not resolve dependencies for project org.apache.spark:spark-launcher_2.11:jar:2.3.0: Could not find artifact org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.7.0 in central (https://repo.maven.apache.org/maven2) -> [Help 1]

解决方法：pom.xml添加

<repository>
  <id>cloudera</id>
  <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>

4.spark Standalone 模式提交出错

Exception: Python in worker has different version 2.7 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

解决方法：这是由于python版本导致的，我们在conf目录下的spark-env.sh配置一下PYSPARK_PYTHON、PYSPARK_DRIVER_PYTHON，我自己电脑上是3.7版本的python，查看安装目录，可以使用which python3.7，将显示的路径贴过来就好，我的路径：/Users/hh/anaconda3/bin/python3.7，所以spark-env.sh添加如下语句

PYSPARK_PYTHON=/Users/hh/anaconda3/bin/python3.7

PYSPARK_DRIVER_PYTHON=/Users/hh/anaconda3/bin/python3.7

21/06/05 16:07:58 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

解决方法：查看说资源不够，看了一下Memory in use: 7.0 GB Total, 1024.0 MB Used，我这边同一台机子上还跑了pySpark，停掉pySpark就可以了。

5.spark yarn 模式提交出错

Exception in thread "main" java.lang.Exception: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.

解决方法，官网上说Ensure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster.，我们去conf目录下的spark-env.sh配置一下

HADOOP_CONF_DIR=/Users/hh/app/hadoop-2.6.0-cdh5.15.1/etc/hadoop

6. nlp pytorch data Field

torch版本：1.10.1、spacy版本：3.2.1、torchtext版本：0.11.1

import torch
from torchtext.legacy import data
from spacy.lang import en

TEXT = data.Field(tokenize='spacy', tokenizer_language='en')

报错：Can't find model 'en'. It looks like you're trying to load a model from a shortcut, which is obsolete as of spaCy v3.0.

解决方法：pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz

一杯敬朝阳一杯敬月光

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
记录学习遇到的各种问题

目录1. 王喆的https://github.com/wzhe06/SparrowRecSys2.maven搭建spark环境出错3.编译spark出错4.spark Standalone 模式提交出错5.spark yarn 模式提交出错6. nlp pytorch data Field1. 王喆的https://github.com/wzhe06/SparrowRecSys一直报这个错Exception in thread "main" java.lang.Illegal
复制链接

扫一扫