Datasophon搭建过程报错记录

一、HDFS

Couldn’t preview the file. NetworkError: Failed to execute ‘send’ on ‘XMLHttpRequest’: Failed to load ‘http://slave1:9864/webhdfs/v1/HelloHadoop.txt?op=OPEN&namenoderpcaddress=master:9820&offset=0&_=1609724219001’

参考

 

https://blog.csdn.net/llwy1428/article/details/112168574?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522168843459416800192232044%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=168843459416800192232044&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-1-112168574-null-null.142^v88^insert_down38v5,239^v2^insert_chatgpt&utm_term=Couldnt%20preview%20the%20file.%20NetworkError%3A%20Failed%20to%20execute%20send%20on%20XMLHttpRequest%3A%20Failed%20to%20load%20http%3A%2F%2Fdatasophon02%3A1025%2Fwebhdfs%2Fv1%2Fwzwtest%2F01%2Fpart-00000op%3DOPEN%26namenoderpcaddress%3Dnameservice1&spm=1018.2226.3001.4187

解决修改hdfs-site.xml

 

<property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> 这是Hadoop 的 HDFS(Hadoop 分布式文件系统)服务的XML 配置属性。 属性名称为“dfs.webhdfs.enabled”,值为“true”。此属性启用WebHDFS REST API,从而允许使用 HTTP 或HTTPS 协议访问HDFS。 WebHDFS 提供了一种简单灵活的方式从远程系统与 HDFS 交互,因为它支持读取、写入和删除文件以及创建和删除目录等操作。WebHDFS REST API 通常由需要从Hadoop 集群外部访问 HDFS 数据的应用程序使用,例如,与其他系统集成或启用基于 Web 的 HDFS 数据访问。 启用 WebHDFS 是需要远程访问 HDFS 数据的Hadoop 集群中的常见配置步骤。

二、Flink

执行 flink run -t yarn-per-job -ys 1 -yjm 1G -ytm 1G -yqu default -p 1 -sae -c test01.flink_task01 -Djobmanager.memory.heap.size=512m Flink_Demo02.jar 报错

 

org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: TaskManager memory configuration failed: Either required fine-grained memory (taskmanager.memory.task.heap.size and taskmanager.memory.managed.size), or Total Flink Memory size (Key: 'taskmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'taskmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:836) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:247) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1078) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1156) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1156) Caused by: org.apache.flink.configuration.IllegalConfigurationException: TaskManager memory configuration failed: Either required fine-grained memory (taskmanager.memory.task.heap.size and taskmanager.memory.managed.size), or Total Flink Memory size (Key: 'taskmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'taskmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly. at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:166) at org.apache.flink.client.deployment.AbstractContainerizedClusterClientFactory.getClusterSpecification(AbstractContainerizedClusterClientFactory.java:49) at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:79) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:2095) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:188) at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:119) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1951) at test01.flink_task01.main(flink_task01.java:41) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) ... 11 more Caused by: org.apache.flink.configuration.IllegalConfigurationException: Either required fine-grained memory (taskmanager.memory.task.heap.size and taskmanager.memory.managed.size), or Total Flink Memory size (Key: 'taskmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'taskmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly. at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.failBecauseRequiredOptionsNotConfigured(ProcessMemoryUtils.java:129) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:86) at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:163) ... 23 more

2.1 启动flink集群报错

解决

 

hadoop fs -chmod -R 777 /

单作业 on yarn 提交报错

 

vim flink-conf.yaml classloader.check-leaked-classloader: false

三、hive

hivesever2突然报错

 

Caused by: java.lang.RuntimeException: The dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxr-xr-x at org.apache.hadoop.hive.ql.exec.Utilities.ensurePathIsWritable(Utilities.java:4501) at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:757) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:698) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:624) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:583) at org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:130) at org.apache.hive.service.cli.CLIService.init(CLIService.java:115) ... 12 more Hive Session ID = c0e0c57c-27fd-4f55-9612-b34ec23acc8b 1741618 [main] ERROR org.apache.hive.service.server.HiveServer2 - Error starting HiveServer2 java.lang.Error: Max start attempts 30 exhausted at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1062) at org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:140) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1305) at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1149) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) Caused by: java.lang.RuntimeException: Error applying authorization policy on hive configuration: The dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxr-xr-x

解决

 

hdfs dfs -chmod -R 777 /tmp

卸载ranger,启动hiveserver2报错

删除配置文件

hive使用belline往进写入报错

查看日志

 

Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client 'abe5bec7-d2b9-4d16-9f94-19dd0a0d1270'. Error: Child process (spark-submit) exited before connecting back with error log SLF4J: Class path contains multiple SLF4J bindings.

报错解决

 

使用命令行的方式启动 nohup hiveserver2 >> /opt/datasophon/hive-3.1.3/logs/hiveserver2.log 2>&1 &

四、doris

4.1 fe启动报错

 

java.io.IOException: the self host ip does not equal to the host in ROLE file 127.0.0.1. You need to set 'priority_net works' config in fe.conf to match the host 127.0.0.1 at org.apache.doris.catalog.Catalog.getClusterIdAndRole(Catalog.java:945) at org.apache.doris.catalog.Catalog.initialize(Catalog.java:847) at org.apache.doris.PaloFe.start(PaloFe.java:128) at org.apache.doris.PaloFe.main(PaloFe.java:63)

解决注意仅限搭建过程中,生产请注意

 

删除 doris-meta目录下的所有目录及文件,修改 fe.conf 里面的 priority_networks,重启即可解决

五、Ranger

如果配置策略一致不同步

 

重启ranger usry

六、DolphinScheduler

6.1 数据源中心

hive

数据源中心,创建hive数据源报错

 

[ERROR] 2023-07-24 03:20:21.134 +0000 com.zaxxer.hikari.pool.HikariPool:[594] - HikariPool-1 - Exception during pool initialization. java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://:10000/test: Peer indicated failure: Unsupported mechanism type PLAIN at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:224) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138) at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364) at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206) at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476) at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:561) at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:112) at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:159) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:117) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:376) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:431) at org.apache.dolphinscheduler.plugin.datasource.api.client.CommonDataSourceClient.checkClient(CommonDataSourceClient.java:96) at org.apache.dolphinscheduler.plugin.datasource.api.client.CommonDataSourceClient.<init>(CommonDataSourceClient.java:54) at org.apache.dolphinscheduler.plugin.datasource.hive.HiveDataSourceClient.<init>(HiveDataSourceClient.java:64) at org.apache.dolphinscheduler.plugin.datasource.hive.HiveDataSourceChannel.createDataSourceClient(HiveDataSourceChannel.java:29) at org.apache.dolphinscheduler.plugin.datasource.api.plugin.DataSourceClientProvider.lambda$getConnection$1(DataSourceClientProvider.java:79) at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4868) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3533) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2282) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2159) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049) at com.google.common.cache.LocalCache.get(LocalCache.java:3966) at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4863) at org.apache.dolphinscheduler.plugin.datasource.api.plugin.DataSourceClientProvider.getConnection(DataSourceClientProvider.java:73) at org.apache.dolphinscheduler.api.service.impl.DataSourceServiceImpl.checkConnection(DataSourceServiceImpl.java:364) at org.apache.dolphinscheduler.api.service.impl.DataSourceServiceImpl$$FastClassBySpringCGLIB$$a86d54aa.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) at org.springframework.aop.framework.CglibAopProxy.invokeMethod(CglibAopProxy.java:386) at org.springframework.aop.framework.CglibAopProxy.access$000(CglibAopProxy.java:85) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:704) at org.apache.dolphinscheduler.api.service.impl.DataSourceServiceImpl$$EnhancerBySpringCGLIB$$d5085dd3.checkConnection(<generated>) at org.apache.dolphinscheduler.api.controller.DataSourceController.connectDataSource(DataSourceController.java:220) at org.apache.dolphinscheduler.api.controller.DataSourceController$$FastClassBySpringCGLIB$$835fdd04.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763) at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)

 

替换包为3.1.3的hive包

报错2

解决

 

缺少hive-exec-2.3.9.jar,下载hive-exec-2.39.jar放在ds所有节点下的 master-server,worker-server,api-server,alert-server 下libs

6.2 Kerberos 认证经常掉

配置doris之后每隔7天都会掉,然后出现如下错误

是一个bug,目前修改Kerberos 续约时长测试(目前这种也不可以)

 

#修改kdc 将原来的7天改为90天 vim /var/kerberos/krb5kdc/kdc.conf # 重启krb5kdc #修改krbtgt的maximum renewable life为90days modprinc -maxlife 1days -maxrenewlife 100days +allow_renewable krbtgt/HADOOP.COM #修改hive/datasophon01的maximum renewable life为90days modprinc -maxlife 1days -maxrenewlife 70days +allow_renewable hive/datasophon01 #修改客户端krb5.conf renew_lifetime为90days vim /etc/kerb5.conf [libdefaults] dns_lookup_realm = false ticket_lifetime = 24h renew_lifetime = 90d forwardable = true rdns = false pkinit_anchors = FILE:/etc/pki/tls/certs/ca-bundle.crt default_realm = HADOOP.COM #default_ccache_name = KEYRING:persistent:%{uid} # 重新认证

6.3 海豚停止工作流实例删除不了,造成master空指针

状态描述

 

1正在执行 4准备停止 5停止 6失败 7成功

解决办法一

 

存储海豚数据源数据库下 update t_ds_process_instance set state = 6 where state = 5;

解决办法二

复制工作流实例名称

到mysql海豚数据库下查询得到 process_definition_code

 

select * from t_ds_process_instance where name ='Sparkjar包传参-17-20230824172313123'

删除或者改变state状态

 

DELETE FROM t_ds_process_instance where process_definition_code =10678698667648

解决方式三

根据master日志报错的code查询错误的工作流

 

复制日志中的code mysql海豚数据库下查询 select * from t_ds_process_task_relation where post_task_code= 10679185661696 得到process_definition_code select * from t_ds_process_instance where `process_definition_code` ='10678698667648'; #删除或者改变状态 DELETE FROM t_ds_process_instance where process_definition_code =10678698667648

最终解决编写触发器

 

七、hudi

7.1 hudi同步mysql到doris中产生报错

 

Caused by: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: org.apache.hudi.org.apache.avro.specific.SpecificRecordBuilderBase.<init>(Lorg/apache/hudi/org/apache/avro/Schema;Lorg/apache/hudi/org/apache/avro/specific/SpecificData;)V

解决 :

 

hudi-flink1.15-bundle-0.12.2.jar #flink中加入 avro-1.11.1.jar

八. Zookeeper

8.1 zookeeper配置Kerberos后客户端启动报错

 

at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1166) Caused by: java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)] at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:320) ... 4 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.zookeeper.client.ZooKeeperSaslClient$1.run(ZooKeeperSaslClient.java:323) at org.apache.zookeeper.client.ZooKeeperSaslClient$1.run(ZooKeeperSaslClient.java:320) ... 7 more Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER) at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:772) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java

解决

 

参考文章: https://blog.csdn.net/zhanglong_4444/article/details/118893733?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522169094217016800211542539%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=169094217016800211542539&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-5-118893733-null-null.142^v92^insert_down28v1&utm_term=%E8%BF%9E%E6%8E%A5zookeeper%E5%AE%A2%E6%88%B7%E7%AB%AFCaused%20by%3A%20GSSException%3A%20No%20valid%20credentials%20provided%20%28Mechanism%20level%3A%20Server%20not%20found%20in%20Kerberos%20database%20%287%29%20-%20LOOKING_UP_SERVER%29&spm=1018.2226.3001.4187

 

进入ZK 目录 : cd ${ZOOKEEPER_HOME}/bin 服务端启动 : sh zkServer.sh start 客户端启动 : sh zkCli.sh -server master01:2181 [一定要带主机名,否则报错!!! ]

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值