Hadoop3.0.3-SYNTH运行

最新推荐文章于 2023-09-14 13:13:48 发布

hahachenchen789

最新推荐文章于 2023-09-14 13:13:48 发布

阅读量259

点赞数

分类专栏： hadoop Hadoop与Kubernetes学习

本文链接：https://blog.csdn.net/hahachenchen789/article/details/81037821

版权

Hadoop与Kubernetes学习同时被 2 个专栏收录

65 篇文章 6 订阅

订阅专栏

hadoop

41 篇文章 2 订阅

订阅专栏

hadoop的迭代实在太快，导致出现很多bug。

在运行SYNTH模式时。发现两个问题。

1.官方给出的SYNTH的json脚本：

{
  "description" : "tiny jobs workload",    //description of the meaning of this collection of workloads
  "num_nodes" : 10,  //total nodes in the simulated cluster
  "nodes_per_rack" : 4, //number of nodes in each simulated rack
  "num_jobs" : 10, // total number of jobs being simulated
  "rand_seed" : 2, //the random seed used for deterministic randomized runs

  // a list of “workloads”, each of which has job classes, and temporal properties
  "workloads" : [
    {
      "workload_name" : "tiny-test", // name of the workload
      "workload_weight": 0.5,  // used for weighted random selection of which workload to sample from
      "queue_name" : "sls_queue_1", //queue the job will be submitted to

    //different classes of jobs for this workload
       "job_classes" : [
        {
          "class_name" : "class_1", //name of the class
          "class_weight" : 1.0, //used for weighted random selection of class within workload

          //nextr group controls average and standard deviation of a LogNormal distribution that
          //determines the number of mappers and reducers for thejob.
          "mtasks_avg" : 5,
          "mtasks_stddev" : 1,
          "rtasks_avg" : 5,
          "rtasks_stddev" : 1,

          //averge and stdev input param of LogNormal distribution controlling job duration
          "dur_avg" : 60,
          "dur_stddev" : 5,

          //averge and stdev input param of LogNormal distribution controlling mappers and reducers durations
          "mtime_avg" : 10,
          "mtime_stddev" : 2,
          "rtime_avg" : 20,
          "rtime_stddev" : 4,

          //averge and stdev input param of LogNormal distribution controlling memory and cores for map and reduce
          "map_max_memory_avg" : 1024,
          "map_max_memory_stddev" : 0.001,
          "reduce_max_memory_avg" : 2048,
          "reduce_max_memory_stddev" : 0.001,
          "map_max_vcores_avg" : 1,
          "map_max_vcores_stddev" : 0.001,
          "reduce_max_vcores_avg" : 2,
          "reduce_max_vcores_stddev" : 0.001,

          //probability of running this job with a reservation
          "chance_of_reservation" : 0.5,
          //input parameters of LogNormal distribution that determines the deadline slack (as a multiplier of job duration)
          "deadline_factor_avg" : 10.0,
          "deadline_factor_stddev" : 0.001,
        }
       ],
    // for each workload determines with what probability each time bucket is picked to choose the job starttime.
    // In the example below the jobs have twice as much chance to start in the first minute than in the second minute
    // of simulation, and then zero chance thereafter.
      "time_distribution" : [
        { "time" : 1, "weight" : 66 },
        { "time" : 60, "weight" : 33 },
        { "time" : 120, "jobs" : 0 }
     ]
    }
 ]
}

首先json文件中不能有任何注释，因此要删除这些注释才能运行。其次在

"deadline_factor_stddev" : 0.001,

这一行最后的逗号不能有，否则不符合json文件的格式，运行报错。

这里给出一个在线查看json文件是否合格的网站：

https://jsonlint.com/

2.在运行过程中

$HADOOP_HOME/share/hadoop/tools/sls/bin/slsrun.sh --tracetype=SYNTH --tracelocation=/home/c/sls/output2/SYNTH.json --output-dir=/home/c/sls/output1 --print-simulation

报错如下：

java.lang.IllegalArgumentException: Null user
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1225)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1212)
	at org.apache.hadoop.yarn.sls.appmaster.AMSimulator.submitReservationWhenSpecified(AMSimulator.java:177)
	at org.apache.hadoop.yarn.sls.appmaster.AMSimulator.firstStep(AMSimulator.java:154)
	at org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:88)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.IllegalArgumentException: Null user

报错信息是NULL user传入了。

在最开始传入该参数时：

  private void startAMFromSynthGenerator() throws YarnException, IOException {
    Configuration localConf = new Configuration();
    localConf.set("fs.defaultFS", "file:///");
    long baselineTimeMS = 0;

    // if we use the nodeFile this could have been not initialized yet.
    if (stjp == null) {
      stjp = new SynthTraceJobProducer(getConf(), new Path(inputTraces[0]));
    }

    SynthJob job = null;
    // we use stjp, a reference to the job producer instantiated during node
    // creation
    while ((job = (SynthJob) stjp.getNextJob()) != null) {
      // only support MapReduce currently
      String user = job.getUser();

getUser()返回后没有判断是不是为NULL。导致错误。

而对于从SLS和rumen输入的函数，得到user时是做了判断的：

  private void createAMForJob(Map jsonJob) throws YarnException {
    long jobStartTime = Long.parseLong(
        jsonJob.get(SLSConfiguration.JOB_START_MS).toString());

    long jobFinishTime = 0;
    if (jsonJob.containsKey(SLSConfiguration.JOB_END_MS)) {
      jobFinishTime = Long.parseLong(
          jsonJob.get(SLSConfiguration.JOB_END_MS).toString());
    }

    String user = (String) jsonJob.get(SLSConfiguration.JOB_USER);
    if (user == null) {
      user = "default";
    }

所以我在想是不是hadoop官方对SYNTH的支持不是很完善。

hahachenchen789

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hadoop3.0.3-SYNTH运行

hadoop的迭代实在太快，导致出现很多bug。在运行SYNTH模式时。发现两个问题。1.官方给出的SYNTH的json脚本：{ "description" : "tiny jobs workload", //description of the meaning of this collection of workloads "num_nodes" : 10, //total no...
复制链接

扫一扫

专栏目录