hadoop执行mapreduce一直卡在mapreduce.Job: Running Job

其实根本原因就是内存不够了。啊啊啊啊啊啊我真的好气,做大数据真的需要一个性能强大的电脑,我电脑开两个虚拟机就已经撑死了cpu都占了90%了,如果开三个的话就会卡死。呜呜呜呜呜呜呜呜好伤心,小王什么时候才能买到MacBook Pro。

鸣谢

感谢这位博主的博客以及他的评论,再次谢谢!

解决问题

1.去修改yarn-site.xml(修改的内容注释为“这是新加的”)

<configuration>
        <!--指定MR走shuffle  -->
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <!-- 指定ResourceManager的地址  -->
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>hadoop102</value>
        </property>

        <!-- 环境变量的继承 -->
        <property>
                <name>yarn.nodemanager.env-whitelist</name>
                <value>JAVA_HOME,HADOOP_COMMOM_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
        </property>
        <!-- 这是新加的 -->
        <property>
                <name>yarn.nodemanager.resource.memory-mb</name>
                <value>20480</value>
        </property>
        <property>
                <name>yarn.scheduler.minimum-allocation-mb</name>
                <value>2048</value>
        </property>
        <property>
                <name>yarn.nodemanager.vmem-pmem-ratio</name>
                <value>2.1</value>
        </property>
</configuration>

2.删除mapred-site.xml中的以下代码

<!-- 指定MapReduce程序运行在Yarn上  -->
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>

删除后的正确代码如下

<configuration>
        <!-- 指定MapReduce程序运行在Yarn上  -->
        <!-- 历史服务器端地址  -->
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>hadoop100:10020</value>
        </property>
        <!-- 历史服务器web端地址   -->
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>hadoop100:19888</value>
        </property>
</configuration>

3.成功后截图
在这里插入图片描述
thats alllll
over

阅读终点,创作起航,您可以撰写心得或摘录文章要点写篇博文。去创作
  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 10
    评论
已经测试过workcount例子程序! [hadoop@test Desktop]$ hadoop jar wordcount.jar \ > /user/hadoop/input/file* /user/hadoop/output 18/05/25 19:51:32 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 18/05/25 19:51:32 INFO mapreduce.JobSubmissionFiles: Permissions on staging directory /tmp/hadoop-yarn/staging/hadoop/.staging are incorrect: rwxrwxrwx. Fixing permissions to correct value rwx------ 18/05/25 19:51:34 INFO input.FileInputFormat: Total input paths to process : 3 18/05/25 19:51:35 INFO mapreduce.JobSubmitter: number of splits:3 18/05/25 19:51:35 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1527248744555_0001 18/05/25 19:51:36 INFO impl.YarnClientImpl: Submitted application application_1527248744555_0001 18/05/25 19:51:36 INFO mapreduce.Job: The url to track the job: http://test:8088/proxy/application_1527248744555_0001/ 18/05/25 19:51:36 INFO mapreduce.Job: Running job: job_1527248744555_0001 18/05/25 19:51:49 INFO mapreduce.Job: Job job_1527248744555_0001 running in uber mode : false 18/05/25 19:51:49 INFO mapreduce.Job: map 0% reduce 0% 18/05/25 19:52:20 INFO mapreduce.Job: map 100% reduce 0% 18/05/25 19:52:29 INFO mapreduce.Job: map 100% reduce 100% 18/05/25 19:52:31 INFO mapreduce.Job: Job job_1527248744555_0001 completed successfully 18/05/25 19:52:32 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=186 FILE: Number of bytes written=491001 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=471 HDFS: Number of bytes written=40 HDFS: Number of read operations=12 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=3 Launched reduce tasks=1 Data-local map tasks=3 Total time spent by all maps in occupied slots (ms)=86763 Total time spent by all reduces in occupied slots (ms)=5836 Total time spent by all map tasks (ms)=86763 Total time spent by all reduce tasks (ms)=5836 Total vcore-milliseconds taken by all map tasks=86763 Total vcore-milliseconds taken by all reduce tasks=5836 Total megabyte-milliseconds taken by all map tasks=88845312 Total megabyte-milliseconds taken by all reduce tasks=5976064 Map-Reduce Framework Map input records=6 Map output records=24 Map output bytes=225 Map output materialized bytes=198 Input split bytes=342 Combine input records=24 Combine output records=15 Reduce input groups=5 Reduce shuffle bytes=198 Reduce input records=15 Reduce output records=5 Spilled Records=30 Shuffled Maps =3 Failed Shuffles=0 Merged Map outputs=3 GC time elapsed (ms)=647 CPU time spent (ms)=4390 Physical memory (bytes) snapshot=893743104 Virtual memory (bytes) snapshot=8465371136 Total committed heap usage (bytes)=659030016 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=129 File Output Format Counters Bytes Written=40
Hadoop definitive 第三版, 目录如下 1. Meet Hadoop . . . 1 Data! 1 Data Storage and Analysis 3 Comparison with Other Systems 4 RDBMS 4 Grid Computing 6 Volunteer Computing 8 A Brief History of Hadoop 9 Apache Hadoop and the Hadoop Ecosystem 12 Hadoop Releases 13 What’s Covered in this Book 14 Compatibility 15 2. MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 A Weather Dataset 17 Data Format 17 Analyzing the Data with Unix Tools 19 Analyzing the Data with Hadoop 20 Map and Reduce 20 Java MapReduce 22 Scaling Out 30 Data Flow 31 Combiner Functions 34 Running a Distributed MapReduce Job 37 Hadoop Streaming 37 Ruby 37 Python 40 iii www.it-ebooks.info Hadoop Pipes 41 Compiling and Running 42 3. The Hadoop Distributed Filesystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 The Design of HDFS 45 HDFS Concepts 47 Blocks 47 Namenodes and Datanodes 48 HDFS Federation 49 HDFS High-Availability 50 The Command-Line Interface 51 Basic Filesystem Operations 52 Hadoop Filesystems 54 Interfaces 55 The Java Interface 57 Reading Data from a Hadoop URL 57 Reading Data Using the FileSystem API 59 Writing Data 62 Directories 64 Querying the Filesystem 64 Deleting Data 69 Data Flow 69 Anatomy of a File Read 69 Anatomy of a File Write 72 Coherency Model 75 Parallel Copying with distcp 76 Keeping an HDFS Cluster Balanced 78 Hadoop Archives 78 Using Hadoop Archives 79 Limitations 80 4. Hadoop I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Data Integrity 83 Data Integrity in HDFS 83 LocalFileSystem 84 ChecksumFileSystem 85 Compression 85 Codecs 87 Compression and Input Splits 91 Using Compression in MapReduce 92 Serialization 94 The Writable Interface 95 Writable Classes 98 iv | Table of Contents www.it-ebooks.info Implementing a Custom Writable 105 Serialization Frameworks 110 Avro 112 File-Based Data Structures 132 SequenceFile 132 MapFile 139 5. Developing a MapReduce Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 The Configuration API 146 Combining Resources 147 Variable Expansion 148 Configuring the Development Environment 148 Managing Configuration 148 GenericOptionsParser, Tool, and ToolRunner 151 Writing a Unit Test 154 Mapper 154 Reducer 156 Running Locally on Test Data 157 Running a Job in a Local Job Runner 157 Testing the Driver 161 Running on a Cluster 162 Packaging 162 Launching a Job 162 The MapReduce Web UI 164 Retrieving the Results 167 Debugging a Job 169 Hadoop Logs 173 Remote Debugging 175 Tuning a Job 176 Profiling Tasks 177 MapReduce Workflows 180 Decomposing a Problem into MapReduce Jobs 180 JobControl 182 Apache Oozie 182 6. How MapReduce Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Anatomy of a MapReduce Job Run 187 Classic MapReduce (MapReduce 1) 188 YARN (MapReduce 2) 194 Failures 200 Failures in Classic MapReduce 200 Failures in YARN 202 Job Scheduling 204 Table of Contents | v www.it-ebooks.info The Fair Scheduler 205 The Capacity Scheduler 205 Shuffle and Sort 205 The Map Side 206 The Reduce Side 207 Configuration Tuning 209 Task Execution 212 The Task Execution Environment 212 Speculative Execution 213 Output Committers 215 Task JVM Reuse 216 Skipping Bad Records 217 7. MapReduce Types and Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 MapReduce Types 221 The Default MapReduce Job 225 Input Formats 232 Input Splits and Records 232 Text Input 243 Binary Input 247 Multiple Inputs 248 Database Input (and Output) 249 Output Formats 249 Text Output 250 Binary Output 251 Multiple Outputs 251 Lazy Output 255 Database Output 256 8. MapReduce Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Counters 257 Built-in Counters 257 User-Defined Java Counters 262 User-Defined Streaming Counters 266 Sorting 266 Preparation 266 Partial Sort 268 Total Sort 272 Secondary Sort 276 Joins 281 Map-Side Joins 282 Reduce-Side Joins 284 Side Data Distribution 287 vi | Table of Contents www.it-ebooks.info Using the Job Configuration 287 Distributed Cache 288 MapReduce Library Classes 294 9. Setting Up a Hadoop Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Cluster Specification 295 Network Topology 297 Cluster Setup and Installation 299 Installing Java 300 Creating a Hadoop User 300 Installing Hadoop 300 Testing the Installation 301 SSH Configuration 301 Hadoop Configuration 302 Configuration Management 303 Environment Settings 305 Important Hadoop Daemon Properties 309 Hadoop Daemon Addresses and Ports 314 Other Hadoop Properties 315 User Account Creation 318 YARN Configuration 318 Important YARN Daemon Properties 319 YARN Daemon Addresses and Ports 322 Security 323 Kerberos and Hadoop 324 Delegation Tokens 326 Other Security Enhancements 327 Benchmarking a Hadoop Cluster 329 Hadoop Benchmarks 329 User Jobs 331 Hadoop in the Cloud 332 Hadoop on Amazon EC2 332 10. Administering Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 HDFS 337 Persistent Data Structures 337 Safe Mode 342 Audit Logging 344 Tools 344 Monitoring 349 Logging 349 Metrics 350 Java Management Extensions 353 Table of Contents | vii www.it-ebooks.info Maintenance 355 Routine Administration Procedures 355 Commissioning and Decommissioning Nodes 357 Upgrades 360 11. Pig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Installing and Running Pig 366 Execution Types 366 Running Pig Programs 368 Grunt 368 Pig Latin Editors 369 An Example 369 Generating Examples 371 Comparison with Databases 372 Pig Latin 373 Structure 373 Statements 375 Expressions 379 Types 380 Schemas 382 Functions 386 Macros 388 User-Defined Functions 389 A Filter UDF 389 An Eval UDF 392 A Load UDF 394 Data Processing Operators 397 Loading and Storing Data 397 Filtering Data 397 Grouping and Joining Data 400 Sorting Data 405 Combining and Splitting Data 406 Pig in Practice 407 Parallelism 407 Parameter Substitution 408 12. Hive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Installing Hive 412 The Hive Shell 413 An Example 414 Running Hive 415 Configuring Hive 415 Hive Services 417 viii | Table of Contents www.it-ebooks.info The Metastore 419 Comparison with Traditional Databases 421 Schema on Read Versus Schema on Write 421 Updates, Transactions, and Indexes 422 HiveQL 422 Data Types 424 Operators and Functions 426 Tables 427 Managed Tables and External Tables 427 Partitions and Buckets 429 Storage Formats 433 Importing Data 438 Altering Tables 440 Dropping Tables 441 Querying Data 441 Sorting and Aggregating 441 MapReduce Scripts 442 Joins 443 Subqueries 446 Views 447 User-Defined Functions 448 Writing a UDF 449 Writing a UDAF 451 13. HBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 HBasics 457 Backdrop 458 Concepts 458 Whirlwind Tour of the Data Model 458 Implementation 459 Installation 462 Test Drive 463 Clients 465 Java 465 Avro, REST, and Thrift 468 Example 469 Schemas 470 Loading Data 471 Web Queries 474 HBase Versus RDBMS 477 Successful Service 478 HBase 479 Use Case: HBase at Streamy.com 479 Table of Contents | ix www.it-ebooks.info Praxis 481 Versions 481 HDFS 482 UI 483 Metrics 483 Schema Design 483 Counters 484 Bulk Load 484 14. ZooKeeper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487 Installing and Running ZooKeeper 488 An Example 490 Group Membership in ZooKeeper 490 Creating the Group 491 Joining a Group 493 Listing Members in a Group 494 Deleting a Group 496 The ZooKeeper Service 497 Data Model 497 Operations 499 Implementation 503 Consistency 505 Sessions 507 States 509 Building Applications with ZooKeeper 510 A Configuration Service 510 The Resilient ZooKeeper Application 513 A Lock Service 517 More Distributed Data Structures and Protocols 519 ZooKeeper in Production 520 Resilience and Performance 521 Configuration 522 15. Sqoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525 Getting Sqoop 525 A Sample Import 527 Generated Code 530 Additional Serialization Systems 531 Database Imports: A Deeper Look 531 Controlling the Import 534 Imports and Consistency 534 Direct-mode Imports 534 Working with Imported Data 535 x | Table of Contents www.it-ebooks.info Imported Data and Hive 536 Importing Large Objects 538 Performing an Export 540 Exports: A Deeper Look 541 Exports and Transactionality 543 Exports and SequenceFiles 543 16. Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Hadoop Usage at Last.fm 545 Last.fm: The Social Music Revolution 545 Hadoop at Last.fm 545 Generating Charts with Hadoop 546 The Track Statistics Program 547 Summary 554 Hadoop and Hive at Facebook 554 Introduction 554 Hadoop at Facebook 554 Hypothetical Use Case Studies 557 Hive 560 Problems and Future Work 564 Nutch Search Engine 565 Background 565 Data Structures 566 Selected Examples of Hadoop Data Processing in Nutch 569 Summary 578 Log Processing at Rackspace 579 Requirements/The Problem 579 Brief History 580 Choosing Hadoop 580 Collection and Storage 580 MapReduce for Logs 581 Cascading 587 Fields, Tuples, and Pipes 588 Operations 590 Taps, Schemes, and Flows 592 Cascading in Practice 593 Flexibility 596 Hadoop and Cascading at ShareThis 597 Summary 600 TeraByte Sort on Apache Hadoop 601 Using Pig and Wukong to Explore Billion-edge Network Graphs 604 Measuring Community 606 Everybody’s Talkin’ at Me: The Twitter Reply Graph 606 Table of Contents | xi www.it-ebooks.info Symmetric Links 609 Community Extraction 610 A. Installing Apache Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613 B. Cloudera’s Distribution for Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619 C. Preparing the NCDC Weather Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
Table of Contents Elasticsearch for Hadoop Credits About the Author About the Reviewers www.PacktPub.com Support files, eBooks, discount offers, and more Why subscribe? Free access for Packt account holders Preface What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support Downloading the example code Downloading the color images of this book Errata Piracy Questions 1. Setting Up Environment Setting up Hadoop for Elasticsearch Setting up Java Setting up a dedicated user Installing SSH and setting up the certificate Downloading Hadoop Setting up environment variables Configuring Hadoop Configuring core-site.xml Configuring hdfs-site.xml Configuring yarn-site.xml Configuring mapred-site.xml The format distributed filesystem Starting Hadoop daemons Setting up Elasticsearch Downloading Elasticsearch Configuring Elasticsearch Installing Elasticsearch's Head plugin Installing the Marvel plugin Running and testing Running the WordCount example Getting the examples and building the job JAR file Importing the test file to HDFS Running our first job Exploring data in Head and Marvel Viewing data in Head Using the Marvel dashboard Exploring the data in Sense Summary 2. Getting Started with ES-Hadoop Understanding the WordCount program Understanding Mapper Understanding the reducer Understanding the driver Using the old API – org.apache.hadoop.mapred Going real — network monitoring data Getting and understanding the data Knowing the problems Solution approaches Approach 1 – Preaggregate the results Approach 2 – Aggregate the results at query-time Writing the NetworkLogsMapper job Writing the mapper class Writing Driver Building the job Getting the data into HDFS Running the job Viewing the Top N results Getting data from Elasticsearch to HDFS Understanding the Twitter dataset Trying it yourself Creating the MapReduce job to import data from Elasticsearch to HDFS Writing the Tweets2Hdfs mapper Running the example Testing the job execution output Summary ...
Java对hdfs操作报如下错误,请问怎么解决?错误如下:Exception in thread "main" java.io.IOException: (null) entry in command string: null chmod 0700 I:\tmp\hadoop-22215\mapred\staging\222151620622033\.staging at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:770) at org.apache.hadoop.util.Shell.execCommand(Shell.java:866) at org.apache.hadoop.util.Shell.execCommand(Shell.java:849) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733) at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:491) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:532) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:509) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:305) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:133) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:144) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308) at com.sl.maxTemperature.main(maxTemperature.java:41)
最新发布
04-23

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

日京

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值