

47.在master节点安装MahoutClient,打开Linux Shell运行mahout命令查看Mahout自带的案例程序,将查询结果显示如下。

 [root@master~]# mahout

MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR toclasspath.

Running on hadoop, using/usr/hdp/ and HADOOP_CONF_DIR=/usr/hdp/


WARNING: Use "yarn jar" to launch YARNapplications.

An example program must be given as the firstargument.

Valid program names are:

  arff.vector: :Generate Vectors from an ARFF file or directory

  baumwelch: :Baum-Welch algorithm for unsupervised HMM training

  buildforest: :Build the random forest classifier

  canopy: :Canopy clustering

  cat: : Print afile or resource as the logistic regression models would see it

  cleansvd: :Cleanup and verification of SVD output

  clusterdump: :Dump cluster output to text

  clusterpp: :Groups Clustering Output In Clusters

  cmdump: : Dumpconfusion matrix in HTML or text formats

  concatmatrices: : Concatenates 2 matrices ofsame cardinality into a single matrix

  cvb: : LDA viaCollapsed Variation Bayes (0th deriv. approx)

  cvb0_local: :LDA via Collapsed Variation Bayes, in memory locally.

  describe: :Describe the fields and target variable in a data set

 evaluateFactorization: : compute RMSE and MAE of a rating matrixfactorization against probes

  fkmeans: :Fuzzy K-means clustering

  hmmpredict: :Generate random sequence of observations by given HMM

 itemsimilarity: : Compute the item-item-similarities for item-basedcollaborative filtering

  kmeans: :K-means clustering

  lucene.vector:: Generate Vectors from a Lucene index

  lucene2seq: :Generate Text SequenceFiles from a Lucene index

  matrixdump: :Dump matrix in CSV format

  matrixmult: :Take the product of two matrices

  parallelALS: :ALS-WR factorization of a rating matrix

  qualcluster: :Runs clustering experiments and summarizes results in a CSV

 recommendfactorized: : Compute recommendations using the factorizationof a rating matrix

 recommenditembased: : Compute recommendations using item-basedcollaborative filtering

 regexconverter: : Convert text files on a per line basis based onregular expressions

  resplit: :Splits a set of SequenceFiles into a number of equal splits

  rowid: : MapSequenceFile<Text,VectorWritable> to{SequenceFile<IntWritable,VectorWritable>,SequenceFile<IntWritable,Text>}

  rowsimilarity:: Compute the pairwise similarities of the rows of a matrix

  runAdaptiveLogistic:: Score new production data using a probably trained and validatedAdaptivelogisticRegression model

  runlogistic: :Run a logistic regression model against CSV data

  seq2encoded: :Encoded Sparse Vector generation from Text sequence files

  seq2sparse: :Sparse Vector generation from Text sequence files

  seqdirectory:: Generate sequence files (of Text) from a directory

  seqdumper: :Generic Sequence File dumper

 seqmailarchives: : Creates SequenceFile from a directory containinggzipped mail archives

  seqwiki: :Wikipedia xml dump to sequence file

 spectralkmeans: : Spectral k-means clustering

  split: : SplitInput data into test and train sets

  splitDataset:: split a rating dataset into training and probe parts

  ssvd: :Stochastic SVD

 streamingkmeans: : Streaming k-means clustering

  svd: : LanczosSingular Value Decomposition

  testforest: :Test the random forest classifier

  testnb: : Testthe Vector-based Bayes classifier

 trainAdaptiveLogistic: : Train an AdaptivelogisticRegression model

  trainlogistic:: Train a logistic regression using stochastic gradient descent

  trainnb: :Train the Vector-based Bayes classifier

  transpose: :Take the transpose of a matrix

  validateAdaptiveLogistic:: Validate an AdaptivelogisticRegression model against hold-out data set

  vecdist: :Compute the distances between a set of Vectors (or Cluster or Canopy, they mustfit in memory) and a list of Vectors

  vectordump: :Dump vectors from a sequence file to text

  viterbi: :Viterbi decoding of hidden states from given output states sequence



[root@master ~]# mkdir 20news

[root@master ~]# tar -xzf 20news-bydate.tar.gz -C20news

[root@master ~]# hadoop fs -mkdir -p/data/mahout/20news/20news-all

[root@master ~]# hadoop fs -put 20news/*/data/mahout/20news/20news-all

[root@master ~]# mahout seqdirectory -i /data/mahout/20news/20news-all-o /data/mahout/20news/output/20news-seq

MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR toclasspath.

Running on hadoop, using/usr/hdp/ andHADOOP_CONF_DIR=/usr/hdp/


WARNING: Use "yarn jar" to launch YARNapplications.

17/05/12 05:04:32 WARN driver.MahoutDriver: Noseqdirectory.props found on classpath, will use command-line arguments only

17/05/12 05:04:32 INFO common.AbstractJob: Commandline arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647],--fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter],--input=[/data/mahout/20news/20news-all], --keyPrefix=[], --method=[mapreduce],--output=[/data/mahout/20news/output/20news-seq], --startPhase=[0],--tempDir=[temp]}

17/05/12 05:04:35 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/12 05:04:35 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/12 05:04:53 INFO input.FileInputFormat: Totalinput paths to process : 4262

17/05/12 05:04:53 INFO input.CombineFileInputFormat:DEBUG: Terminated node allocation with : CompletedNodes: 2, size left: 8691977

17/05/12 05:05:10 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/12 05:05:20 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494563840869_0001

17/05/12 05:05:21 INFO impl.YarnClientImpl: Submittedapplication application_1494563840869_0001

17/05/12 05:05:21 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494563840869_0001/

17/05/12 05:05:21 INFO mapreduce.Job: Running job:job_1494563840869_0001

17/05/12 05:06:34 INFO mapreduce.Job: Jobjob_1494563840869_0001 running in uber mode : false

17/05/12 05:06:34 INFO mapreduce.Job:  map 0% reduce 0%

17/05/12 05:06:59 INFO mapreduce.Job:  map 14% reduce 0%

17/05/12 05:07:02 INFO mapreduce.Job:  map 37% reduce 0%

17/05/12 05:07:05 INFO mapreduce.Job:  map 66% reduce 0%

17/05/12 05:07:08 INFO mapreduce.Job:  map 100% reduce 0%

17/05/12 05:07:15 INFO mapreduce.Job: Jobjob_1494563840869_0001 completed successfully

17/05/12 05:07:15 INFO mapreduce.Job: Counters: 30

        FileSystem Counters

               FILE: Number of bytes read=0

               FILE: Number of bytes written=140047

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=9148620

               HDFS: Number of bytes written=3244364

               HDFS: Number of read operations=17052

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Other local map tasks=1

               Total time spent by all maps in occupied slots (ms)=49028

               Total time spent by all reduces in occupied slots (ms)=0

               Total time spent by all map tasks (ms)=24514

               Total vcore-seconds taken by all map tasks=24514

               Total megabyte-seconds taken by all map tasks=24612056

       Map-Reduce Framework

               Map input records=4262

               Map output records=4262

               Input split bytes=456643

               Spilled Records=0

               Failed Shuffles=0

               Merged Map outputs=0

               GC time elapsed (ms)=176

               CPU time spent (ms)=15710

               Physical memory (bytes) snapshot=250163200

               Virtual memory (bytes) snapshot=2792144896

               Total committed heap usage (bytes)=123207680

        FileInput Format Counters

               Bytes Read=0

        FileOutput Format Counters

               Bytes Written=3244364

17/05/12 05:07:15 INFO driver.MahoutDriver: Programtook 163575 ms (Minutes: 2.72625)


[root@master ~]# hadoop fs -ls /data/mahout/20news/output/20news-seq

Found 2 items

-rw-r--r--   3root hdfs          0 2017-05-12 05:07/data/mahout/20news/output/20news-seq/_SUCCESS

-rw-r--r--   3root hdfs    3244364 2017-05-12 05:07/data/mahout/20news/output/20news-seq/part-m-00000



[root@master ~]# mkdir 20news

[root@master ~]# tar -xzf 20news-bydate.tar.gz -C20news

[root@master ~]# hadoop fs -mkdir -p /data/mahout/20news/20news-all

[root@master ~]# hadoop fs -put 20news/*/data/mahout/20news/20news-all

[root@master ~]# mahout seqdirectory -i/data/mahout/20news/20news-all -o /data/mahout/20news/output/20news-seq

MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR toclasspath.

Running on hadoop, using/usr/hdp/ andHADOOP_CONF_DIR=/usr/hdp/


WARNING: Use "yarn jar" to launch YARNapplications.

17/05/12 05:04:32 WARN driver.MahoutDriver: Noseqdirectory.props found on classpath, will use command-line arguments only

17/05/12 05:04:32 INFO common.AbstractJob: Commandline arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647],--fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter],--input=[/data/mahout/20news/20news-all], --keyPrefix=[], --method=[mapreduce],--output=[/data/mahout/20news/output/20news-seq], --startPhase=[0],--tempDir=[temp]}

17/05/12 05:04:35 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/12 05:04:35 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/12 05:04:53 INFO input.FileInputFormat: Totalinput paths to process : 4262

17/05/12 05:04:53 INFO input.CombineFileInputFormat:DEBUG: Terminated node allocation with : CompletedNodes: 2, size left: 8691977

17/05/12 05:05:10 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/12 05:05:20 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494563840869_0001

17/05/12 05:05:21 INFO impl.YarnClientImpl: Submittedapplication application_1494563840869_0001

17/05/12 05:05:21 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494563840869_0001/

17/05/12 05:05:21 INFO mapreduce.Job: Running job:job_1494563840869_0001

17/05/12 05:06:34 INFO mapreduce.Job: Jobjob_1494563840869_0001 running in uber mode : false

17/05/12 05:06:34 INFO mapreduce.Job:  map 0% reduce 0%

17/05/12 05:06:59 INFO mapreduce.Job:  map 14% reduce 0%

17/05/12 05:07:02 INFO mapreduce.Job:  map 37% reduce 0%

17/05/12 05:07:05 INFO mapreduce.Job:  map 66% reduce 0%

17/05/12 05:07:08 INFO mapreduce.Job:  map 100% reduce 0%

17/05/12 05:07:15 INFO mapreduce.Job: Jobjob_1494563840869_0001 completed successfully

17/05/12 05:07:15 INFO mapreduce.Job: Counters: 30

        FileSystem Counters

               FILE: Number of bytes read=0

               FILE: Number of bytes written=140047

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=9148620

               HDFS: Number of bytes written=3244364

               HDFS: Number of read operations=17052

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Other local map tasks=1

               Total time spent by all maps in occupied slots (ms)=49028

               Total time spent by all reduces in occupied slots (ms)=0

               Total time spent by all map tasks (ms)=24514

               Total vcore-seconds taken by all map tasks=24514

               Total megabyte-seconds taken by all map tasks=24612056

       Map-Reduce Framework

               Map input records=4262

               Map output records=4262

               Input split bytes=456643

                Spilled Records=0

               Failed Shuffles=0

               Merged Map outputs=0

               GC time elapsed (ms)=176

               CPU time spent (ms)=15710

               Physical memory (bytes) snapshot=250163200

               Virtual memory (bytes) snapshot=2792144896

               Total committed heap usage (bytes)=123207680

        FileInput Format Counters

               Bytes Read=0

        FileOutput Format Counters

               Bytes Written=3244364

17/05/12 05:07:15 INFO driver.MahoutDriver: Programtook 163575 ms (Minutes: 2.72625)


[root@master ~]# hadoop fs -text/data/mahout/20news/output/20news-seq/part-m-00000 | head -n 20

17/05/12 05:26:18 INFO zlib.ZlibFactory: Successfullyloaded & initialized native-zlib library

17/05/12 05:26:18 INFO compress.CodecPool: Gotbrand-new decompressor [.deflate]

17/05/12 05:26:18 INFO compress.CodecPool: Gotbrand-new decompressor [.deflate]

17/05/12 05:26:18 INFO compress.CodecPool: Gotbrand-new decompressor [.deflate]

17/05/12 05:26:18 INFO compress.CodecPool: Gotbrand-new decompressor [.deflate]

/20news-bydate-test/alt.atheism/53068   From: decay@cbnewsj.cb.att.com(dean.kaflowitz)

Subject: Re: about the bible quiz answers

Organization: AT&T

Distribution: na

Lines: 18


In article<healta.153.735242337@saturn.wwc.edu>, healta@saturn.wwc.edu (Tammy RHealy) writes:



> #12) The 2 cheribums are on the Ark of theCovenant.  When God said make no

> graven image, he was refering to idols, whichwere created to be worshipped.

> The Ark of the Covenant wasn't wrodhipped andonly the high priest could

> enter the Holy of Holies where it was kept once ayear, on the Day of

> Atonement.


I am not familiar with, or knowledgeable about theoriginal language,

but I believe there is a word for "idol" andthat the translator

would have used the word "idol" instead of"graven image" had

the original said "idol."  So I think you're wrong here, but

then again I could be too.  I just suggesting a way to determine

text: Unable to write to output stream.



[hdfs@master ~]$ hadoop fs -mkdir -p/data/mahout/project

[hdfs@master ~]$ hadoop fs -put user-item-score.txt/data/mahout/project

[hdfs@master ~]$ mahout recommenditembased -i/data/mahout/project/ user-item-score.txt -o /data/mahout/project/output -n 3-b false -s SIMILARITY_EUCLIDEAN_DISTANCE --maxPrefsPerUser 4 --minPrefsPerUser1 --maxPrefsInItemSimilarity 4 --tempDir /data/mahout/project/temp                                                  

MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR toclasspath.

Running on hadoop, using/usr/hdp/ andHADOOP_CONF_DIR=/usr/hdp/


WARNING: Use "yarn jar" to launch YARNapplications.

17/05/15 19:33:06 WARN driver.MahoutDriver: Norecommenditembased.props found on classpath, will use command-line argumentsonly

17/05/15 19:33:07 INFO common.AbstractJob: Commandline arguments: {--booleanData=[false], --endPhase=[2147483647],--input=[/data/mahout/project/user.txt], --maxPrefsInItemSimilarity=[4],--maxPrefsPerUser=[4], --maxSimilaritiesPerItem=[100], --minPrefsPerUser=[1],--numRecommendations=[3], --output=[/data/mahout/project/output], --similarityClassname=[SIMILARITY_EUCLIDEAN_DISTANCE],--startPhase=[0], --tempDir=[/data/mahout/project/temp]}

17/05/15 19:33:07 INFO common.AbstractJob: Commandline arguments: {--booleanData=[false], --endPhase=[2147483647],--input=[/data/mahout/project/user.txt], --minPrefsPerUser=[1],--output=[/data/mahout/project/temp/preparePreferenceMatrix],--ratingShift=[0.0], --startPhase=[0], --tempDir=[/data/mahout/project/temp]}

17/05/15 19:33:08 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:33:08 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:33:10 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:33:10 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:33:10 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0013

17/05/15 19:33:11 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0013

17/05/15 19:33:11 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0013/

17/05/15 19:33:11 INFO mapreduce.Job: Running job:job_1494874269419_0013

17/05/15 19:33:18 INFO mapreduce.Job: Jobjob_1494874269419_0013 running in uber mode : false

17/05/15 19:33:18 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:33:25 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:33:33 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:33:36 INFO mapreduce.Job: Jobjob_1494874269419_0013 completed successfully

17/05/15 19:33:37 INFO mapreduce.Job: Counters: 49

        FileSystem Counters

               FILE: Number of bytes read=54

               FILE: Number of bytes written=272323

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=341

               HDFS: Number of bytes written=187

               HDFS: Number of read operations=6

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Launched reduce tasks=1

               Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=3313

               Total time spent by all reduces in occupied slots (ms)=12410

               Total time spent by all map tasks (ms)=3313

               Total time spent by all reduce tasks (ms)=6205

               Total vcore-seconds taken by all map tasks=3313

               Total vcore-seconds taken by all reduce tasks=6205

               Total megabyte-seconds taken by all map tasks=1696256

               Total megabyte-seconds taken by all reduce tasks=6353920

       Map-Reduce Framework

               Map input records=21

               Map output records=21

               Map output bytes=84

               Map output materialized bytes=46

               Input split bytes=112

               Combine input records=21

               Combine output records=7

               Reduce input groups=7

               Reduce shuffle bytes=46

               Reduce input records=7

               Reduce output records=7

               Spilled Records=14

                Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

               GC time elapsed (ms)=116

               CPU time spent (ms)=2080

               Physical memory (bytes) snapshot=656359424

               Virtual memory (bytes) snapshot=5180207104

               Total committed heap usage (bytes)=484442112








        FileInput Format Counters

               Bytes Read=229

        FileOutput Format Counters

               Bytes Written=187

17/05/15 19:33:37 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:33:37 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:33:38 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:33:40 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:33:41 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0014

17/05/15 19:33:43 INFO impl.YarnClientImpl:Application submission is not finished, submitted applicationapplication_1494874269419_0014 is still in NEW_SAVING

17/05/15 19:33:44 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0014

17/05/15 19:33:44 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0014/

17/05/15 19:33:44 INFO mapreduce.Job: Running job:job_1494874269419_0014

17/05/15 19:33:55 INFO mapreduce.Job: Jobjob_1494874269419_0014 running in uber mode : false

17/05/15 19:33:55 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:34:03 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:34:12 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:34:27 INFO mapreduce.Job: Jobjob_1494874269419_0014 completed successfully

17/05/15 19:34:27 INFO mapreduce.Job: Counters: 50

        FileSystem Counters

                FILE: Number of bytes read=113

               FILE: Number of bytes written=273073

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

                HDFS: Number of bytes read=341

               HDFS: Number of bytes written=288

               HDFS: Number of read operations=6

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Launched reduce tasks=1

               Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=4912

               Total time spent by all reduces in occupied slots (ms)=13474

               Total time spent by all map tasks (ms)=4912

               Total time spent by all reduce tasks (ms)=6737

               Total vcore-seconds taken by all map tasks=4912

               Total vcore-seconds taken by all reduce tasks=6737

               Total megabyte-seconds taken by all map tasks=2514944

               Total megabyte-seconds taken by all reduce tasks=6898688

       Map-Reduce Framework

               Map input records=21

               Map output records=21

               Map output bytes=147

               Map output materialized bytes=105

               Input split bytes=112

               Combine input records=0

               Combine output records=0

               Reduce input groups=5

               Reduce shuffle bytes=105

               Reduce input records=21

               Reduce output records=5

               Spilled Records=42

               Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

               GC time elapsed (ms)=125

               CPU time spent (ms)=1830

               Physical memory (bytes) snapshot=666886144

               Virtual memory (bytes) snapshot=5178028032

               Total committed heap usage (bytes)=488636416








        FileInput Format Counters

               Bytes Read=229

        FileOutput Format Counters

               Bytes Written=288



17/05/15 19:34:28 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:34:28 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:34:29 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:34:29 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:34:29 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0015

17/05/15 19:34:29 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0015

17/05/15 19:34:29 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0015/

17/05/15 19:34:29 INFO mapreduce.Job: Running job:job_1494874269419_0015

17/05/15 19:34:36 INFO mapreduce.Job: Job job_1494874269419_0015running in uber mode : false

17/05/15 19:34:36 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:34:45 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:34:51 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:34:52 INFO mapreduce.Job: Jobjob_1494874269419_0015 completed successfully

17/05/15 19:34:52 INFO mapreduce.Job: Counters: 49

        FileSystem Counters

               FILE: Number of bytes read=126

               FILE: Number of bytes written=272583

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=445

               HDFS: Number of bytes written=335

               HDFS: Number of read operations=7

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Launched reduce tasks=1

               Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=5388

               Total time spent by all reduces in occupied slots (ms)=6558

               Total time spent by all map tasks (ms)=5388

               Total time spent by all reduce tasks (ms)=3279

               Total vcore-seconds taken by all map tasks=5388

               Total vcore-seconds taken by all reduce tasks=3279

               Total megabyte-seconds taken by all map tasks=2758656

               Total megabyte-seconds taken by all reduce tasks=3357696

       Map-Reduce Framework

               Map input records=5

               Map output records=21

               Map output bytes=336

               Map output materialized bytes=118

               Input split bytes=157

               Combine input records=21

               Combine output records=7

               Reduce input groups=7

               Reduce shuffle bytes=118

               Reduce input records=7

                Reduce output records=7

               Spilled Records=14

               Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

               GC time elapsed (ms)=127

               CPU time spent (ms)=1970

                Physical memory (bytes)snapshot=661454848

               Virtual memory (bytes) snapshot=5178253312

               Total committed heap usage (bytes)=486014976








        FileInput Format Counters

               Bytes Read=288

        FileOutput Format Counters

               Bytes Written=335

17/05/15 19:34:52 INFO common.AbstractJob: Commandline arguments: {--endPhase=[2147483647], --excludeSelfSimilarity=[true],--input=[/data/mahout/project/temp/preparePreferenceMatrix/ratingMatrix],--maxObservationsPerColumn=[4], --maxObservationsPerRow=[4],--maxSimilaritiesPerRow=[100], --numberOfColumns=[5],--output=[/data/mahout/project/temp/similarityMatrix],--randomSeed=[-9223372036854775808],--similarityClassname=[SIMILARITY_EUCLIDEAN_DISTANCE], --startPhase=[0],--tempDir=[/data/mahout/project/temp], --threshold=[4.9E-324]}

17/05/15 19:34:52 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:34:52 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:34:52 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:34:53 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:34:53 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0016

17/05/15 19:34:53 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0016

17/05/15 19:34:53 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0016/

17/05/15 19:34:53 INFO mapreduce.Job: Running job:job_1494874269419_0016

17/05/15 19:35:00 INFO mapreduce.Job: Jobjob_1494874269419_0016 running in uber mode : false

17/05/15 19:35:00 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:35:05 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:35:11 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:35:13 INFO mapreduce.Job: Jobjob_1494874269419_0016 completed successfully

17/05/15 19:35:13 INFO mapreduce.Job: Counters: 49

        FileSystem Counters

               FILE: Number of bytes read=50

               FILE: Number of bytes written=272971

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=493

               HDFS: Number of bytes written=150

               HDFS: Number of read operations=7

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=3


               Launched map tasks=1

               Launched reduce tasks=1

               Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=3341

               Total time spent by all reduces in occupied slots (ms)=6930

               Total time spent by all map tasks (ms)=3341

               Total time spent by all reduce tasks (ms)=3465

               Total vcore-seconds taken by all map tasks=3341

               Total vcore-seconds taken by all reduce tasks=3465

               Total megabyte-seconds taken by all map tasks=1710592

               Total megabyte-seconds taken by all reduce tasks=3548160

       Map-Reduce Framework

               Map input records=7

               Map output records=1

               Map output bytes=52

                Map output materialized bytes=42

               Input split bytes=158

               Combine input records=1

               Combine output records=1

               Reduce input groups=1

               Reduce shuffle bytes=42

                Reduce input records=1

               Reduce output records=0

               Spilled Records=2

               Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

               GC time elapsed (ms)=129

               CPU time spent (ms)=1890

               Physical memory (bytes) snapshot=665804800

               Virtual memory (bytes) snapshot=5178810368

               Total committed heap usage (bytes)=488636416








        FileInput Format Counters

               Bytes Read=335

        FileOutput Format Counters

               Bytes Written=98

17/05/15 19:35:14 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:35:14 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:35:15 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:35:16 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:35:16 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0017

17/05/15 19:35:17 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0017

17/05/15 19:35:17 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0017/

17/05/15 19:35:17 INFO mapreduce.Job: Running job:job_1494874269419_0017

17/05/15 19:35:24 INFO mapreduce.Job: Jobjob_1494874269419_0017 running in uber mode : false

17/05/15 19:35:24 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:35:33 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:35:39 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:35:40 INFO mapreduce.Job: Jobjob_1494874269419_0017 completed successfully

17/05/15 19:35:40 INFO mapreduce.Job: Counters: 52

        FileSystem Counters

               FILE: Number of bytes read=166

                FILE: Number of byteswritten=276957

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=545

                HDFS: Number of bytes written=447

               HDFS: Number of read operations=8

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=5


               Launched map tasks=1

               Launched reduce tasks=1

               Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=6262

               Total time spent by all reduces in occupied slots (ms)=7262

               Total time spent by all map tasks (ms)=6262

               Total time spent by all reduce tasks (ms)=3631

               Total vcore-seconds taken by all map tasks=6262

               Total vcore-seconds taken by all reduce tasks=3631

               Total megabyte-seconds taken by all map tasks=3206144

               Total megabyte-seconds taken by all reduce tasks=3718144

       Map-Reduce Framework

               Map input records=7

               Map output records=22

               Map output bytes=476

               Map output materialized bytes=158

               Input split bytes=158

               Combine input records=22

               Combine output records=8

               Reduce input groups=8

                Reduce shuffle bytes=158

               Reduce input records=8

               Reduce output records=5

               Spilled Records=16

               Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

                GC time elapsed (ms)=154

               CPU time spent (ms)=4190

               Physical memory (bytes) snapshot=666284032

               Virtual memory (bytes) snapshot=5179322368

               Total committed heap usage (bytes)=489684992

        Shuffle Errors







        FileInput Format Counters

               Bytes Read=335

        FileOutput Format Counters

               Bytes Written=363





17/05/15 19:35:40 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:35:40 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:35:44 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:35:45 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:35:45 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0018

17/05/15 19:35:45 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0018

17/05/15 19:35:45 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0018/

17/05/15 19:35:45 INFO mapreduce.Job: Running job:job_1494874269419_0018

17/05/15 19:35:57 INFO mapreduce.Job: Jobjob_1494874269419_0018 running in uber mode : false

17/05/15 19:35:57 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:36:07 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:36:14 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:36:15 INFO mapreduce.Job: Jobjob_1494874269419_0018 completed successfully

17/05/15 19:36:15 INFO mapreduce.Job: Counters: 51

        FileSystem Counters

               FILE: Number of bytes read=160

               FILE: Number of bytes written=275869

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=576

               HDFS: Number of bytes written=365

               HDFS: Number of read operations=10

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Launched reduce tasks=1

               Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=8191

               Total time spent by all reduces in occupied slots (ms)=8114

               Total time spent by all map tasks (ms)=8191

               Total time spent by all reduce tasks (ms)=4057

               Total vcore-seconds taken by all map tasks=8191

               Total vcore-seconds taken by all reduce tasks=4057

               Total megabyte-seconds taken by all map tasks=4193792

               Total megabyte-seconds taken by all reduce tasks=4154368

       Map-Reduce Framework

               Map input records=5

               Map output records=19

               Map output bytes=632

               Map output materialized bytes=152

               Input split bytes=129

               Combine input records=19

               Combine output records=7

               Reduce input groups=7

               Reduce shuffle bytes=152

               Reduce input records=7

               Reduce output records=7

               Spilled Records=14

               Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

               GC time elapsed (ms)=195

               CPU time spent (ms)=6470

               Physical memory (bytes) snapshot=681738240

               Virtual memory (bytes) snapshot=5182803968

               Total committed heap usage (bytes)=490733568








        FileInput Format Counters

               Bytes Read=363

        FileOutput Format Counters

               Bytes Written=365




17/05/15 19:36:16 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:36:16 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:36:17 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:36:17 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:36:17 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0019

17/05/15 19:36:18 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0019

17/05/15 19:36:18 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0019/

17/05/15 19:36:18 INFO mapreduce.Job: Running job:job_1494874269419_0019

17/05/15 19:36:25 INFO mapreduce.Job: Jobjob_1494874269419_0019 running in uber mode : false

17/05/15 19:36:25 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:36:31 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:36:37 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:36:38 INFO mapreduce.Job: Jobjob_1494874269419_0019 completed successfully

17/05/15 19:36:38 INFO mapreduce.Job: Counters: 49

        FileSystem Counters

               FILE: Number of bytes read=249

               FILE: Number of bytes written=273343

                FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=505

               HDFS: Number of bytes written=500

                HDFS: Number of read operations=7

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Launched reduce tasks=1

                Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=4469

               Total time spent by all reduces in occupied slots (ms)=6400

               Total time spent by all map tasks (ms)=4469

               Total time spent by all reduce tasks (ms)=3200

               Total vcore-seconds taken by all map tasks=4469

               Total vcore-seconds taken by all reduce tasks=3200

               Total megabyte-seconds taken by all map tasks=2288128

               Total megabyte-seconds taken by all reduce tasks=3276800

       Map-Reduce Framework

               Map input records=7

               Map output records=22

               Map output bytes=512

               Map output materialized bytes=241

               Input split bytes=140

               Combine input records=22

               Combine output records=7

               Reduce input groups=7

               Reduce shuffle bytes=241

               Reduce input records=7

                Reduce output records=7

               Spilled Records=14

               Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

               GC time elapsed (ms)=113

               CPU time spent (ms)=3290

               Physical memory (bytes) snapshot=665755648

               Virtual memory (bytes) snapshot=5179650048

               Total committed heap usage (bytes)=486014976








        FileInput Format Counters

               Bytes Read=365

        FileOutput Format Counters

                Bytes Written=500

17/05/15 19:36:38 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:36:38 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:36:38 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:36:38 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:36:39 INFO mapreduce.JobSubmitter: numberof splits:2

17/05/15 19:36:39 INFO mapreduce.JobSubmitter: Submittingtokens for job: job_1494874269419_0020

17/05/15 19:36:39 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0020

17/05/15 19:36:39 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0020/

17/05/15 19:36:39 INFO mapreduce.Job: Running job:job_1494874269419_0020

17/05/15 19:36:47 INFO mapreduce.Job: Jobjob_1494874269419_0020 running in uber mode : false

17/05/15 19:36:47 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:36:54 INFO mapreduce.Job:  map 50% reduce 0%

17/05/15 19:36:55 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:37:00 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:37:01 INFO mapreduce.Job: Jobjob_1494874269419_0020 completed successfully

17/05/15 19:37:01 INFO mapreduce.Job: Counters: 49

        FileSystem Counters

               FILE: Number of bytes read=309

               FILE: Number of bytes written=410207

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=1453

               HDFS: Number of bytes written=542

               HDFS: Number of read operations=11

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=2

               Launched reduce tasks=1

               Data-local map tasks=2

                Total time spent by all maps in occupiedslots (ms)=11488

               Total time spent by all reduces in occupied slots (ms)=6586

               Total time spent by all map tasks (ms)=11488

               Total time spent by all reduce tasks (ms)=3293

               Total vcore-seconds taken by all map tasks=11488

               Total vcore-seconds taken by all reduce tasks=3293

               Total megabyte-seconds taken by all map tasks=5881856

               Total megabyte-seconds taken by all reduce tasks=3372032

       Map-Reduce Framework

               Map input records=12

               Map output records=28

               Map output bytes=423

               Map output materialized bytes=306

               Input split bytes=665

                Combine input records=0

               Combine output records=0

               Reduce input groups=7

               Reduce shuffle bytes=306

               Reduce input records=28

               Reduce output records=7

               Spilled Records=56

               Shuffled Maps =2

               Failed Shuffles=0

               Merged Map outputs=2

               GC time elapsed (ms)=244

               CPU time spent (ms)=6050

               Physical memory (bytes) snapshot=1123991552

               Virtual memory (bytes) snapshot=7530958848

               Total committed heap usage (bytes)=849870848








        FileInput Format Counters

               Bytes Read=0

        FileOutput Format Counters

               Bytes Written=542

17/05/15 19:37:01 INFO impl.TimelineClientImpl:Timeline service address: http://slaver1:8188/ws/v1/timeline/

17/05/15 19:37:01 INFO client.RMProxy: Connecting toResourceManager at slaver1/

17/05/15 19:37:02 INFO input.FileInputFormat: Totalinput paths to process : 1

17/05/15 19:37:03 INFO mapreduce.JobSubmitter: numberof splits:1

17/05/15 19:37:03 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1494874269419_0021

17/05/15 19:37:03 INFO impl.YarnClientImpl: Submittedapplication application_1494874269419_0021

17/05/15 19:37:03 INFO mapreduce.Job: The url to trackthe job: http://slaver1:8088/proxy/application_1494874269419_0021/

17/05/15 19:37:03 INFO mapreduce.Job: Running job:job_1494874269419_0021

17/05/15 19:37:10 INFO mapreduce.Job: Jobjob_1494874269419_0021 running in uber mode : false

17/05/15 19:37:10 INFO mapreduce.Job:  map 0% reduce 0%

17/05/15 19:37:17 INFO mapreduce.Job:  map 100% reduce 0%

17/05/15 19:37:24 INFO mapreduce.Job:  map 100% reduce 100%

17/05/15 19:37:25 INFO mapreduce.Job: Job job_1494874269419_0021completed successfully

17/05/15 19:37:25 INFO mapreduce.Job: Counters: 49

        FileSystem Counters

               FILE: Number of bytes read=274

               FILE: Number of bytes written=273455

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=866

               HDFS: Number of bytes written=185

               HDFS: Number of read operations=10

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=2


               Launched map tasks=1

               Launched reduce tasks=1

               Data-local map tasks=1

               Total time spent by all maps in occupied slots (ms)=4874

               Total time spent by all reduces in occupied slots (ms)=6604

               Total time spent by all map tasks (ms)=4874

               Total time spent by all reduce tasks (ms)=3302

               Total vcore-seconds taken by all map tasks=4874

               Total vcore-seconds taken by all reduce tasks=3302

               Total megabyte-seconds taken by all map tasks=2495488

               Total megabyte-seconds taken by all reduce tasks=3381248

       Map-Reduce Framework

               Map input records=7

               Map output records=19

               Map output bytes=768

               Map output materialized bytes=266

               Input split bytes=137

               Combine input records=0

               Combine output records=0

               Reduce input groups=5

               Reduce shuffle bytes=266

               Reduce input records=19

                Reduce output records=5

               Spilled Records=38

               Shuffled Maps =1

               Failed Shuffles=0

               Merged Map outputs=1

               GC time elapsed (ms)=124

               CPU time spent (ms)=2150

               Physical memory (bytes) snapshot=597028864

               Virtual memory (bytes) snapshot=5181710336

               Total committed heap usage (bytes)=401080320








        FileInput Format Counters

               Bytes Read=542

        FileOutput Format Counters

                Bytes Written=185

17/05/15 19:37:25 INFO driver.MahoutDriver: Programtook 259068 ms (Minutes: 4.3178)


[hdfs@master ~]$ hadoop fs -cat/data/mahout/project/output/part-r-00000

1      [105:3.5941463,104:3.4639049]

2      [106:3.5,105:2.714964,107:2.0]

3      [103:3.59246,102:3.458911]

4      [107:4.7381864,105:4.2794304,102:4.170158]

5      [103:3.8962872,102:3.8564017,107:3.7692602]





当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


