【Mahout一】基于Mahout 命令参数含义

最新推荐文章于 2024-02-12 21:42:56 发布

axxbc123

最新推荐文章于 2024-02-12 21:42:56 发布

阅读量318

点赞数

分类专栏： Mahout 文章标签：人工智能大数据

本文链接：https://blog.csdn.net/axxbc123/article/details/84720191

版权

1. mahout seqdirectory $ mahout seqdirectory --input (-i) input Path to job input directory(原始文本文件). --output (-o) output The directory pathname ...

摘要由CSDN通过智能技术生成

1. mahout seqdirectory

    $ mahout seqdirectory 
        --input (-i) input               Path to job input directory(原始文本文件).
        --output (-o) output             The directory pathname for output.（<Text,Text>Sequence File）
        -ow

功能：将原始文本数据集转换为< Text, Text > SequenceFile

2. mahout seq2sparke

功能： Convert and preprocesses the dataset（<Text,Text> SequenceFile） into a < Text, VectorWritable > SequenceFile containing term frequencies for each document.

即根据Sequence File转换为tfidf向量文件

说明：If we wanted to use different parsing methods or transformations on the term frequency vectors we could supply different options here e.g.: -ng 2 for bigrams or -n 2 for L2 length normalizat

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

axxbc123

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【Mahout一】基于Mahout 命令参数含义

1. mahout seqdirectory $ mahout seqdirectory --input (-i) input Path to job input directory(原始文本文件). --output (-o) output The directory pathname ...
复制链接

扫一扫