山东大学软件工程应用与实践——Pig代码分析（六）

最新推荐文章于 2021-12-23 19:09:00 发布

Tdqiu

最新推荐文章于 2021-12-23 19:09:00 发布

阅读量252

点赞数 1

文章标签： pig

本文链接：https://blog.csdn.net/weixin_54263893/article/details/121250753

版权

2021SC@SDUSC

概述

本次继续分析pig作为hadoop的轻量级脚本语言操作hadoop的executionengine包下的LimitAdjuster类的代码

LimitAdjuster类

limit adjuster：限位调节器

visitMROp方法
查找包含限制运算符的 map reduce 运算符。
如果查找到，使用一个reducer来添加一个额外的map-reduce操作进入源计划。

public void visitMROp(MapReduceOper mr) throws VisitorException {
        if ((mr.limit!=-1 || mr.limitPlan!=null) )
        {
            opsToAdjust.add(mr);
        }
    }

splitReducerForLimit方法
在reducer计划中，从firstMROp到secondMROp，移动 POLimit 和 POStore 之间所有的operators。

private void splitReducerForLimit(MapReduceOper secondMROp,
            MapReduceOper firstMROp) throws PlanException, VisitorException {

        PhysicalOperator op = firstMROp.reducePlan.getRoots().get(0);
        assert(op instanceof POPackage);

        while (true) {
            List<PhysicalOperator> succs = firstMROp.reducePlan
                    .getSuccessors(op);
            if (succs==null) break;
            op = succs.get(0);
            if (op instanceof POLimit) {
                op = firstMROp.reducePlan.getSuccessors(op).get(0);
                break;
            }
        }
        }

adjust方法
将原来的减少计划拆分为两个mapreduce作业：
第一：从根（POPackage）到POLimit
第二：从POLimit到叶（POPackage），重复POLimit

public void adjust() throws IOException, PlanException{
FileSpec fSpec = new FileSpec(FileLocalizer.getTemporaryPath(pigContext).toString(),
                    new FuncSpec(Utils.getTmpFileCompressorName(pigContext)));
            POStore storeOp = (POStore) mpLeaf;
            storeOp.setSFile(fSpec);
            storeOp.setIsTmpStore(true);
            mr.setReduceDone(true);
            MapReduceOper limitAdjustMROp = new MapReduceOper(new OperatorKey(scope,nig.getNextNodeId(scope)));
            POLoad ld = new POLoad(new OperatorKey(scope,nig.getNextNodeId(scope)));
            ld.setPc(pigContext);
            ld.setLFile(fSpec);
            ld.setIsTmpLoad(true);
            limitAdjustMROp.mapPlan.add(ld);
            if (mr.isGlobalSort()) {
                connectMapToReduceLimitedSort(limitAdjustMROp, mr);
            } else {
                MRUtil.simpleConnectMapToReduce(limitAdjustMROp, scope, nig);
            }

}
splitReducerForLimit(limitAdjustMROp, mr);
            if (mr.isGlobalSort())
            {
                limitAdjustMROp.setLimitAfterSort(true);
                limitAdjustMROp.setSortOrder(mr.getSortOrder());
            }