LLVM学习笔记(42)

3.6.2.4. 资源描述的调度

3.6.2.4.1. SchedReadWrite数据的收集

描述类似SandyBridge处理器指令执行细节的方法不同于Atom这样的处理器,上面的处理对这些处理器不适用。这些处理器使用WriteRes或SchedWriteRes描述SchedWrite对资源的使用,依靠ReadAdvance或SchedReadAdvance来描述对特定的SchedWrite以及特定SchedRead的预读情况。为了处理这些定义,首先需要找出与当前处理的调度类型相关的SchedWrite与SchedRead定义。

SubtargetEmitter::EmitSchedModel(续)

1266    OS << "\n// ===============================================================\n"

1267       << "// Data tables for the new per-operand machine model.\n";

1268 

1269    SchedClassTables SchedTables;

1270    for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

1271           PE = SchedModels.procModelEnd(); PI != PE; ++PI) {

1272      GenSchedClassTables(*PI, SchedTables);

1273    }

1274    EmitSchedClassTables(SchedTables, OS);

1275 

1276    // Emit the processor machine model

1277    EmitProcessorModels(OS);

1278    // Emit the processor lookup data

1279    EmitProcessorLookup(OS);

1280 

1281    OS << "#undef DBGFIELD";

1282  }

1269行的SchedClassTables是SubtargetEmitter的内嵌类,在SchedClassDesc表中每个处理器的每个调度类型都有一个对应项。

38         struct SchedClassTables {

39           std::vector<std::vector<MCSchedClassDesc> > ProcSchedClasses;

40          std::vector<MCWriteProcResEntry> WriteProcResources;

41           std::vector<MCWriteLatencyEntry> WriteLatencies;

42           std::vector<std::string> WriterNames;

43          std::vector<MCReadAdvanceEntry> ReadAdvanceEntries;

44      

45           // Reserve an invalid entry at index 0

46           SchedClassTables() {

47             ProcSchedClasses.resize(1);

48             WriteProcResources.resize(1);

49             WriteLatencies.resize(1);

50             WriterNames.push_back("InvalidWrite");

51             ReadAdvanceEntries.resize(1);

52           }

53         };

其中MCSchedClassDesc的定义如下。它是MC对资源调度的描述方式。

101     struct MCSchedClassDesc {

102       static const unsigned short InvalidNumMicroOps = UINT16_MAX;

103       static const unsigned short VariantNumMicroOps = UINT16_MAX - 1;

104    

105     #ifndef NDEBUG

106       const char* Name;

107     #endif

108       unsigned short NumMicroOps;

109       bool     BeginGroup;

110       bool     EndGroup;

111       unsigned WriteProcResIdx; // First index into WriteProcResTable.

112       unsigned NumWriteProcResEntries;

113       unsigned WriteLatencyIdx; // First index into WriteLatencyTable.

114       unsigned NumWriteLatencyEntries;

115       unsigned ReadAdvanceIdx; // First index into ReadAdvanceTable.

116       unsigned NumReadAdvanceEntries;

117    

118       bool isValid() const {

119         return NumMicroOps != InvalidNumMicroOps;

120       }

121       bool isVariant() const {

122         return NumMicroOps == VariantNumMicroOps;

123       }

124     };

类型MCWriteProcResEntry用于描述指定调度类型在指定周期数里消耗指定处理器资源。

55       struct MCWriteProcResEntry {

56         unsigned ProcResourceIdx;

57         unsigned Cycles;

58      

59         bool operator==(const MCWriteProcResEntry &Other) const {

60           return ProcResourceIdx == Other.ProcResourceIdx && Cycles == Other.Cycles;

61         }

62       };

类型MCWriteLatencyEntry用于记录执行一个指定的SchedWrite定义所需的处理器周期。

69       struct MCWriteLatencyEntry {

70         int Cycles;

71         unsigned WriteResourceID;

72      

73         bool operator==(const MCWriteLatencyEntry &Other) const {

74           return Cycles == Other.Cycles && WriteResourceID == Other.WriteResourceID;

75         }

76       };

MCReadAdvanceEntry由ReadAdvance定义创建,用于描述处理器的流水线旁路,这时写操作的结果可提前若干周期(在ReadAdvance定义中给出)传给后续的读操作。这时UseIdx是这个ReadAdvance定义的索引,WriteResourceID则是旁路支持的SchedWrite的索引,Cycles是缩短的周期(如果是负数则是延长)。

86       struct MCReadAdvanceEntry {

87         unsigned UseIdx;

88         unsigned WriteResourceID;

89         int Cycles;

90      

91         bool operator==(const MCReadAdvanceEntry &Other) const {

92           return UseIdx == Other.UseIdx && WriteResourceID == Other.WriteResourceID

93             && Cycles == Other.Cycles;

94         }

95       };

1272行的GenSchedClassTables()方法就是为特定的处理器生成对应的MCSchedClassDesc实例。在816行CodeGenProcModel::hasInstrSchedModel()在容器WriteResDefs或ItinRWDefs不为空时返回true。这意味着对该处理器而言,存在援引它的WriteRes定义或ItinRW定义。注意815行,不管怎么样,SchedTables的ProcSchedClasses容器与CodeGenSchedModels的ProcModels容器最终将同一大小。

813     void SubtargetEmitter::GenSchedClassTables(const CodeGenProcModel &ProcModel,

814                                                SchedClassTables &SchedTables) {

815       SchedTables.ProcSchedClasses.resize(SchedTables.ProcSchedClasses.size() + 1);

816       if (!ProcModel.hasInstrSchedModel())

817         return;

818    

819       std::vector<MCSchedClassDesc> &SCTab = SchedTables.ProcSchedClasses.back();

820       for (CodeGenSchedModels::SchedClassIter SCI = SchedModels.schedClassBegin(),

821              SCE = SchedModels.schedClassEnd(); SCI != SCE; ++SCI) {

822         DEBUG(SCI->dump(&SchedModels));

823    

824         SCTab.resize(SCTab.size() + 1);

825         MCSchedClassDesc &SCDesc = SCTab.back();

826         // SCDesc.Name is guarded by NDEBUG

827         SCDesc.NumMicroOps = 0;

828         SCDesc.BeginGroup = false;

829         SCDesc.EndGroup = false;

830         SCDesc.WriteProcResIdx = 0;

831         SCDesc.WriteLatencyIdx = 0;

832         SCDesc.ReadAdvanceIdx = 0;

833    

834         // A Variant SchedClass has no resources of its own.

835         bool HasVariants = false;

836         for (std::vector<CodeGenSchedTransition>::const_iterator

837                TI = SCI->Transitions.begin(), TE = SCI->Transitions.end();

838              TI != TE; ++TI) {

839           if (TI->ProcIndices[0] == 0) {

840             HasVariants = true;

841             break;

842           }

843           IdxIter PIPos = std::find(TI->ProcIndices.begin(),

844                                     TI->ProcIndices.end(), ProcModel.Index);

845           if (PIPos != TI->ProcIndices.end()) {

846             HasVariants = true;

847             break;

848           }

849         }

850         if (HasVariants) {

851           SCDesc.NumMicroOps = MCSchedClassDesc::VariantNumMicroOps;

852           continue;

853         }

854    

855         // Determine if the SchedClass is actually reachable on this processor. If

856         // not don't try to locate the processor resources, it will fail.

857         // If ProcIndices contains 0, this class applies to all processors.

858         assert(!SCI->ProcIndices.empty() && "expect at least one procidx");

859         if (SCI->ProcIndices[0] != 0) {

860           IdxIter PIPos = std::find(SCI->ProcIndices.begin(),

861                                     SCI->ProcIndices.end(), ProcModel.Index);

862           if (PIPos == SCI->ProcIndices.end())

863             continue;

864         }

865         IdxVec Writes = SCI->Writes;

866         IdxVec Reads = SCI->Reads;

867         if (!SCI->InstRWs.empty()) {

868           // This class has a default ReadWrite list which can be overriden by

869           // InstRW definitions.

870           Record *RWDef = nullptr;

871           for (RecIter RWI = SCI->InstRWs.begin(), RWE = SCI->InstRWs.end();

872                RWI != RWE; ++RWI) {

873             Record *RWModelDef = (*RWI)->getValueAsDef("SchedModel");

874             if (&ProcModel == &SchedModels.getProcModel(RWModelDef)) {

875               RWDef = *RWI;

876               break;

877             }

878           }

879           if (RWDef) {

880             Writes.clear();

881             Reads.clear();

882             SchedModels.findRWs(RWDef->getValueAsListOfDefs("OperandReadWrites"),

883                                 Writes, Reads);

884           }

885         }

886         if (Writes.empty()) {

887           // Check this processor's itinerary class resources.

888           for (RecIter II = ProcModel.ItinRWDefs.begin(),

889                  IE = ProcModel.ItinRWDefs.end(); II != IE; ++II) {

890             RecVec Matched = (*II)->getValueAsListOfDefs("MatchedItinClasses");

891             if (std::find(Matched.begin(), Matched.end(), SCI->ItinClassDef)

892                 != Matched.end()) {

893               SchedModels.findRWs((*II)->getValueAsListOfDefs("OperandReadWrites"),

894                                   Writes, Reads);

895               break;

896             }

897           }

898           if (Writes.empty()) {

899             DEBUG(dbgs() << ProcModel.ModelName

900                   << " does not have resources for class " << SCI->Name << '\n');

901           }

902         }

820行的循环贯穿了整个函数,它太大了,我们只能一部分一部分来看。820行循环是对所有的调度类型进行遍历。SCTab是当前处理器的MCSchedClassDesc实例。前面我们看到,对一个调度类型CodeGenSchedClass,针对一个特定处理器,如果存在将这个调度类型包含的SchedReadWrite定义与InstrItinClass定义映射为另一组SchedReadWrite的ItinRW或InstRW定义,或者该调度类型本身带有包含SchedVariant定义的SchedReadWrite定义,那么会进行调度类型的推导。在这个CodeGenSchedClass实例的Transitions容器,对该处理器,会记录下到新调度类型的迁移。

另外,SchedVariant的派生定义用于将一个SchedReadWrite定义通过特定的谓词映射到别的SchedReadWrite定义(比如v7.0x86JwriteFZeroIdiom映射到JwriteZeroLatencyWriteFLogic)。这些映射关系由CodeGenSchedClass的Transitions记录。被映射调度类型的MCSchedClassDesc实例都是相同的(在828~832行设置)。因为对这个处理器而言,这个被映射的调度类型不再有用,新的推导调度类型替代它。在851行将其NumMicroOps设置为MCSchedClassDesc::VariantNumMicroOps这个特殊值,这将在运行时触发推导调度类的查找(参考MCSchedClassDesc的isVariant()方法)。

只要对当前处理器,这个CodeGenSchedClass对象没有被映射到别的对象,就会进入到858行以下。859 ~ 864行确保当前CodeGenSchedClass对象适用于当前的处理器。

前面还看到,对被InstRW定义映射的CodeGenSchedClass对象,CodeGenSchedModels的方法createInstRWClass()会在InstRWs容器记录这些InstRW定义的新CodeGenSchedClass对象,并且该对象会继承被映射对象容器Writes与Reads的内容。但这种情形下,容器Writes与Reads原有的内容是无效的,它们应该是这些InstRW定义中的SchedReadWrite定义(域OperandReadWrites),867 ~ 885行处理这个情形。886 ~ 897行则是处理itinRW的定义。

3.6.2.4.2. ​​​​​​​SchedReadWrite及资源间的关联

我们知道在复杂处理器的描述中,能将操作数的SchedWrite定义关联到资源的WriteRes、类似的SchedWriteRes、描述预读的ReadAdvance及SchedReadAdvance,是其核心的定义。这些就是接下来要处理的内容。在将调度类型相关的SchedRead与SchedWrite定义分别保存到容器Writes及Reads后,在908行首先遍历其中的SchedWrite定义。

SubtargetEmitter::GenSchedClassTables(续)

903         // Sum resources across all operand writes.

904       std::vector<MCWriteProcResEntry> WriteProcResources;

905         std::vector<MCWriteLatencyEntry> WriteLatencies;

906         std::vector<std::string> WriterNames;

907       std::vector<MCReadAdvanceEntry> ReadAdvanceEntries;

908         for (IdxIter WI = Writes.begin(), WE = Writes.end(); WI != WE; ++WI) {

909           IdxVec WriteSeq;

910           SchedModels.expandRWSeqForProc(*WI, WriteSeq, /*IsRead=*/false,

911                                          ProcModel);

912    

913           // For each operand, create a latency entry.

914           MCWriteLatencyEntry WLEntry;

915           WLEntry.Cycles = 0;

916           unsigned WriteID = WriteSeq.back();

917           WriterNames.push_back(SchedModels.getSchedWrite(WriteID).Name);

918           // If this Write is not referenced by a ReadAdvance, don't distinguish it

919           // from other WriteLatency entries.

920           if (!SchedModels.hasReadOfWrite(

921                 SchedModels.getSchedWrite(WriteID).TheDef)) {

922             WriteID = 0;

923           }

924           WLEntry.WriteResourceID = WriteID;

925    

926           for (IdxIter WSI = WriteSeq.begin(), WSE = WriteSeq.end();

927                WSI != WSE; ++WSI) {

928    

929             Record *WriteRes =

930               FindWriteResources(SchedModels.getSchedWrite(*WSI), ProcModel);

931    

932             // Mark the parent class as invalid for unsupported write types.

933             if (WriteRes->getValueAsBit("Unsupported")) {

934               SCDesc.NumMicroOps = MCSchedClassDesc::InvalidNumMicroOps;

935               break;

936             }

937             WLEntry.Cycles += WriteRes->getValueAsInt("Latency");

938             SCDesc.NumMicroOps += WriteRes->getValueAsInt("NumMicroOps");

939             SCDesc.BeginGroup |= WriteRes->getValueAsBit("BeginGroup");

940             SCDesc.EndGroup |= WriteRes->getValueAsBit("EndGroup");

        SCDesc.BeginGroup |= WriteRes->getValueAsBit("SingleIssue");                               ß v7.0增加

        SCDesc.EndGroup |= WriteRes->getValueAsBit("SingleIssue");

941    

942             // Create an entry for each ProcResource listed in WriteRes.

943             RecVec PRVec = WriteRes->getValueAsListOfDefs("ProcResources");

944             std::vector<int64_t> Cycles =

945               WriteRes->getValueAsListOfInts("ResourceCycles");

 

        if (Cycles.empty()) {                                                                                                              ß v7.0增加

          // If ResourceCycles is not provided, default to one cycle per

          // resource.

          Cycles.resize(PRVec.size(), 1);

        } else if (Cycles.size() != PRVec.size()) {

          // If ResourceCycles is provided, check consistency.

          PrintFatalError(

              WriteRes->getLoc(),

              Twine("Inconsistent resource cycles: !size(ResourceCycles) != "

                    "!size(ProcResources): ")

                  .concat(Twine(PRVec.size()))

                  .concat(" vs ")

                  .concat(Twine(Cycles.size())));

        }

946    

947             ExpandProcResources(PRVec, Cycles, ProcModel);

948    

949             for (unsigned PRIdx = 0, PREnd = PRVec.size();

950                  PRIdx != PREnd; ++PRIdx) {

951               MCWriteProcResEntry WPREntry;

952               WPREntry.ProcResourceIdx = ProcModel.getProcResourceIdx(PRVec[PRIdx]);

953               assert(WPREntry.ProcResourceIdx && "Bad ProcResourceIdx");

954               WPREntry.Cycles = Cycles[PRIdx];

955               // If this resource is already used in this sequence, add the current

956               // entry's cycles so that the same resource appears to be used

957               // serially, rather than multiple parallel uses. This is important for

958               // in-order machine where the resource consumption is a hazard.

959               unsigned WPRIdx = 0, WPREnd = WriteProcResources.size();

960               for( ; WPRIdx != WPREnd; ++WPRIdx) {

961                 if (WriteProcResources[WPRIdx].ProcResourceIdx

962                     == WPREntry.ProcResourceIdx) {

963                   WriteProcResources[WPRIdx].Cycles += WPREntry.Cycles;

964                   break;

965                 }

966               }

967               if (WPRIdx == WPREnd)

968                 WriteProcResources.push_back(WPREntry);

969             }

970           }

971           WriteLatencies.push_back(WLEntry);

972         }

973         // Create an entry for each operand Read in this SchedClass.

974         // Entries must be sorted first by UseIdx then by WriteResourceID.

975         for (unsigned UseIdx = 0, EndIdx = Reads.size();

976              UseIdx != EndIdx; ++UseIdx) {

977           Record *ReadAdvance =

978             FindReadAdvance(SchedModels.getSchedRead(Reads[UseIdx]), ProcModel);

979           if (!ReadAdvance)

980             continue;

981    

982           // Mark the parent class as invalid for unsupported write types.

983           if (ReadAdvance->getValueAsBit("Unsupported")) {

984             SCDesc.NumMicroOps = MCSchedClassDesc::InvalidNumMicroOps;

985             break;

986           }

987          RecVec ValidWrites = ReadAdvance->getValueAsListOfDefs("ValidWrites");

988           IdxVec WriteIDs;

989           if (ValidWrites.empty())

990             WriteIDs.push_back(0);

991           else {

992             for (RecIter VWI = ValidWrites.begin(), VWE = ValidWrites.end();

993                  VWI != VWE; ++VWI) {

994               WriteIDs.push_back(SchedModels.getSchedRWIdx(*VWI, /*IsRead=*/false));

995             }

996           }

997           std::sort(WriteIDs.begin(), WriteIDs.end());

998           for(IdxIter WI = WriteIDs.begin(), WE = WriteIDs.end(); WI != WE; ++WI) {

999             MCReadAdvanceEntry RAEntry;

1000          RAEntry.UseIdx = UseIdx;

1001          RAEntry.WriteResourceID = *WI;

1002          RAEntry.Cycles = ReadAdvance->getValueAsInt("Cycles");

1003          ReadAdvanceEntries.push_back(RAEntry);

1004        }

1005      }

1006      if (SCDesc.NumMicroOps == MCSchedClassDesc::InvalidNumMicroOps) {

1007        WriteProcResources.clear();

1008        WriteLatencies.clear();

1009        ReadAdvanceEntries.clear();

1010      }

因为在SchedAlias定义可以将一个SchedReadWrite定义映射为另一个SchedReadWrite定义,因此需要调用方法expandRWSeqForProc()来确保在存在一个SchedAlias定义时,得到的SchedReadWrite定义是正确的。对一个特定的SchedReadWrite,只能有一个SchedAlias定义。因为SchedReadWrite的别名可以是一个SchedAlias,也可以是一个包含SchedAlias的WriteSequence,因此需要对这个定义递归调用expandRWSeqForProc()(441与454行)。

420     void CodeGenSchedModels::expandRWSeqForProc(

421       unsigned RWIdx, IdxVec &RWSeq, bool IsRead,

422       const CodeGenProcModel &ProcModel) const {

423    

424       const CodeGenSchedRW &SchedWrite = getSchedRW(RWIdx, IsRead);

425       Record *AliasDef = nullptr;

426       for (RecIter AI = SchedWrite.Aliases.begin(), AE = SchedWrite.Aliases.end();

427            AI != AE; ++AI) {

428         const CodeGenSchedRW &AliasRW = getSchedRW((*AI)->getValueAsDef("AliasRW"));

429         if ((*AI)->getValueInit("SchedModel")->isComplete()) {

430          Record *ModelDef = (*AI)->getValueAsDef("SchedModel");

431           if (&getProcModel(ModelDef) != &ProcModel)

432             continue;

433         }

434         if (AliasDef)

435           PrintFatalError(AliasRW.TheDef->getLoc(), "Multiple aliases "

436                           "defined for processor " + ProcModel.ModelName +

437                           " Ensure only one SchedAlias exists per RW.");

438         AliasDef = AliasRW.TheDef;

439      }

440       if (AliasDef) {

441       expandRWSeqForProc(getSchedRWIdx(AliasDef, IsRead),

442                            RWSeq, IsRead,ProcModel);

443         return;

444       }

445       if (!SchedWrite.IsSequence) {

446         RWSeq.push_back(RWIdx);

447         return;

448       }

449       int Repeat =

450         SchedWrite.TheDef ? SchedWrite.TheDef->getValueAsInt("Repeat") : 1;

451       for (int i = 0; i < Repeat; ++i) {

452         for (IdxIter I = SchedWrite.Sequence.begin(), E = SchedWrite.Sequence.end();

453              I != E; ++I) {

454           expandRWSeqForProc(*I, RWSeq, IsRead, ProcModel);

455         }

456       }

457     }

注意传递给expandRWSeqForProc()的参数WriteSeq,它是908行循环里的局部容器,因此它保存的是每个在910行调用expandRWSeqForProc()展开的SchedReadWrite定义。在920行调用下面这个方法来判断这个展开后的SchedWrite定义是否被一个ProcReadAdvance定义所援引。如果不是,这些写入之间是不需要区分的,可以把WriteID置为0。

352     bool CodeGenSchedModels::hasReadOfWrite(Record *WriteDef) const {

353       for (unsigned i = 0, e = SchedReads.size(); i < e; ++i) {

354         Record *ReadDef = SchedReads[i].TheDef;

355         if (!ReadDef || !ReadDef->isSubClassOf("ProcReadAdvance"))

356           continue;

357    

358         RecVec ValidWrites = ReadDef->getValueAsListOfDefs("ValidWrites");

359         if (std::find(ValidWrites.begin(), ValidWrites.end(), WriteDef)

360             != ValidWrites.end()) {

361           return true;

362         }

363       }

364       return false;

365     }

接着在926行循环里遍历展开后的CodeGenSchedRW对象(实际上是在CodeGenSchedModels实例的SchedWrites容器里的索引),通过下面的方法找出将其关联到处理器资源的WriteRes以及SchedWriteRes定义。其中SchedWriteRes定义本身就是从SchedWrite派生的,如果这个SchedWrite定义就是SchedWriteRes,那么就它了(660行)。如果不是,接下来需要检查这个SchedWrite定义是否有别名,别名是否为适用于指定处理器的SchedWriteRes。

654     Record *SubtargetEmitter::FindWriteResources(

655       const CodeGenSchedRW &SchedWrite, const CodeGenProcModel &ProcModel) {

656    

657       // Check if the SchedWrite is already subtarget-specific and directly

658       // specifies a set of processor resources.

659       if (SchedWrite.TheDef->isSubClassOf("SchedWriteRes"))

660         return SchedWrite.TheDef;

661    

662       Record *AliasDef = nullptr;

663       for (RecIter AI = SchedWrite.Aliases.begin(), AE = SchedWrite.Aliases.end();

664            AI != AE; ++AI) {

665         const CodeGenSchedRW &AliasRW =

666           SchedModels.getSchedRW((*AI)->getValueAsDef("AliasRW"));

667         if (AliasRW.TheDef->getValueInit("SchedModel")->isComplete()) {

668           Record *ModelDef = AliasRW.TheDef->getValueAsDef("SchedModel");

669           if (&SchedModels.getProcModel(ModelDef) != &ProcModel)

670             continue;

671         }

672         if (AliasDef)

673           PrintFatalError(AliasRW.TheDef->getLoc(), "Multiple aliases "

674                         "defined for processor " + ProcModel.ModelName +

675                         " Ensure only one SchedAlias exists per RW.");

676         AliasDef = AliasRW.TheDef;

677       }

678       if (AliasDef && AliasDef->isSubClassOf("SchedWriteRes"))

679         return AliasDef;

680    

681       // Check this processor's list of write resources.

682       Record *ResDef = nullptr;

683       for (RecIter WRI = ProcModel.WriteResDefs.begin(),

684              WRE = ProcModel.WriteResDefs.end(); WRI != WRE; ++WRI) {

685         if (!(*WRI)->isSubClassOf("WriteRes"))

686           continue;

687         if (AliasDef == (*WRI)->getValueAsDef("WriteType")

688             || SchedWrite.TheDef == (*WRI)->getValueAsDef("WriteType")) {

689           if (ResDef) {

690             PrintFatalError((*WRI)->getLoc(), "Resources are defined for both "

691                           "SchedWrite and its alias on processor " +

692                           ProcModel.ModelName);

693           }

694           ResDef = *WRI;

695         }

696       }

697       // TODO: If ProcModel has a base model (previous generation processor),

698       // then call FindWriteResources recursively with that model here.

699       if (!ResDef) {

700         PrintFatalError(ProcModel.ModelDef->getLoc(),

701                       std::string("Processor does not define resources for ")

702                       + SchedWrite.TheDef->getName());

703       }

704       return ResDef;

705     }

如果上述两种情况都不是,就要查找是否存在相关的WriteRes定义了(没有就出错了)。

WriteRes与SchedWriteRes定义的Record对象都保存在CodeGenSchedModels实例的WriteResDefs容器里。因此遍历这个容器就能找到援引该SchedWrite定义的WriteRes。处理器可以设置WriteRes的成员Unsupported来表示不支持该关联。在这种情形下,需要把对应MCSchedClassDesc对象的NumMicroOps设置为InvalidNumMicroOps,并终止处理。不过目前LLVM还没有使用这个机制(v7.0已经使用这个机制,比如Broadwell处理器不支持WriteCvtPH2PSZ——ZMM寄存器上半精度浮点数到浮点数的转换)。

对于支持的WriteRes定义,通过下面的方法获取其所援引的资源以及包含这些资源的上级资源。

763     void SubtargetEmitter::ExpandProcResources(RecVec &PRVec,

764                                                std::vector<int64_t> &Cycles,

765                                                const CodeGenProcModel &PM) {

766       // Default to 1 resource cycle.

767       Cycles.resize(PRVec.size(), 1);                                                                                                   ß v7.0删除

768       for (unsigned i = 0, e = PRVec.size(); i != e; ++i) {

769         Record *PRDef = PRVec[i];

770         RecVec SubResources;

771         if (PRDef->isSubClassOf("ProcResGroup"))

772           SubResources = PRDef->getValueAsListOfDefs("Resources");

773         else {

774           SubResources.push_back(PRDef);

775         PRDef = SchedModels.findProcResUnits(PRVec[i], PM);

776           for (Record *SubDef = PRDef;

777                SubDef->getValueInit("Super")->isComplete();) {

778             if (SubDef->isSubClassOf("ProcResGroup")) {

779               // Disallow this for simplicitly.

780               PrintFatalError(SubDef->getLoc(), "Processor resource group "

781                               " cannot be a super resources.");

782             }

783             Record *SuperDef =

784               SchedModels.findProcResUnits(SubDef->getValueAsDef("Super"), PM);

785             PRVec.push_back(SuperDef);

786             Cycles.push_back(Cycles[i]);

787             SubDef = SuperDef;

788           }

789         }

790         for (RecIter PRI = PM.ProcResourceDefs.begin(),

791                PRE = PM.ProcResourceDefs.end();

792              PRI != PRE; ++PRI) {

793           if (*PRI == PRDef || !(*PRI)->isSubClassOf("ProcResGroup"))

794             continue;

795           RecVec SuperResources = (*PRI)->getValueAsListOfDefs("Resources");

796           RecIter SubI = SubResources.begin(), SubE = SubResources.end();

797           for( ; SubI != SubE; ++SubI) {

798             if (std::find(SuperResources.begin(), SuperResources.end(), *SubI)

799                 == SuperResources.end()) {

800               break;

801             }

802           }

803           if (SubI == SubE) {

804             PRVec.push_back(*PRI);

805             Cycles.push_back(Cycles[i]);

806           }

807         }

808       }

809     }

768行的循环收集参数PRVec(它来自WriteRes类型为list<ProcResourceKind>的ProcResources成员)的组成成员(即组成ProcResources的ProcResourceKind成员)。来自ProcResourceUnits类型的定义还可以给出包含它的上级资源,这些上级资源也需要逐级地记录到PRVec里。

790行的循环遍历与这个处理器相关的所有ProcResGroup定义。对某个ProcResGroup定义只有它的资源完全覆盖SubResources,才把它视为一个上级资源,并记录在PRVec里。还要注意,在WriteRes里,如果没有给出ResourceCycles,就都缺省为1。

回到GenSchedClassTables()方法,在949行遍历刚收集到的资源定义。如果WriteProcResources容器里已经有这个资源的记录,那么需要累加这个资源的周期数。955行的注释说,这使得该资源看起来被顺序使用,而不是被并行使用。这对于顺序机器是十分重要的。如果WriteProcResources中没有记录这个资源,就把它加入这个容器。接着在971行向WriteLatencies容器加入记录当前操作数延时信息的WLEntry对象。

在975行开始遍历与当前处理调度类型相关的SchedRead定义(实际上我们只关心与之相关的ReadAdvance或SchedReadAdvance定义)。

709     Record *SubtargetEmitter::FindReadAdvance(const CodeGenSchedRW &SchedRead,

710                                               const CodeGenProcModel &ProcModel) {

711       // Check for SchedReads that directly specify a ReadAdvance.

712     if (SchedRead.TheDef->isSubClassOf("SchedReadAdvance"))

713         return SchedRead.TheDef;

714    

715       // Check this processor's list of aliases for SchedRead.

716       Record *AliasDef = nullptr;

717       for (RecIter AI = SchedRead.Aliases.begin(), AE = SchedRead.Aliases.end();

718            AI != AE; ++AI) {

719         const CodeGenSchedRW &AliasRW =

720           SchedModels.getSchedRW((*AI)->getValueAsDef("AliasRW"));

721         if (AliasRW.TheDef->getValueInit("SchedModel")->isComplete()) {

722           Record *ModelDef = AliasRW.TheDef->getValueAsDef("SchedModel");

723           if (&SchedModels.getProcModel(ModelDef) != &ProcModel)

724             continue;

725         }

726         if (AliasDef)

727           PrintFatalError(AliasRW.TheDef->getLoc(), "Multiple aliases "

728                         "defined for processor " + ProcModel.ModelName +

729                         " Ensure only one SchedAlias exists per RW.");

730         AliasDef = AliasRW.TheDef;

731       }

732       if (AliasDef && AliasDef->isSubClassOf("SchedReadAdvance"))

733         return AliasDef;

734    

735       // Check this processor's ReadAdvanceList.

736       Record *ResDef = nullptr;

737       for (RecIter RAI = ProcModel.ReadAdvanceDefs.begin(),

738              RAE = ProcModel.ReadAdvanceDefs.end(); RAI != RAE; ++RAI) {

739         if (!(*RAI)->isSubClassOf("ReadAdvance"))

740           continue;

741         if (AliasDef == (*RAI)->getValueAsDef("ReadType")

742             || SchedRead.TheDef == (*RAI)->getValueAsDef("ReadType")) {

743           if (ResDef) {

744             PrintFatalError((*RAI)->getLoc(), "Resources are defined for both "

745                           "SchedRead and its alias on processor " +

746                           ProcModel.ModelName);

747           }

748           ResDef = *RAI;

749         }

750       }

751       // TODO: If ProcModel has a base model (previous generation processor),

752       // then call FindReadAdvance recursively with that model here.

753       if (!ResDef && SchedRead.TheDef->getName() != "ReadDefault") {

754         PrintFatalError(ProcModel.ModelDef->getLoc(),

755                       std::string("Processor does not define resources for ")

756                       + SchedRead.TheDef->getName());

757       }

758       return ResDef;

759     }

与FindWriteResources()类似,FindReadAdvance()找出该SchedRead所代表的SchedReadAdvance定义,或援引该SchedRead定义的ReadAdvance定义。只有SchedRead是ReadDefault时(读操作的缺省设置),才允许返回值是一个空指针。

回到GenSchedClassTables(),同样,ReadAdvance与SchedReadAdvance都可以设置Unsupported域,从特定处理器视野里消失。987~1004行提取ReadAdvance所援引的SchedWrite定义并排序,在998行循环为每个SchedWrite定义创建MCReadAdvanceEntry实例,保存在ReadAdvanceEntries容器。

到1015行,我们已经完成与当前调度类型相关SchedWrite定义关联的资源及SchedReadAdvance及ReadAdvance定义的处理,为它们生成了相应的MCWriteProcResEntry,MCWriteLatencyEntry,及MCReadAdvanceEntry实例。

SubtargetEmitter::GenSchedClassTables(续)

1011      // Add the information for this SchedClass to the global tables using basic

1012      // compression.

1013      //

1014      // WritePrecRes entries are sorted by ProcResIdx.

1015      std::sort(WriteProcResources.begin(), WriteProcResources.end(),

1016                LessWriteProcResources());

1017 

1018      SCDesc.NumWriteProcResEntries = WriteProcResources.size();

1019      std::vector<MCWriteProcResEntry>::iterator WPRPos =

1020        std::search(SchedTables.WriteProcResources.begin(),

1021                   SchedTables.WriteProcResources.end(),

1022                    WriteProcResources.begin(), WriteProcResources.end());

1023      if (WPRPos != SchedTables.WriteProcResources.end())

1024        SCDesc.WriteProcResIdx = WPRPos - SchedTables.WriteProcResources.begin();

1025      else {

1026        SCDesc.WriteProcResIdx = SchedTables.WriteProcResources.size();

1027        SchedTables.WriteProcResources.insert(WPRPos, WriteProcResources.begin(),

1028                                              WriteProcResources.end());

1029      }

1030      // Latency entries must remain in operand order.

1031     SCDesc.NumWriteLatencyEntries = WriteLatencies.size();

1032      std::vector<MCWriteLatencyEntry>::iterator WLPos =

1033        std::search(SchedTables.WriteLatencies.begin(),

1034                    SchedTables.WriteLatencies.end(),

1035                    WriteLatencies.begin(), WriteLatencies.end());

1036      if (WLPos != SchedTables.WriteLatencies.end()) {

1037        unsigned idx = WLPos - SchedTables.WriteLatencies.begin();

1038        SCDesc.WriteLatencyIdx = idx;

1039        for (unsigned i = 0, e = WriteLatencies.size(); i < e; ++i)

1040          if (SchedTables.WriterNames[idx + i].find(WriterNames[i]) ==

1041              std::string::npos) {

1042            SchedTables.WriterNames[idx + i] += std::string("_") + WriterNames[i];

1043          }

1044      }

1045      else {

1046        SCDesc.WriteLatencyIdx = SchedTables.WriteLatencies.size();

1047        SchedTables.WriteLatencies.insert(SchedTables.WriteLatencies.end(),

1048                                          WriteLatencies.begin(),

1049                                          WriteLatencies.end());

1050        SchedTables.WriterNames.insert(SchedTables.WriterNames.end(),

1051                                       WriterNames.begin(), WriterNames.end());

1052      }

1053      // ReadAdvanceEntries must remain in operand order.

1054      SCDesc.NumReadAdvanceEntries = ReadAdvanceEntries.size();

1055      std::vector<MCReadAdvanceEntry>::iterator RAPos =

1056        std::search(SchedTables.ReadAdvanceEntries.begin(),

1057                    SchedTables.ReadAdvanceEntries.end(),

1058                    ReadAdvanceEntries.begin(), ReadAdvanceEntries.end());

1059      if (RAPos != SchedTables.ReadAdvanceEntries.end())

1060        SCDesc.ReadAdvanceIdx = RAPos - SchedTables.ReadAdvanceEntries.begin();

1061      else {

1062        SCDesc.ReadAdvanceIdx = SchedTables.ReadAdvanceEntries.size();

1063        SchedTables.ReadAdvanceEntries.insert(RAPos, ReadAdvanceEntries.begin(),

1064                                              ReadAdvanceEntries.end());

1065      }

1066    }

1067  }

在GenSchedClassTables()余下代码里,1015~1029行将当前调度类的所有MCWriteProcResEntry对象添加到SchedTables的容器WriteProcResources(这个容器以及下面提到的容器由所有处理器共享)。注意,对这个调度类而言,它使用的资源可能有多个,因此有多个MCWriteProcResEntry实例,在添加到容器WriteProcResources里时,这些实例是连续添加的,调度类只需记录第一个实例在容器中的索引。

同样,在1031~1052行将当前调度类的所有MCWriteLatencyEntry对象添加到SchedTables的容器WriteLatencies,同时将对应SchedWrite的名字记录到SchedTables的容器WriterNames。在1040行,如果同一个MCWriteLatencyEntry对象对应多个SchedWrite,需要生成合成名字。

最后,1054~1065行将当前调度类的所有MCReadAdvanceEntry对象添加到SchedTables的ReadAdvanceEntries容器中。

在这些过程里,SCDesc是代表当前处理的调度类型的MCSchedClassDesc实例。同时,它来自SchedTables的ProcSchedClasses容器当前的最后一项(代表当前的处理器)。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值