A Test Framework of BMC for platform team

Table of Contents

A Test Framework of BMC for platform team.. 1

Introduction. 1

Requirement Analyze. 2

General Design. 2

Requirement. 2

Not requirement:3

Model Design. 3

Pre-run model3

Functional model3

IPMI device “Global” Commands. 3

BMC Watchdog Timer Commands. 4

Chassis Commands. 4

Event commands. 5

PEF and Alerting command. 5

SEL commands. 5

SDR Repository command. 5

FRU Inventory Device Commands. 6

Event Filter Table. 6

Sensor Device commands. 6

Platform-specific OEM command. 7

Post-run model7

Implement. 7

Requirement. 7

User interface. 7

Schedule. 7

 

Introduction

As theone most important component of a server, BMC is used to monitor system environmentmonitoring, hardware error and record related logs, so its high availabilityand stability is critical to us. Currently platform always use some manualscripts or commands to do BMC test, in fact the scripts and commands can be puttogether and re-used for other platform, and we can leverage other team’s testframework such as CTH test platform to test it automatically, therefore oneTest Frame work for BMC test is need, and this document is used to describe thedesign of the framework.

RequirementAnalyze

Typically uses cases include:

1.      OEM SEL decoding Test for customer ipmitool/unit test

2.      BMC stress test

3.      BMC function test

4.      BMC regression test

5.      System monitro

6.      Platform monitor

7.      Firmware upgrade

8.      Netmon

 

GeneralDesign

Requirement

·        support both local KCS test and LAN test

·        Could be integrate with EVT test framework

·        Could be integrated with CTH test tools

o  Graphic user interface;

o  Support auto Log;

o  Support automatically report;

o  Tests mode/collection configurable;

·        To be easily expanded for new platform;

·        To be easily expanded for other component

This requires that other software component own by platform team such asplatmon/error handling will be easily tested by this framework.

·        Support auto-training;

That means the framework will record the test cases which once detectFW/Software problem, and will automatically add them into test case for latertests.Accordingto experience, because we always meet problem when run BMC reset, firmwareupgrade , sensor reading and power cycle test, these type commands should becollected by auto-training mechanism.

 

·        Support multiply test modes: function test;random test; stress test

o  Function test: test for every component/unit one by one;

o  Random test: the random test generator willproduce the random test cases, such as random raw data, random SELs to BMC

o  Stress test:

a.      Component level stress test:

Many commands run together on the same component such as SEL / SDR / FRU/ Watchdog   / CPLD at the same time toverify its availability.

b.     System level stress test:

Many components ran stress test at the same time

 

·        Support an interface to add platform-specifictest;

·        The framework makes full use of code withcurrent DDOS

The customer ipmitool is implemented in our DDOS according coming BMCspecification, so maybe we could reuse the code and test scripts

·        The framework should define test level/log level

The test level is used to define the test granularity (sensor level,component level, bus level, function level, chip level) while the log levelhelp debug and log more details about the execution and output of the testcases.

 

·        Base on small generic test sets

To start this work easily,  we willchoose a basic test cases collection maybe included by all platform, such asBMC SEL, user , SDR, reset, Lan, Fan, FRU and power control command.

Not requirement:

·        Test for some OEM command which must operatehardware manually;

·        Test for whose result may need manuallyinvestigation.

ModelDesign

Pre-run model

Check BMC version requirement;

Check BIOS version requirement;

check extra parameters;

check BMC is in normal state;

save BMC IPMIuser/sol/serial/channel/mac/DHCP/IP default configuration;

stop system and platform monitro;

Functional Test

IPMI device “Global” Commands

·        Get Device ID command

·        Warm Reset command

·        Cold Reset command

·        Get Self Tests Results command

·        Manufacturing Test on command

·        Set ACPI power state command

·        Get ACPI power state command

·        Get Device GUID command

·        Broadcast “Get Device ID” command

·        Firmware Firewall & Command Discoverycommands

·        IPMI Messaging Support Commands

o  Set BMC Global Enables

o  Get BMC Global Enables

o  Clear Message Flags

o  Get Message Flags

o  Get Message

o  Send message

§ Get BT interface Capabilities

§ Master write-read

§ IPMI serial/Modem Commands

§ Set Serial/Modem Configuration

§ Get Serial/Modem configuration

§ Serial / Modem connection Activ

§ SOL command ( Optional)

§ SOL Activating

§ Get SOL configuration Parameters

§ Set SOL configuration Parameters

PEF and Alerting command

Get PEF Capabilities

Arm PEF Postpone Timer

Set PEF Configuration Parameters

Get PEF Configuration Parameters

Set Last Processor Event ID

Get Last Processor Event ID

BMC Watchdog Timer Commands

Reset watchdog timer

Set Watchdog timer

Get Watchdog timer

(On expiration of the Watchdog timeout:

v System Reset

v System Power Off

v System power cycle

v Pre-timeout interrupt (Optional)

Chassis Commands

Get Chassis capabilities

Get Chassis Status

Chassis control

Chassis Reset

Event commands

Set event receiver

Get event receiver

Platform event message command

SEL component

Trigger all possible SELs in scripts;

Delete SELS;

Full SELs;

Empty SELs;

Get SEL Info

Get SEL Entry

Add SEL Entry

Partial Add SEL entry

Clear SEL

Get SEL time

Set SEL time

OEM SEL decoding test (How to do the test with our OEM SELsdecoding code?)

SDR Repository command

Get SDR Repository Info

Reserve SDR Repository

Get SDR

Add SDR

Partial Add SDR

Clear SDR Repository

Get SDR Repository Time

Set SDR Repository Time

FRU Inventory Device Commands

Get Device ID

Get Self Test Results

Broadcast Get Device ID

Get Sensor Reading

Set Event Receiver

Get Event Receiver

Platform Event

 

Event Filter Table

Get EFT list

Check whether EFT works

How to do PEF auto test?

Sensor Device commands

Static and Dynamic Senor Devices

Get Device SDR info commands

Get Device SDR command

Reserve Devices SDR Repositorycommand

Get sensor Reading Factors command

Set sensor Hysteresis command

Get sensor Hysteresis command

Set Sensor Threshold command

Get sensors event Enable command

Set sensors event disabled command

Re-arm sensor Events command

User controlling command

Lan command

Platform-specific OEM command

OEM SELs;

OEM  SLIC command;

OEM NVRAM/BBU command

(How to classified different OEM commands per platforms?)

Random Test

Using “ipmitoolevent” command to cover all OEM SEL decoding, this is easy to be done by ascript:

Step 1: get allsensors;

Step 2: for everysensor, list all possible state

Step 3:Assert/Deassert every state

Step 4: checkwhether SEL decoding the event correctly.

 

Another method isusing “raw 0x40 0x41” command to emulate all possible SEL, and then using “ipmitoolsel list” to check whether successful to decode them.

Stress and Integration Test

Power button Test for many times

Short presspre-boot

Long press pre-boot

Short presspost-boot

Long presspost-boot

Fan controlling Test for manytimes

One Fan enterauto-manual control mode

Multiply Fancontrolling method switch

Sequent Fancontrolling command

LED Test for many times

Chassis LED Test

SP LED Test

FAN LED Test

SLIC LED Test

PSU LED Test

Disk LED Test

PSU SEL Test

Unpowered PSUinserted

Powered PSUinserted

Unpowered PSU removed

powered PSU removed

Firmware program Test

Firmware versionget

Firmware upgrade

Firmware downgrade

Firmware checksum

Firmware otheroption

Performance Test

Measure keyperformance for most frequent used commands, and compare it with other platform

Post-run model

check BMC is in normal state;

restore BMC IPMI user/sol/serial/FAN/LED/channel/mac/DHCP/IP defaultconfiguration;

restore system and platform monitor;

Record logs in three different lawer:

wKiom1XAEMPRD_3yAAEHG967-cs796.jpg

And then analyze above logs: PASS rate/ FAILrate:

·        Command level log: OK/ERROR

·        Case level log

Component level log;

Log upload;

Implement

Requirement

·        Unified final result format: $component test :PASS/FAIL / Unsupported/Not Run

·        Easily to expand to new platform for all OEMcommand

(Try to reuse all command function in sub script)

·        Cover all possible IPMI command maybe used by system daemon process