Dan Joyce‘s 16 bug types only found with gate-level simulation

原文:Dan Joyce's 16 bug gate-level simulation

Subject: Dan Joyce's 16 bug types only found with gate-level simulation


The three takeaways I got from Wally Rhines' DVcon'16 keynote were:

  • Static verification is fastest growing category in EDA. Emulation closely follows as number 3. (#2 was not mentioned)
  • New kinds of focused verification will continue be adopted, such as reset-domain crossing and constraints verification. (At Real Intent, we see new requirements for focused static solutions at the gate-level also.)
  • ...

Followed by Jim Hogan doing a panel at that DVcon asking:
 

"Q. Will formal and emulation replace SW simulation tools?"


Followed by Lauro Rizzatti saying:

"As my colleagues have said, complexity is killing simulation. There is good news in the emulation camp, which I have followed for 22 years. Today emulation does most of what simulation does with the exception of timing and multi-value logic. It works 4 to 6 orders of magnitude faster."

And Brian Hunter of Cavium:

"Anytime we find a bug with an emulation platform, we always go back and question what did we do wrong in simulation." ...

    - Prakash on DVcon'16, portable stimulus, & end of simulation

From: "Dan Joyce" <user=danj domain=correctdesigns not calm>

Hi, John,

First off, I want to thank Prakash for writing his DVcon'16 Trip Report.
I love the fact that these critical issues are being discussed and then 
disseminated in such a public way.  I couldn't make that DVcon, but I 
found it disturbing to see how many engineers were nonchalantly drinking

  
  

  
  

the Jonestown Kool-Aid that formal tools and much faster emulation boxes are
going to replace old school Verilog/VHDL RTL software simulators in chip
design -- and especially with that truly dangerous notion of not bothering
to do gate-level simulations at all -- or even considering just dropping
timing from their Gatesims altogether.

If I was there at that DVcon, I would have been seriously fighting the
urge to shout down those engineers thinking of skipping gate-level sims.

Why should anyone listen to Dan Joyce about gate-level sims?  I have 25
years in chip design and verification.  I taped-out 22 chips in that time 
and only had one of those 22 chips back in 1995 had to be respun.  So my 
record is 21 out of 22 -- which is a 95.5% success rate -- something I'm 
proud of.  Many of these were huge chips pushing the technology of the day.  
My most recent chip was a 1.25 billion instance 16nm TSMC FinFET design. 

        ----    ----    ----    ----    ----    ----    ----

WHY YOU MUST STILL DO GATE-LEVEL SIMULATIONS (GLS) TODAY

Using gate-level simulations (GLS), I've found both functional and timing
bugs in chips at every stage of chip design -- from the start of early
functional development -- all the way down to subtle yet chip-fatal timing
bugs just 2 days before final tapeout.

Going in, I have to issue a "heads-up" warning: GLS can be an extremely
expensive task that fails to find critical design flaws before chips are
released to manufacturing -- if done wrong.  However if done right I see
the Cost/Benefit of Gatesims getting better today.  Regardless, GLS finds
chip-killing bugs that formal, STA, ABV, lint, and emulation won't even
notice.  You are taking a big risk if you don't do GLS.

That said, let me set the stage...


THAT DAMNED ZERO-DELAY PROBLEM...

The goal of any chip development team is to have processes (lint, LEC, STA,
verification) so solid that GLS will never find any bugs and not be needed.

But GLS does find bugs; most very late in the design process and close to
tape-out.  GLS bugs are often very serious, and tend to cause problems with
no possible workaround.  Even when workarounds are possible, finding and
debugging GLS failures is much easier than debugging silicon in the lab.

It's not uncommon to chase bugs in the lab for months which could have
been found and debugged in a couple hours in GLS.  Although GLS is harder
to debug than RTL, it is much easier than silicon.

GLS bugs exist because practically all chip Verilog/SystemVerilog/VHDL
simulations are done with "ideal world" zero-delay tests that are run on
pre-synthesized Veilog/VHDL/SystemVerilog RTL code.  These sims are
tailored to speed-up simulation runtime performance -- but sacrifice their
ability to catch certain types of bugs.

Additionally, many steps are performed on the design after the RTL is
verified to produce the gate netlist that is used to manufacture the 
silicon.  Some of these post-RTL steps include synthesis to gates; place
and route; power insertion; adding logic for Built In Self Test (BIST) and
Built In Self Repair (BISR); and insertion of Design For Testability (DFT)
logic.  RTL tests won't find bugs from any of those steps since the logic
added during those steps weren't in the orginal RTL to begin with!

In addition GLS finds chip timing issues missed by Primetime/Tempus STA
due to bad timing constraints.  The gate-level model is much closer to
the real silicon design -- testing at gates frequently finds timing bugs
in asynchronous logic that cannot be found in RTL.

SPECIAL WARNING: GLS has gotten harder as today's chip designs have gotten
bigger; but at the same time the need for GLS is greater than ever before.
The amount of logic in your chip that does not exist in the RTL design has
increased.  Add the complexity of 100's of unrelated clocks has increased
the risk of your final physical blocks not closing timing dramatically in
the last few years.  Finally as the man-hours and calendar time needed to
go from tapeout to silicon has increased -- making the cost of a gate-level
bug much higher -- and making it even more important than ever to get back
good working silicon on the 1st pass.


DAN JOYCE'S 16 BUG TYPES ONLY FOUND BY GATE-LEVEL SIMULATION

The following is the list of chip design bugs that can only be found cheaply
by using GLS.  Keep in mind, I'm in gate-sims.  This is at the tail end of
the project where the design team tells me "this chip is ready!  We're good
to go!"; and then I've caught least 1 of these chip-killer bugs after that
point.

  1. Timing Bugs.  Using incorrect constraints actually cause your DC or
     Genus synthesis tool to create timing bugs -- and then those same
     bad constraints are used to run Primetime or Tempus STA.  So the
     same constraint error will cause both the bug and the bad check that
     will miss detecting that bug.

  2. Linting Bugs.  Lint tools like Real Intent or Spyglass look at your
     source Verilog/SystemVerilog/VHDL RTL code that produces bad gates.
     The bad news is that some lint tools (I won't say which ones) have a
     signal-to-noise ratio that produce too many warnings that need to be
     reviewed and waived by hand.  The human error of waiving the wrong lint
     warning creates a difference between RTL and gate functionality.  And
     worst, LEC will not find these waivered bugs since LEC starts with
     the same wrong gate functionality.

  3. BFM-masked Bugs.  RTL verification typically uses BFMs (Bus Functional
     Models) to simplify test generation and checking of results.  BFMs that
     incorrectly model part of your DUT can cause bugs to be missed.  Your
     GLS must do some tests driven by gate-level cores instead of just
     internal BFMs.

  4. IP Bugs.  You can have 3rd party IP that works perfectly in RTL, but
     quietly contains timing/functional/ifdef/pragma bugs that can only be
     caught in GLS.  These quiet IP bugs can kill a chip.

  5. Clocking Bugs.  Your RTL has quiet real-life glitches, over max
     frequencies, or duty cycle bugs are often only seen in GLS tests
     with full SDF timing.

  6. Reset Timing Bugs.  These are typically clock zones where the reset
     is released at different clock edges on their D-FF's.  These are also
     called initialization bugs.  They can only be detected in gate
     simulations with delays.

  7. `ifdef Bugs.  From `ifdefs in your code where RTL simulation uses one
     set of `ifdefs different from the `ifdefs synthesis used.  LEC does
     not catch this and you won't suspect anything until you run GLS.

  8. Dynamic Frequency Change Clock Bugs.  Often high performance, yet low
     power chips must be able to switch frequencies without quiescing its
     logic.  This logic can only be verified with GLS with full timing
     to detect these clock issues.

  9. Multi-Cycle Path (MCP) Bugs.  For example, you have a chip with a
     12-cycle MCP in it.

       a) your source signals must be held stable for the full 12-cycle
          period,

     and

       b) your destination flops must only capture the results at the
          12th cycle -- and not earlier nor later.

     If you fail to do "a" or "b" above, it will create an MCP set-up/hold
     issue that causes metastability ("X's") on your final output flop.

 10. Force/Release Bugs.  Often in testbenches to get past some bottleneck,
     code like this will be used:
     
                   force load_fifo_name_here = 1'b1;
                   force ecc_error = 1'b0;
                   force aix_bus = 32'bFFFFFFFF;

     What happens is the verification guys forget to remove or "release"
     all or some of these "force" commands -- causing tests to pass and
     their bugs to go undetected.  GLS throws up compile errors for most
     internal "forces" when signals are renamed through synthesis; with
     the few "forces" remaining to be reviewed and removed if possible.

 11. BIST/BISR Bugs.  If your design's original Veilog/VHDL source RTL
     does not include BISR or BIST logic, bugs involving the BIST/BISR
     logic can only be found in GLS.

 12. DFT Bugs.  Usually RTL does not include DFT logic so those bugs in
     the DFT logic can only be found in GLS.

 13. Power Insertion Bugs.  Usually your RTL is not power inserted.  UPF
     testing is an attempt to find these in RTL, but since most power
     logic is not included in RTL, the only true test of power logic can
     only be done with a "power aware" gate-level simulation.  This is
     where the simulation models of your gate-level library cells only work
     when your power-enabled netlist are connected and driven correctly
     by your clamp cells, voltage translators, and power islands.

 14. Delta-Delay Race Conditions.  Occasionally RTL is run with #0 or #1
     or blocking and non-blocking assignments that include a RTL delta delay
     race condition.  These are simulation artifacts.  If your source RTL
     simulates wrong, people will design their chip to "pass" wrong.  They
     are assuming everything is OK.  But their final Gates will work
     differently than their RTL -- and this is a rare case where only GLS
     will detect the real silicon behavior mismatch.

 15. LEC Holes.  All LEC tools work by doing a logical equivalence between
     two gate-level models.  If you're doing LEC between your RTL and your
     Gates of the same design, the LEC tool starts by doing a synthesis of
     your RTL to a simple gate implementation.  If your "RTL" gates does not
     100% synthesize to gates that 100% match your RTL's functionality, your
     LEC run will be comparing your Gate netlist with an "RTL" gate model
     that is already broken.  You can get incorrect LEC results from this.

 16. LEC Waivers.  Large designs are divided into pieces to allow LEC to
     handle it within a reasonable time.  Any tool mistake in this cutting
     process or any incorrect waivers can result in functional differences
     between RTL and Gates.  Only GLS detects this.

With all this warning, I keep expecting my next chip to be the one where GLS
does not find a chip-killing bug that formal, STA, ABV, lint, and emulation
didn't catch.  At 22 chips it still hasn't happened.

    - Dan Joyce
      Correct Designs, Inc.                      Austin, TX

P.S. What follows are details of how I found these 16 bugs types using GLS.
很不错的一套站群系统源码,后台配置采集节点,输入目标站地址即可全自动智能转换自动全站采集!支持 https、支持 POST 获取、支持搜索、支持 cookie、支持代理、支持破解防盗链、支持破解防采集 全自动分析,内外链接自动转换、图片地址、css、js,自动分析 CSS 内的图片使得页面风格不丢失: 广告标签,方便在规则里直接替换广告代码 支持自定义标签,标签可自定义内容、自由截取、内容正则截取。可以放在模板里,也可以在规则里替换 支持自定义模板,可使用标签 diy 个性模板,真正做到内容上移花接木 调试模式,可观察采集性能,便于发现和解决各种错误 多条采集规则一键切换,支持导入导出 内置强大替换和过滤功能,标签过滤、站内外过滤、字符串替换、等等 IP 屏蔽功能,屏蔽想要屏蔽 IP 地址让它无法访问 ****高级功能*****· url 过滤功能,可过滤屏蔽不采集指定链接· 伪原创,近义词替换有利于 seo· 伪静态,url 伪静态化,有利于 seo· 自动缓存自动更新,可设置缓存时间达到自动更新,css 缓存· 支持演示有阿三源码简繁体互转· 代理 IP、伪造 IP、随机 IP、伪造 user-agent、伪造 referer 来路、自定义 cookie,以便应对防采集措施· url 地址加密转换,个性化 url,让你的 url 地址与众不同· 关键词内链功能· 还有更多功能等你发现…… 程序使用非常简单,仅需在后台输入一个域名即可建站,不限子域名,站群利器,无授权,无绑定限制,使用后台功能可对页面进行自定义修改,在程序后台开启生 成功能,只要访问页面就会生成一个本地文件。当用户再次访问的时候就直接访问网站本地的页面,所以目标站点无法访问了也没关系,我们的站点依然可以访问, 支持伪静态、伪原创、生成静态文件、自定义替换、广告管理、友情链接管理、自动下载 CSS 内的图。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值