By admin / Uncategorized / 0 Comments

Published on SemiWiki (Article Link)

 

 

 

One task that is not very exciting but is critical is that of library quality assurance. Many design groups have created their own procedures, often having been burned in the past, to ensure that the libraries that they use are good. Failure to do so has resulted in:

  • example 1: just before tapeout it was discovered that the layout and the LEF for a cell did not match. It took days to track this down
  • example 2: the timing files did not match the netlist causing post-layout simulation to fail
  • example 3: after weeks of iterating to try and achieve timing closure it was found that there was a constraint error in the library with setup+hold<0

These types of errors are at one level almost trivial, but the consequences can be severe. Empyrean Software have a tool Qualib to address this. As you might guess from its name, Qualib is a library quality inspection tool. It is a comprehensive platform to qualify library/IP with advanced analysis features for better quality. Qualib can be used by design groups as a sort of incoming inspection on 3rd party libraries and IP, and it can also be used by the creators of library and IP to perform an outgoing inspection, ensuring that they are shipping good product.Qualib performs a number of important checks:

  • Cell Presence Check: based on a cell list ensure that all views of all cells are present and that there are no additional cells or views
  • LEF vs GDS Check: ensure that the reduced cell view in the LEF matches the actual layout: pin name, pin shape, boundary, obstructions (layers should either be pins or obstructed)
  • Timing versus Verilog Check: ensure that the timing in the .lib matches the Verilog: pin name and direction, function, timing arcs
  • LEF Check: make sure that the LEF is consistent: cell properties, pin properties, DRC issues (such as off-grid), routability issues (unreachable pins)
  • GDS Check: label errors, tag errors, DRC errors issues (layout out of boundary etc)
  • Timing Check: ensure timing constraint consistency, setup+hold>0, timing arc presence, power arc presence, condition consistency, timing table monotonicity (if the load doubles the cell should get slower not faster)
  • For transistor level designs there are also equivalent checks that the circuit description language (CDL) matches LEF, timing and verilog

The flow is straightforward, selecting the rules, running the checks and then getting reports in either html or text formats. There is an interactive environment for setting up the checks and examining issues.

The benefits of Qualib are not so much that errors are found that would otherwise be missed. Modern design practice almost guarantees that the problems will eventually come to light. But what is important is to find problems that will be tapeout show-stoppers earlier in the design cycle, with the reduced risk of major panic just before tapeout. By having “known good” libraries early in the design cycle there is a reduced risk of missing the tapeout schedule when errors only get discovered during final verification. This is another example of what has become almost universally known as “shift left”, moving the discovery and fixing of problems earlier in the design cycle so that the design cycle is shortened and issues are discovered when there is still time to fix them without slipping the tapeout.

Read more “Blog: How Good Is Your Library? Are You Sure?”

By admin / Uncategorized / 0 Comments

Blog: How Good Are Your Clocks?

Published on SemiWiki (Article Link)

 

 

 

One of the trickiest tasks in designing a modern SoC is getting the clock tree(s) right. The two big reasons for this:

  • the clocks can consume 30{c9f8253d00388757a40a8936b9f1a45e3317544b84db7d04736a2bdf34a57b51} or more of the power of the whole chip, so minimizing the number of buffers inserted is critical to keeping power under control
  • the clock insertion delay and clock skew have a major impact on timing. If a flop on the early side of the skew window drives a flop on the late side, or vice versa, it can consume a large part of the setup/hold margin and so affect the maximum clock frequency that the chip will work at

The clock-tree is actually constructed during physical design during the clock-tree synthesis (CTS) phase. This is driven by constraints provided by the design team and so a large part of producing a good clock-tree is creating good constraints.An additional issue is that increasingly SoCs are built out of blocks of IP assembled together. Typically the IP blocks are designed by a “front-end” design team, often overseas, and the physical design and assembly is done by a “back-end” team at the headquarters.

But this leads to another problem. The front-end designers have to come up with good constraints, plus avoid producing inherently unbalanced logic that will be difficult to clock. However they don’t think like back-end designers and don’t understand the physical CTS process well.

Meanwhile the back-end team doesn’t understand the clock structure well, and by that stage in the design process has little time for interaction. They will typically run with whatever the front-end teams gave them and do their best to close timing with what they have. But it is frustrating and may be impossible to close timing with a suboptimal clock tree.

Empyrean Software has a tool, ClockExplorer, that addresses these problems. It provides front-end designers with feedback on the quality of the clock tree to find errors or suboptimal design. Structure and constraint checking can also evaluate clock quality, and help front-end and back-end designers to identify design problems that should be fixed early.It then allows the front-end designers to communicate this information to the back-end designers and gives them similar feedback. It can also be used after CTS to do a more in-depth analysis taking the physical information into account. Of course at this point it can display a layout view, showing where the actual clock-paths run on the physical chip. For each problem, ClockExplorer can identify the problem, detail what issue it will cause and explain what needs to be changed to fix the problem. In this way it allows less experienced designers to be effective and avoid creating problems that will only show up later.

Note that ClockExplorer does not create the actual clock tree, that is still left to the CTS. ClockExplorer is a tool that allows front-end and back-end designers together to create good clock constraints, which in turn will lead to better clocks, lower power, and a fast timing closure process. In short, better CTS QoR.ClockExplorer allows designers to look at a schematic of the clock tree. Since all the datapath elements are suppressed, it can handle extremely large designs very fast. For front-end designers it produces a timing dependency report, reports suboptimal structures, missing constraints and so on. It can automatically identify false paths or unnecessary balancing, and so minimize the number of buffers that will need to be inserted. The clock tree can be displayed by level or by delay.

As an example of its use on a 28nm design with 600K instances it reduced the clock tree buffer count by 40{c9f8253d00388757a40a8936b9f1a45e3317544b84db7d04736a2bdf34a57b51}, the hold time total negative slack (TNS) by 80{c9f8253d00388757a40a8936b9f1a45e3317544b84db7d04736a2bdf34a57b51} and so on. See the table below.

In summary, ClockExplorer is a tool offering structure and constraint checking, constraint optimization, and clock tree debugging.

Read more “Blog: How Good Are Your Clocks?”

By admin / Uncategorized / 0 Comments

Blog: What is Skipper?

Published on SemiWiki (Article Link)

 

What is Skipper? Well, it seems it’s a penguin in the movie Madagascar. skip0And one of Barbie’s sisters. Who knew? But for Semiwiki readers it’s an integrated chip finishing platform from Empyrean Software. Skipper can read in full-chip layout extremely fast, examine it and manipulate it in various ways, and write it out again.

Skipper solves a number of different problems, both before tapeout and when debugging silicon exhibiting problems:

Read more “Blog: What is Skipper?”

By admin / Uncategorized / 0 Comments

P&R Aware ECO

Published on SemiWiki (Blog Link)

Modern SoC designs require a placement- and routing-aware ECO solution to close timing

As an applications engineer for over 15 years supporting physical design tools that enable implementation closure, I have seen the complexity of timing closure grow continuously from one process node to the next. At 28nm, the number of scenarios for timing sign-off has increased to the extent that is way beyond the number that a Place & Route tool can handle. Most designers turned to Static Timing Analysis (STA) tools for a solution. But the STA tools have two limitations:

  1. STA tools usually run in a scenario-by-scenario fashion. For STA tools to generate ECOs that close timing for all scenarios, one would need to run multiple sessions at the same time, one session for each scenario. This requires the STA tools to be run simultaneously on multiple servers, with each server needing a license.
  2. Current STA tools do not have or use the physical information. As a result, many ECO’s (Engineering Change Orders) generated by STA tools may end up being not implementable in the physical world due to placement and/or routing congestions.

These limitations prompted for a new solution that can:

  1. Simultaneously handle large number of scenarios without requiring large number of licenses/server machines
  2. Understand the impact placement and routing have on those scenarios and hence implement ECO directive accordingly

These requirements are critical to effectively and efficiently achieve timing closure.Without these capabilities, designers are forced into not only a process that takes too many iterations and longer time to closure, and often have to accept lower chip performance for time to market.

In a recent customer engagement, I had to help the customer close timing on a design that was highly congested in both placement and routing. In addition, the design required timing closure on more than 100 sign-off scenarios. It would have taken multiple engineers and many weeks to close timing using an STA based methodology.

A key point to note is that not all routing congested areas are also placement congested, such as the channels between the macros at the top level of an SoC design. Hence, to effectively address timing violations, the tools and flow must understand both placement and routing congestion. Otherwise, one might cause new setup violations while fixing the hold violations due to detoured ECO routes. This is the primary reason why an STA based flow that is not placement and most importantly routing-aware takes many iterations to close timing.

We identified the congestion issues and used a placement and routing aware timing closure solution that could simultaneously handle all MMMC scenarios. Results: quicker timing closure with far fewer iterations!

At 20nm, a timing closure solution must be routing aware, because the additional requirements of double patterning and Vt implanting rules have a direct impact on timing and hence closure.Welcome your comments and sharing your experiences with timing closure.

Empyrean Software (San Jose, California) develops and markets solutions that accelerate SoC design closure. Its flagship products, ClockExplorer and TimingExplorer were released to the market in 2006 and 2009 respectively. They have been successfully used and taped-out in over 100 SoC designs. Other products from ICScape include PowerExplorer, RCExplorer and LibExplorer. It offers sales and technical support for its products in US, China, Japan, South Korea and Taiwan.

Read more “P&R Aware ECO”

By admin / Uncategorized / 0 Comments

Blog: CTS Specs

Published on SemiWiki (Article Link)

Properly Handing Of Clock Tree Synthesis Specifications Edit

Given today’s design requirements with respect to low power, there is increasing focus on the contribution to total power made by a design’s clock trees. The design decisions made by the front-end team to achieve high performance without wasting power must be conveyed to back-end team. This hand-off must be accurate and complete. A key component of that hand-off is the clock tree synthesis (CTS) constraints. Let’s look at what can go wrong and how to avoid these pitfalls.The clock trees in chips ten years ago were fairly simple and most chips had only a handful of clock trees. In today’s technologies this has exploded into a forest of clock trees. Sheer volume alone points to the need for automation. But even more daunting are complexities of today’s clock trees. Clock gating has been in use for a while now to aid in reducing power. Included IP blocks will have their own clock requirements. There are generated clocks, overlapping clocks, clock dividers, and on and on. All of this information needs to be packaged by the front-end team into the SDC file and clock specification (clock constraint) file for use by the back-end team.

Empyrean Software’s ClockExplorer tool was developed to provide analysis tools to help both teams understand the entire clock graph being developed. It crosschecks equivalence of constraints generated by front-end and back-end teams. Both teams could use ClockExplorer to analyze and sign-off the netlist and clock constraints. ClockExplorer’s platform checks the clock structure and aids in the generation constraints for a CTS tool, including CTS sequencing for complex situations with multiple SDC files and overlapping clock trees. If these tasks are done manually by either team, mistakes are much more likely to occur.
Beyond the important capabilities of simply generating and checking the constraints, ClockExplorer also optimizes the clock topology to reduce latency. As a visual aid, ClockExplorer also generates a clock schematic, greatly assisting in reviews and discussions between the teams.

By using tools such as Empyrean Software‘s ClockExplorer, I think that front-end and back-end design teams will be able to cut design errors due to improper understanding of, or generation of, clock tree synthesis constraints. They will have a common view of the clock system, consistent checking and automated generation handling the key aspects of the constraint files. This should make a difficult task much easier and more reliable. Where discrepancies due crop up, the visual aid enabled by the automatic generation of the clock schematics should make debugging and communications between the teams much easier.

Read more “Blog: CTS Specs”