Sunday, February 3, 2008

Correlation issues in a typical CAD flow

There can be numerous issues relating to correlation across various steps in a typical ASIC CAD flow. These issues can cause timing/area/power convergence problems in meeting ASIC tapeouts.

1. Logic Synthesis:
-------------------
At logic synthesis step not much is known about wires in the design. Wires can be estimated using statistical models based on design size and cannot be computed.

The delay calculation engine at logic synthesis stage makes use of either a pessimistic or optimistic wire load model from the library.

Only post placement, we can back-annotate reasonably accurate wire delays and re-optimize/re-run logic optimizations to model reality.

Floorplan, which plays a significant part in timing closure is not known at Logic synthesis stage.

High fanout buffering also has to be estimated and cannot be accrately performed.

Delay models (like logical effort) are meant for speed (linear delay model) and not accuracy. This might be a very goodthing to use at this stage.

NLDM and SPDM models are really not used inside a logic synthesis system for optimization as they are not scalable/optimizable models.

There is no real value using accurate delay models (SPDM) for delay calculation as design is not really complete by any means.

There is no clock tree. Insertion delay and Skew might have to be modeled based on design experience or these values could be derived after a quick prototype CTS run.

There are synthesis tools now in the market which try to read in floor planning information as a part of the synthesis process and quickly estimate block placements. It would be nice if these tools generated their own floor plan.

2. Post placement/Global route:
-------------------------------
Elmore delay is a single moment delay computation technique. The model does not account for proper slew computation/propagation.

It also does not account for resistive shielding (i.e the gate does not see an effective capacitance but will see the total capacitance as load). This makes the model pessimistic.

This is good news towards a global to final mode closure as all this is pessimism getting added in the design flow inherently to over constrain physical optimization algorithms. It could be a bad thing in terms of over design (more area/power)

Vias are not inserted at global mode stage of the design flow. Rather they could be modeled. Depending on the modeling technique used, it can effect global mode via resistance calculation.

Estimating crosstalk at this stage can be very tricky as track assignment is not yet complete and coupling capacitance can only be predicted and not computed. This could be achieved by some means of virtual track assignment.

How many tools you know of account for crosstalk pessimism at global mode stage?

A reasonable thing to do in STA here sounds like propagating the worst AT signal + worst slew forward in the timing engine so that some more pessimism is built into the physical synthesis process.

Another valuable point to look at this stage is to monitor the design congestion. May be we could club congestion metric to the x-talk metric and derive a statistical model for x-talk at GR stage of the design flow to account for extra margins during the physical synthesis step?

3. Post final route:
--------------------
The problem is small restructuring still can't be performed post layout (final routing) and the final route could have potentially detoured from the original global route topology in fixing the LVS/DRC violations.

The chance of this happening is much less in less congested designs as final routes likes to follow the global routing topologies as much as possible.

Whatever optimization happens post layout is a incremental/ECO optimization which will effect steps like sizing, buffering, placement and routing to an incremental extent.

Sizing/Buffering using an accurate delay model is absolutely essential at this stage. So tools have to ideally make use of multiple pole models. But is this the case?

Most Optimization commands use an internal call to the extractor and delay calculator. The extractor comes up with an unreduced RC network.

The job of the delay calculator is to build a reduced RC network taking in to account only the dominant poles (and zero's). This process is called model order reduction and a reasonably accurate technique to do this is Asymptotic waveform evaluation (AWE).

AWE when computed using Pade via Lanczos or Block Arnoldi method is a reasonably accurate delay calculation method.

Adding margins sounds like an elegant/good solution, but margins are design dependent, could end up being pessimistic and can have a detrimental effect on chip area and power.

There are so many pessimism's/optimism's involved in the entire ASIC design process.

No comments: