I decided to cover the sub sections one by one..
First lets go with test...
One major ingredient of CAD is Testing and design for Test.
The thumb rule in chip design is "you can't market it until you test it".
Testability has become an essential part of today's chips. Once fabricated each chip sits on a tester where functional as well as ATPG patterns are run on it to catch fabrication issues. It also undergoes the process of burn-in to accelerate any more failures which can happen in the field. Only working chips come out into the market.
Some chips might be lost and hence we have yield defined. ((working chips/total chips manufactured) * 100).
The defects in silicon are targetted after we define them at a higher level of abstraction called faults (fault modeling is a major area of research). Most defects in silicon can be modelled using these fault models. This higher level of abstraction simplifies the process of DFT.
Here we go with the most popular fault models...
stuck at 0 (single/multiple), stuck at 1, coupling faults, bridging faults, delay faults.
Testability is most commonly accomplished in most ASIC's by full-scan or partial-scan. Although it is an overhead in terms of area (Muxed flip flops or LSSD), full scan breaks down a sequential test problem into a combinational one and thereby reduces complexity of test.
Partial scan still requires an atpg (automatic test pattern generation) tool to generate both sequential and combinational test patterns, but is less of an overhead on area.
These scan flops are connected in the form of a serial shift register configuration. The test vectors are scanned in serially (during shift mode) and then applied to logic during capture mode to catch defects. Then the responses are serially scanned out and compared to golden responses.
The test pattern is a vector which distinguishes between a good and a faulty circuit.
So any pattern which can find all satisfying assignments of the f(good) xor f(faulty) = 1 is a test pattern. All atpg tools work on solving this problem.
There are various algorithms for generating test vectors. The most fundamental one's are D-algorithm(Roth), PODEM(Path oriented decision making) and FAN (Fujiwara). Then show up the most recent SAT (Satisfiability based techniques- Larabee).
the test vectors can be driven from a tester or by using BIST (built in self test logic). Each has it's advantages and disadvantages. BIST can be sometimes an overhead if we also make it drive deterministic test patterns (as they need to be stored in memory, in case test is happening at speed). Random BIST is a small overhead on area but usually might not give the desired coverage.
So there are some pseudo random pattern generating BIST's, which are a tradeoff.
Apart from all BIST types we discussed till now, there is mixed signal BIST for Test coverage on Mixed signal circuits.
Memories have their own class of defects and (like pattern sensitivity faults, NPSF etc)hence have their own test requirements. So memory BIST uses various other techniques like march test, GALPAT (galloping patterns test) to get test coverage on memory. The complexity involved in testing NPSF is too high, so some people might skip testing for them all together).
A major advantage of BIST is at-speed. This is needed to catch some path delay faults which can be only caught at speed and if chip frequency (> 500 MHz+) is high.
Since each chip has to be tested and needs to have a desired coverage so that it can be marketted, a major area of research is test compression. This has to be accomplished in order to reduce test time/time to market. During compression we try and target those vectors which can catch multiple faults and use them as patterns for the tester. This way we reduce the number of test vectors and hence test time. test time = no of vectors * test frequency. Most atpg tools these days are capable of test compression.
Mentor graphics has come up with a unique way of compressing test patterns using EDT (embedded determinstic test-technique). There is a compression, decompression logic which is available on chip. Externally only a few test patterns are applied from tester. The decompressor makes the few external patterns into a lot of deterministic patterns internally, which then are applied to logic to get the desired fault/test coverage. Once this is achieved, the resulting patterns are compressed back and are collected by the tester. There is a many to one mapping/one to many mapping mechanism, decomposing which, the defects in the chip can be pin pointed to.
The advantage of this mechanism is externally only a few patterns are applied from the tester. This reduces the tester memory and also test time. Hence it saves a lot of $$$. A lot of patterns (decompressed) are applied at speed further reducing test time and also catching tough to catch at speed defects/which translates to higher test coverage.
One major concern which remains about this mechanism "how will heat dissipation be handled during test?". This is the concern which is common to most at speed testing scenarios.
There are other DFT mechanisms which reconfigure scan chains on the fly and reduce test time.
In addition to these mechanisms, there are various parametric tests like IDDQ and IDDT (quiscent and transient power supply current testing). These defects are caught by various current/voltage sensors built on and off chip. This area has little to do with CAD automation except when it comes to generating the appropriate vectors forthese tests during ATPG. :)
Wednesday, February 28, 2007
Tuesday, February 20, 2007
Breaking down the vlsi problem into sub problems
In this section we discuss some of the Major Tasks involved in the design of a SOC (System On Chip).
1. Verification
2. Logic Design
3. Logic synthesis and physical optimization.
4. DFM (design for Manufacturability)/DFY (design for yield)
5. Analog/RF (Radio Frequency), Mixed signal design.
6. Testing and design for Testability.
I will describe each one in a bit more detail now...
1. Verification
Verication can be split up into--
1.1 Black Box: Not much is known about the internals of the Design
Grey Box: Few Internals exposed.
White Box: Everything internal to the design is well understood and RTL is accessible.
1.2 Assertions (ABV), Model checking (MC), Equivalence checking (EC)(sequential/combinational), Simulation, Symbolic simulation, Emulation.
1.3 Verification Languages: PSL, Sugar, SystemC, SystemVerilog, e, vera, VHDL, Verilog, c++
2. Logic design
-----------------
2.1 behavioral level : usually the specification is written at this level!
Languages: VHDL, Verilog, System Verilog, C++, systemC
2.2 Register Transfer Level : Primarily to infer hardware for ASIC and FPGA through the process of logic synthesis
2.3 Hardware/software co-design. Most Emulation technologies currently use FPGA on emulator platform and have transactor functions to communicate with software on a work station. Hence this is hardware/software codesign.
C++ and Behavioral level (Verilog/VHDL/systemVerilog/systemC) synthes tools are available in market from some of commercial CAD vendors.
Tools: Behavioral Compiler (SNPS), Catapult (MENT)
Currently to infer hardware for SOC's, logic synthesis tools are the most popular as they give a finer level of control on the logic which gets inferred. Behavioral synthesis tools just dont offer that yet!. Hence if you have to infer logic, you have to write RTL.
It will take a couple of years for behavioral synthesis to mature to a level where it can give hardware designers more degrees of freedom (area/timing/power) and finer level of control.
3. Logic synthesis and Physical design Tools :
Few examples are: Design Compiler, Physical Compiler, ICC, First Encounter, nanoRoute.. etc)
3.1 Lots of sub problems to solve in this particular domain.
3.2 Primarily split up into these major sub problems
3.2.1 Analysis/Elaboration (compiler theory).
High level optimization
FSM state encoding
sharing and Merging of operators
RTL clock gating
Module Compilers
3.2.2 Logic optimization
Primarily split into 2 level, multi-level, Multi-valued
Literal reduction [boolean and algebraic factorization]
Decomposition
Extraction
Tree height reduction (KMS, lengauer)
Technology Mapping for area/timing/power [dag covering, BDD mapper]
Logic Restructuring [for speeding up a circuit]
Making architectural tradeoffs between design ware components for speed/area
Handling Fanout/Load
Removing Hanging logic
Redundancy Removal
Collapsing logic
Loop breaking
Dont care optimization
Retiming
gate Sizing
Gain Based synthesis
3.2.3 Floorplanning/ Floorplacement (simultaneous macro + std cell placement)
Recursive Bisectioning
Analtycal Placers
Multi level Placers
Simulated Annealing (few iterations)
3.2.4 physical optimization which is
Partitioning:
Objectives: wirelength
Algorithms: KL/FM/HMetis (k-way Hypergraph partitioning)/Multilevel Methods
Placement (Global/detailed)
Objectives:wirelength, Congestion, Timing, Legalization, noise
Algorithms: Recursive Bisectioning, MLP, APlace, capo, fengShui, Timberwolf, QPlace
Wirelength prediction, buffer estimation are also included in every iteration of placement. some placers quickly calculate steiner wirelength in each iteration
Buffering
Objectives: Load, fanout, Area, Power, slew, slack, Crosstalk
Algorithms: VGBA (Van Ginneken Buffering Algorithm)
Gate Sizing
Objectives: std cell area, slew, slack, Load, leakage, Crosstalk
Algorithms::LP, GP, Convex Porgramming
Research: Transisor sizing. On the fly library characterization
Routing (Global, track and detailed)
Objectives: wirelength reduction, timing, congestion, Crosstalk, DRC
Algorithms: Maze running, Line searching, Channel Routing, Stub routing, Steiner Trees (global), BRBC heuristics, Lots of approximation algorithms, Dijkstra's algorithm, Linear Porgramming, Randomized algorithms
There are gridded as well as Gridless routers now. But fundamentally I would guess all routers work on a grid and the grid only keeps getting smaller and is limited by the manufacturing grid.
3.3 Power distribution
Objective: to supply power to all logic elements with minimum IR drop
Minimize congestion because of metal utilization
Reduce transients during switching of power (Decap Insertion).
Automated power grid synthesis tools are getting stronger day by day. These tools are making use of Algebraic Multi-grid methods, which previously were used for solving elliptic partial differential equations.
3.4 power/rail/em analysis and power grid design
Objective: to check reliability of power grid
Tools: Incremental rail analysis /optimization is an active area of research. Matrix solvers are used to solve KCL/KVL. Conjugate gradient method is quite a popular candidate here for solving sparse linear matrices.
3.5 Clock Tree synthesis
Objectives:: Minimize Insertion delay/Minimize skew/Minimize power in distributing clocks to sequential elements
Algorithms:: MME (Method of means and medians), DME (Deferred Merge embedding)
3.6 leakage/dynamic power optimization
Objectives:: To save battery power.
Techniques: Multi Vt swapping (LP Formulation), Multi-Vdd (voltage islands), sleep modes, reduce switching probabilities in logic re-structuring
3.6 LVS/DRC
Objectives:: To check manufacturability of the chip
Mostly geometric algorithms for fast polygon manipulation for detecting DRC's. Fixing is done by router adhering to routing rules.
3.7 RLC Extraction
Objectives:: Accuracy of extraction is important as this information is passed to delay calculation engine
Targets for algorithms:: Accuracy of model, memory, run time
Types of extractors: 2.5D and 3D
3D extraction uses Maxwell's equations/Greens function to model RLC of small geometries and for rule generation purposes for 2.5D extractor. This is most accurate form of extraction (QuickCap/Raphael)
2.5D extractor uses the rules generated by 3D extractor with interpolation/Extrapolation (starRCXT)
Algorithms:: Model Order Reduction techniques for reducing size of RC networks so that they become managable. (PRIMA/PRIMO). Lots of new MOR techniques coming up every day. AWE is the most popular in the digital logic. refer to Larry Pillegi's link in the Link's section.. :)
3.8 Timing (Static timing analysis/statistical static timing analysis!)
Statistical timing is gaining popularity
Objective: Accuracy in timing without Multi-corner analysis
Algorithms: use process spread information in calculating timing spread
predict yield
Path based and block based approaches
Tool: PTSI
3.9 Noise avoidance and Repair
Exraction of coupling cap fed to delay calculator to find out crosstalk delta
Fixing done using track spacing and non default rules
Sizing of gates/buffers to reduce Agressor slew
Timing window methods to reduce crosstalk delta pessimism
3.10 OCV analysis and Optimization
Objective:: To account for process spread within and across wafers
Might be nullified with wide adoption of SSTA
Algorithms are for common path pessimism removal
4. DFM/DFY
4.1 Litho simulation Objective: How litho aware is your GDS?.
4.2 CMP checks.. Objective: Will make ECMP (electro chemical mechanical polishing) a robust process during manufacturing by checking metal density on layout.
4.3 Std cell library : tweaks to std cell libraries to increase yield
4.4 leakage power: leakage power has an impact on yield as well
Tools: Refer to Blaze DFM's wonderful site
Pondering further on point:4
-----------------------------
As process technology shrink from 90nm to 65nm and below (45 nm), it's difficult to manufacture such small geometries and still get a high yield (no of working chips/wafer). So manufacturability is getting tightly integrated with 3, which we discussed above. One reason which immediatly comes to mind is Lithography. While process is 45 nm, the wavelength of light used in lithography is still 153 nm (argon-fluoride and F2 laser wavelengths). Effects like diffraction are prominent. This means Litho should be taken care of in design so that these effects dont translate to silicon (done using rules for router in physical design, but this might not be the only technique). we will discuss more on each as we keep posting
5. Analog (A/D, D/A, bandgap, current mirrors, diff amps, PLL's, DLL's)
5.1 Spice
Objective: Circuit level simulator
Tools: HSPICE, PSPICE, ELDO
5.2 Analog synthesis (An active research area)
5.3 RFIC
5.4 Analog layout/Extraction
5.5 Characterization and design of std cell libraries
5.5.1 Create .lib (liberty) timing/leakage/Noise models for std cell libraries
5.6 Memory modeling and characterization
6. Test and design for Testability.
Major areas: SCAN/BIST/Test Compression/Fault Modeling/ATPG/Testers
Tester Manufacturer: Teradyne
1. Verification
2. Logic Design
3. Logic synthesis and physical optimization.
4. DFM (design for Manufacturability)/DFY (design for yield)
5. Analog/RF (Radio Frequency), Mixed signal design.
6. Testing and design for Testability.
I will describe each one in a bit more detail now...
1. Verification
Verication can be split up into--
1.1 Black Box: Not much is known about the internals of the Design
Grey Box: Few Internals exposed.
White Box: Everything internal to the design is well understood and RTL is accessible.
1.2 Assertions (ABV), Model checking (MC), Equivalence checking (EC)(sequential/combinational), Simulation, Symbolic simulation, Emulation.
1.3 Verification Languages: PSL, Sugar, SystemC, SystemVerilog, e, vera, VHDL, Verilog, c++
2. Logic design
-----------------
2.1 behavioral level : usually the specification is written at this level!
Languages: VHDL, Verilog, System Verilog, C++, systemC
2.2 Register Transfer Level : Primarily to infer hardware for ASIC and FPGA through the process of logic synthesis
2.3 Hardware/software co-design. Most Emulation technologies currently use FPGA on emulator platform and have transactor functions to communicate with software on a work station. Hence this is hardware/software codesign.
C++ and Behavioral level (Verilog/VHDL/systemVerilog/systemC) synthes tools are available in market from some of commercial CAD vendors.
Tools: Behavioral Compiler (SNPS), Catapult (MENT)
Currently to infer hardware for SOC's, logic synthesis tools are the most popular as they give a finer level of control on the logic which gets inferred. Behavioral synthesis tools just dont offer that yet!. Hence if you have to infer logic, you have to write RTL.
It will take a couple of years for behavioral synthesis to mature to a level where it can give hardware designers more degrees of freedom (area/timing/power) and finer level of control.
3. Logic synthesis and Physical design Tools :
Few examples are: Design Compiler, Physical Compiler, ICC, First Encounter, nanoRoute.. etc)
3.1 Lots of sub problems to solve in this particular domain.
3.2 Primarily split up into these major sub problems
3.2.1 Analysis/Elaboration (compiler theory).
High level optimization
FSM state encoding
sharing and Merging of operators
RTL clock gating
Module Compilers
3.2.2 Logic optimization
Primarily split into 2 level, multi-level, Multi-valued
Literal reduction [boolean and algebraic factorization]
Decomposition
Extraction
Tree height reduction (KMS, lengauer)
Technology Mapping for area/timing/power [dag covering, BDD mapper]
Logic Restructuring [for speeding up a circuit]
Making architectural tradeoffs between design ware components for speed/area
Handling Fanout/Load
Removing Hanging logic
Redundancy Removal
Collapsing logic
Loop breaking
Dont care optimization
Retiming
gate Sizing
Gain Based synthesis
3.2.3 Floorplanning/ Floorplacement (simultaneous macro + std cell placement)
Recursive Bisectioning
Analtycal Placers
Multi level Placers
Simulated Annealing (few iterations)
3.2.4 physical optimization which is
Partitioning:
Objectives: wirelength
Algorithms: KL/FM/HMetis (k-way Hypergraph partitioning)/Multilevel Methods
Placement (Global/detailed)
Objectives:wirelength, Congestion, Timing, Legalization, noise
Algorithms: Recursive Bisectioning, MLP, APlace, capo, fengShui, Timberwolf, QPlace
Wirelength prediction, buffer estimation are also included in every iteration of placement. some placers quickly calculate steiner wirelength in each iteration
Buffering
Objectives: Load, fanout, Area, Power, slew, slack, Crosstalk
Algorithms: VGBA (Van Ginneken Buffering Algorithm)
Gate Sizing
Objectives: std cell area, slew, slack, Load, leakage, Crosstalk
Algorithms::LP, GP, Convex Porgramming
Research: Transisor sizing. On the fly library characterization
Routing (Global, track and detailed)
Objectives: wirelength reduction, timing, congestion, Crosstalk, DRC
Algorithms: Maze running, Line searching, Channel Routing, Stub routing, Steiner Trees (global), BRBC heuristics, Lots of approximation algorithms, Dijkstra's algorithm, Linear Porgramming, Randomized algorithms
There are gridded as well as Gridless routers now. But fundamentally I would guess all routers work on a grid and the grid only keeps getting smaller and is limited by the manufacturing grid.
3.3 Power distribution
Objective: to supply power to all logic elements with minimum IR drop
Minimize congestion because of metal utilization
Reduce transients during switching of power (Decap Insertion).
Automated power grid synthesis tools are getting stronger day by day. These tools are making use of Algebraic Multi-grid methods, which previously were used for solving elliptic partial differential equations.
3.4 power/rail/em analysis and power grid design
Objective: to check reliability of power grid
Tools: Incremental rail analysis /optimization is an active area of research. Matrix solvers are used to solve KCL/KVL. Conjugate gradient method is quite a popular candidate here for solving sparse linear matrices.
3.5 Clock Tree synthesis
Objectives:: Minimize Insertion delay/Minimize skew/Minimize power in distributing clocks to sequential elements
Algorithms:: MME (Method of means and medians), DME (Deferred Merge embedding)
3.6 leakage/dynamic power optimization
Objectives:: To save battery power.
Techniques: Multi Vt swapping (LP Formulation), Multi-Vdd (voltage islands), sleep modes, reduce switching probabilities in logic re-structuring
3.6 LVS/DRC
Objectives:: To check manufacturability of the chip
Mostly geometric algorithms for fast polygon manipulation for detecting DRC's. Fixing is done by router adhering to routing rules.
3.7 RLC Extraction
Objectives:: Accuracy of extraction is important as this information is passed to delay calculation engine
Targets for algorithms:: Accuracy of model, memory, run time
Types of extractors: 2.5D and 3D
3D extraction uses Maxwell's equations/Greens function to model RLC of small geometries and for rule generation purposes for 2.5D extractor. This is most accurate form of extraction (QuickCap/Raphael)
2.5D extractor uses the rules generated by 3D extractor with interpolation/Extrapolation (starRCXT)
Algorithms:: Model Order Reduction techniques for reducing size of RC networks so that they become managable. (PRIMA/PRIMO). Lots of new MOR techniques coming up every day. AWE is the most popular in the digital logic. refer to Larry Pillegi's link in the Link's section.. :)
3.8 Timing (Static timing analysis/statistical static timing analysis!)
Statistical timing is gaining popularity
Objective: Accuracy in timing without Multi-corner analysis
Algorithms: use process spread information in calculating timing spread
predict yield
Path based and block based approaches
Tool: PTSI
3.9 Noise avoidance and Repair
Exraction of coupling cap fed to delay calculator to find out crosstalk delta
Fixing done using track spacing and non default rules
Sizing of gates/buffers to reduce Agressor slew
Timing window methods to reduce crosstalk delta pessimism
3.10 OCV analysis and Optimization
Objective:: To account for process spread within and across wafers
Might be nullified with wide adoption of SSTA
Algorithms are for common path pessimism removal
4. DFM/DFY
4.1 Litho simulation Objective: How litho aware is your GDS?.
4.2 CMP checks.. Objective: Will make ECMP (electro chemical mechanical polishing) a robust process during manufacturing by checking metal density on layout.
4.3 Std cell library : tweaks to std cell libraries to increase yield
4.4 leakage power: leakage power has an impact on yield as well
Tools: Refer to Blaze DFM's wonderful site
Pondering further on point:4
-----------------------------
As process technology shrink from 90nm to 65nm and below (45 nm), it's difficult to manufacture such small geometries and still get a high yield (no of working chips/wafer). So manufacturability is getting tightly integrated with 3, which we discussed above. One reason which immediatly comes to mind is Lithography. While process is 45 nm, the wavelength of light used in lithography is still 153 nm (argon-fluoride and F2 laser wavelengths). Effects like diffraction are prominent. This means Litho should be taken care of in design so that these effects dont translate to silicon (done using rules for router in physical design, but this might not be the only technique). we will discuss more on each as we keep posting
5. Analog (A/D, D/A, bandgap, current mirrors, diff amps, PLL's, DLL's)
5.1 Spice
Objective: Circuit level simulator
Tools: HSPICE, PSPICE, ELDO
5.2 Analog synthesis (An active research area)
5.3 RFIC
5.4 Analog layout/Extraction
5.5 Characterization and design of std cell libraries
5.5.1 Create .lib (liberty) timing/leakage/Noise models for std cell libraries
5.6 Memory modeling and characterization
6. Test and design for Testability.
Major areas: SCAN/BIST/Test Compression/Fault Modeling/ATPG/Testers
Tester Manufacturer: Teradyne
Subscribe to:
Posts (Atom)