Posts Tagged ‘change management’

The only constant is change…

Wednesday, September 9th, 2009

“The only constant is change, continuing change, inevitable change, that is the dominant factor” during the Implementation of complex chips today.

Before I go further, I would first like to acknowledge that the highlighted quote above is from one of the most famous writer’s of our time, Isaac Asimov. And it is a translated version of a quote by Heraclitus, a 500BC, greek philosopher. I am sure these great writers and philosophers certainly did not contemplate the use of the above phrase in context of Implementing chips !!! But do you relate to it in your everyday timing constraint/verification work ? I would go out on a limb, and say that each of you would say an emphatic “YES”.

Here is a typical examples of what I have recently experienced:

“I was working on a highly complex block, with 30+ clocks and soft IP components from several vendors. After carefully and diligently working on it for several weeks, all timing constraints (IO, inter-clock false paths etc.) were cleaned up, post-synthesis netlist was meeting pre-layout timing, and the backend was making decent progress towards timing closure. Out of nowhere, one of the soft IP vendor delivers a new drop with some “minor” bug fixes. And guess what happens? The block netlist is a disaster after synthesis. Why? It so happened that the new soft IP drop introduced some half-cycle paths, and some additional inter-clock paths from some config registers (on clk1) to all other clock domains (clk2 … clkn). It turned out that the half-cycle paths were legitimate, and I had to tweak synthesis variables/flows to meet timing. After carefully analyzing/reviewing each new inter-clock path using the Soft IP documentation, some vendor help, and Primetime analysis, it was concluded that all these were false path. I updated the constraints/SDC to put those set_false_path commands, and the synthesis was back to normal again. The entire process took about 1 week or so to finish.”

Can you relate to the this experience when working on large SOCs Have you ever experienced something similar before, and had to spend several days to identify and fully understand what changed in the design RTL, carefully analyze and review those changes, update the timing constraints appropriately, iterate a few times in synthesis, and make sure that everything is back to normal ? And just when you finished, guess what happens — there is yet another change that again causes havoc during implementation, and you repeat everything — and this repetitive cycle goes on several times during the entire duration of the project.

Now, in this example, I am not ranting and raving about the Soft IP vendor creating such a major bottleneck for implementation. They are just doing their job by providing a more robust IP to their customers, and not fully aware of what issues can come up during implementation. The key issue here is how to make meaningful progress and converge on the implementation phase of a complex design, in a chaotic, constantly changing environment ? And in many cases without a direct line of communication to the designer of the RTL code in question?

To do so, first, we must acknowledge, as Isaac Asimov and Heraclitus quote, that “The only constant is change”, and it is going to stay that way. Schedule and cost planning for chip implementation puts a heavy focus on when the RTL netlist will “freeze“. Now “freezing” the RTL should certainly be a very high priority, and everyone in the design team should strive towards that goal. However, the reality is that freezing a RTL netlist is beyond the control of implementation teams, and it is an unreliable, unpredictable metric to be totally dependent upon. Think about it — if a major bug is found in the RTL or some IP Vendor drops a new version with some critical fixes just days before tapeout, you have no choice but to accommodate that change, right? Or if the marketing team comes along in the 11th hour and say that a major customer wants to see some minor changes in the feature set of the chip, which would obviously result in RTL changes – and you have to suck it up, and accept that change. This is just a fact-of-life in this business – chips only succeed when they sell (hopefully in large volumes!).

How we respond to these unpredictable, random changes is something we do control – and having a solution that enables us to deal with these day-to-day, unpredictable changes in an efficient, targeted way will be a big factor in ensuring on-time delivery of projects while maintaining a high degree of flexibility. For example,

  • • Automatically identify (in minutes), any new clocks, clock logic, or inter-clock paths (false or not) that were introduced in a new netlist, compared to a previous netlist
  • • Ability to quickly Compare any two SDC’s, and get insight into meaningful changes.
  • • Identify if constraints have been modified or need to be modified on any critical IO Ports
  • • Find out if any new ½ cycle paths were introduced in the latest netlist, compared to the previous one
  • • Determine if register clock propagation has changed from one version to the next
  • • Quickly detect if there is a significant change in the no. of registers being driven by any clock in the design, that might impact CTS implementation
  • • In general, getting quick, upfront, automated visibility and understanding into what changed in the design netlist/constraints that would cause implementation problems in the downstream flow, before these problems bubble back up in a big, messy way

In absence of such robust capabilities, design changes today are dealt with using painstakingly, laborious methods of manually going through timing reports, browsing through IP documentation which are 100’s of pages long, debugging check_timing report files, interactively analyzing issues in native EDA tools using ad-hoc scripts, sending numerous back-n-forth emails and meetings with the RTL / IP Vendor / Physical Design teams etc. This process takes a tremendous amount of time, which kills schedules. Moreover, it requires a lot of bandwidth & engineering resources (especially for complex chips with numerous P&R blocks), and that kills budgets. Furthermore, this pseudo-manual process is error prone, causing engineers to miss real issues, which are identified much later in the design cycle, causing even more pain for everyone. Lastly, it involves a great amount of mundane, grunt work of repetitive nature.

How can we drastically minimize all of the above to handle design changes better ? Couple automation with good methodologies, and a large amount of pain and uncertainty will go away!