Outlined below is a summary of our past research projects. Many of these projects have helped lay the foundation of our current research interests. More information about these research pursuits can be obtained by reading the relevant papers and/or contacting us for more details.
Dynamic Thermal Sensing for ICs
Dynamic Thermal Sensing for Runtime Thermal Profile Estimation:
Temperature has become a foremost challenge affecting the reliability and Mean Time Before Failure. Temperature on chip exhibits significant variation in time and space, caused due to different activity patterns of various modules on the chip and varying power footprint of different applications. This underlying space time unpredictability means a pure design time approach to curtailing thermal effects is not sufficient. Runtime thermal management cannot be performed effectively without accurate knowledge of the chips thermal state (temperature at each location of interest at any given time and the associated trends). Modern chips, especially CPUs are being equipped with on-chip thermal sensors that could be used to provide on-line runtime information. Although, thermal sensors are promising, there is no scientific methodology in place that intelligently incorporates the thermal sensing infrastructure in the design (where should the sensors go, how should the data be collected etc.). Sensors cannot be placed everywhere in the chip and tend to be noisy and error prone as well. Using noisy readings from a few sensors and generating the thermal profile of the entire chip presents an intriguing theoretical and practical challenge.
We have investigated a statistical methodology that develops a scientific basis for deciding the locations of the thermal sensors, the number of bits assigned to each sensor and the size of the central register that stores the readings of all the sensors. Since placing sensors on chip is a costly affair, our methods strive to minimize the redundancy in the sensor observations. Two sensors placed at locations with highly correlated temperature values is wasteful while placing them at locations with smaller thermal correlations gives more information about the thermal profile. Once the thermal sensing infrastructure is in place, we would like to use the noisy sensor readings to estimate the chip thermal profile accurately. We have developed statistical schemes where the correlations between the sensor and non sensor locations are exploited to estimate the entire thermal profile. We used concepts from Kalman Filters and other dynamic state estimation paradigms to perform our estimation while accounting for noisy thermal sensors.
Autonomous Dynamic Thermal Management:
Managing the temperature surges of computer systems has become an active topic of research where much of the existing effort is being focused on controlling parameters such as task scheduling, CPU voltage and speed etc. using a combination of reactive and predictive-proactive schemes. With growing complexity of computer systems radical approaches are needed for thermal management. We have developed thermal management schemes at different levels of abstraction of computer systems: from data centers to individual chips and from application software to hardware platform. The methods we investigated strive for the perfect balance of schemes at different levels of abstraction in an autonomous fashion. For example, if a simple re-distribution of tasks in a datacenter can address the thermal issue then we don’t need to activate costly CPU level thermal management schemes. A key aspect of our work deals with autonomy and self-healing of systems.
Resource Constrained Distributed Data Collection and Estimation:
Although resource management (such as energy) in sensor networks has been a very active topic of research for the past decade, the fundamental tradeoff between sensor computing capability and quality & timeliness of sensing is yet to be understood. Sensor networks are examples of distributed collaborative systems that autonomously collect data and make inferences about the system state. Several fundamental algorithms such as Kalman/Particle filters, Wyner Ziv coding etc. are used extensively along with protocols for data communication. Imposing resource constraints such as finite precision, energy, fixed sensor node memory etc. result in completely different results of distributed inference problem instances. We have investigated such research agendas from the context of distributed video coding where large visual sensor networks are used to collect real time visual data for performing target tracking and achieving perpetual surveillance.
Statistical Analysis and Optimization of VLSI Circuits in Presence of Fabrication Variability:
In order to keep up with the society’s growing appetite for electronic systems such as faster computers and ultra low energy mobile devices, the semiconductor industry has been continuously reducing the dimensions of fabricated features on VLSI chips. This has enabled us to pack billions of transistors on modern chips thereby unlocking new applications which were unimaginable just a few years ago.
VLSI fabrication in nano-scale dimensions is very hard to control and is therefore accompanied by significant randomness. This causes several fabricated chips to deviate from the power/performance-speed specifications leading to loss in reliability and yield. The state of the art design schemes ignore this randomness, a feature that is becoming increasingly important. We have developed several effective techniques for understanding the impact and reducing the effects of fabrication randomness through intelligent, automated VLSI design schemes that proactively account for this randomness statistically.
Fabrication variability causes the chip’s timing and power to randomly deviate from the specifications. The state of the art either ignores this variability or makes highly simplifying assumptions about its nature and impact. We have made several significant contributions to the problem of modeling the impact of fabrication randomness on the chip’s timing characteristics. We have developed several techniques that accurately and efficiently predict the probability density function (PDF) or the statistical spread in chip’s timing in presence of fab-randomness. Using the knowledge of statistical spread in design characteristics (such as timing, power), we have developed several schemes for controlling design parameters such as device/wire dimensions, for reducing the detrimental impact of fabrication randomness on yield and reliability. Our approach includes first developing a precise mathematical understanding of how controlling design parameters impacts yield and reliability followed by applying problem specific customizations to this theory for making it practically applicable. This two-pronged approach was applied to two distinct but related schemes for countering variability. The first attacks the problem by optimizing the circuit parameters during the design phase so as to have enough slack around the specifications for immunity to variability (low probability of violating specifications). The second approach investigates a different paradigm in which the impact of variability on design constraints was corrected after fabrication of each chip separately.
Our work exposes several strong mathematical properties of the binning yield loss function (a variant of yield loss) with respect to circuit parameters like transistor sizes. By exploiting these properties we could successfully formulate the problem of assigning sizes to logic gates for minimizing binning yield loss as a convex program. This is a very significant result because the convexity property could be exploited to solve this formulation optimally and efficiently. Using developments in convex optimization theory we could solve the logic gate sizing problem to generate highly optimized designs with very small yield loss.
These design time approaches end up over-constraining the circuit and also rely on accurate knowledge of variability statistics which are not always available. We also investigated a different paradigm in which the impact of variability on design constraints was corrected after fabrication. This was performed using specific knobs installed in the design that allowed detection and correction of timing constraint violations due to fab-randomness. Our work exposed several significant mathematical properties that allowed controlling these knobs efficiently and effectively for reducing the yield loss and improving reliability.
Methods for Leakage and Dynamic Power Optimization:
Growing desire for light weight mobile electronics and ever increasing transistor counts integrated on-chip have made reducing power/energy dissipation an extremely significant part of the VLSI design process. We have developed several techniques for reducing power dissipation thereby controlling on-chip thermal hotspots (one of the primary causes of reliability loss) and also improving the battery life. We followed a two pronged approach of first developing a precise mathematical understanding of how controlling design parameters impacts power dissipation followed by applying problem specific customizations to this theory.
Techniques for Leakage Power Reduction: As the fabrication dimensions scale to mid/lower nanometers, leakage/static power becomes a significant contributor of overall chip power. We have investigated several automated design techniques including forward body biasing FBB, MTCMOS sleep transistors and Dual threshold technology that can provide a 10x reduction in overall leakage power. Reduction of leakage by controlling these circuit parameters causes the timing to increase. By developing a rigorous mathematical understanding of the leakage vs timing tradeoff obtained by controlling these parameters we developed effective schemes for reducing leakage under circuit timing constraints. For example our approaches investigated ways of controlling the transistor body bias such that the leakage could be minimized effectively when a device (like a cell phone) was in standby (did not require high speed capability) and switched back to high speed mode (with high leakage) whenever high performance was needed. A similar approach could also be enabled by MTCMOS (multiple threshold CMOS) based sleep transistors. Using a control signal connected to these sleep transistors, one could effectively turn off the gates and functional modules of the device into sleep state where it dissipated extremely low leakage power. Whenever the device was needed, the same control knob could be used to wake it up. We proposed highly effective fine grained sleep transistor placement and sizing schemes with significantly better outreach than traditional approaches. We also investigated leakage minimization techniques based on assigning threshold voltages to logic gates from a choice of two available thresholds. In these papers, we investigated the theoretical properties of the threshold assignment problem and proved the continuous version to be a convex program. Overall our methods resulted in orders of magnitude reduction in leakage with minimal penalty on overall circuit timing compared with designs optimized solely for high speed operation.
Controlling Dynamic Power in VLSI Systems: Dynamic power dissipation occurs due to excessive switching of transistors that causes power/energy to be dissipated in the device capacitors. Typical high performance computing workloads are associated with high dynamic power overhead. We investigated the problem of distributing tasks to chip’s functional resources in such a way that overall switching is minimized. Since the problem was NP Complete we proposed effective graph theoretic heuristics. Allocating dual supply voltages to gates is a very effective way of reducing dynamic power. We also addressed the problem of clock power dissipation by judiciously synthesizing the clock tree. Experimental results highlight the effectiveness of these schemes in significantly reducing the dynamic power dissipation of modern multi-million transistor designs.
Computational Efficiency in Particle Filters with Applications to Computer Vision:
Technique of particle filtering has been widely applied for solving the inference problems for non-linear systems such as target tracking, navigation etc. Computational efficiency of particle filters is a major hindrance to their practicality. We have investigated several techniques for improving their computational efficiency using algorithmic modifications, parallelization and pipelining on multi-processor machines. Our schemes could solve several particle filtering instances (like target tracking in video) almost in real time while the traditional implementations could not come close in terms of computational efficiency.