2015 CICC Technical Papers - Tuesday
Tuesday Technical Papers
Session 11 – Wireline Building Blocks
Tuesday, May 2, 9:00 – 12:00, Lady Bird 1 Room
Session Chair: Eric Naviasky, Cadence
Session Co-Chair: Mohammad Hekmat, Samsung
11-1 A 10 GHz 56 fsrms-Integrated-Jitter and -247 dB FOM Ring-VCO Based Injection-Locked Clock Multiplier with a Continuous Frequency-Tracking Loop in 65 nm CMOS
Xuqiang Zheng, Fangxu Lv, Feng Zhao*, Shigang Yue*, Chun Zhang, Ziqiang Wang, Fule Li, Hanjun Jiang, Zhihua Wang, Tsinghua University, *University of Lincoln
This paper develops a 10GHz Ring-VCO based injection-locked clock multiplier (RILCM) using a new timing-adjusted PD based hybrid frequency tracking loop in 65nm CMOS. The measured results show that it achieves 56.1fs rms-jitter and -57.13dBc spur level. The calculated figure-of-merit (FOM) is -247.3dB.
11-2 Jitter Injection for On-Chip Jitter Measurement in PI-Based CDRs
J. Liang, A. Sheikholeslami, University of Toronto
H. Tamura, H. Yamaguchi, Fujitsu Laboratories Limited
The RMS relative jitter between the clock and data of a 28Gb/s half-rate PI-based digital CDR fabricated in 28nm CMOS, is measured with sub-picosecond accuracy by injecting square wave jitter using the CDR’s PI code and measuring its effect on the autocorrelation function of the bang-bang PD output.
11-3 A 27.1 mW, 7.5-to-11.1 Gb/s Single-Loop Referenceless CDR With Direct Up/dn Control
K. Park, W. Bae, D.-K. Jeong, Seoul National University
A 7.5-to-11.1 Gb/s half-rate referenceless CDR with a compact frequency acquisition scheme is proposed. Using the bang-bang phase-frequency detector with a direct up/dn control, the referenceless CDR is realized by a single-loop architecture. The proposed CDR achieves a wide capture range, low power, and small area.
11-4 A 40-Gbps 0.5-pJ/bit VCSEL Driver in 28nm CMOS with Complex Zero Equalizer
A. Sharif-Bakhtiar, M. G. Lee*,A. Chan Carusone, University of Toronto, *Fujitsu Labs of America
The paper explains a 40Gbps VCSEL driver in 28nm CMOS technology achieving 1.3dBm OMA at record low 0.5pJ/b power efficiency. The transmitter utilizes a new type of low-power equalizer with a pair of tunable complex zeros in its transfer function to compensate for VCSEL electro-optical ringing enabling 40Gbps operation.
11-5 Low-Power CMOS Receivers For Short Reach Optical communication
A. Sharif-Bakhtiar, M. G. Lee*,A. Chan Carusone, University of Toronto, *Fujitsu Labs of America
The paper explains the motivation behind low-bandwidth frontend optical receivers in CMOS. Reported receivers with low bandwidth frontends utilizing DFE, CDS, and integrate-and-dump (ID) are analyzed. Finally design of an ID receiver fabricated in 28nm CMOS with -8.5dBm sensitivity and 0.7pJ/b power efficiency at 20Gbps is explained.
Session 12 – Analog Techniques I
Tuesday, May 2, 9:00 – 12:00, Lady Bird 2 Room
Session Chair: Nagendra Krishnapura, IIT Madras
Session Co-Chair: Ken Suyama, Epoch Microelectronics
12-1 A 0.5V Supply, 49nW Band-Gap Reference and Crystal Oscillator in 40nm CMOS
Abhirup Lahiri, Pradeep Badrathwal, Nitin Jain, Kallol Chatterjee, STMicroelectronics
Operating from 0.5V supply, a band-gap reference (BGR) and 32kHz crystal oscillator (XO) are co-designed in 40nm CMOS process with <49nW power consumption from -40°C to 120°C and temperature coefficients of 8ppm/°C and 0.25ppm/°C, respectively. The power consumptions of both XO and BGR at 120°C are 2x lower than previous works.
12-2 A Start-up Boosting Circuit with 133x Speed Gain for 2-Transistor Voltage Reference
Dongkwun Kim, Wanyeong Jung, Sechang Oh, Kyojin D. Choo, Dennis Sylvester, David Blaauw, University of Michgan
This work presents a start-up boosting circuit designed for fast stabilization of a 2-transistor voltage reference. A clock injection method is used to induce a large bias on the 2-transistor voltage reference resulting in a fast output voltage settling which is critical to reducing initialization time and energy consumption.
12-3 A Precisely-Timed Energy Injection Technique Achieving 58/10/2µs Start-Up in 1.84/10/50MHz Crystal Oscillators
H. Esmaeelzadeh, S. Pamarti, University of California, Los Angeles
A fast start-up crystal oscillator using a precisely-timed injection technique is proposed. The prototype 65nm CMOS IC includes 3 crystal oscillators, targeting 1.84/10/50MHz with measured start-up times of 58/10/2μs while consuming 6.7/45.5/195μW respectively. This corresponds to 15x faster start-up over prior art. For each oscillator, two crystals with different package sizes and Q-factors were tested to verify the technique’s robustness over crystal’s parameters and frequency variations.
12-4 A 0.7V Time-based Inductor for Fully Integrated Low Bandwidth Filter Applications
B. Salz, M. Talegaonkar*, G. Shu**, A. Elmallah, R. Nandwana, B. Sahoo, P. K. Hanmolu, University of Illinois at Urbana-Champaign, *InPhi, **Oracle
A fully digital inductor is demonstrated in 65nm CMOS with wide tuning range and small area. The proposed technique uses novel time-domain signal processing techniques in order to generate an inductance. By realizing the gyrator like so, we are able to achieve small area and take advantage of technology scaling.
12-5 A 0.65mW 20MHz 5th-Order Low-Pass Filter with +28.8dBm IIP3 Using Source Follower Coupling
Y. Xu, J. Muhlestein, U. Moon, Oregon State University
A highly linear continuous-time low-pass filter (LPF) topology using source follower coupling is presented with excellent power efficiency. It synthesizes a 3rd-order low-pass transfer function in a single stage using coupled source followers and three capacitors, and can be configured to 2nd-order by disconnecting a capacitor. A 5th-order Butterworth prototype is designed with a cascade of two stages in 0.18μm CMOS, and occupies a core area of 0.12mm2. Operating with a 1.3V supply, the filter consumes 0.5mA current, and achieves a bandwidth of 20MHz with 82dB stop-band rejection. The measured in-band IIP3 is +28.8dBm. The dynamic range is 74dB, with 15.3nV/√Hz averaged in-band input-referred noise.
Session 13 – Security Circuits and Systems
Tuesday, May 2, 9:00 – 12:00, Lady Bird 3 Room
Session Chair: Swaroop Ghosh, Pennsylvania State University
Session Co-Chair: Xin Li, Carnegie Mellon
13-1 Energy Efficient and Ultra Low Voltage Security Circuits for Nanoscale CMOS Technologies, Sanu Mathew, Sudhir Satpathy, Vikram Suresh, Ram K. Krishnamurthy, Circuit Research Lab, Intel Corporation
Low-area energy-efficient security primitives are key building blocks for enabling end-to-end content protection, user authentication, and consumer confidentiality in the IoT world that is estimated to surpass 50billion smart and connected devices by 2020. This paper describes design approaches that blend energy-efficient circuit techniques with optimal accelerator micro-architecture datapath, and hardware friendly arithmetic to achieve ultra-low energy consumption in security platforms for seamless adoption in area/battery constrained and self-powered systems. Industry leading energy-efficiency is demonstrated with three designs, fabricated and measured in advanced process technologies : 1) A 2040-gate arithmetically optimized composite-field Sbox based AES accelerator achieves 289Gbps/W peak energy-efficiency while offering 432Mbps throughput in 22nm tri-gate CMOS, 2) Hybrid Physically Unclonable Function (PUF) circuit leverages burn-in induced aging to reduce bit-error, followed by temporal-majority-voting, dark-bit masking, and error-correction conditioning techniques to generate a 100% stable full-entropy key with 190fJ/bit energy consumption in 22nm tri-gate CMOS. 3) A light-weight all digital TRNG uses in-line correlation suppressor and entropy-extractor circuits to achieve >0.99 min-entropy with 3pJ/bit measured energy-efficiency while operating down to 300mV in 14nm tri-gate CMOS.
13-2 A DRAM based Physical Unclonable Function Capable of Generating >10^32 Challenge Response Pairs per 1Kbit Array for Secure Chip Authentication
Qianying Tang, Chen Zhou, *Woong Choi, *Gyuseong Kang, *Jongsun Park, Keshab Parhi, and Chris. H. Kim, University of Minnesota,*Korea University
A Physically Unclonable Function (PUF) based on a 65nm logic-compatible DRAM achieves a higher level of security compared to previous memory based PUFs by supporting >10^32 possible challenge response pairs per 1Kbit array. Hardware data shows an intra-chip Hamming Distance (HD) of 0.0039 by utilizing a zero-overhead repetitive write-back technique along with bit-masking. The proposed eDRAM based PUF has a 0.68µm^2 bit cell area and consumes 0.89pJ/bit.
13-3 Trustworthy System-on-Chip Design for Internet of Things
Sandip Ray, NXP Semiconductors
The Internet of Things (IoT) regime arguably began about a decade back, when the number of connected computing devices exceeded the human population. Today our environment includes billions of connected devices, coordinating and communicating to implement applications of the scale of intelligent homes, self-driving automobiles, and smart cities. The trend is towards even more proliferation of these devices with estimates of trillions within the next 15 years, representing the fastest growth for any sector at any time in the human history. Security and trustworthiness are critical requirements for computing systems in IoT applications. In particular, these systems track, collect, and analyze some of our most private, personal information including health, sleep patterns, contact information, browsing patterns, etc. In addition, the system may contain other sensitive assets built-in by the manufacturer, e.g., cryptographic and DRM keys, fuses, etc. It is crucial to ensure that all such sensitive information is protected from malicious, unauthorized access. Consequently, a significant component of development of a modern System-on-Chip (SoC) design is expended on architecting, designing, and validating security mechanisms. In this talk, we will look at security assurance challenges for modern SoC designs targeted for Internet-of-Things applications. Security assurance mechanisms in current industrial practice is a highly complex activity, spanning the entire system life-cycle, and involving trade-offs and collaboration among a large number of stake-holders. We will discuss the gaps between the current state of the practice and the assurance requirements, and some of the research initiatives undertaken to bridge these gaps. Research in the area marries several research topics in computer science and engineering, including architecture, power/performance management, hardware/software co-design, and verification, and the talk will give a flavor of the nature of the cross-cutting research necessary to develop trustworthy computing devices in the IoT era.
13-4 An Area-Efficient Microcontroller with an Instruction-Cache Transformable to an Ambient Temperature Sensor and a Physically Unclonable Function
Teng Yang, Jiangyi Li, Minhao Yang, Peter R. Kinget, Mingoo Seok, Columbia University
This paper presents an very area-efficient SoC design with ambient temperature sensing and PUF operations based on a unique transformation of microcontroller’s I$ to temperature sensor and PUF. It has comparable performances to the state-of-the-art but consumes 9.8X smaller sensor frontend area.
Session 14 – Forum – Self-Sustaining IoTs – Fact or Fiction
Tuesday, May 2, 9:00 – 12:00, Lady Bird Studio Room
Session 15 – Energy Efficient Wireless for 5G and IoT
Tuesday, May 2, 2:00 – 5:30, Lady Bird 1 Room
Session Chair: Woogeun Rhee, Tsinghua University
Session Co-Chair: Swaminathan Sankaran, Texas Instruments
15-1 Energy Efficiency Maxima for Wireless Communications: 5G, IoT, and Massive MIMO
Earl McCune, Eridan Communications
Maximum energy efficiency of any wireless communication link requires a global optimization across the entire block diagram, the signal modulation, and the link operating protocol. Important aspects of signal modulation are presented, followed by protocol aspects needed for link efficiency. Operating temperature consequences of LTE for massive-MIMO arrays are explored.
15-2 An Ultra-Low-Power Wake-Up Receiver with Voltage-Multiplying Self-Mixer and Interferer-Enhanced Sensitivity
Vivek Mangal, Peter R. Kinget, Columbia University
A 0.5V self-mixer-first 550MHz 220nW wake-up receiver in 0.13um CMOS has a -56.4dBm sensitivity at 36.36kbps and an energy consumption of 6.1pJ/bit. A 10-stage voltage-multiplying self-mixer using MOS transistors in weak inversion consumes 2.7nW and offers multi-stage conversion gain at baseband. In the presence of a -43.5dBm PM interferer, the alternate 1.1uW high-frequency baseband path in the receiver offers an enhanced sensitivity of -63.6dBm.
15-3 A 6.1mW 5Mb/s 2.4GHz Transceiver with F-OOK Modulation for High Bandwidth and Energy Efficiencies
Y. Zhang, R. Zhou, W. Rhee, Z. Wang, Tsinghua University
This paper presents an energy/bandwidth efficient frequency-domain OOK (F-OOK) transceiver for short-range communications. The transmitter performs the F-OOK modulation using a PLL based high-point modulator and a constant-envelope PA for low power consumption. The receiver consists of a sliding-IF RF front end and an 8-bit dual-channel ADC. A digital signal processing is done by an off-chip FPGA to provide F-OOK demodulation with an interference robust algorithm based on sliding-window FFT and magnitude comparison methods. A 2.4GHz 5Mb/s frequency-domain OOK (F-OOK) transceiver is implemented in 65nm CMOS. The sensitivity is -96dBm at 5Mb/s. The transceiver consumes 6.1mW from a 0.8V, achieving an energy efficiency of 1.22nJb/s with the bandwidth efficiency of 96%.
Session 16 – Switching Regulators
Tuesday, May 2, 2:00 – 5:30, Lady Bird 2 Room
Session Chair: Jeff Morroni, Texas Instruments
Session Co-Chair: Mike Mulligan, Silicon Laboratories
16-1 A Digital Pulse Width Modulation Closed Loop Control LDMOS Gate Driver for LED Drivers Implemented in a 0.18µm HV CMOS Technology
S. Strache, L. Rolff*, S. Dietrich*, M. Hanhart*, T. Zekorn*, R. Wunderlich*, S. Heinen*, Robert Bosch GmbH, *RWTH Aachen University
For multicolor LED driver applications several integrated HV transistors have to be driven. The presented digital PWM gate driver is based on GCM and employs a digital centric closed loop gate-source voltage control. It directly drives the transistors out of the HV supply without requiring external components.
16-2 A 10MHz 2mA-800mA 0.5V-1.5V 90% Peak Efficiency Time-Based Buck Converter with Seamless Transition between PWM/PFM Modes
S. J. Kim, W. Choi, R. Pilawa, P. K. Hanumolu, University of Illinois at Urbana-Champaign
We present a 10MHz buck converter with enhanced light load efficiency achieved by combining time-based PWM control with PFM. The proposed seamless transition techniques provide freedom of exchanging the control mode between PFM and PWM which greatly enhance system power management. Fabricated in a 65nm CMOS, the prototype achieves 90% peak efficiency and > 80% efficiency over load current range of 2mA to 800mA. VO changes by less than 40mV during PWM to PFM transitions.
16-3 An isolated DC-DC converter with fully integrated magnetic core transformer
Zhao Tianting, Zhuo Yue, Chen Baoxing, Analog Devices
This work presents an isolated DC-DC converter with fully integrated magnetic core transformer. The converter achieves best-in-class efficiency (46%) and EMI performance (pass CISPR22 Class B limit with 10dB margin).
16-4 A 92.1% Efficient DC-DC Converter for Ultra-Low Power Microcontrollers with Fast Wake-up
F. Santoro, R. Kuhn*, N. Gibson*, N. Rasera*, T. Tost*, D. Schmitt-Landsiedel, R. Brederlow*, TUM, Munich, Germany, *Texas Instruments, Freising, Germany
We present a DC-DC converter (1.8-3.3V input / 1.2V output) for integration in an ultra-low power system on chip. The converter is designed to minimize the wake-up energy of the system by reducing the output cap to only 56nF – still guaranteeing an output ripple smaller than 30mV at 2.56mA load.
16-5 Buck Converter with Higher Than 87% Efficiency over 500nA to 20mA Load Current Range for IoT Sensor Nodes by Clocked Hysteresis Control
C.-S. Wu, M. Takamiya, T. Sakurai, The University of Tokyo
A buck converter with newly proposed Clocked Hysteresis Control has been developed that achieves conversion efficiency of 90.4% at 1μA load current and almost flat efficiency in whole load current range. Continuously-on comparators in the conventional hysteresis control is removed to improve the conversion efficiency under light load current conditions while maintaining a fast transient response.
16-6 A 220-mV Input, 8.6 Step-Up Voltage Conversion Ratio, 10.45-W Output Power, Fully Integrated Switched-Capacitor Converter for Energy Harvesting
Luca Intaschi, Francesco Dalena*, Paolo Bruschi, Giuseppe Iannaccone, University of Pisa, Dialog Semiconductor
We present a 10.45 W fully integrated step-up switched-capacitor DC-DC converter for energy harvesting, with 8.6 voltage-conversion ratio and 37.4% power-conversion efficiency from a 220 mV voltage source. The circuit, implemented in 55nm CMOS, can supply power to a Bluetooth beacon with a thermoelectric generator exploiting a 3.5°C temperature difference.
16-7 A 1.2A Auto-Configurable Dual-Output Switched-Capacitor DC-DC Regulator with Continuous Gate-Drive Modulation Achieving ≤0.01mV/mA Cross Regulation
Z. Hua and H. Lee, University of Texas at Dallas
An auto-configurable 2-output SC DC-DC regulator in 0.13µm CMOS is reported. The continuous gate-drive modulation allows the converter being the first capable of handling 100s-of-mA load/output with minimized output cross regulation (OCR) and the use of small required load capacitance. The proposed regulator supports 600mA/output load with only a 2.2µF capacitor, offers 87.6% peak power efficiency, and achieves >4x and 3.4x reductions in the OCR and total passive volume compared to prior SIMO converters.
16-8 Fully Tunable Software Defined DC-DC Converter with 3000X Output Current & 4X Output Voltage Ranges
Saurabh Chaubey Ramesh Harjani
This paper presents a fully integrated, software defined capacitive DC-DC converter. The converter implements K-F-C tuning (K = conversion ratio, F = frequency and C = capacitance) in real time so as to accommodate any output load. It has a 4X tunable output voltage, supports a 3269X output load current range while achieving a peak efficiency of 82.1%. This design introduces an accumulation floating junction MOS capacitor that is used for the 18.3 fF/ m2 bucket-capacitors with less than 2 A/mm2 leakage. This leakage is 40X lower than standard MOS capacitors. The converter transforms a 1.0V input to a 0.25-0.95V output for a 0.13mA-425mA load range while maintaining better than 70% efficiency. The power density for better than at 70% efficiency is 1.05W/mm2 (@ Vout=430mV). Load regulation is implemented using capacitive and frequency tuning in digital and analog domains respectively. The design was fabricated in TSMC GP 65nm.
Session 17 – Non-Traditional Computing Hardware
Tuesday, May 2, 2:00 – 5:30, Lady Bird 3 Room
Session Chair: Axel Thomsen, Cirrus Logic
Session Co-Chair: Dinesh Somasekhar, Intel
17-1 Hardware for Machine Learning: Challenges and Opportunities
V. Sze, Y.-H. Chen, J. Emer, A. Suleiman, Z. Zhang, Massachusetts Institute of Technology
17-2 A Scalable Time-based Integrate-and-Fire Neuromorphic Core with Brain-Inspired Leak and Local Lateral Inhibition Capabilities
Muqing Liu, Luke R. Everson, and Chris H. Kim, University of Minnesota
A fully scalable light-weight integrate-and-fire neuromorphic core with brain-inspired leak and local lateral inhibition features is implemented in 65nm. The core computes the neural net algorithm entirely in the time domain using standard digital circuits. A parallel two-layer architecture realized using the proposed core achieves a 91% digit recognition accuracy.
17-3 Temperature-Insensitive Analog Vector-by-Matrix Multiplier Based on 55 nm NOR Flash Memory Cells
X. Guo, F. Merrikh Bayat, M. Prezioso, Y. Chen*, B. Nguyen*, N. Do*, D. B. Strukov, UC Santa Barbara, *Silicon Storage Technology Inc.
We have fabricated , 85 °C.
17-4 Analog In-Memory Subthreshold Deep Neural Network Accelerator
L. Fick, D. Blaauw, D. Sylvester, University of Michigan, S. Skrzyniarz, M. Parikh, D. Fick, Isocline Engineering
Low duty-cycle mobile systems could benefit from ultra-low power DNN accelerators. Analog in-memory computational units store synaptic weights in on-chip non-volatile arrays to perform subthreshold current calculations. In-memory computation entirely eliminates off-chip weight accesses and amortizes read power through current re-use. The proposed system consumes 900nW in a 130nm process.
17-5 A 4-mm$^2$ 180-nm-CMOS 15-Giga-Cell-Updates-per-Second DNA Sequence Alignment Engine Based on Asynchronous Race Conditions
A. Madhavan, T. Sherwood, D. B. Strukov, University of California, Santa Barbara
2X2mm chip of a Race Logic based system, which uses race conditions for accelerating DNA sequence alignment. In Race Logic, information is encoded in propagation delay and the computation is performed by observing outcome of races in a configurable circuit. Performance and power results reported show favourable comparison against state-of-the-art.
17-6 Using Quantum Emulation for Advanced Computation
Brian R. La Cour, Granville E. Ott, S. Andrew Lanham, Applied Research Laboratories, The University of Texas at Austin
A novel concept for advanced computation is considered using an analog electronic emulation of a gate-based quantum computer. We discuss a general classes of problems for which such a device is well suited, examine the expected computational speedup versus bandwidth, and describe the measured performance of a small-scale hardware prototype.
Session 18 – Panel – Your Favorite Analog/Mixed-signal/RF Circuits
Tuesday, May 2, 2:00 – 5:30, Lady Bird Studio Room
Session 19 – High-Performance and Low-Power Frequency Generation
Tuesday, May 2, 2:00 – 5:30, Lady Bird 1 Room
Session Chair: Yanjie Wang, Intel
Session Co-Chair: Hua Wang, Georgia Tech
19-1 Multi-Phase Sub-Sampling Fractional-N PLL with Soft Loop Switching for Fast Robust Locking
Dongyi Liao, Fa Foster Dai, Bram Nauta*, and Eric Klumperink*, Dept. of Electrical and Computer Eng., Auburn University, USA *University of Twente, Enschede, Netherlands
This paper presents a low phase noise sub-sampling PLL (SSPLL) with multi-phase outputs. Automatic soft switching between the sub-sampling phase loop and frequency loop is proposed to improve robustness. A quadrature LC oscillator with capacitive phase interpolation network is employed to achieve fractional-N frequency synthesis.
19-2 A 0.8-1.3 GHz Multi-phase Injection-locked PLL Using Capacitive Coupled Multi-ring Oscillator with Reference Spur Suppression
Ruixin Wang, Fa Foster Dai, Auburn University
This paper presents an inductor-less injection-locked PLL (IL-PLL) using capacitive coupled multi-ring oscillator (MRO). With a 50 MHz reference, the MRO IL-PLL generates 24 multi-phase outputs covering 800-1.3 GHz with reference spur of -63 dBc,in-band phase noise of -121 dBc/Hz @ 1MHz offset and 513 fs jitter.
19-3 A 330μW 1.25ps 400fs-INL Vernier Time-to-Digital Converter with 2D Reconfigurable Spiral Arbiter Array and 2nd-Order ΔΣ Linearization
H. Wang*, F. F. Dai*, H. Wang**, *Auburn University, **Georgia Institute of Technology College of Engineering
Paper presents an 8-bit 1.25ps Vernier TDC with 2D reconfigurable spiral arbiter array. The 2D spiral arbiter array improves both linearity and detection range. The quantization errors are minimized using a reconfigurable arbiter array with 2nd order SDM. The prototype consumes 0.3mW under a 1V supply achieving 0.4ps INL.
19-4 A 350uW 2GHz FBAR transformer coupled Colpitts oscillator with close-in phase noise reduction
Jabeom Koo, Keping Wang, Richard Ruby*, Brian Otis, University of Washington, *Avago technologies Inc.
The proposed oscillator reduces the close-in phase noise as well as power consumption compared to conventional Colpitts by utilizing transformer. Measurement results show 12dB reduction at 100Hz offset frequency with 350uW power consumption, which is almost a half power of conventional oscillator using same 2GHz FBAR device.