

# 2022 PRESS KIT



**ISSCC Press Kit Disclaimer** 

The material presented here is preliminary.

As of November 11, 2021, there is not enough information to guarantee its correctness.

Thus, it must be used with some caution.

# **ISSCC 2022** VISION STATEMENT

The International Solid-State Circuits Conference is the foremost global forum for presentation of advances in solid-state circuits and systems-on-a-chip. The Conference offers a unique opportunity for engineers working at the cutting edge of IC design and use to maintain technical currency, and to network with leading experts.

### **Table of Contents**

| Table of Contents                                                                   | 4  |
|-------------------------------------------------------------------------------------|----|
| Preamble                                                                            | 7  |
| FAQ on ISSCC                                                                        | 7  |
| Plenary Session (Session 1)                                                         | 9  |
| Plenary Session — Invited Papers                                                    | 11 |
| Plenary Session — Invited Papers                                                    | 12 |
| Evening Events (EE)                                                                 | 14 |
| SE1: Student Research Preview                                                       | 14 |
| SE2: The Next-Generation Circuit Designer Workshop                                  | 14 |
| EE3: Semiconductor Supply Chain                                                     | 14 |
| EE4: Bright and Dark Side of Al                                                     | 15 |
| EE5: Shifting Tides of Innovation – Where is Cutting Edge Research Happening Today? | 15 |
| EE6: Next Trillion Dollar Market                                                    | 15 |
| Session Overviews and Highlights                                                    | 16 |
| Conditions of Publication                                                           | 17 |
| PREAMBLE                                                                            | 17 |
| FOOTNOTE                                                                            | 17 |
| Session 2 Overview: Processors                                                      | 18 |
| Session 2: Processors                                                               | 19 |
| Session 3 Overview: Analog Techniques & Sensor Interfaces                           | 21 |
| Session 3 Highlights: Analog Techniques & Sensor Interfaces                         | 22 |
| Session 4 Overview: mm-Wave and subTHz ICs for Communication and Sensing            | 23 |
| Session 4 Highlights: mm-Wave and SubTHz ICs for Communication and Sensing          | 24 |
| Session 5 Overview: Imagers, Range Sensors and Displays                             | 26 |
| Session 5 Highlights: Imagers, Range Sensors and Displays                           | 27 |
| Session 6 Overview: Ultra-High-Speed Wireline                                       | 28 |
| Session 6 Highlights: Ultra-High-Speed Wireline                                     | 29 |
| Session 7 Overview: NAND Flash Memory                                               |    |
| Session 7 Highlights: NAND Flash Memory                                             | 31 |
| Session 8 Overview: Advanced RF Building Blocks                                     | 32 |
| Session 8 Highlights: Advanced RF Building Blocks                                   | 33 |
| Session 9 Overview: High-Quality GHz-to-THz Frequency Generation and Radiation      |    |
| Session 9 Highlights: High-Quality GHz-to-THz Frequency Generation and Radiation    |    |
| Session 10 Overview: Nyquist and Incremental ADCs                                   |    |
| Session 10 Highlights: Nyquist and Incremental ADCs                                 |    |
| Session 11 Overview: Compute-in-Memory & SRAM                                       |    |
| Session 11 Highlights: Compute-in-Memory & SRAM                                     |    |
| Session 12 Overview: Monolithic System for Robot and Bio Applications               |    |

| Session 12 Highlights: Monolithic System for Robot and Bio Applications                          |    |
|--------------------------------------------------------------------------------------------------|----|
| Session 13 Overview: Digital Techniques for Clocking, Variation Tolerance and Power Management   |    |
| Session 13 Highlights: Digital Techniques for Clocking, Variation Tolerance and Power Management | 43 |
| Session 14 Overview: GaN, High-Voltage and Wireless Power                                        |    |
| Session 14 Highlights: GaN, High-Voltage and Wireless Power                                      |    |
| Session 15 Overview: ML Processors                                                               |    |
| Session 15 Highlights: ML Processors                                                             |    |
| Session 16 Overview: Emerging Domain-Specific Digital Circuits and Systems                       |    |
| Session 16 Highlights: Emerging Domain-Specific Digital Circuits and Systems                     |    |
| Session 16 Highlights: Emerging Domain-Specific Digital Circuits and Systems                     | 50 |
| Session 17 Overview: Advanced Wireline Links and Techniques                                      | 51 |
| Session 17 Highlights: Advanced Wireline Links and Techniques                                    | 52 |
| Session 18 Overview: DC-DC converter                                                             | 53 |
| Session 18 Highlights: DC-DC Converters                                                          | 54 |
| Session 19 Overview: Power Amplifiers and Building Blocks                                        | 55 |
| Session 19 Highlights: Power Amplifiers and Building Blocks                                      | 56 |
| Session 20 Overview: Body and Brain Interfaces                                                   | 57 |
| Session 20 Highlights: Body and Brain Interfaces                                                 | 58 |
| Session 21 Overview: Highlighted Chip Releases: Machine Learning and Digital Processing          | 59 |
| Session 21 Highlights: Highlighted Chip Releases: Machine Learning and Digital Processing        | 60 |
| Session 22 Overview: Cryo-Circuits and Ultra-Low-Power Intelligent IoT                           | 61 |
| Session 22 Highlights: Cryo-Circuits and Ultra-Low- Power Intelligent IoT                        | 62 |
| Session 23 Overview: Frequency Synthesizers                                                      | 63 |
| Session 23 Highlights: Frequency Synthesizers                                                    | 64 |
| Session 24 Overview: Low-Power and UWB Radios for Communication and Ranging                      | 65 |
| Session 24 Highlights: Low-Power and UWB Radios for Communication and Ranging                    |    |
| Session 25 Overview: Noise-Shaping ADCs                                                          |    |
| Session 25 Highlights: Noise-Shaping ADCs                                                        | 68 |
| Session 26 Overview: Highlighted Chip Releases: Systems and Quantum Computing                    |    |
| Session 26 Highlights: Quantum Computing Invited Papers                                          |    |
| Session 26 Highlights: Augmented Reality Invited Paper                                           | 71 |
| Session 27 Overview: mm-Wave & Sub-6GHz Transmitters & Receivers for 5G Radios                   | 72 |
| Session 27 Highlights: mm-Wave and sub-6GHz Transmitters & Receivers for 5G Radios               | 73 |
| Session 28 Overview: DRAM and Interface                                                          | 74 |
| Session 28 Highlights: DRAM and Interface                                                        | 75 |
| Session 29 Overview: ML Chips for Emerging Applications                                          |    |
| Session 29 Highlights: ML Chips for Emerging Applications                                        |    |
| Session 30 Overview: Power-Management Techniques                                                 |    |
| Session 30 Highlights: Power Management Techniques                                               |    |
| Session 31 Overview: Audio Amplifiers                                                            | 80 |
| Session 31 Highlights: High-Performance Audio Amplifiers                                         |    |
|                                                                                                  |    |

| Session 32 Overview: Ultrasound and Beamforming Applications   | 82  |
|----------------------------------------------------------------|-----|
| Session 32 Highlights: Ultrasound and Beamforming Applications | 83  |
| Session 33 Overview: Domain-Specific Processors                | 84  |
| Session 33 Highlights: Domain-Specific Processors              | 85  |
| Session 34 Overview: Hardware Security                         | 86  |
| Session 34 Highlights: Hardware Security                       | 87  |
| Trends                                                         | 89  |
| Conditions of Publication                                      | 90  |
| PREAMBLE                                                       | 90  |
| FOOTNOTE                                                       | 90  |
| Analog – 2022 Trends                                           | 92  |
| Data Converters – 2022 Trends                                  | 95  |
| RF – 2022 Trends                                               | 100 |
| Wireless – 2022 Trends                                         | 102 |
| Wireline – 2022 Trends                                         | 104 |
| Digital Circuits – 2022 Trends                                 | 116 |
| Machine Learning (ML) & AI – 2022 Trends                       | 118 |
| Memory – 2022 Trends                                           | 121 |
| Technology Directions – 2022 Trends                            | 128 |
| INDEX                                                          | 129 |
| Technical Topics Mapped to Papers                              | 130 |
| Selected Presenting Companies/Institution Mapped to Papers     | 130 |
| Contact Information                                            | 135 |

#### FAQ on ISSCC

#### What is ISSCC?

ISSCC (International Solid-State Circuits Conference) is the **flagship** conference of the IEEE Solid-State Circuits Society. According to the SIA, the Semiconductor industry generated US\$464 billion in sales in 2020 and ISSCC continues to be the premier technical forum for presenting advances in solid-state circuits and systems. According to the SIA, in 2021, semiconductor sales are expected to reach U.S\$ 522 billion worldwide. Semiconductors are crucial components of electronics devices and the industry is highly competitive. The <u>vear-on-year growth rate</u> in 2021 is expected to see growth rates of 12.5 percent.

#### Who Attends ISSCC?

Attendance at ISSCC 2022 is expected to be around **3000**. Corporate attendees from the semiconductor and system industries typically represent around **60%**.

#### Where is ISSCC?

The 69th ISSCC will be held in-person and virtually from February 20th through February 24th, 2022.

#### Are there Keynote Speakers?

After a day devoted to educational events, ISSCC 2022 begins formally on Monday, February 21, 2022- with four exciting plenary talks:

- Aart de Geus, Chairman & Co-CEO, Synopsys, Mountain View, CA
- Renée James, Founder, Chairman, & CEO, Ampere Computing, Santa Clara, CA
- Marco Cassis, President, Sales, Marketing, Communications & Strategy Development STMicroelectronics, Geneva, Switzerland / Tokyo, Japan
- Inyup Kang, President, Samsung Electronics, Hwaseong, Korea

#### What is the Technical Coverage at ISSCC?

ISSCC covers a full spectrum of design approaches in advanced technical areas broadly categorized as: (1) Communication Systems, (2) Analog Systems, (3) Digital Systems, and (4) Innovations including micro-machines and MEMS, imagers, sensors, biomedical devices, as well as forward-looking developments that may take three or more years for commercialization.

© COPYRIGHT 2022 ISSCC—DO NOT REPRODUCE WITHOUT PERMISSION

#### How are ISSCC Papers Selected?

Currently around 650 submissions are received each year across the broad spectrum of specified topics. Review is by a team of over 150 scientific and industry experts from the Far-East, Europe, and North America. These experts are organized into 12 Sub-Committees that cover the 4 broad areas described earlier:

- Communication Systems includes Wireless, RF, and Wireline Subcommittees
- Analog Systems includes Analog, Power Management, and Data Converter Subcommittees
- Digital Systems includes Memory, Digital Circuits, Digital Architectures and Systems, and Machine Leaning and Al Subcommittees
- Innovative Topics includes Imagers/MEMS/Medical Devices/Displays and Technology Directions Subcommittees

#### What Companies are Presenting this year?

Companies presenting papers at ISSCC 2022 include Analog Devices, Chan Zuckerberg Biohub, Meta Reality Labs, Fujitsu, IBM, Intel, Marvell, MaxLinear, Mediatek, Nano Core Chip Electronic Technology, NXP Semiconductors, Samsung, SK Hynix, Sony, Tenstorrent. TSMC, just to name a few. A more complete list can be found in the Index.

#### Are there educational sessions?

ISSCC features a variety of educational events which include:

- Twelve Tutorials (targeted toward participants looking to broaden their horizon)
- Six Forums (targeted toward experts in an information sharing context)
- One Short Course (targeted toward in-depth appreciation of a current hot topic)

#### Are There Other Events?

A more complete list of all activities at ISSCC 2022:

- Four Plenary Presentations
- Eight Invited Industry Talks on Highlighted Chip Releases
- Technical Sessions (34 distinct sessions)
- Six Special Events and Panels, including:
  - o Semiconductor Supply Chain
  - Bright and Dark Side of Al
  - Shifting Tides of Innovation
  - Next Trillion Dollar Market
  - Next-Generation Circuit Designer Workshop
  - o Student Research Preview (for the introduction of graduate-student research-in-progress)
- Educational Sessions Featuring:
  - o Twelve Tutorials
  - Six Forums
  - One Short Course
- Demonstration Sessions from Academia and Industry
- Networking Events
- Author Interview Sessions
- A Number of University Alumni Events

#### How Do I Use this Press Kit?

The Press Kit provides a PREAMBLE section that features this FAQ and other general information. The kit also includes SESSION OVERVIEWS AND HIGHLIGHTS of all 35 technical sessions into which the 208 papers are grouped, together with brief descriptions and context for each. As well, there is an abstract for each of the Plenary talks. For your convenience, the Kit includes two structural charts in the INDEX section: (a) a list of the 4 Technical Topics and their associated Subcommittees and Sessions; (b) a list of contributing companies and institutions with their associated papers. Thus, to locate information of interest you can access Chart 4.1 to identify sessions of interest, after which you might logically access its Session's Overview or Highlight section. Alternatively, if your interest is in particular organization then Chart 4.1 will direct you immediately to papers of interest each of which is detailed in its corresponding Session Overview and possibly in the Highlights section. For anyone's interest it is useful to use Chart 4.1 to access the appropriate Trends information which provides a broad historical view of the context of your interest and often includes reference to current ISSCC 2022 papers.

#### Anything New This Year?

ISSCC will hold an invited Industry Track (Sessions 21 and 26) which will highlight recent hot-product releases from AMD, Meta Reality Labs, Fujitsu, Google Quantum AI, IBM, Intel, SambaNova Systems, Tenstorrent, and Texas Instruments discuss innovative ways they solved product-level challenges.

#### Overview: ISSCC 2022 – "Intelligent Silicon for a Sustainable World

After a year in a worldwide pandemic and strong environmental changes, the circuit design community is evolving to support a more sustainable world. Without change, Integrated Circuits will become the dominant energy consumer and source of carbon emissions in the future. Low-power circuit design is becoming more sophisticated in all different areas of digital computation, machine learning, and analog design. Wireless and Wireline advances will help optimizing energy of local and global communications. Sensor design can be deployed to measure environmental data and optimize energy consumption. Power management techniques are required in a majority of circuits to extend systems' lifetime. Finally, new technologies are giving some opportunities to develop innovative circuits, reduce fabrication impact, and improve recyclability

#### **Plenary Session (Session 1)**

The Plenary Session on the mornings of February 23 2022, will feature four renowned speakers:

- Aart de Geus, Chairman & Co-CEO, Synopsys, Mountain View, CA
- Renée James, Founder, Chairman, & CEO, Ampere Computing, Santa Clara, CA
- Marco Cassis, President, Sales, Marketing, Communications & Strategy Development STMicroelectronics, Geneva, Switzerland / Tokyo, Japan
- Inyup Kang, President, Samsung Electronics, Hwaseong, Korea

Highlights of these Plenary talks are provided in the following section.

# **ISSCC 2022** Plenary session – invited papers



Chair:

Kevin Zhang, Taiwan Semiconductor Manufacturing Company, Hsinchu, Taiwan ISSCC Conference Chair

Associate Chair: Edith Beigné, Meta, Menlo Park, CA ISSCC International Technical-Program Chair

#### 1.1 Catalysts of the Impossible: Silicon, Software, and Smarts for the Era of SysMoore

Aart de Geus, Chairman & Co-CEO, Synopsys, Mountain View, CA

As we confront global scale challenges with immense intertwined datasets, the distillation of usable insights will require an exponential increase in Al processing capability. The impossibility horizon – sustained by the slowing of Moore's Law – will be pierced by rapid advancements in materials, devices, software, and architecture (the '*SysMoore*' era). Autonomous design *instruments* – super-tools fusing together hundreds of algorithms precision-guided by AI – are unlocking opportunity for circuit designers ushering in a new wave of architectural vitality.

In the follow up to 'Builders of the Imaginary', we will unveil the next chapter in autonomous design, piecing together a new breed of super-monolithic devices, dense interconnects, and chiplets, into software-defined, heterogeneous architectures.

#### 1.2 The Future of the High-Performance Semiconductor Industry and Design

Renée James, Founder, Chairman, & CEO, Ampere Computing, Santa Clara, CA

While the explosive growth of today's modern cloud was fueled by high performance, present- day efficient modern cloud services have moved to a new phase of compute that require scalability and elasticity, while still achieving the highest performance levels to run a myriad of cloud services. The new breed of software underlying today's cloud services is initiating a third phase of compute unencumbered by architectural complexity designed for client- and server-enterprise applications. Initially, cloud computing was able to leverage traditional processor architectures to deliver value to the end customers. However, massive adoption of cloud-based services has amplified the limitations of the incumbent architectures that were designed for a very different software model in client-server enterprises. The requirement of high-performance for cloud computing has fundamentally changed from one of peak performance at a CPU-level to overall performance at the system-level. This system-level performance refers to maximizing system-level throughput while staying within or further reducing power and cost envelopes, and with much higher emphasis on predictable and consistent performance. This cloud-driven computing requires a fundamental shift in the processor, as well as in SOC architectures and designs, and demands continued innovation to stay ahead of cloud computing growth for the next decade. These innovations need to address the entire vertical stack from software, architecture, design, to packaging and manufacturing domains. The paper will discuss a new approach in architectural thinking and design based on cloud computing as the driving force for demand.

#### 1.3 Intelligent Sensing: Enabling the Next "Automation Age"

Marco Cassis, President, Sales, Marketing, Communications & Strategy Development STMicroelectronics, Geneva, Switzerland / Tokyo, Japan

Sensors have undergone extraordinary proliferation since the beginning of the 21st Century. Thanks to IoT, connected smart sensors can now be found all around us. This makes it possible to collect a wealth of data autonomously and continuously without human intervention, automating routine activities while unlocking previously unattainable insights and functionality. As we enter the Automation Age, the information generated from these sensors can be processed and acted on locally to take action in the physical world.

Sensing, artificial intelligence, and actuation will enable autonomous end-to-end system solutions in existing and new application fields including automotive, digital health, agriculture, environmental control, and decarbonization. The Semiconductor Industry is driving this transformation and sensors, smart embedded actuators, analog interfaces, connectivity, security and embedded AI, offer a perfect toolset for companies to continue to innovate. To fuel this innovation, we need to develop energy-efficient, high-accuracy, autonomous, ultra-compact, and trusted ICs. These chips need to feature state-of-the-art system and embedded security techniques to protect the gathered data, its processing and the resulting actuation. New and super-efficient computational hardware technologies supporting AI and machine learning are already transforming at-the-edge data processing and are pushing the envelope on intelligent functionality and IoT network scalability.

Future advances will rely on these evolving IC technologies as well as associated packaging solutions. These will include super-integration, wafer-to-wafer bonding, and system-in-package, to enable the heterogenous integration of multiple technologies.

#### 1.4 The Art of Scaling: Distributed and Connected to Sustain the Golden Age of Computation

Inyup Kang, President, Samsung Electronics, Hwaseong, Korea

The history of the computer has been nothing but miraculous! Thanks to the rapid innovations in semiconductor manufacturing, we have started from gigantic machines that filled an entire room, to low-cost tiny microchips that billons of people can afford and keep in their pockets (or should I say, hands?) all day long. Still, even with this level of progress, mobile devices are barely capable of replicating the "brain" of a jellyfish, and the trend shows that we are already hitting our limits in semiconductor scaling. In this paper, we define the Cost-Performance Ratio (CPR) metric that captures the trend in a single equation. We propose that we shall find solutions in each area of intra-chip, inter-chip, and inter-device level, and highlight the domain-specific computing, 3D packaging, and advanced communication as the main drivers to the next level of computing, satisfying our insatiable need.

# **ISSCC 2022**

# **EVENING EVENTS**



# **Evening Events (EE)**

ISSCC 2022 will continue the popular tradition of evening sessions where experts often of opposing views, discuss topics which range from the lighthearted to the controversial (but always informative and entertaining!). This year's are *"SemiConductor Supply Chain"*, *"Bright and Dark Side of AI"*, *"Shifting Tides of Innovation"*, and *"Next Trillion Dollar Market"*. In addition, ISSCC 2022 will include additional special events including *"Student Research Preview"* and *"Next Generation Circuit Designer Workshop"*.

#### **SE1: Student Research Preview**

#### Sunday, February 20

The Student Research Preview (SRP) will highlight selected student research projects in progress. The SRP consists of 90 second presentations followed by a Poster Session, by graduate students from around the world, which have been selected on the basis of a short submission concerning their on-going research.

The Student Research Preview will include a talk by a distinguished member of the solid-state circuits community.

#### SE2: The Next-Generation Circuit Designer Workshop

This an educational workshop for EE undergraduates and early graduate students (Master's students/1-2 years of PhD) who are interested in choosing a career in integrated circuit (IC) design. The workshop will include a "fireside chat" with invited renown speaker on diversity, bias in the tech industry and how do we overcome that over the course of our career. The selected participants will also hear from current rising stars in the industry, speakers who will motivate why IC design and share their unique paths through circuit design. The participants will also have a chance to present a poster of their work to the ISSCC audience. There will be additional networking events organized throughout the conference. See the ISSCC website for the application requirements and submission link.

#### **EE3: Semiconductor Supply Chain**

#### Monday, February 21

The semiconductor supply chain is a highly specialized global complex system stretching from design houses to manufacturing fabs, to test and assembly units and integration factories which vary by the nature of the company, market and product. The COVID-19 pandemic with component shortages, along with the geopolitical trade conflicts, and the thread of counterfeiting have highlighted the challenges and need for a better orchestrated and diversified management chain. The speakers in this panel, drawn from leading manufacturers and governmental bodies, will address questions like how to ensure continued supply of critical materials and products and whether or not reaching out to diverse manufacturers is a necessity and not a luxury. This event will not go into political controversies but instead have an open discussion about existing bottlenecks, and identify changes that can be made in the semiconductor industry to both grow overall output and be more resilient to macroeconomic conditions.

#### EE4: Bright and Dark Side of AI

#### Monday, February 21

Hype aside, AI is clearly an emerging two-headed monster. This evening panel of AI experts could provide insights on both sides of the AI coin - the bright and dark sides. The goal is to help stimulate a multi-stakeholder AI dialog, gather views from panelists and reflect on transformative ideas with an eye on safety and risk management. What is the golden vision for futuristic AI platforms? What are compelling AI use cases and where are the risks? For instance, AI holds a lot of promise for cybersecurity and human-centric robots. However, these sectors also have some of the highest potential for fallout. AI's dark side capability is that it can be made to mimic human behavior. What are the consequences? The panel will also share forward-looking policy development and on ethical, legal and societal issues related to AI, including socio-economic challenges.

#### EE5: Shifting Tides of Innovation – Where is Cutting Edge Research Happening Today?

#### Tuesday, February 22

As research becomes more complex, multidisciplinary and system oriented, the focal point of innovation has begun to shift. Resource constraints such as people per project or the cost of working in the latest technology node also impact who can participate in cutting edge research. Industry does not have strong incentives to publish their most innovative and competitive work, leaving many in the dark as to what the state of the art is within companies. Even within industry, innovation can come from research divisions, production, or startups. On the academic side, funding bodies and trends can also impact the innovation process. How can we close this gap and who really has the edge? Is industry-guided academic research the way to get the best of both worlds? The traditional debate has been between academia and industry, but there are many more facets to the discussion. This panel explores all the possible sides of how innovation occurs across the industry

#### **EE6: Next Trillion Dollar Market**

#### Tuesday, February 22

This panel assembles a collection of experts in various domains to give their view on what will be the next trillion-dollar market driver for chips. Will it be in IoT/IoE? Wireless/6G? Automotive? AI/machine learning? Quantum computing? Optics? Space exploration? The experts in the panel will discuss the challenges and opportunities for growth in multiple potentially large sectors for the semiconductor industry

# ISSCC 2022 SESSION OVERVIEWS AND HIGHLIGHTS



#### PREAMBLE

The Session Overviews and Highlights to follow serve to capture the context, highlights, and potential impact, of the papers to be presented in each Session at ISSCC 2022 in February.

OBTAINING COPYRIGHT to ISSCC press material is EASY!

You may quote the Subcommittee Chair as the author of the text if authorship is required.

You are welcome to use this material, copyright- and royalty-free, with the following understanding:

- That you will maintain at least one reference to ISSCC 2022 in the body of your text, ideally retaining the date and location. For detail, see the FOOTNOTE below.
- That you will provide a courtesy PDF of your excerpted press piece and particulars of its placement to shahriar@ece.ubc.ca

#### FOOTNOTE

• From ISSCC's point of view, the phraseology included in the box below captures what we at ISSCC would like your readership to know about this, the 69th appearance of ISSCC, on February 20<sup>th</sup> to February 24<sup>th</sup> ,2022.

This and other related topics will be discussed at length at ISSCC 2022, the foremost global forum for new developments in the integrated-circuit industry. ISSCC, the International Solid-State Circuits Conference, will be held virtually on February 20 - February 24, 2022

ISSCC Press Kit Disclaimer

The material presented here is preliminary.

As of November 11, 2021, there is not enough information to guarantee its correctness.

Thus, it must be used with some caution.

#### **Digital Architectures and Systems Subcommittee**

Session Chair: Hugh Mair, MediaTek, Austin, TX

Session Co-Chair: Shidhartha Das, Arm Ltd, Cambridge, UK

Mainstream high performance processors take center stage in this year's conference with next-generation architectures being detailed for x86 and Power<sup>™</sup> processors by Intel, AMD, and IBM. In addition to mainstream compute, the session features groundbreaking work in parallel/array compute, leading off with the massive performance and integration of Intel's Ponte-Vecchio, while a multi-die approach to reconfigurable compute from researchers at UCLA features an ultra-high density die-to-die interface. Mobile processing also marks a milestone this year with the introduction of the Armv9 ISA into flagship smartphones.

- In Paper 2.1, Intel details the Ponte-Vecchio platform for next-generation data center processing, integrating 47 tiles from 5 different process nodes into a single package, including 16 5nm compute tiles. 45TFLOPS of sustained FP32 vector processing is demonstrated alongside 5TB/s of memory fabric bandwidth and >2TB/s of aggregate memory and scale-out bandwidth.
- In Paper 2.2, Intel's next-generation Xeon Scaleable processor utilizing a quasi-monolithic approach to integration in 7nm is
  presented. The 2×2 die array features an ultra-high bandwidth multi-die fabric IO featuring 10TB/s total die-to-die bandwidth
  across the 20 interfaces, while maintaining a low 0.5pJ/b energy consumption.
- In Paper 2.3, IBM's Z series processor advances into 7nm technology, featuring many architectural improvements to best leverage this class of CMOS technology. The processor leverages a large 32MB L2 cache, creating large virtual L3 and L4 caches and a fully synchronous interface that connects two co-packaged 530mm<sup>2</sup> die operating at half the CPU clock.
- In Paper 2.4, IBM describes a 7nm 16-core Power10<sup>™</sup> processor, featuring a series of architectural, design and implementation improvements to ensure continued performance gains. The 602mm<sup>2</sup> die features an impressive 2TB/s bandwidth aggregated across chip-to-chip, DRAM and PCIE interfaces.
- In Paper 2.5, MediaTek unveils their first Armv9 CPUs for flagship mobile applications, featuring a 3.4GHz maximum clock rate and a tri-gear CPU subsystem with out-of-order CPUs for both mid and high-performance gears. Manufactured in 5nm, resource scaling and implementation methodologies of the high-performance gear achieve a 27% performance uplift vs. the mid-gear.
- In Paper 2.6, UCLA demonstrates a 2×2 multi-die reconfigurable processor with a multi-layer switch-box interconnect and ultrahigh-density multi-die interfacing. Fabricated in 16nm and utilizing a silicon interposer, the die-to-die IO features a 10µm bump pitch with each IO circuit occupying 137µm<sup>2</sup> and consuming 0.38pJ/b of energy.
- In Paper 2.7, AMD discusses the micro-architectural features of "Zen 3", providing a unique 7nm-to-7nm (same-node) power and performance comparison to the prior generation. The 68mm<sup>2</sup> die achieves a 19% core IPC improvement through architectural enhancements, coupled with a 6% frequency improvement, yielding an increase in power efficiency by up to 20%.

### **Session 2: Processors**

#### [2.1] Ponte Vecchio: A Multi-Tile 3D Stacked Processor for Exascale Computing

#### [2.6] A 16nm 785GMACs/J 784-Core Digital Signal Processor Array with a Multilayer Switch Box Interconnect, Assembled as a 2×2 Dielet with 10µm-pitch Inter-Dielet I/O, for Runtime Multi-Program Reconfiguration

#### [2.7] Zen3: The AMD 2<sup>nd</sup>-Generation 7nm x86-64 Microprocessor Core

**Paper 2.1 Authors:** W. Gomes<sup>1</sup>, A. Koker<sup>2</sup>, P. Stover<sup>3</sup>, S. Siers<sup>2</sup>, D. Ingerly<sup>1</sup>, S. Venkataraman<sup>4</sup>, C.Pelto<sup>1</sup>, T. Shah<sup>5</sup>, A. Rao<sup>2</sup>, F. O'Mahony<sup>1</sup>, E. Karl<sup>1</sup>, L. Cheney<sup>2</sup>, I. Rajwani<sup>2</sup>, R.Cortez<sup>2</sup>, H. Jain<sup>4</sup>, A. Chandrasekhar<sup>4</sup>, R. Koduri<sup>6</sup>

**Paper 2.1 Affiliation:** <sup>1</sup>Intel, Portland, OR, <sup>2</sup>Intel, Folsom, CA, <sup>3</sup>Intel, Chandler, AZ, <sup>4</sup>Intel, Bengaluru, India, <sup>5</sup>Intel, Austin, TX, <sup>6</sup>Intel, Santa Clara, CA

Paper 2.6 Authors: U. Rathore, S. S. Nagi, D. Markovic

Paper 2.6 Affiliation: University of California, Los Angeles, Los Angeles, CA

Paper 2.7 Authors: T. Burd<sup>1</sup>, W. Li<sup>1</sup>, J. Pistole<sup>1</sup>, S. Venkataraman<sup>1</sup>, M. McCabe<sup>1</sup>, T. Johnson<sup>1</sup>, J.Vinh<sup>1</sup>, T. Yiu<sup>1</sup>, M. Wasio<sup>1</sup>, H-H. Wong<sup>1</sup>, D. Lieu<sup>1</sup>, J. White<sup>2</sup>, B. Munger<sup>2</sup>, J.Lindner<sup>2</sup>, J. Olson<sup>2</sup>, S. Bakke<sup>2</sup>, J. Sniderman<sup>2</sup>, C. Henrion<sup>3</sup>, R. Schreiber<sup>4</sup>, E.Busta<sup>3</sup>, B. Johnson<sup>3</sup>, T. Jackson<sup>3</sup>, A. Miller<sup>3</sup>, R. Miller<sup>3</sup>, M. Pickett<sup>3</sup>, A. Horiuchi<sup>3</sup>, J. Dvorak<sup>3</sup>, S. Balagangadharan<sup>5</sup>, S. Ammikkallingal<sup>5</sup>, P. Kumar<sup>5</sup>

**Paper 2.7 Affiliation:** <sup>1</sup>AMD, Santa Clara, CA, <sup>2</sup>AMD, Boxborough, MA, <sup>3</sup>AMD, Fort Collins, CO, <sup>4</sup>AMD, Austin, TX, <sup>5</sup>AMD, Bangalore, India

#### Subcommittee Chair: Tom Burd, Digital Architectures and Systems Subcommittee

#### CONTEXT AND STATE OF THE ART

- Continued process-technology scaling, circuit-design and system-integration innovation is necessary to deliver the continuing cadence of performance gains and energy-efficiency for data-center compute demands .
- 3D integration delivers on the promise of continued transistor integration however power-delivery, thermal-management and heterogeneous integration remain key technical and engineering challenges.
- System architecture scalability through efficient high-bandwidth connectivity between multiple cores continues to drive SoC. compute-capabilities

#### TECHNICAL HIGHLIGHTS

- Intel presents "Ponte Vecchio", their next-generation datacenter processor for exascale computing demonstrating 3D-stacked heterogenous integration with 100B transistors in 47 tiles and 5 different process nodes
  - The system delivers >45 TFLOPS FP32 performance, >5TB/s of sustained memory fabric bandwidth and >2TB/s of aggregate memory and scale-out bandwidth within a 600W system TDP.
- "Zen 3", powering 2<sup>nd</sup>-generation 7nm microprocessors, featuring eight 4.9GHz cores and a 32MB planar cache is highlighted, with an additional 2TB/s 64MB AMD 3D V-Cache attached via direct copper-to-copper bonding
  - A 19% core IPC improvement is achieved through architectural enhancements, while a 6% frequency improvement at high voltage is achieved through physical design improvement, yielding an increase in power efficiency by up to 20%.

- The University of California, Los Angeles, presents a multi-dielet, reconfigurable processor array in 16nm CMOS, assembled on a 2-layer silicon interconnect fabric with a 10µm-pitch inter-dielet interface with a peak energy efficiency of 785GMACs/J.
  - Each 5.29mm<sup>2</sup> dielet consists of 196 DSP cores, with a 3-layer rapid reconfigurable interconnect operates at 1.1GHz enabling 70.4Gb/s bandwidth at a density of 297Gb/s/mm.

#### APPLICATIONS AND ECONOMIC IMPACT

- Intel and AMD continue to deliver innovations in system integration and circuit-design enabling the next wave of highperformance systems for meeting the performance and bandwidth demands of future cloud-native applications.
- 3D integration demonstrates a realizable roadmap of increasing SoC compute capability and ever-greater transistor integration through integration of heterogeneous process-technologies.
- Reconfigurable processor arrays taking advantage of ultra-high-density die-to-die connectivity in multi-die configurations
  deliver significant energy-efficiency and throughput gains unlocking new applications in data-driven computing.

# Session 3 Overview: Analog Techniques & Sensor Interfaces

#### **Analog Subcommittee**

Session Chair: Viola Schäffer, Texas Instruments, Freising, Germany

Session Co-Chair: Jiawei Xu, Fudan University, Shanghai, China

Subcommittee Chair: Maurits Ortmanns, University of Ulm, Institute of Microelectronics, Ulm, Baden-Württemberg, Germany

Analog circuits and sensor interfaces continue to improve power efficiency without sacrificing speed and noise performance. Three presentations focus on improving the start-up time, phase noise and temperature stability of oscillators and another improves the achievable sample rate by reducing comparator delay with a capacitive bootstrap technique all by means of maintaining power efficiency. State-of-the-art performance is also demonstrated in a shunt-based current measurement IC with improved temperature calibration, in a MEMS Coriolis sensor readout with best published resolution, in a magnetoimpedance-sensor-based ultra-low-noise magnetic sensor readout and in a temperature sensor that maintains its power efficiency up to 180°C.

- In Paper 3.1, Samsung Electronics presents a clock management IC with improved start-up time and start-up energy based on a single crystal oscillator. This is achieved by replacing the RTC crystal with an on-chip RC oscillator, which is calibrated by a duty-cycled machine-learning-based calibration scheme, and a precision injection technique to enhance the start-up performance.
- In Paper 3.2, Samsung Electronics presents a 52MHz temperature-compensated crystal oscillator (TCXO). A variable feedback
  resistor, that is digitally controlled, adjusts the oscillation slope across temperature to compensate for load-capacitance change
  and thereby improves phase noise by 6dB, achieving -158dBc/Hz phase noise at 100kHz offset.
- In Paper 3.3, University of Twente presents a 1GS/s comparator in 22nm FDSOI technology. The comparator has 174µVrms input noise, a sub-500ps CLK-OUT delay, and consumes 75fJ per comparison leading to an FoM of 0.6nJ·µV<sup>2</sup>·ns, which is an order of magnitude better than the state-of-the-art.
- In Paper 3.4, ETH Zürich and Postech introduce a novel R-RC oscillator scheme that achieves a low TC in a wide temperature range from -40°C to 125°C. The proposed oscillator generates a reference frequency of 2.3MHz with a temperature coefficient of 7.93ppm/°C and an energy efficiency of 3.3pJ/cycle.
- In Paper 3.5, TU Delft presents a versatile current sensor with tunable temperature compensation for different types of external shunt resistors, achieving state-of-the-art ±0.25% gain error with ±25A current range and 10kHz bandwidth.
- In Paper 3.6, TU Delft presents a MEMS Coriolis-based mass-flow-to-digital converter, achieving state-of-the-art resolution (100µg/h/√Hz) and zero stability (±0.35mg/h) with both liquids and gases.
- In Paper 3.7, Advanced Industrial Science and Technology discloses an energy-efficient magnetoimpedance-based magnetometer with the DLL-based digital sensitivity calibration and CDS techniques, achieving 10pT/√Hz input referred noise, 2.6mW power consumption and 33kHz bandwidth.
- In Paper 3.8, Hamad Bin Khalifa University presents a CMOS temperature sensor with ±0.45°C (3σ) inaccuracy using 1-point trim from -50°C to 180°C. By utilizing the sub-ranging, double-sampling, and constant biasing techniques, the sensor achieves a resolution-FoM of 7.2pJ·K<sup>2</sup> at 150°C.

## Session 3 Highlights: Analog Techniques & Sensor Interfaces

[3.1] A Single-Crystal-Oscillator-Based Clock-Management IC with 18× Start-Up Time Reduction and 0.68ppm/°C Duty-Cycled Machine-Learning-Based RCO Calibration

#### [3.3] A 174µV Input Noise, 1GS/s Comparator in 22nm FDSOI with a Dynamic-Bias Preamplifier Using Tail Charge Pump and Capacitive Neutralization Across the Latch

Paper 3.1 Authors: Jaehong Jung, Seunghyun Oh, Joomyoung Kim, Gihyeok Ha, Jinhyeon Lee, Seungjin Kim, Euiyoung Park, Jaehoon Lee, Yelim Yoon, Seungyong Bae, Wonkang Kim, Yong Lim, Jongwoo Lee, Thomas Byunghak Cho

Paper 3.1 Affiliation: Samsung Electronics, Hwaseong-si, Korea

Paper 3.3 Authors: Harijot Singh Bindra, Jeroen Ponte, Bram Nauta

Paper 3.3 Affiliation: University of Twente, Enschede, The Netherlands

Subcommittee Chair: Maurits Ortmanns, University of Ulm, Institute of Microelectronics, Ulm, Baden-Württemberg, Germany, Analog

#### CONTEXT AND STATE OF THE ART

- A stable frequency reference is critical for wireless applications. Conventional cellular devices require a high frequency (10's of MHz) main crystal oscillator and a low frequency (32.768kHz) real-time clock with stringent frequency stability requirements. Replacing the low frequency crystal with an on-chip oscillator, calibrated with the main high-frequency oscillator eliminates the cost and board area for the low-frequency crystal.
- Crystal oscillators typically suffer from slow start-up times, increasing average power consumption in duty-cycled systems.
- Comparators are often at the core of analog-to-digital converters and their delay can limit system throughput. Dynamic comparators offer great power efficiency (down to 88pJ) but at the expense of longer delay (680ns) or higher noise(0.45mVrms).

#### TECHNICAL HIGHLIGHTS

- Samsung Electronics presents a 0.68ppm/°C clock-management IC with 18.2× improved start-up time and 6.4× lower start-up energy based on a single crystal oscillator.
  - The RTC crystal is replaced with an on-chip RC oscillator that is calibrated by a duty-cycled machine-learning-based calibration to achieve 0.68ppm/°C with a power consumption of 48.2µW. Furthermore, a precise injection technique is used to reduce the start-up time by 18.2× and energy by 6.4×.
- The University of Twente presents a 1GS/s comparator in 22nm FDSOI technology with 174µVrms input noise and sub-500ps CLK-OUT delay that consumes only 75fJ per comparison.
  - To increase comparator speed with high energy-efficiency, the comparator uses a dynamic bias preamplifier with tail charge pump and a capacitive neutralization technique across the latch input-output.

#### APPLICATIONS AND ECONOMIC IMPACT

• Crystal oscillators typically suffer from slow start-up times. Their long startup times increase average power consumption in duty cycled systems. By reducing start-up time and energy lower power systems can be designed.

Improving clock delay without increasing comparison energy or noise enables higher resolution and sample-rate converters.

# Session 4 Overview: mm-Wave and subTHz ICs for Communication and Sensing

#### **Wireless Subcommittee**

Session Chair: Yiwu Tang, Qualcomm Technologies, San Diego, CA

Session Co-Chair: Ho-Jin Song, Pohang University of Science and Technology, Pohang, Korea

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR

This session presents several advances in the state of the art for mm-wave and subTHz communication and sensing systems. It features papers describing two D-band transceivers for high data-rate communication, a 140GHz monostatic radar with integrated duplexing and self-interference cancellation, a 23-to-29GHz receiver with autonomous spatial-notch steering technique, a 98×98 antenna reflect array at 256GHz, a 430GHz concurrent transceiver pixel array, a 300GHz low-power heterodyne receiver with integrated clock generation and a radiation-hardened 8-element phased-array receiver for sat com.

- In Paper 4.1, Nokia Bell Labs and L3 Harris present a D-band transceiver chipset that is scalable in two dimensions in 0.13µm SiGe BiCMOS for a phased-array-on-glass module supporting a data rate up to 30Gb/s with 64-QAM modulation. The transmitter and receiver consume 1.1W and 0.8W, respectively.
- In Paper 4.2, Intel Corporation presents a fully integrated D-band transmitter achieving a data rate of 160Gb/s and efficiency of 1.1pJ/b in 22nm FinFET technology. The IC integrates a wideband RF-DAC with embedded 4:1 multiplexer, a sub-sampling PLL, frequency triplers, LO buffers, a wideband two-stage PA, and on-chip SRAM/PRBS for high-speed data generation.
- In Paper 4.3, Massachusetts Institute of Technology, South China University of Technology, Tsinghua University, University of Technology Sydney and MIT Lincoln Laboratory present a 140GHz monostatic radar with on-chip antenna. The integrated inherent-low-loss duplexing and adaptive self-interference cancellation achieve 33.3dB of total TX/RX isolation over 14GHz bandwidth. The achieved transmitter output power and receiver noise figure are 11.2dBm and 12.9dB, respectively.
- In Paper 4.4, Delft University of Technology presents a 23-to-29GHz four-element receiver in 40nm CMOS with mm-wave N-Input-N-Output spatial notch filtering and autonomous notch-steering to achieve 20-to-40dB mm-wave spatial rejection, -14dBm in-notch IP<sub>1dB</sub> and 4.8dB minimum noise figure while consuming 56mW/element.
- In Paper 4.5, Massachusetts Institute of Technology and Intel Corporation present a 98×98 antenna reflect array at 265GHz and demonstrate electronically steered beams with 1 degree beamwidth. The array uses 1-bit phase shifters and in-pixel memory for sidelobe and beam squint mitigation during raster scanning.
- In Paper 4.6, University of Texas at Dallas and Oklahoma State University presents a 430GHz 1×3 concurrent transceiver pixel array in 65nm CMOS. It achieves 3-meter reflection-mode active imaging with a peak EIRP of -4dBm and minimum DSB noise figure of 39dB at a 28.6mW DC power consumption per pixel.
- In Paper 4.7, University of California Los Angeles presents a 300GHz heterodyne receiver with integrated clock generation circuitry consisting of a 270GHz fundamental-mode sub-sampling PLL with offset mixing driven by a 108GHz PLL and a 54GHz PLL in 28nm CMOS technology. The receiver achieves 18dB gain and 20dB noise figure with 52mW power consumption.
- In Paper 4.8, Tokyo Institute of Technology presents a radiation-hardened Ka-band 8-element phased-array receiver for small satellite constellations in 65nm CMOS. The proposed receiver consumes 3.4mW/path while achieving 0.06dB/Mrad and 0.4°/Mrad total ionizing dose (TID) gain and phase radiation tolerance, respectively.

## Session 4 Highlights: mm-Wave and SubTHz ICs for Communication and Sensing

# [4.2] A Fully Integrated 160Gb/s D-Band Transmitter with 1.1pJ/b Efficiency in 22nm FinFET Technology

# [4.3] A 140GHz Transceiver with Integrated Antenna, Inherent-Low-Loss Duplexing and Adaptive Self-Interference Cancellation for FMCW Monostatic Radar

**Paper 4.2 Authors:** Steven Callender<sup>1</sup>, Amy Whitcombe<sup>1</sup>, Abhishek Agrawal<sup>1</sup>, Ritesh Bhat<sup>1</sup>, Mustafijur Rahman<sup>1,2</sup>, Chun C Lee<sup>1,3</sup>, Peter Sagazio<sup>1</sup>, Georgios Dogiamis<sup>4</sup>, Brent Carlton<sup>1</sup>, Mark Chakravorti<sup>1</sup>, Stefano Pellerano<sup>1</sup>, Christopher Hull<sup>1</sup>

**Paper 4.2 Affiliations:** <sup>1</sup>Intel, Hillsboro, OR, <sup>2</sup>now with IIT Delhi, New Delhi, India, <sup>3</sup>now with Nebula Microsystems, Richardson, TX, <sup>4</sup>Intel, Chandler, AZ

**Paper 4.3 Authors:** Xibi Chen<sup>1</sup>, Muhammad Ibrahim Wasiq Khan<sup>1</sup>, Xiang Yi<sup>1,2</sup>, Xingcun Li<sup>1,3</sup>, Wenhua Chen<sup>3</sup>, Jianfeng Zhu<sup>4</sup>, Yang Yang<sup>4</sup>, Kenneth Kolodziej<sup>5</sup>, Nathan M. Monroe<sup>1</sup>, Ruonan Han<sup>1</sup>

**Paper 4.3 Affiliations:** <sup>1</sup>Massachusetts Institute of Technology, Cambridge, MA, <sup>2</sup>South China University of Technology, Guangzhou, China, <sup>3</sup>Tsinghua University, Beijing, China, <sup>4</sup>University of Technology Sydney, Ultimo, Australia, <sup>5</sup>MIT Lincoln Laboratory, Cambridge, MA

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR, Wireless

#### CONTEXT AND STATE OF THE ART

- Research interest continues to increase in mm-wave and sub-terahertz bands to improve performance in various application fields such as communications, radars, and sensing.
- Large available bandwidth at sub-terahertz frequencies provides great potential for extremely high throughputs in communication systems and cm-resolution in sensing applications. However, the design of radio transceivers operating at subTHz frequencies with low noise, low power and high level of integration is inherently challenging, especially in high-volume, low-cost process technologies like CMOS.

#### TECHNICAL HIGHLIGHTS

- Intel Corporation presents a 16-Gb/s D-band transmitter in 22nm FinFET technology that can enable the ultra-fast next generation wireless communication systems
  - The proposed chip achieves the highest published data rate of 160Gb/s with an efficiency of 1.1pJ/b at 10<sup>-3</sup> BER. This work demonstrates a high level of integration, including the subTHz frequency synthesizer, the baseband digital-to-analog converter, as well as the subTHz front-end components.
- MIT, South China University of Technology, Tsinghua University, University of Technology Sydney and MIT Lincoln Laboratory present a 140GHz monostatic radar based on an on-chip inherent-low-loss duplexer and adaptive self-interference cancellation
  - The 140GHz radar in CMOS achieves intrinsic alignment of transmitted and received beams by the monostatic operation enabled by the proposed on-chip turnstile antenna, dual-slot inherent-low-loss duplexer and adaptive selfinterference cancellation. It achieves 33.3dB of total TX/RX isolation over 14GHz bandwidth.

#### APPLICATIONS AND ECONOMIC IMPACT

- By leveraging carrier frequencies above 100GHz and large chunks of available spectrum, sub-terahertz and ultra-broadband chipset solutions can enable 100+Gb/s wireless data-rate for next generation communication standards. Dense integration and co-design with packaging solutions that can enable low-cost phased-array systems is critical for commercial, high-volume implementations.
- Wireless sensing technologies are already having a strong impact on our society, as for example automotive radars improving safety on the roads. Sub-terahertz sensing and radar in CMOS have the potential to achieve high-resolution vital-sign sensing and security imaging. This would allow the development on many more applications that can be implemented in every day devices at low cost, thanks to the high level of integration and high volume offered by CMOS technology.

# Session 5 Overview: Imagers, Range Sensors and Displays

#### Imagers, Medical, MEMS and Displays Subcommittee

Session Chair: Mutsumi Hamaguchi, Sharp, Nara, Japan

Session Co-Chair: Seong-Jin Kim, Ulsan National Institute of Science and Technology, Ulsan, Korea

This session covers diverse imagers and range sensors for 2D and 3D imaging applications, and a driver IC for mobile displays. It begins with a paper describing a photon-counting 1Mpixel 143dB-DR image sensor based on charge-focusing SPAD. Two SPAD-based flash LiDAR papers present background light rejection capability, first a reconfigurable 64x64 pixels with inter-pixel coincidence detection and first-last event detection schemes followed by an 80×60 in-pixel histogramming TDC with quaternary searched time gating and  $\Delta$ -intensity phase detection. An iToF range sensor approaching sub-mm precision is then presented. After the smallest all-directional autofocus (AF) pixels are described, a dual-mode imager on a single sensor with high-resolution viewing and low-power always-on motion detection is presented. An ADC-less energy-efficient image sensor is introduced, followed by an aggressive pixel scaling resulting in the smallest-to-date 0.56µm pixels. The session ends with a display driver IC with the smallest channel size of 2688µm<sup>2</sup>.

- In Paper 5.1, Canon presents a 3D-BSI 1Mpixel charge-focusing SPAD image sensor for single-photon-sensitive HDR imaging applications. The sensor enables scalable implementation of photon-counting pixels with <0.4W power consumption and 143dB dynamic range based on pixel-wise exposure control and adaptive clocked recharging techniques.
- In Paper 5.2, Fondazione Bruno Kessler describes a reconfigurable 64×64 SPAD imager developed for spacecraft navigation and landing. Inter-pixel coincidence detection, sensitivity control, and first-last event detection techniques provide high background rejection and multiple acquisition modes to improve SNR.
- In Paper 5.3, Ulsan National Institute of Science and Technology reports an in-pixel histogramming TDC architecture based on a quaternary searching algorithm that operates 2× faster than the binary searching method. The inherent time-gating and differential intensity phase detection suppresses 30klux background light, achieving 1.5mm depth resolution.
- In Paper 5.4, Shizuoka University presents a ToF range imager with charge-injection reference plane sampling (CI-RPS) for sub-mm range precision. The CI-RPS generates a charge-injected pseudo photocurrent to reduce jitters due to both the gating drivers and light trigger circuits, demonstrating a high precision of 38µm in a 10-frame average.
- In Paper 5.5, Samsung Electronics shows the world's smallest 1.0µm dual pixel based on inter-PD overflow and in-pixel DTI, achieving the same image quality as that of the single pixel with the full-well capacity of 10,000e<sup>-</sup> and high AF contrast ratio of 3.5. This pixel is rotated and placed horizontally and vertically to enable an all-directional AF.
- In Paper 5.6, Sony describes a single-chip solution with floating-diffusion binning to support high-resolution viewing images and lower resolution computer vision images. In-frame dynamic voltage and frequency scaling minimizes power consumption to 790µW for full-color VGA and 120µW for motion detection at 1fps.
- In Paper 5.7, Yonsei University introduces a fully digital time-mode CMOS image sensor based on the token readout and repeated-scanning scheme. The scheme removes the TDC and the high-speed interface, realizing energy efficiency of 22.9pJ/frame-pixel with 92dB dynamic range.
- In Paper 5.8, Samsung Electronics demonstrates a 64Mpixel CMOS image sensor with the smallest-to-date 0.56µm-pitch pixels separated by full-depth deep-trench isolation (FDTI) with a high-k dielectric layer, achieving comparable performance to the previous generation of 0.64-µm pixel. An air-embedded fence structure also improves optical crosstalk and sensitivity.
- In Paper 5.9, KAIST presents a display source-driver IC with LSB-stacked LV-to-HV-amplify 10b DAC. The mismatch-insensitive 4×-multiplier with low-voltage compact TRs and the 2b-LSB stack-up scheme occupy the smallest-to-date 1-channel area of 2688µm<sup>2</sup> and extend the DAC resolution from 8b to 10b with high-voltage output.

### Session 5 Highlights: Imagers, Range Sensors and Displays

#### [5.1] A 0.37W 143dB-Dynamic-Range 1Mpixel Backside-Illuminated Charge-Focusing SPAD Image Sensor with Pixel-Wise Exposure Control and Adaptive Clocked Recharging

#### [5.8] A 64Mpixel CMOS Image Sensor with 0.56µm Unit Pixels Separated by Front Deep-Trench Isolation

Paper 5.1 Authors: Y. Ota, K. Morimoto, T. Sasago, M. Shinohara, Y. Kuroda, W. Endo, Y. Maehashi, S. Maekawa, H. Tsuchiya, A. Abdelghafar, S. Hikosaka, M. Motoyama, K. Tojima, K. Uehira, J. Iwata, F. Inui, Y. Matsuno, K. Sakurai, T. Ichikawa

Paper 5.1 Affiliation: Canon, Kanagawa, Japan

Paper 5.8 Authors: S. Park, C. Lee, T. Lee, S. Park, H. Park, D. Park, M. Heo, I. Park, H. Yeo, Y. Lee, B. Lee, D-C. Lee, J. Kim, J. Lee, S-I. Kim, I-S. Joe, J. Park, T. Kim, C. K. Chang, J. Kim, J. Lee, H. Kim, C-R. Moon, H-S. Kim

Paper 5.8 Affiliation: Samsung Electronics, Hwaseong, Korea

Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

#### CONTEXT AND STATE OF THE ART

- SPAD devices are not only used for 3D imaging with time-of-flight sensors, but are also being incorporated into color image sensors with high dynamic range required in 2D imaging applications such as security and automotive.
- Innovations in process technologies and circuit techniques for CMOS image sensors push the limits of pixel scaling without degrading key features such as full-well capacity and noise performance.

#### TECHNICAL HIGHLIGHTS

- Canon presents a 3D-stacked SPAD image sensor based on pixel-wise exposure control and adaptive clocked recharging, achieving much lower power consumption than previous state-of-the-art photon counting SPAD sensors
  - A 1Mpixel charge-focusing SPAD imager achieves 20× dynamic range (DR) extension and 50× power reduction in a high illumination scenario compared with the prior art, showing 143dB DR and <0.4W power consumption</li>
- Samsung continues aggressive pixel scaling by virtue of advanced technologies such as full-depth deep-trench isolation with a high-k dielectric layer and an air-embedded fence structure, realizing the smallest pixel to date
  - The smallest-to-date 0.56µm pixel is demonstrated with advanced process technologies, maintaining the full-well capacity of 6000e- and YSNR of 30dB, while reducing dark current by 30%

#### APPLICATIONS AND ECONOMIC IMPACT

- A photon-counting SPAD imager is one of the strongest candidates for high-dynamic-range imaging in security, automotive, and medical applications
- Ultra-high-resolution image sensors are in high demand for high-end mobile devices equipped with multiple camera modules

#### **Wireline Subcommittee**

Session Chair: Thomas Toifl, Cisco Systems, Thalwil, Switzerland

Session Co-Chair: Amir Amirkhany, Samsung Display America Lab, San Jose, CA

#### Subcommittee Chair: Yohan Frans, Xilinx, San Jose, CA

With an insatiable demand for bandwidth in networking and computing, wireline links are pushing the limits of achievable data rates while consuming less energy and area. The first paper in the session describes the design of an ADC-based 224Gb/s PAM-4 receiver circuit, extending the data rate by 2× compared to prior art. The second paper demonstrates the current state of the art of 112Gb/s ADC-DSP based transceivers, achieving transmission over a 50dB-loss channel while consuming only 4.5pJ/b. Papers 6.3-6.6 demonstrate how mixed (analog/digital) implementations can reduce energy consumption and chip area, and enable advanced equalization such as multi-tap DFEs. The session concludes with Paper 6.7, where a 2x50Gb/s link on a plastic waveguide is demonstrated with an integrated frequency tracking loop to enable coherent demodulation at low power consumption.

- In Paper 6.1, Intel demonstrates a 224Gb/s PAM-4 receiver. The receiver features a data path with 18dB gain at 53GHz consisting of an inductively peaked analog front end, a 64-way time-interleaved ADC, a DSP incorporating a 16-tap digital FFE and a CDR loop utilizing a DCO followed by inductive clock distribution. The receiver supports channels with 31dB insertion loss at Nyquist with BER < 1e-5. The power efficiency of the analog part is 1.41pJ/b.</li>
- In Paper 6.2, Marvell demonstrates a DSP-based quad transceiver macro for 112Gb/s PAM-4 operation. The design features an SST-driver based DAC, a wideband CTLE, a 64× interleaved ADC and DSP, and achieves a bit error rate of 1e-5 over a 50dB-loss channel. Power efficiency of 4.5pJ/b for the combined analog and digital power is demonstrated in measurements.
- In Paper 6.3, Peking University describes a 112Gb/s mixed-signal transceiver design in 28nm CMOS. The TX uses a 4:1 MUX with a pre-charging state to increase speed, while the receiver features an analog 16× interleaved sampled 4-tap FFE to cancel ISI in mid-reach channel applications. The transceiver achieves a power efficiency of 2.29pJ/b and compensates for 20.8dB of channel loss at a bit error rate of < 1e-11.</li>
- In Paper 6.4, Broadcom presents a 60Gb/s configurable PAM-4/NRZ transceiver, where a BER < 7.6e-6 is achieved over a 47.5dB loss channel at 3.03pJ/b power efficiency. The design incorporates a two-stage CTLE and a 14-tap analog DFE, where taps 2-14 are implemented as direct DFE, and a loop-unrolled first tap DFE is incorporated with a constant unity coefficient.</li>
- In Paper 6.5, Marvell reveals an 8×113Gb/s duplex transceiver macro for XSR channels over an MCM. The TX, which is
  implemented as tail-less CML driver, also incorporates 2 roving taps to cancel reflections. The TX-FFE coefficients are adjusted
  via a back-channel driven from the RX. The TRX operates up to 128Gb/s, and achieves BER < 1E-11 for an 80mm MZM channel
  while consuming 1.55pJ/b.</li>
- In Paper 6.6, Intel demonstrates a 58.125Gb/s multi-protocol analog PAM-4 receiver with 16 DFE taps. The RX implements 16 DFE taps for the data samples and 3 DFE taps for edges samples in analog using direct feedback. The RX receives data at 58Gb/s with 19.5mV margin at a bit error rate of 1E-4 and power efficiency of 4.4pJ/b. The transceiver consumes only 0.14mm<sup>2</sup>.
- In Paper 6.7, Point2 Technology presents a 50Gb/s PAM-4 bi-directional link over a plastic waveguide. A 28nm CMOS 70GHz transceiver IC adopts the direct conversion architecture, where an LO phase synchronization loop continuously tracks the phase offset of the LO signals to maximize the output SNR. The link achieves a FoM of 2.8pJ/b/m for transmission of 25Gb/s NRZ data over 1m, and 4.2pJ/b/m for transmission of 50Gb/s PAM-4 data over 1m.

### Session 6 Highlights: Ultra-High-Speed Wireline

#### [6.1] A 1.41pJ/b 224Gb/s PAM-4 SerDes Receiver with 31dB Loss Compensation

**Paper Authors:** Yoav Segal<sup>1</sup>, Amir Laufer<sup>1</sup>, Ahmad Khairi<sup>1</sup>, Yoel Krupnik<sup>1</sup>, Marco Cusmai<sup>1</sup>, itamar Levin<sup>1</sup>, Ari Gordon<sup>1</sup>, Yaniv Sabag<sup>1</sup>, Vitali Rahinski<sup>1</sup>, Gadi Ori<sup>1</sup>, Noam Familia<sup>1</sup>, Stas Litski<sup>1</sup>, Tali Warshavsky<sup>1</sup>, Udi Virobnik<sup>1</sup>, Yeshayahu Horwitz<sup>1</sup>, Ajay Balankutty<sup>2</sup>, Shiva Kiran<sup>2</sup>, Samuel Palermo<sup>3</sup>, Peng Mike Li<sup>4</sup>, Ariel Cohen<sup>1</sup>

Paper Affiliation: <sup>1</sup>Intel, Jerusalem, Israel, <sup>2</sup>Intel, Hillsboro, OR, <sup>3</sup>Texas A&MUniversity, College Station, TX, <sup>4</sup>Intel, San Jose, CA

Subcommittee Chair: Yohan Frans, Xilinx, San Jose, CA, Wireline

#### CONTEXT AND STATE OF THE ART

- The emergence of cloud computing, machine learning, and artificial intelligence drives rapid growth in datacenter bandwidth which approximately doubles every 3 to 4 years.
- This trend of bandwidth growth requires new electrical interfaces that deliver dramatic increases in SERDES transceiver speeds.

#### TECHNICAL HIGHLIGHTS

- This paper presents a power-efficient 224Gb/s PAM-4 ADC based receiver in a 5nm CMOS process, doubling the data-rate of previously published SerDes.
- It demonstrates the capability to meet bandwidth, jitter and equalization requirements implemented with the best power efficiency and die area compared to all published material.
- It features a hybrid continuous time linear equalizer (CTLE) incorporating both inductive peaking and sourcedegeneration.
- It proposes heavy bandwidth extension topologies employing several types of inductive peaking.

#### APPLICATIONS AND ECONOMIC IMPACT

- It targets next generation Ethernet for chip-to-module applications, envisioned to be the first use-case scenario at this datarate.
- It doubles the data-rate from the current IEEE 802.3ck standard at 106Gb/s.

#### **Memory Subcommittee**

Session Chair: Violante Moschiano, Micron Semiconductor, Avezzano, Italy

Session Co-Chair: Seung-Jae Lee, Samsung Electronics, Hwaseong-si, Korea

A 3D NAND Flash memory continues to increase in bit density and performance for both mobile and storage applications. Word line layer numbers now exceed 220 layers. The 4b/cell architecture has matured and been widely used, as evidenced by papers in this session: confirming strong industry and market interest. 3D NAND has been proposed for similar vector matching to reduce search latency and energy.

- In Paper 7.1, Western Digital and KIOXIA describe 162-layer 1Tb 4b/cell 3D Flash memory with a 15Gb/mm<sup>2</sup> storage density. Achieving a program throughput of 60MB/s and a t<sub>R</sub> of 65µs by employing a 8kB WL central stair structure and a contactthrough-WL architecture. The design supports an 2.4Gb/s IO speed. Power management features are used to improve system parallelism and high-speed wafer-level testing.
- In Paper 7.2, Micron Technology describe a 176-layer 1Tb 4b/cell 3D Flash memory with quad-plane concurrent read. Program performance is improved by introducing a dual-verify technique. Signal noise and power is improved with a modular design architecture. Finally, peak power management and IO duty cycle calibration is added to improve system parallelism and the data transfer rate.
- In Paper 7.3, SK Hynix describes 176-layer 1Tb 4b/cell 3D Flash memory using concurrent program sensing for a shorter program time. Ground noise compensation is used to minimize read bias voltage fluctuation during independent plane read operations and cache reads. The read-out bit-error rate is reduced by an algorithm that finds the optimal read level.
- In Paper 7.4, Samsung electronics shows a 1Tb 3b/cell 3D NAND Flash memory with more than 220 WLs and 11.55Gb/mm2 bit density by CMOS under array technique, 164MB/s write throughput and 2.4Gb/s Interface. To achieve the performance, intrinsic variation compensation schemes and algorithm are applied.
- In Paper 7.5, Macronix presents a 512Gb in-memory-computing 3D NAND Flash memory supporting similar vector matching
  operations for edge AI devices. Pooled sampling, 1-3-3 ordering and a dynamic-feedback summation scheme are introduced
  to increase read robustness and reduce power consumption used by SVM operations.

## Session 7 Highlights: NAND Flash Memory

# [7.1] A 1-Tb 4b/Cell 4-Plane 3D Flash with 162-Layer 68-mm2 Chip Size and 2.4-Gbps IO Speed

**Paper 7.1 Authors**: Jong Hak Yuh<sup>1</sup>, Jason Li<sup>1</sup>, Heguang Li<sup>1</sup>, Yoshihiro Oyama<sup>1</sup>, Cynthia Hsu<sup>1</sup>, Pradeep Anantula<sup>2</sup>, Stanley Jeong<sup>1</sup>, Anirudh Amarnath<sup>1</sup>, Siddhesh Darne<sup>1</sup>, Sneha Bhatia<sup>2</sup>, Tianyu Tang<sup>1</sup>, Aditya Arya<sup>2</sup>, Naman Rastogi<sup>2</sup>, Naoki Ookuma<sup>3</sup>, Hiroyuki Mizukoshi<sup>3</sup>, Alex Yap<sup>1</sup>, Demin Wang<sup>1</sup>, Steve Kim<sup>1</sup>, Yonggang Wu<sup>1</sup>, Min Peng<sup>1</sup>, Jason Lu<sup>1</sup>, Tommy Ip<sup>1</sup>, Seema Malhotra<sup>2</sup>, David Han<sup>1</sup>, Masatoshi Okumura<sup>1</sup>, Jiwen Liu<sup>1</sup>, John Sohn<sup>1</sup>, Hardwell Chibvongodze<sup>3</sup>, Muralikrishna Balaga<sup>2</sup>, Aki Matsuda<sup>1</sup>, Chakshu Puri<sup>2</sup>, Chen Chen<sup>1</sup>, Indra K V<sup>2</sup>, Chaitanya G<sup>2</sup>, Venky Ramachandra<sup>1</sup>, Yosuke Kato<sup>3</sup>, Huijuan Wang<sup>1</sup>, Farookh Moogat<sup>1</sup>, In-Soo Yoon<sup>1</sup>, Kazushige Kanda<sup>4</sup>, Takahiro Shimizu<sup>4</sup>, Noboru Shibata<sup>4</sup>, Takashi Shigeoka<sup>4</sup>, Kosuke Yanagidaira<sup>4</sup>, Takuyo Kodama<sup>4</sup>, Ryo Fukuda<sup>4</sup>, Yasuhiro Hirashima<sup>4</sup>, Mitsuhiro Abe<sup>4</sup>

**Paper 7.1 Affiliation :** <sup>1</sup>Western Digital, Milpitas, CA, <sup>2</sup>Western Digital, Bangalore, India, <sup>3</sup>Western Digital, Sakae-ku Yokohama-shi, Japan, <sup>4</sup> KIOXIA, Tokyo, Japan

Subcommittee Chair: Meng-Feng Chang, National Tsing Hua University

#### CONTEXT AND STATE OF THE ART

- CMOS peripheral circuit placed under the array for a 4b/cell architecture and an increase in the word-line-layer number are sustaining the growing 3D NAND bit density
- Circuit techniques to improve device and system performance, and IO speed increases are presented to meet the market requirement for 3D NAND Flash memories

#### TECHNICAL HIGHLIGHTS

- Western Digital and KIOXIA present 1-Tb 4b/cell 3D NAND Flash memory with 162-word-line-layers.
  - 15.0Gb/mm<sup>2</sup> bit density using 162-word-line-layers, a 8kB WL central stair structure, and a contact-through-WL architecture.
  - o 2.4Gb/s IO with an LTT/CTT combo driver and a time-division peak-power management feature.

#### APPLICATIONS AND ECONOMIC IMPACT

 NAND Flash performance and density continue to improve despite technology scaling. 4b/cell technology is maturing to fulfill SSD and mobile Flash memory system requirements.

#### **RF Subcommittee**

Session Chair: Hua Wang, ETH Zürich, Switzerland

Session Co-Chair: Masoud Babaie, Delft University of Technology, The Netherlands

The three papers in this session showcase recent advances in RF-building-block designs. The first paper discusses a wideband LNA that is based on a positive-feedback-based noise-cancelling technique and achieves low NF and low DC power consumption. This is followed by a fractional-N type-I sampling PLL based on a voltage interpolator while accelerating startup of a crystal oscillator. The session concludes with the third paper presenting an FMCW modulator using an RTWO-based ADPLL.

- In Paper 8.1, Nanyang Technological University introduces a wideband LNA in 28nm CMOS with a positive-feedback-based noise-cancelling technique achieving 2.7dB NF within 3.4mW DC power consumption.
- In Paper 8.2, Intel Corporation shows a fractional-N type-I sampling PLL in 22nm FinFET that is based on a voltage interpolator and is reconfigurable to facilitate the startup of a crystal oscillator within 45µs.
- In Paper 8.3, Analog Devices presents an FMCW modulator using an RTWO-based ADPLL that achieves sawtooth chirps with slopes up to 65MHz/µs and a 37kHz rms FM error.

# Session 8 Highlights: Advanced RF Building Blocks

# [8.2] A 2-to-2.48GHz Voltage-Interpolator-Based Fractional-N Type-I Sampling PLL in 22nm FinFET Assisting Fast Crystal Startup

**Paper Authors:** Somnath Kundu, Timo Huusari, Hao Luo, Abhishek Agrawal, Eduardo Alban, Sarah Shahraini, Thao Xiong, Dan Lake, Stefano Pellerano, Jason Mix, Nasser Kurd, Mohamed Abdel-Moneum, Brent Carlton

Paper Affiliation: Intel Corporation, Hillsboro, OR

Subcommittee Chair: Jan Craninckx, imec, Belgium

#### CONTEXT AND STATE OF THE ART

- A high-performance clock generator with extremely low jitter, area, and power consumption is the key building block in the emerging Internet-of-Things (IoT) applications to connect billions of devices.
- Fast shutdown and restart of the crystal-based clocking subsystems is necessary for edge-computing platforms to transition among different power states efficiently.

#### TECHNICAL HIGHLIGHTS

- This work presents a voltage-interpolation technique based on capacitor charge sharing in a type-I sampling PLL to achieve fractional-N operation and, therefore, eliminate the use of a digital-to-time converter or a voltage reference and the associated complex calibration logic. The PLL can also be reconfigured to inject energy into the crystal for a quick and robust startup.
  - A 22nm FinFET prototype of a PLL with a crystal oscillator achieves 45µs startup time and 0.9ps jitter at 2.3GHz in a fractional-N mode.
  - The PLL active area is at least 33% smaller than for reported digital LC PLLs with comparable performance.

#### APPLICATIONS AND ECONOMIC IMPACT

- This design is potentially an enabling technology for ultra-low-power IoT applications.
- Interconnection of billions of IoT devices in the next decade will open the door to numerous exciting applications.

## Session 9 Overview: High-Quality GHz-to-THz Frequency Generation and Radiation

#### **RF Subcommittee**

Session Chair: Jing-Hong Conan Zhan, MediaTek, Taiwan.

Session Co-Chair: Swaminathan Sankaran, Texas Instruments, TX.

High-quality frequency references are essential building blocks for enabling enhanced fidelity and throughput in low-GHz/5G/emerging xG communications and mmWave/THz radar/imaging. This session presents the latest advancements in high-quality frequency generation and efficient radiation on silicon-based processes with a potential impact on maturing 5G, emerging xG, connectivity, and sensing applications, while allowing for high levels of integration and low-cost adoption.

- In Paper 9.1, the University of Pavia presents a VCO operating in series resonance and achieving -138dBc/Hz phase noise at 1MHz offset at 10GHz in a 55nm BiCMOS process. The benefits of series resonance over parallel resonance for a given supply limit is analyzed. Among Si-based oscillators operating in same frequency range, the paper reports >10dB improved phase noise.
- In Paper 9.2, Delft University of Technology demonstrates a dual-core triple-mode VCO in a 22nm FinFET process. The tankquality-factor degradation is circumvented with the innovative construction and optimal switching of the LC tank allowing the VCO to achieve ~200dB area-normalized FoM.
- In Paper 9.3, Tsinghua University reports uniquely scalable design topologies to freely scale oscillator cores for millimeter-wave applications enabling a truly flexible power-for-performance trade-off. A 53.6-to-60.2GHz VCO adopting this technique achieves -136dBc/Hz phase noise at 10MHz offset using 65nm CMOS.
- In Paper 9.4, the University of California, Los Angeles presents a 6-element THz radiator operating at 425GHz in a 90nm BiCMOS process. With its synchronization ability at the fundamental frequency and leveraging the nonlinearity of a PIN-diode to enhance the 5<sup>th</sup>-harmonic generation, the inter-coupled array radiates 18.1dBm with high conversion efficiency, low phase noise, and low area.

# Session 9 Highlights: High-Quality GHz-to-THz Frequency Generation and Radiation

# [9.1] Series-Resonance BiCMOS VCO with Phase Noise of -138dBc/Hz at 1MHz offset from 10GHz and -190dBc/Hz FoM

# [9.2] A 0.049mm<sup>2</sup> 7.1-to-16.8GHz Dual-Core Triple-Mode VCO Achieving 200dB FoM<sub>A</sub> in 22nm FinFET

Paper 9.1 Authors: Alessandro Franceschin, Domenico Riccardi, Andrea Mazzanti

Paper 9.1 Affiliation: University of Pavia, Pavia, Italy

Paper 9.2 Authors: Jiang Gong<sup>1,2</sup>, Bishnu Patra<sup>3</sup>, Luc Enthoven<sup>1,2</sup>, Job van Staveren<sup>1,2</sup>, Fabio Sebastiano<sup>1,2</sup>, Masoud Babaie<sup>1</sup>

Paper 9.2 Affiliation: <sup>1</sup>Delft University of Technology, Delft, The Netherlands, <sup>2</sup>QuTech, Delft, The Netherlands, <sup>3</sup>Intel, Hillsboro, OR

Subcommittee Chair: Jan Craninckx, imec, Belgium

#### CONTEXT AND STATE OF THE ART

- High-quality VCOs are fundamental building blocks for high-throughput-communication and high-fidelity sensing systems.
- Supply voltage traditionally limits the maximum VCO swing making phase noise improvement challenging.
- A wide tuning-range VCO can save silicon area; however, such VCOs typically need to compromise on phase-noise performance and/or power consumption.

#### **TECHNICAL HIGHLIGHTS**

- The University of Pavia presents a series-resonance VCO that improves silicon-based oscillator phase noise by more than 10dB while maintaining comparable figure of merit.
  - The series-resonant VCO in 55nm BiCMOS achieves -138dBc/Hz phase noise @ 1MHz offset at 10GHz.
- Delft University of Technology reports a wide-tuning-range VCO that achieves state-of-the-art performance while occupying compact area
  - Quality-factor degradation is circumvented with the innovative construction and optimal switching of the LC tank allowing the 22nm FinFET VCO to achieve ~200dB area-normalized FoM.

#### APPLICATIONS AND ECONOMIC IMPACT

• The reported VCOs achieve excellent phase-noise performance while occupying compact area. These techniques are desirable and could be used in communication systems, such as 5G-mm-wave and WiFi7, and sensing applications, such as mm-wave radar and THz imaging.

#### **Data Converters Subcommittee**

Session Chair: Jan Westra, Broadcom, Bunnik, The Netherlands

Session Co-Chair: Ping Gui, Southern Methodist University, Dallas, Texas, USA

Nyquist analog-to-digital converters continue to set the trend in the speed vs. efficiency corner of the Schreier figure-of-merit plot. The first four papers in this session present pipelined-SAR architectures, each showing specific strengths in efficiency or speed. The fifth paper presents a very high-resolution SAR converter, pushing the limits of accuracy and dynamic range. The two last papers in this session present incremental ADCs with the low-power and area requirements needed for IoT applications.

- In Paper 10.1, the University of Southern California presents a 10GS/s two-step time-domain ADC, using a delay-tracking pipelined-SAR TDC. The 8b converter reaches a Walden FoM of 25fJ/c, in a very small area of only 2850µm<sup>2</sup> in 14nm CMOS.
- In Paper 10.2, National Cheng Kung University describes a 14b, 130MS/s pipelined-SAR ADC with a distributed averaging correlated level shifting ring amplifier. Consuming only 0.82mW at 130MS/s, the ADC achieves 72.5dB SNDR, yielding a FoM<sub>w</sub> and FoM<sub>s</sub> of 1.8fJ/c and 181.5dB, respectively.
- In Paper 10.3, Tsinghua University and Peking University show a 200MS/s pipelined-SAR ADC using an architecture that is
  optimized to combine kT/C noise cancellation with the use of PVT robust ring amplifiers. The 13b converter reaches an SNDR
  of 67dB, using an input capacitance of only 128fF.
- In Paper 10.4 describes a 260MS/s, 12b pipelined-SAR ADC by Auburn University. The design uses a ring-TDC-based fine quantizer to minimize nonlinearity, save power and achieve PVT-robust automatic scale alignment between voltage and time domains. The converter reaches an efficiency of 3.04fJ/c FoM<sub>W</sub> and 174.8dB FoM<sub>S</sub>.
- In Paper 10.5, Analog Devices presents a high-precision SAR ADC with zero-order mismatch shaping, that makes linearity independent of capacitor matching. Made in a 180nm process, the 24b, 2MS/s precision SAR ADC reaches 0.03ppm INL and 106dB dynamic range.
- In Paper 10.6, Shejiang University and Peking University describe a very efficient 15b self-timed incremental zoom ADC in 55nm CMOS, for IoT applications. The design, which consumes only 5uW from a 1V supply, reaches 50.2fJ/c FoM<sub>w</sub> and 177.3dB FoM<sub>s</sub>.
- In Paper 10.7, Tsinghua University and Peking University present a zoom-incremental-counting ADC with a 10kHz bandwidth. Made in a 28nm CMOS process, the design achieves 103dB SNDR and 100dB full-scale CMRR in only 0.014mm<sup>2</sup>.
## **Session 10 Highlights: Nyquist and Incremental ADCs**

## [10.1] A 10GS/s 8b 25fJ/c-s 2850um<sup>2</sup> Two-Step Time-domain ADC Using Delay-Tracking Pipelined-SAR TDC with 500fs Time Step in 14nm CMOS Technology

## [10.5] A 24b 2MS/s SAR ADC with 0.03ppm INL, 106.3dB DR and FoM=187dB in 180nm CMOS

Paper 10.1 Authors: Juzheng Liu, Mohsen Hassanpourghadi, Mike Shuo-Wei Chen

Paper 10.1 Affiliation: University of Southern California, Los Angeles, CA

Paper 10.5 Authors: Jesper Steensgaard<sup>1</sup>, Richard Reay<sup>2</sup>, Raymond Perry<sup>2</sup>, Dave Thomas<sup>2</sup>, Geoffrey Tu<sup>2</sup>, George Reitsma<sup>2</sup>

Paper 10.5 Affiliation: 1Analog Devices, Sequim, WA, USA 2Analog Devices, Santa Clara, CA

Subcommittee Chair: Michael Flynn, University of Michigan, Ann Arbor, MI

#### CONTEXT AND STATE OF THE ART

- High-speed low-power and area-efficient ADCs with high linearity and dynamic range enable more integration, which is
  required in many emerging applications such as wideband communications, smart sensors, biomedical imaging, and portable
  instrumentation.
- Circuit non-idealities and variability with technology process, supply and temperature have been a fundamental limitation to achieving high performance with robustness.
- Innovations in hybrid architectures, time-domain processing, and circuit techniques for ADCs help break the performance barrier with robustness.

#### TECHNICAL HIGHLIGHTS

- University of Southern California introduces a new delay-tracking pipelined-SAR TDC architecture in 14nm CMOS achieving high speed and low cost with best FOM and smallest reported area among published ADCs with ≥ 10GS/s.
  - A 2× Interleaved 10GS/s two-step time-domain ADC with a delay-tracking pipelining technique to enhance the speed of the SAR TDC reaches a Walden FoM of 16.6fJ/c in a very small area of only 1425um<sup>2</sup> in 14nm CMOS.
- Analog Devices presents a high-precision SAR ADC with zero-order mismatch shaping to make linearity independent
  of capacitor matching and an integrating residue amplifier with minimum noise bandwidth and near-noiseless autozeroing to achieve low noise at low power.
  - The 24b ADC operating at 2MS/s demonstrates robust operation and performance from -40C to +125C without background calibration and achieves 0.03ppm INL.

- Advanced circuit techniques for high-accuracy ADCs with robustness make many new applications possible and efficient.
- High single-channel sampling rate reduces the number of channels needed in high sample-rate scenarios and overall implementation cost.
- Lower power consumption and small silicon area while preserving performance enables the integration of more channels with improved robustness.

#### **Memory Subcommittee**

Session Chair: Eric Karl, Intel, Hillsboro, Oregon, USA

Session Co-Chair: Yasuhiko Taito, Renasas, Tokyo, Japan

Subcommittee Chair: Meng-Fan Chang, National Tsing Hua University, Hsinchu, Taiwan

Compute-in-Memory (CIM) continues to diversify to leverage characteristics of different memory technologies to perform energy-efficient computations in different signal domains. This session covers CIM designs implemented using DRAM, ReRAM, MRAM, PCM and SRAM technologies spanning applications from high-performance computing to edge-AI devices. The first paper introduces a GDDR6-based CIM solution achieving 1TFLOPS with an 8Gb capacity. The next three papers introduce advancements in non-volatile CIM targeted for low-power edge applications: featuring advancements demonstrating encrypted MAC operations, a hybrid SLC/MLC scheme for improved compute accuracy, and a time-domain MAC operation. The fifth paper outlines an ultra-low power SRAM macro for edge applications achieving fW/b in off-state leakage. The final three papers outline high-precision digital and SRAM CIM schemes reaching up to 250TOPS/W.

- In Paper 11.1, SK Hynix describes an 1ynm, GDDR6-based accelerator-in-memory with a command set for deep-learning
  operation. The 8Gb design achieves 1GHz MAC operations with a peak throughput of 1TFLOPS and supports major activation
  functions to improve accuracy.
- In Paper 11.2, National Tsing Hua University presents a 22nm 4Mb STT-MRAM near-memory-computation macro with dataencrypted MAC operation. A 4Mb design with 2304 bus-width achieves a 192GB/s read-and-decryption bandwidth and 25.1 -55.1TOPS/W on an 8b-input, 8b-weight, and 26b-output computation with an operating voltage range of 0.65 - 0.8V.
- In Paper 11.3, TSMC demonstrates a 40nm PCM compute-in-memory macro with a hybrid SLC and MLC configuration. Combining an input-reordering scheme to enhance sparsity, the 2Mb-cell design achieves 20.5 and 65TOPS/W for an 8b-input, 8b-weight, and 19bit-output computation with hybrid 2SLC-3MLC and 4MLC configurations.
- In Paper 11.4, National Tsing Hua University presents a 22nm 8Mb ReRAM compute-in-memory macro with a DC-current-free time-domain MAC operation. The design achieves 61.8TOPS/W on an 8b-input, 8b-weight and 19b-output MAC computation.
- In Paper 11.5, Peking University presents an ultra-low leakage SRAM macro targeted for AIoT sensing platforms. The macro implements adjustable power supply (V<sub>DD</sub> and V<sub>SS</sub>) rails with keeper loading free circuits to maximize leakage reduction using the body-bias effect. The 256kb design achieves 2.53fW/bit leakage and up to 14.1MHz performance in a 180nm technology, and 55.2fW/bit and up to 53.9MHz performance in a 55nm technology.
- In Paper 11.6, TSMC demonstrates a 5nm, fully digital compute-in-memory macro supporting simultaneous MAC and write
  operations with an operating voltage range of 0.5-0.9V. The design utilizes a fully interruptible 12T latch-based storage element
  and has no accuracy degradation in digital-to-analog or analog-to-digital conversion. The macro achieves 254TOPS/W on a 4binput, 4b-weight, and 14b-output MAC computation.
- In Paper 11.7, Peking University presents a 32kb full-precision ADC-less CIM macro in a 28nm technology. The design features dynamic logic compute circuitry, reconfigurable local processing units supporting AND, OR and XOR logic functions, and a postsummation circuit enabling >98% DNN utilization. The macro achieves 27.4TOPS/W on an 8b-input, 8b-weight, and 8-21boutput computation.
- In Paper 11.8, National Tsing Hua University describes a 1Mb SRAM-CIM macro in a 28nm technology achieving 37TOPS/W
  on an 8b-input, 8b-weight, and 22b-output precision MAC computation. This macro introduces a time-domain accumulation
  scheme and a time-to-digital converter that enable a fast access time of 0.3ns/b output precision.

## Session 11 Highlights: Compute-in-Memory & SRAM

[11.1] A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-In-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep Learning Applications

#### [11.3] A 40nm 2M-cell 8b-Precision Hybrid SLC-MLC PCM Computing-in-Memory Macro with 20.5-65.0 TOPS/W for Tiny AI Edge Devices

**Paper 11.1 Authors:** Seong Ju Lee, Dong Yoon Ka, Kyeong Pil Kang, Kyu Young Kim, Jungyeon Kim, Jeong Je Park, Joon Hong Park, Sang Hoon Oh, Jun Yeol Jeon, Gi Moon Hong, Kyu Dong Hwang, Kornijcuk Vladimir, Yong Kee Kwon, Nah Sung Kim, Woo Jae Shin, Jongsoon Won, Min Kyu Lee, Hyun Ha Joo, Hae Rang Choi, Jae Wook Lee, Dong Uc Ko, Dae Han Kwon, Chun Seok Jeong, Ji Eun Jang, II Park, Joo Hwan Cho

Paper 11.1 Affiliation: SK Hynix, Icheon-si, Gyeonggi-do, Korea

**Paper 11.3 Authors:** Win-San Khwa<sup>1</sup>, Yen-Cheng Chiu<sup>2</sup>, Chuan-Jia Jhang<sup>2</sup>, Sheng-Po Huang<sup>2</sup>, Chun-Ying Lee<sup>2</sup>, Tai-Hao Wen<sup>2</sup>, Fu-Chun Chang<sup>2</sup>, Shao-Ming Yu<sup>1</sup>, Tung-Yin Lee<sup>1</sup>, Meng-Fan Chang<sup>1,2</sup>

Paper 11.3 Affiliation: 1TSMC Corporate Research, Hsinchu, Taiwan; 2National Tsing Hua University, Hsinchu, Taiwan

Subcommittee Chair: Meng-fan Chang, National Tsing Hua University, Hsinchu, Taiwan

#### CONTEXT AND STATE OF THE ART

- With continued advances in deep neural network (DNN) applications the energy spent in the movement of data through memory hierarchy is rapidly becoming a dominant factor in system design.
- The concept of processing-in-memory (PIM) in DRAM has been introduced in recent years, integrating bank parallel processing without needing to move data through memory channels.
- For edge computing applications, that are focused on enabling the imminent "internet of everything" era, nonvolatile computein-memory processing has great potential to reduce data transfer energy and to enable extremely low off-state power consumption.

#### TECHNICAL HIGHLIGHTS

- SK Hynix outlines a processor-in-memory system based upon GDDR6 DRAM designed to accelerate deep learning
  applications. Compared to previous commercial offerings based upon HBM technology, this GDDR6 solution enables
  relatively low-cost applications with a peak throughput of 1TFLOP with an I/O bandwidth of 16Gb/s. This solution enables up
  to a 10× speedup relative to emulated GPU systems using HBM2 bandwidth.
- TSMC describes a 2Mb phase-change memory (PCM) compute-in-memory macro enabling 8-bit input and weight MAC functions to support DNN applications. The macro implements single level cells (SLC) for most significant bits and multi-level cells (MLC) for least significant bits within the macro to optimize accuracy, density, and energy efficiency. The measured 8-bit input, 8-bit weight and 19-bit output computation delivers 20.5TOPS/W with a CIFAR-10 benchmark accuracy of 91.89%.

- Compute-in-memory solutions aim to impact next-generation data center and cloud inference and training applications with the goal of substantially reducing energy consumption and the total cost of ownership.
- Edge device inference and training is rapidly increasing with a variety of applications spanning autonomous driving, driver assistance, image recognition, voice recognition and surveillance. Compute-in-memory solutions aim to enable increased performance and capacity on edge devices, enabling new applications and improved performance to end users.

## Session 12 Overview: Monolithic System for Robot and Bio Applications

## **Technology Directions**

Session Chair: Milin Zhang, Tsinghua University, China

Session Co-Chair: Daniel H. Morris, Meta, CA

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan

The session describes circuits and systems that enable new applications in the area of robotics and biomedical. The session begins with a compact IC with all of the electronic functions necessary for a micro-robot. The next three papers discuss sensor and sensor interfaces, including an integrated image sensor and TMD photo-FET array, a self-powered wireless electrochemical sensor and a  $\mu$ ECoG implant system. The session concludes with three biomedical focused papers, including a paper describing a CMOS cellular interface array and two papers demonstrating molecular biosensors.

- In Paper 12.1, University of Michigan shows a 210 × 340 × 50µm CMOS die in 55nm triple-well process that includes all the electronic functions necessary for a micro-robot, including energy harvesting, sensing, processing, communication, and actuation.
- In Paper 12.2, Harvard University presents a 200 x 256 image sensor integrated a TMD photo-FET array on a CMOS time-todigital converter (TDCs) IC as the first integration of 2D semiconducting materials with CMOS electronics, expanding the standalone TMD integration scale by 50 times.
- In Paper 12.3, National Yang Ming Chiao Tung University shows an energy-autonomous wireless soil pH and electrical conductance measurement IC using soil-microbial and photovoltaic (PV) energy harvesting featuring a peak efficiency of 81.3% over a 0.05-to-14mW power range.
- In Paper 12.4, imec presents a μECoG implant system composed of a flexible, actively-multiplexed 256 electrodes μECoG array (3μm IGZO TFT), and an incremental-ΔΣ readout IC (22nm FDSOI), achieving effective area and power per channel of 0.001mm2 and 1.61μW, respectively, and an input-referred noise of 1.55μVrms.
- In Paper 12.5, Georgia Institute of Technology demonstrate a CMOS cellular interface array capable of multi-modal imaging with 1,568 concurrent pixel readout and the smallest multi-modal pixel (13µm×13µm), enabling a sampling rate of 45.8kHz.
- In Paper 12.6, UCSD reports the first CMOS molecular electronic chip for single-molecule biosensing integrated current readout circuits arrayed with 20µm pitch as a 16k sensor array on a CMOS chip, providing a 1kHz frame rate.
- In Paper 12.7, CEA-Léti reveals a co-integrated readout matrix allowing a sequential scan of 1024 NEMS-based mass sensors enabling 375µm<sup>2</sup> 0.5mW pixel-level readout IC selects and 0.3ppm frequency variation in 100ms.

# Session 12 Highlights: Monolithic System for Robot and Bio Applications

## [12.1] A 210 × 340 × 50 $\mu$ m Integrated CMOS System for Micro-Robots with Energy Harvesting, Sensing, Processing, Communication and Actuation

**Paper 12.1 Authors:** Li Xu<sup>1</sup>, Maya Lassiter<sup>2</sup>, Xiao Wu<sup>1</sup>, Yejoong Kim<sup>1</sup>, Jungho Lee<sup>1</sup>, Makoto Yasuda<sup>3</sup>, Masaru Kawaminami<sup>4</sup>, Marc Miskin<sup>2</sup>, David Blaauw<sup>1</sup>, Dennis Sylvester<sup>1</sup>

**Paper 12.1 Affiliation:** <sup>1</sup>University of Michigan, Ann Arbor, MI, <sup>2</sup>University of Pennsylvania, Philadelphia, PA, <sup>3</sup>United Semiconductor Japan Co., Ltd., Kuwana, Japan, <sup>4</sup>United Semiconductor Japan Co., Ltd., Yokohama, Japan

Subcommittee Chair: Makoto Nagata, Kobe University, Japan

#### CONTEXT AND STATE OF THE ART

- Existing state-of-the-art ultra-low power sensor systems integrate energy harvesting, sensing, processing, and communication, but are missing a key component of actuation.
- Micro-robots require ultra-small volume and high efficiency computation and wireless communication.

#### TECHNICAL HIGHLIGHTS

- University of Michigan demonstrates a 210 × 340 × 50µm integrated circuit in 55nm triple-well process with integrated energy harvesting, sensing, processing, communication, and actuation for a micro-robot.
  - The IC demonstrates movement of a micro-robot using two legs when activated by a 60k lux light source. The power consumption of the entire system is only 75nW.
  - This is the first sub-mm<sup>3</sup> (210 × 340 × 50µm) CMOS die in 55nm triple-well process that includes all the electronic functions necessary for a micro-robot, including energy harvesting, sensing, processing, communication, and actuation.

- The CMOS compatible lithographic fabricated actuators offer an unprecedented opportunity to create micron-scale, monolithic systems with moving capability.
- The monolithic system with sensing, processing, communication, energy harvesting and actuation brings new potential for wireless sensor systems in the area of micro-implants and micro-robots.

# Session 13 Overview: Digital Techniques for Clocking, Variation Tolerance and Power Management

#### **Digital Circuits Subcommittee**

Session Chair: Tanay Karnik, Intel, Hillsboro, OR

Session Co-Chair: Ping-Hsuan Hsieh, National Tsing Hua University, Hsinchu, Taiwan

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

In this session, eight papers highlight developments in digital clocking, variation tolerance and power management. The session opens with a clock generator incorporating various safety mechanisms to achieve ISO26262 ASIL-D requirements. For clocking circuits, two papers on injection-locked clock multipliers demonstrate: 1) a power-gating injection-locking architecture with a background multi-functional digital calibration, and 2) an injection-pulse-shaping technique. The next paper describes a fully automated hardware-driven clock-gating architecture to reduce dynamic clocking power in a 5nm mobile SoC. For variation tolerance, two papers demonstrate: 1) a microprocessor with digital voltage-droop sensors and a voltage-control loop to offset loadline uplift plus noise effects, enabling a runtime algorithm to boost the core clock frequency, while not exceeding the maximum voltage requirement (V<sub>DD,MAX</sub>) for reliability, and 2) a fully synthesizable digital temperature sensor using only logic transistors and metal wires. The last two papers on power management architecture to recycle the energy stored in the decoupling capacitor before a processor enters sleep mode, and 2) an ultra-low-power SoC for IoT applications, featuring energy-performance scaling, event-driven fast DVFS and minimum energy-point tracking.

- In Paper 13.1, Samsung presents a 14nm clock generator with various safety mechanisms to achieve ISO26262 ASIL-D requirements. Numerous IP-level monitors and controllers are implemented to achieve the overall diagnostic coverage above 99%. Circuit operation is qualified over 5 corners and for temperatures in the -40°C to 150°C range.
- In Paper 13.2, KAIST presents an ultra-low-jitter ring-oscillator-based injection-locked clock multiplier using power-gating injection-locking and a background multi-functional digital calibrator in 65nm CMOS. The measured rms jitter and FoM of the output signal at 8.16GHz are 97fs and -248.7dB, respectively.
- In Paper 13.3, Fudan University proposes an injection-pulse-shaping technique that can eliminate all phase errors from the injection-locked clock multiplier output. The proposed design in 65nm technology achieves -79dBc reference spur at 2.5GHz output and 0.496mW/GHz power efficiency, occupying 0.021mm<sup>2</sup> active area.
- In Paper 13.4, Samsung presents a fully automated hardware-driven clock-gating architecture for their 5nm mobile SoC. The clock components are linked by a handshake interface and the clocks are automatically switched off if there is no activity. This scheme achieves 10-to-40% power reduction, while only occupying 0.03% of the die area.
- In paper 13.5, IBM presents a microprocessor with digital voltage-droop sensors with adaptive throttling to mitigate voltage droops and enable a voltage-control loop (i.e., undervolting) to offset loadline uplift plus noise effects to prevent voltage from exceeding the maximum voltage requirement (V<sub>DD,MAX</sub>) for reliability. Workload optimized frequency (WOF) deterministically maximizes the core clock frequency. The combined effect demonstrates a 15% frequency boost with a 10% reduction in core voltage across a range of workloads.
- In paper 13.6, Samsung presents a fully synthesizable digital temperature sensor with only logic transistors and metal wires in 5nm CMOS. The sensor eliminates PVT variation via a ratio-metric method and a differential structure to achieve a 3σ accuracy error of ±1.8°C from -40°C to 150°C. At 0.65V, the power consumption is 15µW and the conversion time is 11.7µs.
- In Paper 13.7, the University of Washington presents a power-management architecture to recycle the energy stored in the decoupling capacitor before a processor enters sleep mode, with optimal sleep voltage tracking. Measurement results applied to a 65nm ARM Cortex-M0 processor demonstrate a 57.3% reduction in sleep/resume energy losses, for only 0.02mm<sup>2</sup> area overhead.
- In Paper 13.8, the University of Virginia presents a power-management unit (PMU) in an ultra-low-power SoC for IoT applications, featuring energy-performance scaling, event-driven fast DVFS and minimum energy-point tracking. The 0.026mm<sup>2</sup> 65nm design enables a PMU peak efficiency of 92.6% and consumes 194nW of total system power applied to a RISC-V Core regulated between 0.4 and 1V.

# Session 13 Highlights: Digital Techniques for Clocking, Variation Tolerance and Power Management

#### [13.5] Deterministic Frequency Boost and Voltage Enhancements on the POWER10™ Processor

#### [13.1] Clock Generator with ISO26262 ASIL-D Grade Safety Mechanism for SoC Clocking Application

**Paper 13.5 Authors:** Brian T. Vanderpool<sup>1</sup>, Phillip J. Restle<sup>2</sup>, Eric J. Fluhr<sup>3</sup>, Gregory S. Still<sup>4</sup>, Frank Campisano<sup>3</sup>, Ian Carmichael<sup>3</sup>, Eric Marz<sup>5</sup>, Rahul Batra<sup>3</sup>, Richard Willaman<sup>3</sup>

**Paper 13.5 Affiliation:** <sup>1</sup>IBM, Rochester, MN, <sup>2</sup>IBM, Yorktown Heights, NY, <sup>3</sup>IBM, Austin, TX, <sup>4</sup>IBM, Research Triangle Park, NC, <sup>5</sup>IBM, Essex Junction, VT

**Paper 13.1 Authors:** Dokyung Lim, Sounghun Shin, Seungmin Lee, Kihyun Kwon, Jeongmin An, Wonsik Yu, Chanyoung Jeong, WooSeok Kim, Michael Choi, Jongshin Shin

Paper 13.1 Affiliation: Samsung Electronics, Hwaseong-si, Gyeonggi-do, Korea

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

#### CONTEXT AND STATE OF THE ART

- Reliability has become a key issue for digital IC products, defining safe operation parameters such as voltage and frequency margins.
- Innovations in variation-tolerant and frequency-boosting techniques improve processor performance and energy efficiency, while maintaining reliability by preventing voltage from exceeding the maximum voltage requirement (V<sub>DD,MAX</sub>) for scaled technologies.
- The highest automotive-safety-integrity level (ASIL) risk classification, as defined by the road vehicle functional safety standard (ISO26262), is ASIL-D for systems operating fully autonomous vehicles, requiring 99% diagnostic coverage of the single-point fault metric.

#### **TECHNICAL HIGHLIGHTS**

- IBM presents a microprocessor with digital voltage-droop sensors with adaptive throttling to mitigate voltage droops and maximize core performance
  - A voltage-control loop (undervolting) to offset loadline uplift plus noise effects prevents voltage from exceeding V<sub>DD,MAX</sub> for reliability.
  - o Workload optimized frequency deterministically maximizes the core clock frequency.
  - Measurements demonstrate a 15% frequency boost with a 10% reduction in core voltage across a range of workloads.
- Samsung presents a clock generator with various safety mechanisms to satisfy the highly restrictive ISO26262 ASIL-D requirements.
  - The design integrates numerous IP-level monitors and controllers to achieve an overall diagnostic coverage above 99% for extreme reliability.
  - PLL lock, control voltage, frequency and jitter are all monitored live to detect any functional failure of clocking IP.

- Advancements in processor performance, energy efficiency, and reliability via variation-tolerant and frequency-boosting techniques enhance the robustness and lower the operating costs of data compute centers for many business applications, including health and financial industries.
- IP-level live monitoring of safe operation for advanced driver assist systems (ADAS) is made possible with various integrated sensors, to satisfy the highly restrictive diagnostic coverage for the ASIL-D requirements expected in fully autonomous vehicles.

## Session 14 Overview: GaN, High-Voltage and Wireless Power

#### **Power Management Subcommittee**

Session Chair: Bernhard Wicht, University of Hannover, Hannover, Germany

Session Co-Chair: Patrik Arno, STMicrolelectronics, Grenoble, France

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA

Power conversion is ubiquitous, and a variety of power-converter topologies have found their ways into both established and emerging applications. This session aims at showcasing some of the best work in the application-specific power-conversion field, such as monolithically integrated GaN-on-Si gate drivers and converters, a chip-scale offline power supply for IoT nodes, sensors, RF transceivers, 48-to-1V hybrid converters for data center applications, a hybrid switching supply modulator for mobile communications as well as wireless power transfer, and isolated DC-DC converters for industrial and medical applications and for mobile device-to-device charging.

- In Paper 14.1, National Chiao Tung University and Realtek Semiconductor present a monolithically integrated 400V GaN-on-Si half-bridge. The proposed diode-emulated GaN technique using a meta-stable fast (MSF) comparator reduces reverse conduction loss by sub-0.2ns deadtime. A dual dv/dt control together with an active bootstrap controller support 50MHz operation and 120V/ns slew rate.
- In Paper 14.2, Leibniz University of Hannover presents an offline power supply with fully integrated power stage in a 0.18µm SOI-CMOS process. This converter supports both AC-DC and DC-DC conversion from 15-to-325V to 3.3-to-10V with an output power of up to 300mW. It achieves a superior power density of 458mW/cm<sup>3</sup> and 73.7% peak efficiency.
- In Paper 14.3, the University of Texas at Dallas presents a monolithic GaN for direct 48-to-1V conversion delivering 5A output current at 500kHz switching frequency. The IC includes four on-die power switches and gate drivers, a dead-time controller and temperature sensors. The gate driver reports the shortest propagation delays of 11.6ns and 14ns for monolithic GaN gate drivers.
- In Paper 14.4, the University of Texas at Dallas presents a hybrid 48-to-1V DC-DC converter. The 3:1 ladder-based capacitorassisted dual-inductor filtering topology uses fully on-chip power switches and achieves 87% peak efficiency at 2.5MHz and 3W peak power. The flying capacitors neither require precharging during start-up nor voltage balancing in the steady state.
- In Paper 14.5, Samsung Electronics presents a digital envelope-tracking supply modulator for RF power amplifiers. A combined structure of a six-level switched-capacitor voltage divider (SCVD) and an interleaved hybrid buck-boost converter achieves newradio (NR) 200MHz channel BW and 93.6% efficiency for 2G/3G/LTE/NR RF power amplifiers.
- In Paper 14.6, Hong Kong University of Science and Technology and the University of Macau present a wireless power-transfer system that delivers 27W to the load and reaches 80% maximum DC-DC efficiency. The single-stage regulated power amplifier with off-chip GaN transistor merges DC-DC and Class-E circuits to save one inductor and one capacitor.
- In Paper 14.7, the University of Science and Technology of China presents an isolated DC-DC converter with a cross-coupled shoot-through-free Class-D oscillator. The proposed architecture achieves 51% peak efficiency and 1.2W maximum output power while meeting CISPR-32 Class-B standard on a two-layer PCB without using any stitching capacitor.
- In Paper 14.8, Iowa State University presents a capacitive isolated DC-DC converter that achieves 68.3% peak efficiency, reconfigurability of 1-/2-phase operations, through-power-link voltage regulation without using extra feedback transformers/capacitors, and ±8kV/µs common-mode transient immunity with closed-loop voltage regulation.

## Session 14 Highlights: GaN, High-Voltage and Wireless Power

## [14.1] A Monolithic GaN-based Driver and GaN Power HEMT with Diode-emulated GaN Technique for 50MHz Operation and Sub-0.2ns Deadtime Control

Paper Authors: Y.-Y. Kao<sup>1</sup>, S.-H. Hung<sup>1</sup>, Y.-H. Wen<sup>1</sup>, T.-H. Yang<sup>1</sup>, S.-Y. Li<sup>1</sup>, T.-W. Wang<sup>1</sup>, K.-H. Chen<sup>1</sup>, Y.-H. Lin<sup>2</sup>, S.-R. Lin<sup>2</sup>, T.-Y. Tsai<sup>2</sup>

Paper Affiliation: 1National Chiao Tung University, Hsinchu, Taiwan, 2Realtek Semiconductor, Hsinchu, Taiwan

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA, Power Management

#### CONTEXT AND STATE OF THE ART

- Size reduction and efficiency improvement are key performance drivers for power conversion. GaN technology is enabling an increased switching frequency and a corresponding size reduction of power converters for a wide variety of applications.
- The integration of drivers and control circuitry on the same die as the GaN power transistor can yield both performance as well as cost improvements but is hampered by the lack of PMOS devices.

#### TECHNICAL HIGHLIGHTS

- National Chiao Tung University and Realtek Semiconductor present a monolithically integrated 400V GaN-on-Si halfbridge with gate driver, power switch, and internal supporting circuits.
  - Diode-emulated GaN technique using a meta-stable fast (MSF) comparator reduces reverse conduction loss by sub-0.2ns deadtime.
  - o Dual dv/dt control together with an active bootstrap controller support 50MHz operation and 120V/ns slew rate

- 400V GaN-on-Si half-bridge, enabling small size and highly efficient power conversion for grid-connected loads.
- Fully integrated in a GaN technology, enabling high performance, cost reductions, and mass production.

#### **Machine Learning Subcommittee**

Session Chair: SukHwan Lim, Samsung Electronics, Hwaseong-si, Korea

Session Co-Chair: Sophia Shao, University of California, Berkeley, CA

Subcommittee Chair: Marian Verhelst, KU Leuven, Belgium, ML Subcommittee Chair

Machine learning processors continue their rapid evolution, offering more flexible acceleration of inference and training and implementation in the most advanced CMOS technology nodes such as 4nm. Compute-in-memory (CIM) architectures are continuing to show further improvements and thus gain traction. Both digital and analog-circuit-based CIM have been used to enhance area and energy efficiency, while at the same time improving in flexibility to address various types of neural networks. ML processors are also being adopted in a wider variety of domains including ultra-low power and in-sensor computing applications. This session comprises nine papers, covering a diverse range of both architectural and circuit innovations, as well as application areas ranging from the cloud to insensor processors.

- In Paper 15.1, Samsung presents a multi-mode 8k-MAC neural processing unit (NPU) for mobile SoCs. The NPU features unified multi-precision MAC support from INT4/8/16 to FP16, with a unified FP/INT datapath to boost area efficiency and datapath reconfiguration to enhance utilization. Manufactured in a leading edge 4nm process, the chip achieves 11.59TOPS/W and 1.72TOPS/mm<sup>2</sup> for MobileNetEdgeTPU with multi-mode support for low-latency mode or always-on mode.
- In Paper 15.2, Northwestern University describes a systolic neural CPU Processor with 95% PE utilization for combined deep learning and general-purpose computing. The 65nm chip introduces an architectural approach that allows multiple processing elements (PE) to be reconfigured to implement a RISC-V CPU. By reconfiguring the PEs to CPU or NN accelerator according to the given task, the chip can increase area efficiency, while exhibiting high utilization.
- In Paper 15.3, Fudan University and Alibaba DAMO Academy present a Computing on Memory Boundary (COMB) NN
  Processor, which combines features of in-memory and near-memory processing designs with high energy efficiency, while
  exploiting bipolar bitwise sparsity that can save energy for both zero bits and multiple one bits. Two chips are demonstrated,
  with the 65nm implementation achieving a system energy efficiency of 32.9TOPS/W, and the 28nm implementation achieving
  a COMB macro efficiency of 45.7TOPS/W.
- In Paper 15.4, Tokyo Institute of Technology describes Hiddenite a 4K-PE Hidden Network Inference 4D-Tensor Engine. The 40nm neural network inference chip achieves state-of-the-art efficiency, implementing a weight generation unit given a known seed together with a super mask decompression unit for on-chip model construction to drastically reduce the need to access external memory.
- In Paper 15.5, Tsinghua University and University of California, Santa Barbara describe a reconfigurable digital CIM processor for large-scale deep learning engines. To accelerate both cloud inference and training, their chip supports both FP and INT computation in a unified pipeline, and features an in-memory Booth multiplication scheme, in-memory accumulation and exponent pre-alignment. The 28nm chip demonstrates an efficiency of 29.2TFLOPS/W for BF16 and 36.5TOPS/W for INT8.
- In Paper 15.6, KU Leuven and imec present DIANA a DIgital and ANAlog hybrid neural network SoC featuring a RISC-V host coupled with a digital ML accelerator and an analog CIM core. The hybrid architecture enables concurrent execution of NNs across the digital and analog ML cores using a scheduler that considers required precision and efficiency. Their 22nm chip combines a top efficiency of 790TOPs/W (1.5b/8b) in its analog-in-memory portion with a 4TOPs/W (8b/8b) in its digital portion.
- In Paper 15.7, KAIST and Columbia University present a fully analog CNN processor featuring convolution, pooling, and nonlinearity (RELU) datapath fully (end-to-end) in the analog domain, with no analog-to-digital conversion between layers. The processor adopts a variation-tolerant analog design approach, including analog memory with a write-with-feedback scheme that allows the fully analog processor to be robust to PVT variations. The 28nm chip achieves a peak efficiency of 332.7TOPS/W for 5b equivalent precision.
- In Paper 15.8, Mythic describes an analog CIM processor based on Flash memory for edge AI real-time video analysis. The full-scale commercial system (40nm process) is tile-based with 76 ACiM tiles, each containing 1024×2048 NOR Flash arrays, plus 5 sector control tiles, 1 PCIe tile, and 1 system control tile. Peak performance (8b to 2b) is 16.6-59.3TOPS and 3.3-10.9TOPS/W for the full system, and 5.2-18.5TOPS/W for the ACiM array.
- In Paper 15.9, National Tsing Hua University describes a processing-in-sensor chip for low-power intelligent vision sensors. The 180nm chip includes a 128×128 pixel array and a mixed-mode 3-layer configurable neural network engine to boost efficiency and reduce digital logic area. A "human face or not" detection task achieves an accuracy of 93.6% and power consumption of 122.6µW at 250fps.

### **Session 15 Highlights: ML Processors**

[15.1] A Multi-Mode 8k-MAC HW-Utilization-Aware Neural Processing Unit with a Unified Multi-Precision Datapath in 4nm Flagship Mobile SoC

#### [15.5] A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 Reconfigurable Digital CIM Processor with Unified FP/INT Pipeline and Bitwise In-Memory Booth Multiplication for Cloud Deep Learning Acceleration

**Paper 15.1 Authors:** Jun-Seok Park<sup>1</sup>, Chansoo Park<sup>1</sup>, Suknam Kwon<sup>1</sup>, Hyeong-Seok Kim<sup>1</sup>, Taeho Jeon<sup>1</sup>, Yesung Kang<sup>1</sup>, Heonsoo Lee<sup>1</sup>, Jongwoo Lee<sup>1</sup>, James Kim<sup>1</sup>, YoungJong Lee<sup>1</sup>, Sangkyu Park<sup>1</sup>, Jun-Woo Jang<sup>2</sup>, SangHyuck Ha<sup>1</sup>, MinSeong Kim<sup>1</sup>, Jihoon Bang<sup>1</sup>, Sukhwan Lim<sup>1</sup>, Inyup Kang<sup>1</sup>

Paper 15.1 Affiliation: <sup>1</sup>Samsung Electronics, Hwaseong-si, Gyeonggi-do, Korea, <sup>2</sup>Samsung Advanced Institute of Technology, Suwon-si, Gyeonggi-do, Korea,

**Paper 15.5 Authors:** Fengbin Tu<sup>1,2</sup>, Yiqi Wang<sup>1</sup>, Zihan Wu<sup>1</sup>, Ling Liang<sup>2</sup>, Yufei Ding<sup>2</sup>, Bongjin Kim<sup>2</sup>, Leibo Liu<sup>1</sup>, Shaojun Wei<sup>1</sup>, Yuan Xie<sup>2</sup>, Shouyi Yin<sup>1</sup>

Paper 15.5 Affiliation: 1Tsinghua University, Beijing, China, 2University of California, Santa Barbara, Goleta, CA

Subcommittee Chair: Marian Verhelst, KU Leuven, Belgium, ML Subcommittee Chair

#### CONTEXT AND STATE OF THE ART

- Neural network accelerators have long focused on high energy-efficiency to help extend battery life in edge devices, and to maximize the compute capability delivered within fixed power and thermal envelopes in datacenters.
- Recently, flexibility in terms of support for a wide variety of neural networks, compute precisions and NN operations is becoming an increasingly important feature for ML processors, across a wide range of applications and product domains.

#### TECHNICAL HIGHLIGHTS

- Samsung describes a multi-mode neural processing unit applicable for both high compute low-latency mobile applications, as well as always-on scenarios with ultra-low-power requirements.
  - A unified FP/INT datapath supports a wide variety of neural networks, precisions and operations, while a reconfigurable computational path improves utilization on challenging layers.
  - Their 4 nm chip achieves 4.26FLOPS/W for DeepLab-V3 and 11.59TOPS/W MobileNetEdgeTPU.
- Tsinghua University and the University of California, Santa Barbara describe a reconfigurable digital CIM Processor, supporting both FP and INT computation in a unified datapath.
  - With a novel in-memory Booth multiplication scheme, in-memory accumulation and exponent pre-alignment, their 28 nm chip achieves 29.2TFLOPS/W for BF16 and 36.5TOPS/W for INT8.

- With a unified FP/INT datapath that offers sufficient flexibility and precision for datacenter deployment of either NN inference
  or training, the enhanced area and energy efficiency of the reconfigurable digital CIM processor can help reduce total cost of
  ownership, including cooling and deployment costs.
- The multi-mode NPU can run AI applications on smartphones with higher energy efficiency and can enrich the user experience by accelerating workloads, such as computational photography and computer vision.

## Session 16 Overview: Emerging Domain-Specific Digital Circuits and Systems

#### **Digital Circuits Subcommittee**

Session Chair: Huichu Liu, Meta, Menlo Park, CA

Session Co-Chair: Mijung Noh, Samsung Electronics, Hwasung, Korea

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

Emerging applications continue to drive advancements in digital circuits and systems for high integration, low area and high energy efficiency. The first three papers in this session demonstrate memory-centric accelerators, including: 1) a digital in-memory computing SRAM macro with approximate arithmetic hardware, 2) a high-density compute-in-memory RRAM macro, and 3) an SoC integrated with a large-capacity RRAM for edge applications. Next, a metal-oxide thin-film transistor (TFT)-based microprocessor demonstrates high speed for 800nm TFT technology, followed by a flexible and scalable Ising machine for solving combinatorial optimization problems. The final two papers showcase 1) a low-power transceiver for voltage-mode human-body communication, and 2) a baseband processor for run-time optimal beamforming.

- In Paper 16.1, Columbia University and Intel present a digital in-memory computing (DIMC) SRAM in 28nm CMOS to accelerate convolutional neural networks using approximate arithmetic hardware. The macro achieves an area of 0.033mm<sup>2</sup> 2,569F<sup>2</sup>/b, ~2.5× better than prior work, while retaining up to 2219TOPS/W (1b/1b activation/weight) energy efficiency at 0.5V and 280MHz with state-of-the-art neural network accuracy.
- In Paper 16.2, Georgia Institute of Technology and TSMC present a 40nm RRAM compute-in-memory macro of 64kb in 0.027mm<sup>2</sup> (2.37Mb/mm<sup>2</sup>) with 4.23× improvement in density. Measurements demonstrate an energy efficiency of 26.56TOPS/W at 0.83V and 64MHz.
- In Paper 16.3, Georgia Institute of Technology and TSMC present a 5mm<sup>2</sup> 40nm SoC for edge applications, featuring 2.25MB of RRAM, 768KB of SRAM, and an embedded Core M3 processor. Measurements show an overall energy efficiency of 60.64TOPS/W at 0.9V and 192MHz.
- In Paper 16.4, IMEC introduces a flexible 8b microprocessor in an 800nm metal-oxide TFT technology implemented with a complete digital design flow using a pseudo-CMOS standard-cell library. The chip occupies an area of 24.91mm<sup>2</sup> and achieves 71.4KHz, while consuming 134.9mW of power at supply voltage of 3V and bias voltage of 6V.
- In Paper 16.5, Nanyang Technological University presents a scalable accelerator in 65nm CMOS for solving combinatorial optimization problems using the Ising model and annealing process. This accelerator integrates 1024 spins per chip with up to 4 spins per processing element and 28 flexible interconnects per spin, supporting 8b coefficients and consuming 1.14µW/spin at 1.2V and 64MHz with an area of 0.338mm<sup>2</sup> for core PE array.
- In Paper 16.6, Purdue University presents a 65nm digital-based transceiver with a switched-capacitor adiabatic voltage-mode driver and energy-efficient combinatorial pulse-position modulation (CPPM) for body-worn video-sensing augmented reality (AR). The 0.144mm<sup>2</sup> transceiver consumes 63.3µW (~4.4pJ/b) with CPPM (6b/symbol) at 0.75V and 20MHz.
- In Paper 16.7, the University of Washington describes a 28nm digital beamformer for mm-wave phased arrays that adapts to the presence of multiple interferers using run-time adaptation. The 0.53mm<sup>2</sup> chip operates at 1.95GHz to support an instantaneous bandwidth of 650MHz, while consuming 712mW at 1.1V.

## Session 16 Highlights: Emerging Domain-Specific Digital Circuits and Systems

#### [16.3] A 40nm 60.64TOPS/W ECC-Capable Compute-in-Memory/Digital 2.25MB/768KB RRAM/SRAM System with Embedded Cortex M3 Microprocessor for Edge Recommendation Systems

**Paper 16.3 Authors:** Muya Chang<sup>1</sup>, Samuel D Spetalnick<sup>1</sup>, Brian Crafton<sup>1</sup>, Win-San Khwa<sup>2</sup>, Yu-Der Chih<sup>3</sup>, Meng-Fan Chang<sup>2</sup>, Arijit Raychowdhury<sup>1</sup>

**Paper 16.3 Affiliation:** <sup>1</sup>Georgia Institute of Technology, Atlanta, GA, <sup>2</sup>TSMC Corporate Research, Hsinchu, Taiwan, <sup>3</sup>TSMC Design Technology, Hsinchu, Taiwan

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

#### CONTEXT AND STATE OF THE ART

- High-density, non-volatile resistive RAM (RRAM) with compute-in-memory capability is a compelling candidate for embedded systems with high-memory-capacity integration to enable complex real-world applications at the edge.
- Circuit innovations in error-correction codes (ECC) and power-gating capabilities in a hybrid RRAM/SRAM system improve
  processor accuracy, reliability and energy efficiency to satisfy application requirements.

#### TECHNICAL HIGHLIGHTS

- Georgia Institute of Technology and TSMC present an SoC for edge applications, featuring 2.25MB of RRAM, 768KB of SRAM and an embedded core M3 processor.
  - A hybrid compute-in-memory/digital RRAM/SRAM system with ECC and power-gating capability demonstrates an energy efficiency of 60.64TOPS/W at 0.9V and 192MHz.
  - $_{\odot}$  Event-based power gating on the RRAMs achieves a 10× system power reduction from 25mW with all RRAMs on to 2.6mW with all RRAMs off.
  - ECC activation in the RRAM reduces the bit error rate by 5 orders of magnitude from 1e-3 to 1e-8.

#### APPLICATIONS AND ECONOMIC IMPACT

 High-capacity RRAM integrated with SRAMs in hybrid compute-in-memory/digital SoCs enables edge recommendation systems, improving energy efficiency for event-driven and memory-constrained applications.

# Session 16 Highlights: Emerging Domain-Specific Digital Circuits and Systems

#### [16.4] Flex6502: A Flexible 8b Microprocessor in 0.8µm Metal-Oxide Thin-Film Transistor Technology Implemented with a Complete Digital Design Flow Running Complex Assembly Code

Paper 16.4 Authors: Hikmet Çeliker<sup>1,2</sup>, Antony Sou<sup>3</sup>, Brian Cobb<sup>3</sup>, Wim Dehaene<sup>1,2</sup>, Kris Myny<sup>1,2</sup>

**Paper 16.4 Affiliation:** <sup>1</sup>imec, Leuven, Belgium, <sup>2</sup>KU Leuven ESAT, Leuven, Belgium, <sup>3</sup>PragmatIC Semiconductor Ltd, Cambridge, United Kingdom

Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

#### CONTEXT AND STATE OF THE ART

• Although emerging metal-oxide thin-film-transistor (TFT) circuits with flexible substrates are attractive for IoT applications, the circuit speed and integration scale of these designs limit the opportunity for processing complex computational tasks.

#### TECHNICAL HIGHLIGHTS

- IMEC introduces a flexible 8b microprocessor in an 800nm metal-oxide TFT technology implemented with a complete digital design flow using a pseudo-CMOS standard-cell library.
  - The chip implements the famous 6502 MOS microprocessor in an indium-gallium-zinc-oxide (IGZO) process technology, achieving 71.4kHz, while consuming 134.9mW of power.
  - o Execution of assembly code for the well-known Snake game showcases the real-time operation.

#### APPLICATIONS AND ECONOMIC IMPACT

 Advanced circuit techniques improve the performance of the flexible TFT-based microprocessor and showcases this technology for realizing complex computational functionality for embedded IoT applications.

## Session 17 Overview: Advanced Wireline Links and Techniques

#### **Wireline Subcommittee**

Session Chair: Bo Zhang, Broadcom, Irvine, CA

Session Co-Chair: Wei-Zen Chen, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

#### Subcommittee Chair: Yohan Frans, Xilinx, San Jose, CA

Wireline links are critical in enabling data-traffic growth in data centers, AI, automotive, and next-generation telecommunication infrastructure. These applications demand high-performance optical and electrical transceivers with low jitter, high bandwidth, and excellent energy efficiency. This session introduces advanced wireline links and techniques over optical and electrical channels.

The first two papers of the session describe optical transceiver components. The first one is a 200Gb/s short-reach coherent receiver with analog-based DSP solution, and the other one is a 100Gb/s PAM-4 transmitter with SiP MOSCAP modulators and CMOS drivers. The 3<sup>rd</sup> paper describes a 10Gb/s isolator-based transceiver with strong surge protection for reliable industrial and medical applications. The remaining four papers of the session focus on clock generation for serial-link transceivers. The first one is a low-power 56GHz fractional-N PLL for 224Gb/s PAM-4 applications, then followed by a 14GHz sub-sampling PLL (SSPLL) with a high multiplication factor. The last two papers describe highly linear and low-jitter phase interpolators for timing recovery.

- In Paper 17.1, Peking University proposes a 200Gb/s analog DPQPSK coherent receiver in 28nm CMOS. It features 12MHz carrier recovery, equalization for 10km optical fiber dispersion, 1.2Mrad/s SOP and 9dB electrical channel loss. The receiver achieves an energy efficiency of 4.6pJ/b with active area of 0.06mm<sup>2</sup>.
- In Paper 17.2, California Institute of Technology demonstrates an optical 100Gb/s PAM-4 transmitter system in a SiP-CMOS
  platform including a push-pull segmented MZM structure using MOSCAP phase modulators. Two pairs of modulators are driven
  at 50Gbaud by a dual-channel CMOS driver, and is flip-chip bonded to the photonics chip with 2.4pJ/b power efficiency.
- In Paper 17.3, Analog Devices shows a fully integrated galvanic isolated 10Gb/s transceiver using a split-ring resonator in 40nm CMOS process. It achieves the state-of-the-art isolation capability of 100kV/µs common mode transient immunity and >24kV surge rating.
- In Paper 17.4, University of California, Los Angeles, describes a 56GHz fractional-N PLL targeting 224Gb/s PAM-4 transmitters featuring a current-mode FIR filter to minimize DSM quantization noise folding. Fabricated in 28nm CMOS technology, the PLL consumes 23mW and occupies an area of 0.1mm<sup>2</sup>.
- In Paper 17.5, University of California, Berkeley, proposes a new SSPLL design with a proportionally divided charge pump for a multiplication factor of 480. It operates at 14.3GHz, and achieves 6.6mW including the FLL, 153.4fs jitter and -248.1dB FoM.
- In Paper 17.6, Columbia University presents a 65nm 7b twin phase interpolator prototype with the Delta QDLL. It achieves an INLpp less than 1.45LSB from 3.5GHz to 11GHz, an integrated fractional spur level of -41.7dBc at 7GHz with 1429ppm modulation, and an integrated jitter of 58.5fsrms.
- In Paper 17.7, University of British Columbia proposes a 14GHz integrating mode phase interpolator in 5nm FinFET process. It achieves 9b linearity, 0.43mW/GHz power, 295fs/410fs DNL/INL at 13.3GHz and -42.6dBc integrated rotation spur, and 71fs<sub>rms</sub> integrated jitter, all in a compact area.

## Session 17 Highlights: Advanced Wireline Links and Techniques

#### [17.1] A 4.6pJ/b 200Gb/s Analog DP-QPSK Coherent Optical Receiver in 28nm CMOS

**Paper Authors:** Kai Sheng<sup>1</sup>, Haowei Niu<sup>1</sup>, Boyang Zhang<sup>1</sup>, Weixin Gai<sup>1</sup>, Bingyi Ye<sup>1</sup>, Tianjian Zuo<sup>2</sup>, Lei Liu<sup>2</sup>, Sen Zhang<sup>2</sup>, Tingting Zhang<sup>2</sup>, Hang Zhou<sup>1</sup>, Congcong Chen<sup>1</sup> **Paper Affiliation:** <sup>1</sup>Peking University, Beijing, China, <sup>2</sup>Huawei Technologies, Shenzhen, China

Subcommittee Chair: Yohan Frans, Xilinx, San Jose, CA, Wireline

#### CONTEXT AND STATE OF THE ART

- The emerging applications such as artificial intelligence and cloud computing have significantly driven requirements for high transmission data rates.
- Polarization diversity coherent detection is an indispensable technique to realize high capacity transmission owing to its
  excellent spectral efficiency. The existing DSP-based coherent receiver led to high power consumption, while the prior analog
  coherent receiver is limited in data rate.

#### TECHNICAL HIGHLIGHTS

- A 200Gb/s analog dual-polarization quadrature phase-shift keying (DP-QPSK) coherent optical receiver featuring a 2-stage equalizer to relax the speed limit, while consuming 10 times less power than ADC-DSP-based ones.
- The active area is only 0.06mm<sup>2</sup> and 90% smaller than the ADC-DSP-based receivers.
- It demonstrates 12MHz carrier recovery, equalization for 10km optical fiber dispersion, 1.2Mrad/s SOP and 9dB electrical channel loss.

- Significantly lowers the receiver's power/area consumption, which would reduce the cost of commercial long-haul optical systems.
- The proposed 2-stage equalizer can be generally applied to any analog-based receiver.

#### **Power Management Subcommittee**

Session Chair: Harish Krishnamurthy, Intel, Hillsboro, OR

Session Co-Chairs: Xun Liu, Chinese University of Hong Kong, Shenzhen, China

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA

Power density, efficiency, and transient response have always been the key performance metrics used to measure modern DC-DC converters that are used for powering various applications such as multi-core microprocessors, energy harvesting and direct batteryattached systems, automotive electronics, and LED drivers. Novel topologies with high voltage-conversion ratios while minimizing inductor current, maximizing inductor slew rate for faster transient response, and maintaining high power-conversion efficiency are introduced in this session. Fully integrated power converters for maximum power density and alternative techniques to sense inductor current in traditional buck converters are also presented in this session for showcasing the latest DC-DC converter designs at both system and circuit levels.

- In Paper 18.1, KAIST and Samsung Electronics present a high-power density (1.23 W/mm<sup>2</sup>), high efficiency (~83%), 400MHz
   6-phase fully integrated buck converter in 28nm CMOS with on-chip capacitor dynamic reallocation for inductor current sensing and current balancing while achieving fast DVS of 75mV/ns.
- In Paper 18.2, the University of Science and Technology of China presents a double step-down (DSD) DC-DC converter with dual phase charging that improves the inductor slew rate by 2× to minimize droop (56mV) for a 3A/20ns load transient.
- In Paper 18.3, the University of Macau and the University of Lisboa overcome the limitation of D<0.5 in DSD DC-DC converters and reliability issues when D>0.5 by proposing a capacitor cross-connected DSD which in conjunction with a hysteretic controller improves the droop to 0.73× of the theoretical minimum while improving the DVS rate by 1.3×.
- In Paper 18.4, Dartmouth College presents a fully integrated 3:1 step-down resonant Dickson converter which achieves 78.3% efficiency for 4.2-to-1.2V step down at 140mW. The design uses a merged electromagnetic LC resonator to reduce high-frequency winding loss effects and improve utilization of spiral magnetics.
- In Paper 18.5, Intel presents a 4-phase fully integrated voltage regulator (FIVR) in 7nm CMOS delivering up to 12A load current with a current density of 47 A/mm<sup>2</sup> and featuring a digitally assisted control loop that allows autonomous phase shedding (APS) while retaining fast transient response.
- In Paper 18.6, Zhejiang University presents a high efficiency 98.4% peak efficiency, 5V input 0.4-to-1.2V output reconfigurable capacitive-sigma converter. This work connects an unregulated 2:1 switched capacitor in input-series and output-shunt configuration with a reconfigurable hybrid Dickson buck converter to interface with higher voltages while minimizing current through the output inductor and improving the overall efficiency.
- In Paper 18.7, the University of Texas at Dallas presents a hybrid boost converter topology with a scalable N-stage conversion ratio (CR) boosting scheme to enlarge the minimum switch on-time, provide N DC outputs, and reduce switch V<sub>DS</sub> stress as well as capacitor voltage. The 4-stage converter produces 4 outputs and achieves peak efficiencies of 91% at CR = 12 and f<sub>SW</sub> = 2MHz and 87% at CR = 10 and f<sub>SW</sub> = 5MHz.
- In Paper 18.8, the University of Macau and the University of Lisboa present an SC-parallel-inductor buck (SCPL-Buck) topology that can significantly reduce inductor current to <0.5× of I<sub>load</sub>, reducing the conduction losses by over 75% for the same DCR, facilitating a peak efficiency of 92.9% for a 4.2V input voltage system.

## **Session 18 Highlights: DC-DC Converters**

## [18.6] A 5V Input 98.4% Peak Efficiency Reconfigurable Capacitive-Sigma Converter with Greater than 90% Peak Efficiency for the Entire 0.4~1.2V Output Range

Paper Authors: Xu Yang, Linhu Zhao, Menglian Zhao, Zhichao Tan, Lenian He, Yong Ding, Wuhua Li, Wanyuan Qu

Paper Affiliation: Zhejiang University, Hangzhou, China

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA, Power Management

#### CONTEXT AND STATE OF THE ART

- Aggressive power management in USB and battery powered devices requires efficient high-conversion-ratio power delivery in small volume, area, and z-height with output power headroom to support peak power workloads.
- Hybrid multi-level converter-based solutions necessitate the full inductor current through all the switches limiting power levels and lowering current density.

#### **TECHNICAL HIGHLIGHTS**

- A highly efficient unregulated switched-capacitor (SC) converter is combined with a reconfigurable Dickson hybrid buck stage in an input-series and output-shunt configuration to support high input voltages and deliver higher output current.
  - The converter achieves 98.4% peak efficiency and beyond 90% peak efficiency for the entire 0.4-to-1.2V output range from a 5V input system (VCR of 12.5)
  - As compared to the state-of-the-art designs, the peak efficiency is improved by 1.5% at VCR=4.2 and 4.9% at VCR=12.5.

#### APPLICATIONS AND ECONOMIC IMPACT

Power delivery technologies are a key enabler of small form factor for battery-powered mobile and handheld devices. With its
novel scheme of connecting two dissimilar topologies in input series and output shunt configuration, the contributions of this
paper will change the paradigm of high-voltage-conversion-ratio architectures in future mobile power delivery technologies.

## Session 19 Overview: Power Amplifiers and Building Blocks

#### **RF Subcommittee**

Session Chair: Hongtao Xu, Fudan University, China

Session Co-Chair: Yves Baeyens, Nokia Bell Labs, NJ

Subcommittee Chair: Jan Craninckx, imec, Belgium

This session presents the latest advances in power amplifiers and highly linear circuits operating from WiFi frequencies (2.4 and 5 to 7GHz) up to D-Band (110 to 130GHz). The first three papers discuss PA design strategies that lead to PAs operating in D-Band, W-Band, and at 28GHz, In the fourth paper, a bi-directional PA/LNA front-end is described. This is followed by a WiFi 6E digital polar transmitter, a 27-to-41GHz VSWR-resilient power/impedance sensor, and a 1-to-18GHz passive mixer with an integrated LO driver.

- In Paper 19.1, Tsinghua University presents a compact D-Band SiGe BiCMOS Doherty PA using an 8-way slotline-based power combiner achieving a saturated output power of over 22dBm from 110 to 130GHz. The power added efficiency (PAE) at respectively 110/120/130GHz reaches 18.7/17.2/16.1% at peak and 12.1/11.7/9.8% at 6dB power back-off (PBO).
- In Paper 19.2, Tsinghua University reveals a W-Band 65nm-CMOS PA, which uses a scalable 128-to-1 power combiner and delivers 32.1dBm peak P<sub>out</sub> with 15% peak PAE at 98GHz under a 1V supply.
- In Paper 19.3, Tianjin University proposes a 28GHz compact 3-way transformer-based parallel-series Doherty PA achieving 25.5dBm P<sub>SAT</sub> and 20.4%/14.2% PAE at 6/12dB PBO while occupying 0.54mm<sup>2</sup> in a 55nm bulk CMOS process.
- In Paper 19.4, the Georgia Institute of Technology describes a broadband ultra-compact high-linearity bi-directional PA/LNA front-end. It reduces chip area by sharing matching networks and improves PA performance by using a hybrid N/PMOS.
- In Paper 19.5, Intel introduces a digital polar transmitter in 16nm FinFET that delivers a maximum output of 28dBm supporting wide bandwidths (up to 160MHz) on both LB (2.4 to 2.5GHz) and HB/UHB (5 to 7GHz) for dual band WiFi 6E applications.
- In Paper 19.6, the Georgia Institute of Technology shows an on-chip VSWR-resilient joint true-power/impedance sensor operating with a >21dB dynamic range over a 27-to-41GHz BW.
- In Paper 19.7, the University of California, Santa Barbara demonstrates a 1-to-18GHz passive mixer with an integrated LO driver in 45nm SOI CMOS. By using differential distributed-stacked-complementary (DiSCo) switches, a P<sub>1dB</sub>> 20dBm and IIP3 >33dBm is achieved.

## Session 19 Highlights: Power Amplifiers and Building Blocks

#### [19.1] A 110-to-130GHz SiGe BiCMOS Doherty Power Amplifier with Slotline-Based Power-Combining Technique Achieving >22dBm Saturated Output Power and >10% Power Back-Off Efficiency

#### [19.2] A 1V 32.1dBm 92-to-102GHz Power Amplifier with a Scalable 128-to-1 Power Combiner Achieving 15% Peak PAE in a 65nm Bulk CMOS Process

Paper 19.1 Authors: Xingcun Li<sup>1</sup>, Wenhua Chen<sup>1</sup>, Shuyang Li<sup>1</sup>, Huibo Wu<sup>1</sup>, Xiang Yi<sup>2</sup>, Ruonan Han<sup>3</sup>, Zhenghe Feng<sup>1</sup>

**Paper 19.1 Affiliation:** <sup>1</sup>Tsinghua University, Beijing, China, <sup>2</sup>South China University of Technology, Guangzhou, China, <sup>3</sup>Massachusetts Institute of Technology, Boston, MA

Paper 19.2 Authors: Wei Zhu, Jiawen Wang, Ruitao Wang, Yan Wang

Paper 19.2 Affiliation: 1Tsinghua University, Beijing, China

Subcommittee Chair: Jan Craninckx, imec, Belgium

#### CONTEXT AND STATE OF THE ART

- The demand for 100+ Gbps data rates in wireless communications has driven the rapid development of silicon-based transceivers for the mm-wave and sub-THz bands. The broad available spectrum is attracting interest for both short-range and backhaul high-speed communications. However, to overcome the high path loss, high-power transmitters (TXs) or power amplifiers (PAs) are essential in such systems.
- Silicon-based mm-wave PAs encounter several challenges, such as the limited f<sub>T</sub>/f<sub>max</sub> of transistors, low breakdown voltage in CMOS/SiGe transistors, and considerable loss in the passive combining networks. As a result, reported output power at 100GHz and beyond is limited to ~20dBm and efficiency is typically poor (<10%)</li>

#### TECHNICAL HIGHLIGHTS

- Tsinghua University introduces the first high-efficiency Doherty amplifier operating beyond 100GHz and providing both higher output power and better efficiency than previous amplifiers operating in that frequency range.
  - Using an 8-way slotline-based power combiner and a SiGe technology, the PA achieves the saturated output power of over 22dBm from 110-to-130GHz while the power added efficiency (PAE) at 110/120/130GHz reaches 18.7/17.2/16.1% at peak and 12.1/11.7/9.8% at 6dB power back-off (PBO), respectively.
- Tsinghua University reveals the first Si CMOS PA providing over 1W of power at W-band (92 to 102GHz). By efficiently
  combining 128 power cores in 65nm CMOS, the presented PA achieves 10 times more output power than previously
  reported W-band CMOS PAs. Its performance rivals that of more exotic technologies, such as GaN HEMT.
  - Using a scalable center- and side-fed coupled-line-based power-combining scheme, the 65nm bulk CMOS PA delivers 32.1dBm peak P<sub>out</sub> with a 15% peak PAE at 98GHz under a 1V supply.

- The much larger output power and better efficiency, especially for modulated signals, of the presented PAs enable new
  applications in sub-millimeter wave communications for both wireless backhaul and 6G, environmental sensing, and other
  applications.
- The ability to implement amplifiers with a Watt-level output power in W-band and beyond in high-volume manufacturing processes, such as 65nm bulk CMOS, will significantly reduce the cost of sub-mm-wave transceivers and will spawn new commercial opportunities.

### **Session 20 Overview: Body and Brain Interfaces**

#### Imagers, Medical, MEMS and Display (IMMD) Subcommittee

Session Chair: Sohmyung Ha, New York University Abu Dhabi, Abu Dhabi, UAE

Session Co-Chair: Rikky Muller, University of California, Berkeley, CA

#### Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

This session covers systems that interface with the body and the brain in wearable, implantable, and *in vitro* applications. The papers demonstrate innovations that traverse both circuit- and system-level designs with validations in biomedical environments. This session features five wearable biointerface technologies. The first paper describes a wearable dry-electrode bioimpedance interface circuit with a novel chopping scheme that minimally degrades the input impedance. Three papers describe neural interface applications. The fourth paper describes a 256-channel closed-loop neuromodulation IC with an integrated low-power neural network brain-state classifier.

- In Paper 20.1, Fudan University presents a dry-electrode impedance-measurement circuit that boosts the input impedance by using quiet chopping and pre-charging techniques. This IC achieves 100MΩ input impedance at 50kHz, 0.5mΩ/√Hz sensitivity, and 106dB SNR.
- In Paper 20.2, Pohang University of Science and Technology presents an 8-channel ECG-monitoring IC with time-divisionmultiplexing multi-channel CMI cancellation and a digital-assisted DC-servo loop using one shared AFE. This IC achieves 18.6µW/channel, and a total CMRR against CMI of 100dB up to V<sub>CMI</sub> of 20V<sub>PP</sub>.
- In Paper 20.3, Daegu Gyeongbuk Institute of Science and Technology presents a continuous-time capacitor-coupled 2<sup>nd</sup>-order ΔΣ biopotential recording IC with parasitic-insensitive input impedance boosting. It achieves 421MΩ input impedance and 178.1dB FOM<sub>SNDR</sub> with 10kHz bandwidth and 560mV<sub>PP</sub> input range.
- In Paper 20.4, EPFL and Cornell University present a 256-channel time-domain-multiplexed analog front-end with 16 high-voltage stimulators, and a tree-structured hierarchical neural network for closed-loop neuromodulation. The high-channel-count SoC occupies only 0.014mm<sup>2</sup> while dissipating 1.51µW/channel and 0.227µJ/classification.
- In Paper 20.5, Yonsei University, Kangwon National University, Incheon National University, and gBrain demonstrate an implantable neural recording system using body-coupled data/power transfer for freely behaving animals, removing the need of a headstage. It achieves a data rate of 20.48Mb/s at energy efficiency of 32pJ/b.
- In Paper 20.6, Ulsan National Institute of Science and Technology introduces a sub-array multiplexing microelectrode array to multiplex 24,320 electrodes with 17.7µm pitch through 380 readout channels, enabling programmable channel selection while consuming 30.7mW.
- In Paper 20.7, the University of California at Berkeley presents a low-power sensor interface IC for continuous monitoring of blood oxygen (SpO<sub>2</sub>) and heart rate. The IC uses a 3.75µW transimpedance front-end (TFE) with a reconstruction-free sparse sampling technique that reduces the total system power by 70%.
- In Paper 20.8, Ulsan National Institute of Science and Technology presents a wide DR impedance plethysmogram (IPG) IC for blood pressure and cardiovascular disease measurements. A mixed-mode baseline cancellation scheme and an artifactdetecting CT ADC are proposed to minimize vulnerability to artifacts, achieving high SNR of 103.2dB and wide DR of 145.2dB.

### **Session 20 Highlights: Body and Brain Interfaces**

[20.1] A 0.5m $\Omega$ / $\sqrt{Hz}$  106dB SNR 0.45cm<sup>2</sup> Dry-Electrode Bioimpedance Interface with Current Mismatch Cancellation and Boosted Input Impedance of 100M $\Omega$  at 50kHz

#### [20.4] A 256-Channel 0.227µJ/class Versatile Brain-Activity Classification and Closed-Loop Neuromodulation SoC with 0.004mm<sup>2</sup>-1.51µW/channel Fast-Settling Highly Multiplexed Mixed-Signal Front-End

Paper 20.1 Authors: Q. Pan, T. Qu, F. Shan, B. Tang, Z. Hong, and J. Xu

Paper 20.1 Affiliation: Fudan University, Shanghai, China

Paper 20.4 Authors: U. Shin<sup>1,2</sup>, L. Somappa<sup>1</sup>, C. Ding<sup>1</sup>, B. Zhu<sup>1,2</sup>, Y. Vyza<sup>1</sup>, A. Trouillet<sup>1</sup>, S. Lacour<sup>1,3</sup>, and M. Shoaran<sup>1,3</sup>

**Paper 20.4 Affiliation:** <sup>1</sup>EPFL, Geneva, Switzerland, <sup>2</sup>Cornell University, Ithaca, NY, <sup>3</sup>Center for Neuroprosthetics, Geneva, Switzerland

Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

#### CONTEXT AND STATE OF THE ART

- Dry-electrode power-efficient wearable bioimpedance sensors offer noninvasive long-term health monitoring
- There is a trade-off between the input impedance and noise when a chopping scheme is applied at the analog front-end
- A new chopping scheme that minimizes the input impedance degradation allows a high input impedance and low noise simultaneously
- Innovations in on-chip classification of neurological signals perform inference-based machine learning with low power, low memory utilization, and low area
- Integration of high channel counts for neural interfacing and closed-loop neuromodulation with low power, low noise and low area

#### TECHNICAL HIGHLIGHTS

- Fudan University presents a dry-electrode impedance-measurement circuit that boosts the input impedance and lowers noise by using quiet chopping and pre-charging techniques
  - $\circ$  This IC achieves 100MΩ input impedance at 50kHz, 0.5mΩ/ $\sqrt{Hz}$  sensitivity, and 106dB SNR
- EPFL and Cornell University present a 256-channel time-domain multiplexed analog front-end with 16 high-voltage stimulators, and a tree-structured hierarchical neural network for closed-loop neuromodulation
  - The high-channel-count SoC occupies only 0.014mm<sup>2</sup> while dissipating 1.51µW/channel and 0.227µJ/classification

- New innovations in health technologies improve performance in sensing biological signals
- Innovative analog front-end enables long-term dry-contact impedance measurement
- Incorporation of low-energy machine learning at the edge enables neural interfaces to treat on demand and in closed loop
- Lower power consumption while preserving performance enables the integration of more channels in lower area

## Session 21 Overview: Highlighted Chip Releases: Machine Learning and Digital Processing

#### **Invited Industry Session**

Session Chair: Dennis Sylvester, University of Michigan, Ann Arbor, MI

Session Co-Chair: Thomas Burd, AMD, Santa Clara, CA

Subcommittee Chair: Piet Wambacq, imec, Belgium

This session highlights four recent digital systems, spanning several application areas. In addition to machine learning accelerators (including network training), the session includes papers describing the design of CPUs that form the heart of the world's fastest supercomputer, as well as a highly efficient accelerator for Bitcoin mining. The papers describe a range of techniques used in modern commercial digital systems, including testability and defect tolerance issues, and also discuss the important role that software plays in hardware design. Thermal and power considerations are ubiquitous and discussed in the varied settings that these chips will be deployed in.

- In Invited Paper 21.1, SambaNova describes their dataflow architecture, designed to optimize execution of modern software and targeting large machine learning models. The 7nm chip has >40B transistors and can train a 1-trillion parameter natural language processing model in a quarter rack machine, with record out-of-the-box CNN training accuracy due to its dataflow unit architecture features.
- In Invited Paper 21.2, Fujitsu presents the CPU design used in the world's fastest supercomputer, Fugaku. The paper touches on interconnect strategies, energy efficiency techniques, and reliability concerns dealt with during the design of the A64FX 52-core processor at the heart of the 442PetaFLOPS supercomputer.
- In Invited Paper 21.3, Intel presents a 7nm Bitcoin mining ASIC that consumes 55J per terahash. The chip incorporates ultralow voltage design, die voltage stacking, specialized clocking strategies, and other circuit and micro-architectural optimizations to achieve its high energy efficiency.
- In Invited Paper 21.4, Tenstorrent describes their Wormhole neural network training processor, fabricated in 12nm CMOS and consuming 771mm<sup>2</sup> providing 430TOPS. The paper describes their approach to scale-out Wormhole, as well as design for testability, reliability in the face of defects, and global clocking strategies.

## Session 21 Highlights: Highlighted Chip Releases: Machine Learning and Digital Processing

#### [21.3] Bonanza Mine: An Ultra-Low Voltage Energy-Efficient Bitcoin Mining ASIC

**Paper Authors:** Chandra Katta<sup>1</sup>, Vikram B. Suresh<sup>2</sup>, Srinivasan Rajagopalan<sup>1</sup>, Tao Z. Zhou<sup>3</sup>, AmitKumar Patel<sup>1</sup>, Raju Rakha<sup>1</sup>, Nikhil Krishna Gopalakrishna<sup>1</sup>, Sanu Mathew<sup>2</sup>, Ajat Hukkoo<sup>1</sup>

Paper Affiliation: 1Intel Corporation, Santa Clara, CA, 2Intel Corporation, Hillsboro, OR, 3Intel Corporation, San Diego, CA

Subcommittee Chair: Piet Wambacq, imec, Belgium

#### CONTEXT AND STATE OF THE ART

- Bitcoin mining continues to grow in popularity and is very power-hungry, currently estimated to use 91TW-hour annually, which is greater than Finland's annual electricity usage, for example.
- Accelerators can optimize energy efficiency for this compute-intensive task; system integration issues such as cooling should also be considered.

#### TECHNICAL HIGHLIGHTS

- Intel presents a Bitcoin mining ASIC that employs a wide range of micro-architectural and circuit techniques to achieve excellent energy efficiency.
  - The 7nm ASIC operates at 355mV and is deployed in a 25-deep board voltage stacked configuration (8.875V main supply) that takes advantage of the consistent current requirements during mining.
  - The 14.16mm<sup>2</sup> chip operates at 1.6GHz, generating 137B hashes per second at 2.5W. The chip supports high performance, balanced, and power-saving modes by identifying naturally slower and faster compute engines on a die and modulating their use.

- Bitcoin mining has migrated from CPUs to GPUs and now to ASICs. This design represents the most technically advanced bitcoin mining ASIC to date.
- The cryptocurrency mining hardware market is forecast to grow by \$2.8B during 2021-2025, per recent market forecasts.

## Session 22 Overview: Cryo-Circuits and Ultra-Low-Power Intelligent IoT

#### **Technology Directions Subcommittee**

Session Chair: Sudip Shekhar, University of British Columbia, Vancouver, BC

Session Co-Chair: Radu Berdan, Kioxia, Kawasaki, Japan

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan

This session captures some of the trends in the emerging areas of quantum computing and intelligent IoTs. The first three papers describe qubit controlling systems and circuits operating at 3.5K. Next, a chip leveraging backscattering communication is described for low power IoTs. The last three papers leverage mixed-signal circuits to significantly improve the energy efficiency of AIoT features - voice activity detection, keyword spotting and wake-up.

- In Paper 22.1, IBM presents a cryogenic 14nm FinFET qubit state controller including a domain-specific processor and a 4.5to-5.5GHz RF AWG operating at 3.5K for superconducting transmons.
- In Paper 22.2, the authors at Pohang University of Science and Technology describe a cryo-CMOS controller IC in 40nm CMOS operating at 3.5K. The chip includes 2 PLLs, 4 pulse modulation channels and 2 receiver channels with each channel capable of accessing 16 qubits with a frequency division multiplexing.
- In Paper 22.3, the authors at the Southern University of Science and Technology and EPFL describe a Hybrid Class B/C modeswitching VCO in a 130nm SiGe BiCMOS process, operating from 13.9-to-18.1GHz at 3.5K.
- In Paper 22.4, the authors at UCSD present a WiFi and Bluetooth backscattering combo chip in 65nm CMOS featuring beam steering via a fully-reflective phased-controlled multi-antenna termination technique. Operation over 56 meters is demonstrated.
- In Paper 22.5, the University of Macau introduces a 108nW, 0.8mm<sup>2</sup> analog voice activity detector (VAD) fabricated in 28nm CMOS which achieves a hit rate of 90% for speech and 94% for non-speech at a classification rate of 100Hz.
- In Paper 22.6, the authors at ETH Zurich and KAIST, Korea describe a 72uW keyword spotter (KWS) ASIC in 65nm CMOS that
  uses time-domain feature extraction enabling a 86% detection accuracy on the 12-class Google Speech Command Dataset
  (GCSD). The 6.48mm<sup>2</sup> chip processes 61 frames/s and is powered by an on-chip solar-energy harvester.
- In Paper 22.7, the authors at Peking University present a clock-free spiking neural network for AloT wake-up functions which leverages computing-in-memory to achieve long-term power consumption of 82nW whilst classifying the MIT-BIH arrhythmia dataset. The chip consumes 0.53pJ/SOP and has 40us latency.

## Session 22 Highlights: Cryo-Circuits and Ultra-Low- Power Intelligent IoT

## [22.1] A Cryo-CMOS low-power semi-autonomous qubit state controller in 14nm FinFET technology

**Paper Authors:** David J Frank<sup>1</sup>, Sudipto Chakraborty<sup>1</sup>, Kevin Tien<sup>1</sup>, Pat Rosno<sup>2</sup>, Thomas Fox<sup>1</sup>, Mark Yeck<sup>1</sup>, Joseph A Glick<sup>1</sup>, Raphael Robertazzi<sup>1</sup>, Ray Richetta<sup>2</sup>, John F Bulzacchelli<sup>1</sup>, Daniel Ramirez<sup>2</sup>, Dereje Yilma<sup>2</sup>, Andrew Davies<sup>2</sup>, Rajiv V Joshi<sup>1</sup>, Shawn D Chambers<sup>2</sup>, Scott Lekuch<sup>1</sup>, Ken Inoue<sup>1</sup>, Devin Underwood<sup>1</sup>, Dorothy Wisnieff<sup>1</sup>, Chris Baks<sup>1</sup>, Donald Bethune<sup>3</sup>, John Timmerwilke<sup>1</sup>, Blake R Johnson<sup>1</sup>, Brian P Gaucher<sup>1</sup>, Daniel J Friedman<sup>1</sup>

**Paper Affiliation:** <sup>1</sup>IBM T. J. Watson Research Center, YORKTOWN HGTS., NY, <sup>2</sup>IBM Systems, Rochester, MN, <sup>3</sup>IBM Almaden Research Center, San Jose, CA

Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan, Technology Directions

#### CONTEXT AND STATE OF THE ART

- Scaling quantum computers to above 10<sup>5</sup> qubits requires individual and accurate qubit control in a 10mK environment, which is intractable if performed by control lines from room temperature due to mechanical connector congestion and RF loss.
- As a solution, time-division multiplexing of control signals is challenging since qubits need simultaneous activations, and frequency-division multiplexing can be undesirable due to the creation of extra tones which create unwanted qubit excitations.

#### TECHNICAL HIGHLIGHTS

- A scalable, 3.5K cryogenic 14nm FinFET qubit state controller is implemented, including a domain-specific processor with both general and special purpose instructions and a 4.5-to-5.5GHz RF AWG. Measured transmon coherence times are comparable to conventional room temperature control results.
  - This cryo-CMOS qubit state controller achieves lower power/qubit under active control while providing AWG functionality, general purpose programming and semi-autonomous operation.
  - o It is the first such controller built in 14nm FinFET technology.

- Fulfilling the promise of quantum computers of solving currently intractable problems requires scaling up the number of qubits whilst maintaining accurate control and flexibility in a <4K environment.
- Moving qubit state controllers on chip, close to the qubit, is an important stepping stone in creating large-scale, commercially viable quantum computers.

#### **RF Subcommittee**

Session Chair: Jun Yin, University of Macau, Macau, China

Session Co-Chair: Jeremy Dunworth, Qualcomm Technologies, San Diego, CA

Subcommittee Chair: Jan Craninckx, imec, Belgium

This session presents the latest advances in digital and analog phase-locked loops (PLLs) from 2.4 to 28GHz for high-performance wireless applications.

- In Paper 23.1, TSMC demonstrates a 4-to-17.5GHz cascade PLL employing LC and ring oscillators in the 1<sup>st</sup> and 2<sup>nd</sup> PLLs with
  programmable reference clock realignments to achieve a wide frequency-tuning range and low jitter. Implemented in 5nm
  FinFET CMOS, it achieves 204fs<sub>rms</sub> jitter and -241dB FoM.
- In Paper 23.2, KAIST presents an ultra-low-jitter fractional-N ring-oscillator (RO) digital PLL that dynamically selects the most proper phase among the 8 phases of the RO to reduce the DTC range and thermal noise. The jitter and FoM near 5.2GHz are 188fs<sub>rms</sub> and –243dB, respectively.
- In Paper 23.3, Delft University of Technology introduces a fractional-N digital PLL employing a time-mode arithmetic unit to perform the DTC and time-amplifier tasks simultaneously. It achieves a -59dBc worst-case fractional spur and 182fs<sub>rms</sub> jitter at 2.68GHz, while consuming 3.48mW, resulting in -249.4dB FoM.
- In Paper 23.4, Seoul National University presents a ring-oscillator-based injection-locked digital PLL incorporating a reference octupler with a probability-based adaptive calibration and a variable-step-size LMS algorithm. It achieves 177/223fs<sub>rms</sub> jitter at 8/16GHz.
- In Paper 23.5, Broadcom demonstrates a harmonic-mixing-based fractional-N PLL with a coupled mm-wave VCO to extend the loop bandwidth and suppress VCO noise. Consuming 12.9mW, the PLL with a sub-100MHz reference clock achieves 88fs<sub>ms</sub> jitter and -250dB FoM.
- In Paper 23.6, Politecnico di Milano presents a 28nm fractional-N bang-bang PLL leveraging a proportional-integral gear shift of the loop-filter gains and an adaptive frequency switching technique to overcome the jitter-versus-settling-time trade-off. It achieves 68.6fs<sub>rms</sub> jitter at near-integer 8.75GHz channels and 1.56µs locking time across the 8.5-to-10GHz tuning range.
- In Paper 23.7, the University of Electronic Science and Technology of China reports a 25.8GHz Integer-N PLL with a timeamplifying phase-frequency detector to suppress charge-pump noise without an extra frequency-locked loop. The PLL achieves 60fs<sub>rms</sub> jitter and -252.8dB FoM.

### **Session 23 Highlights: Frequency Synthesizers**

[23.1] A Cascaded PLL (LC-PLL + RO-PLL) with a Programmable Double Realignment Achieving 204fs Integrated Jitter (100kHz to 100MHz) and -72dB Reference Spur

[23.2] A 188fs<sub>rms</sub>-Jitter and –243dB-FoM<sub>jitter</sub> 5.2GHz-Ring-DCO-Based Fractional-N Digital PLL with a 1/8 DTC-Range-Reduction Technique Using a Quadruple-Timing-Margin Phase Selector

**Paper 23.1 Authors:** Tsung-Hsien Tsai<sup>1</sup>, Ruey-Bin Sheen<sup>1</sup>, Sheng-Yun Hsu<sup>1</sup>, Ya-Tin Chang<sup>1</sup>, Chih-Hsien Chang<sup>1</sup>, Robert Bogdan Staszewski<sup>2</sup>

Paper 23.1 Affiliation: <sup>1</sup>TSMC, Hsinchu, Taiwan, <sup>2</sup>University College Dublin, Dublin, Ireland

Paper 23.2 Authors: Chanwoong Hwang\*, Hangi Park\*, Taeho Seong, Jaehyouk Choi

Paper 23.2 Affiliation: KAIST, Daejeon, Korea

Subcommittee Chair: Jan Craninckx, imec, Belgium

#### CONTEXT AND STATE OF THE ART

- Area-efficient low-jitter PLLs are required in modern SoCs for advanced wireless and wired applications that integrate an increasing number of PLLs.
- The inherent high phase noise of the ring oscillator (RO) limits the jitter performance of the PLL, especially at a large feedback division ratio.
- Innovations in architectures and calibration techniques for PLLs employing ROs help improve the jitter performance.

#### TECHNICAL HIGHLIGHTS

- TSMC demonstrates a cascade PLL in 5nm FinFET CMOS achieving low jitter and wide frequency tuning range.
  - A 4-to-17.5GHz cascade PLL employs LC and ring oscillators in the 1<sup>st</sup> and 2<sup>nd</sup> PLLs with programmable reference clock realignments. It achieves 204fs<sub>rms</sub> jitter and -241dB FoM.
- KAIST presents a ring-DCO-based fractional-N digital PLL in 65nm CMOS achieving the lowest jitter among prior-art ring-oscillator-based fractional-N frequency synthesizers.
  - A 5.2GHz fractional-N digital PLL dynamically selects the most proper phase among the 8 phases of the ring DCO to reduce the DTC range and thermal noise. It achieves 188fs<sub>rms</sub> jitter and –243dB FoM at near 5.2GHz.

- Phase noise (jitter), silicon area, and frequency tuning range are essential design metrics that are traded off in the design of CMOS frequency synthesizers for modern wireless/wired systems.
- These PLLs present dedicated solutions to overcome the foregoing trade-offs in PLL designs.

# Session 24 Overview: Low-Power and UWB Radios for Communication and Ranging

#### Wireless Subcommittee

Session Chair: Maryam Tabesh, Google Inc., Mountain View, CA

Session Co-Chair: Jan Prummel, Renesas Electronics, 's-Hertogenbosch, Netherlands

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR

This session presents energy efficient wireless systems for communication and ranging applications. The first three papers are focused on ranging and ultra-wideband systems, whereas the next four papers present various techniques to achieve low power, small area, high sensitivity, and superior signal-to-interference rejection.

- In Paper 24.1, The University of Michigan and United Semiconductor present a crystal-less frequency hopping receiver that achieves long-range (up to 483m), non-line-of-sight RF localization with decimeter-level accuracy while consuming only 3.9mW.
- In Paper 24.2, imec-Netherlands, TU Delft, Eindhoven University of Technology and imec utilize an IR-UWB transmitter with a 3-D hybrid impulse modulation to enable 1.66Gb/s data rate, 5.8pJ/b energy efficiency, and 15cm of transmission range from 15mm below skin for intracortical brain-computer interfaces.
- In Paper 24.3, Newradio Technology Co. presents a highly integrated 1T3R transceiver for the IEEE 802.15.4/4z standard. The highly reconfigurable transmitter uses an arbitrary pulse engineering in order to achieve precise spectrum control while supporting multiple data rates up to 31.2Mb/s.
- In Paper 24.4, Renesas Electronics presents a BLE RF transceiver with on-chip antenna tuning that utilizes self IQ-phase correction to achieve a high average receiver image rejection of 39dB across PVT variations without any calibration, while occupying a small area of 0.84mm<sup>2</sup>.
- In Paper 24.5, University of Macau, Sun Yat-Sen University, and Instituto Superior Tecnico/University of Lisboa present a 266µW BLE receiver featuring an N-path passive balun-LNA and a pipeline down-mixing baseband signal extraction scheme achieving 77dB SFDR and -3dBm OOB-B<sub>-1dB</sub>.
- In Paper 24.6, The University of Michigan demonstrates a dual-chirp OOK modulation for interference mitigation in a ULP receiver that provides -41dB SIR while achieving a high sensitivity and low power consumption of -103dBm and 110µW, respectively.
- In Paper 24.7, KAIST, PHYCHIPS, and Hanbat National University present a 900MHz LPWAN radio with a -124dBm sensitivity reconfigurable data/wake-up receiver, which achieves -76dB of SIR and 114µW duty-cycled power consumption in wake-up receiver mode.

## Session 24 Highlights: Low-Power and UWB Radios for Communication and Ranging

[24.2] A 1.66Gb/s and 5.8pJ/b Transcutaneous IR-UWB Telemetry System with Hybrid Impulse Modulation for Intracortical Brain Computer Interfaces

# [24.5] A 266µW Bluetooth Low-Energy (BLE) Receiver Featuring an N-Path Passive Balun-LNA and a Pipeline Down-Mixing BB-Extraction Scheme Achieving 77dB SFDR and -3dBm OOB-B<sub>-1dB</sub>

**Paper 24.2 Authors:** Minyoung Song<sup>1</sup>, Yu Huang<sup>1,2</sup>, Yiyu Shen<sup>1</sup>, Chengyao Shi<sup>1,3</sup>, Arjan Breeschoten<sup>1</sup>, Mario Konijnenburg<sup>1</sup>, Huib Visser<sup>1</sup>, Jac Romme<sup>1</sup>, Barundeb Dutta<sup>4</sup>, Morteza S. Alavi<sup>2</sup>, Christian Bachmann<sup>1</sup>, Yao-Hong Liu<sup>1</sup>

**Paper 24.2 Affiliation:** <sup>1</sup>imec-Netherlands, Eindhoven, The Netherlands, <sup>2</sup>TU Delft, Delft, The Netherlands, <sup>3</sup>Eindhoven University of Technology, Eindhoven, The Netherlands, <sup>4</sup>imec, Leuven, Belgium

Paper 24.5 Authors: Haijun Shao<sup>1</sup>, Pui-In Mak<sup>1</sup>, Gengzhen Qi<sup>2</sup>, Rui P. Martins<sup>1,3</sup>

Paper 24.5 Affiliation: <sup>1</sup>University of Macau, Macau, China, <sup>2</sup>Sun Yat-Sen University, Zhuhai, China, <sup>3</sup>Instituto Superior Tecnico/University of Lisboa, Lisbon, Portugal

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR, Wireless

#### CONTEXT AND STATE OF THE ART

- Intra-cortical extracellular neural sensing is being rapidly and widely applied in several clinical research and brain-computer
  interfaces. As the number of sensing channels continues to double every 6 years, wireless telemetry modules need to achieve
  higher data transfer rate up to Gb/s. Moreover, to enable transcutaneous implant, the module should be miniaturized to ~ 3cm<sup>2</sup>,
  with transmission range up to 10cm and power consumption limited to ~10mW to avoid excessive tissue heating.
- Low power consumption is critical in Bluetooth Low-Energy (BLE) receivers to extend the battery life of many portable devices. At the same time, high sensitivity and linearity are critical to ensure sufficient wireless range and resilience to interferers, which results in an improved user experience. Typical active BLE receivers that use low-voltage low-noise amplifiers to achieve submW power consumption often struggle to maintain performance.

#### TECHNICAL HIGHLIGHTS

- imec-Netherlands, TU Delft, Eindhoven University of Technology and imec introduce an IR-UWB transmitter with 1.66Gb/s data rate, 5.8pJ/b energy efficiency, and 15cm of transmission range from 15mm below skin for intracortical brain-computer interfaces.
  - A hybrid impulse modulation employing a digital polar-based IR-UWB transmitter is used to increase data-rate while significantly reducing the overall pulse signal-to-noise ratio requirements.
- University of Macau, Sun Yat-Sen University, and Instituto Superior Tecnico/University of Lisboa present a 266µW BLE receiver with 77dB SFDR and -3dBm OOB-B-1dB
  - An N-path passive balun-LNA front-end is used to improve both SFDR and OOB-B<sub>-1dB</sub> by 19dB compared to recent publications, while consuming 28% less power.

- UWB radio technology promises high data rate and precise positioning over a small distance, which can be useful in many
  consumer electronic and brain-computer interface applications. In small devices and even more in transcutaneous implants, low
  volume and low power consumption are key enablers for UWB technology.
- Ultra-low-power radios are the main building blocks of Internet-of-Everything connectivity. Bluetooth Low-Energy (BLE) radios are extensively used for wireless connectivity in many small portable devices, where longer battery life dramatically improves user experience.

#### **Data Converter Subcommittee**

Session Chair: Dominique Morche, LETI, Grenoble, France

Session Co-Chair: Yan Zhu, University of Macau, Macau, China

Subcommittee Chair: Michael Flynn, University of Michigan, Ann Arbor, MI

Noise shaping continues to drive high-DR ADC performance in many applications with kHz bandwidth, including sensor and audio interfaces. The first three  $\Delta\Sigma$  ADC paper in this session reach SNDRs higher than 92 dB and demonstrate significant progress in SFDR (123.2dB), DR (96.9 dB), or compactness and DR (108.8 dB). Noise shaping is also exploited in the two following paper to extend the bandwidth of high DR ADC to 360 MHz with 68dB DR and 120 MHz with 114dB SFDR. The session wraps up with two noise-shaping SAR converters with a new architecture exploited to reduce the OSR to 5, while achieving close to 14 ENOB, and a TI-Pipe-SAR with a bandwidth tunable from 125MHz to 1GHz.

- In Paper 25.1, UCSD describes an ADC which consumes 4.4µW from a 0.8V supply achieving a 92.1dB SNDR and 123.2dB SFDR over a 2.5kHz bandwidth. Pseudo virtual ground is exploited in a capacitively coupled VCO-based ADC to linearize and stabilize the system.
- In Paper 25.2, Zhejiang University presents a fully dynamic MASH ADC that consumes 2.87µW while achieving 96.9dB DR and 94.0dB SNDR in 1kHz BW at an OSR of 125×. A dynamic-body-biasing-assisted correlated level shifting (CLS) technique is used to boost the DC gain of a single-stage FIA with only one reservoir capacitor.
- In Paper 25.3, Oregon State University shows a compact (0.0375mm<sup>2</sup>), low-power (203.5uW), high-resolution (108.8dB DR) 180nm DT DSM audio ADC. The design uses the pseudo-pseudo-differential switched capacitor technique, which utilizes singleended circuits.
- In Paper 25.4, Eindhoven University of Technology presents a 5GHz, CT MASH ADC achieving 68dB DR in 360MHz BW. It achieves 5dB higher FOMs and 2× lower FOMw, while occupying smaller area, in comparison to the state-of-the-art broadband CT ADCs.
- In Paper 25.5, NXP semiconductors introduces a CTΔΣ ADC achieving -101dBc THD and -105dBc IM3 with 120MHz bandwidth. The offset calibration and digital DAC error correction circuitries are implemented on-chip and do not require any external test/calibration signal.
- In Paper 25.6, Georgia Institute of Technology describes a 4<sup>th</sup>-order NS-SAR featuring a low OSR design and built-in buffering. The prototype in 65nm achieves 84.1dB SNDR with 500kHz BW (OSR=5) and 133.8uW (with input buffer).

## Session 25 Highlights: Noise-Shaping ADCs

## [25.4] A 28nm 6GHz 2b Continuous-Time $\Delta\Sigma$ ADC with -101dBc THD and 120MHz Bandwidth Using Digital DAC Error Correction

**Paper Authors:** Muhammed Bolatkale<sup>1</sup>, Robert Rutten<sup>1</sup>, Hans Brekelmans<sup>1</sup>, Shagun Bajoria<sup>1</sup>, Yihan Gao<sup>1</sup>, Bernard Burdiek<sup>2</sup>, Lucien Breems<sup>1</sup>

Paper Affiliation: 1NXP Semiconductors, Eindhoven, The Netherlands, 2NXP Semiconductors, Hamburg, Germany

Subcommittee Chair: Michael Flynn, University of Michigan, Ann Arbor, MI, Data Conversion

#### CONTEXT AND STATE OF THE ART

- High Dynamic Range ADCs have always been required for bandwidths ranging from tens of kHz for sensors & audio, to tens
  of MHz for broadband AM/FM radio. Today, the resolution of such ADCs is limited to a few tens of MHz, which limits the
  application range.
- State-of-the-art ADCs with bandwidths exceeding 100MHz have been reported, with excellent linearity numbers up to -83dBc THD [4-5], but this is far off from the <-100dBc THD achieved by the ultra-highly linear 1b ADCs.</li>

#### TECHNICAL HIGHLIGHTS

- A Continuous Time ΣΔ-ADC with -101dBc /-105dBc THD/IM3 and 71.5dB SNDR over 120MHz bandwidth in 28nm CMOS with a modulator core consumption of 108.8mW.
  - The DAC and quantizer employ digital background calibration enabling wideband linearity performance.
  - The inverter-based amplifiers are supplied by an LDO designed to deliver an optimal supply voltage to keep the transconductance (gm) of the amplifier stages constant over temperature and process variations. Measurements of multiple samples did not show noticeable temperature sensitivity of the linearity between -40 and +135 degrees Celsius and over global process corners (TT, FF, SS).

- Advanced circuit techniques for higher linearity and wider BW ADCs improve application performance (digital radio, radar).
- The offset calibration and digital DAC error correction circuitries are implemented on-chip and do not require any external test/calibration signal, and operate in the background, reducing manufacturing cost significantly while preserving performance.

# Session 26 Overview: Highlighted Chip Releases: Systems and Quantum Computing

#### **Invited Industry Session**

Session Chair: Fabio Sebastiano, Delft University of Technology, The Netherlands

Session Co-Chair: Alice Wang, Everactive, Plano, TX

Subcommittee Chair: Piet Wambacq, imec, Belgium

This session highlights innovations in Systems and Quantum Computing announced within the last year. As Moore's law is slowing down, products are moving towards complex system-level innovations to keep up with power and performance requirements. Quantum computing, which was once a dream, is becoming a reality with products on the horizon. These papers delve into practical system-related topics, mass-production related challenges and solutions (e.g., cost, reliability, thermal/voltage issues, packaging, etc.) in addition to circuit content and silicon measurement results.

- In invited Paper 26.1, Google presents their architecture of the Sycamore quantum computer, beginning with the quantum processor level and continuing through the system implementation. Google's Quantum AI Roadmap charts the path from today's ~50 qubit systems to future fault-tolerant quantum computers.
- In invited Paper 26.2, IBM Quantum Systems uses software and hardware approaches such as direct RF control and measurement, partially centralized control of distributed systems, and compiler infrastructure that mimics single-threaded execution of OpenQASM circuits across distributed hardware. These solutions are presented in the context of a co-design strategy to enable flexibility amid a continually changing quantum hardware landscape.
- In invited Paper 26.3, Meta presents a digital pixel sensor (DPS) with an optimal sensor architecture based on a distributed compute architecture for Augmented Reality (AR) applications whose requirements of lowest power, best performance, and minimal form factor makes AR sensors the new frontier.
- In invited Paper 26.4, AMD's V-Cache marks the industry's first 3D stacked product that attaches additional cache onto a high-performance processor through hybrid bonding, a technology that offers significant bandwidth and power benefits over state-of-the-art uBump based approaches. V-Cache expands Zen3's on-die L3 Cache from 32MB to 96MB, providing up to 2 TB/sec of bandwidth and 15% average gaming performance uplift.

## Session 26 Highlights: Quantum Computing Invited Papers

## [26.1] Beyond-Classical Computing Using Superconducting Quantum Processors

#### [26.2] Design Considerations for Superconducting Quantum Systems

Paper 26.1 Author: J. Bardin<sup>1,2</sup>

Paper 26.1 Affiliation: 1Google Quantum AI, Goleta, CA, 2University of Massachusetts, Amherst, MA

Paper 26.2 Authors: G. Zettles<sup>1</sup>, S. Willenborg<sup>1</sup>, B. Johnson<sup>2</sup>, A. Wack<sup>2</sup>, B. Allison<sup>1</sup>

Paper 26.2 Affiliation: <sup>1</sup>IBM, Rochester, MN, <sup>2</sup>IBM, Yorktown Heights, NY

Subcommittee Chair: Piet Wambacq, imec, Belgium

#### CONTEXT AND STATE OF THE ART

- Quantum computing concepts emerged in the 70's and 80's for efficient simulation engines. While many applications were dreamed up to use quantum computing, until recently they have remained hypothetical.
- The state of the art in quantum computing has been able to demonstrate tens of qubits and show superiority in computing
  over classical machines on a given task. The question mark is how to scale up quantum computers to solve large problems
  such as quantum simulations.

#### TECHNICAL HIGHLIGHTS

- Google presents their architecture of the Sycamore quantum computer, beginning with the quantum processor level and continuing through the system implementation
  - Google's Quantum AI Roadmap charts the path from today's ~50 qubit systems to future fault-tolerant quantum computers
- IBM Quantum Systems have a co-design strategy to enable flexibility amid a continually changing quantum hardware landscape
  - IBM's solution uses software and hardware approaches such as direct RF control and measurement, partially centralized control of distributed systems, and compiler infrastructure that mimics single-threaded execution of OpenQASM circuits across distributed hardware

- Applications envisioned for quantum computing include Cybersecurity, Drug Development, Financial Modeling, Traffic Optimization, Weather Forecasting and Climate Change and Artificial Intelligence
- The Quantum Computing market size is expected to reach USD 1.8 billion by 2026, from USD 472 million in 2021, at a CAGR of 30.2% during the forecast period

### **Session 26 Highlights: Augmented Reality Invited Paper**

#### [26.3] Augmented Reality – the Next Frontier of Image Sensors and Compute Systems

Paper Authors: C. Liu, S. Chen, T.-Hsun T., B. De Salvo, and J. Gomez

Paper Affiliation: Meta, Redmond, WA

Subcommittee Chair: Piet Wambacq, imec, Belgium

#### CONTEXT AND STATE OF THE ART

- Augmented Reality (AR) will be the next great wave of human-oriented computing, dominating our relationship with the digital world for the next 50 years
- The combined requirements of lowest power, best performance, and minimal form factor makes AR sensors the new frontier

#### TECHNICAL HIGHLIGHTS

- Meta presents a digital pixel sensor (DPS) with an optimal sensor architecture based on a distributed compute architecture.
  - $\circ$  ~ The 4.6  $\mu m$  DPS pixel has an in-pixel ADC and 10b SRAM
  - The proposed triple quantization (3Q) scheme combines a time-to-saturation (TTS) quantization mode and two linear ADC modes within a single exposure to achieve 127dB intra-scene DR
  - The sensor has a 512×512 effective resolution and consumes only 5.75mW in 3Q operation at 30fps

- As augmented reality becomes more sophisticated, new applications emerge such as surgical training, virtual retail experience, design and modeling, advancing the tourism industry, and enriching classroom learning
- The augmented reality market was valued at USD 14.7 billion in 2020 and is projected to reach USD 88.4 billion by 2026 with a CAGR of 31.5% from 2021 to 2026

## Session 27 Overview: mm-Wave & Sub-6GHz Transmitters & Receivers for 5G Radios

#### **Wireless Subcommittee**

Session Chair: Shahriar Shahramian, Bell Laboratories - Nokia, New Providence, USA

Session Co-Chair: Byung-Wook Min, Yonsei University, Seoul, Korea

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR

The session is driven by key advances in 5G radio integrated circuits at mm-wave & sub-6GHz. The first three papers present mm-wave transceivers for dual-band, broadband, and fast beamforming. The final two papers demonstrate sub-6GHz dual-mode, noise cancelling and digital-IF receivers.

- In Paper 27.1, Samsung Semiconductor and Samsung Electronics present a 16-channel, 28/39GHz dual-polarized 5G FR2 phased-array transceiver IC with a quad-stream IF transceiver supporting non-contiguous carrier aggregation up to 1.6GHz BW. The presented chipset offers a phased-array beamforming IC with 32 antenna ports and concurrent dual polarization for RX/TX, and its corresponding intermediate frequency IC.
- In Paper 27.2, Tokyo Institute of Technology presents a power-efficient 24-to-71GHz CMOS phased-array receiver utilizing a harmonic-selection technique that offers 36dB inter-band blocker rejection for 5G NR. The receiver mixer is driven by a tri-phase LO generator and the desired harmonic mixing component is enhanced while the undesired harmonics are rejected. 256-QAM EVMs of -33.3dB, -30.9dB, -31.6dB, and -28.5dB is achieved at 28GHz, 39GHz, 47.2GHz, and 60.1GHz respectively.
- In Paper 27.3, IBM T.J. Watson Research Center and Fujikura demonstrate a 24-to-30GHz 256-Element dual-polarized 5G phased array with fast beam-switching support for >30,000 beams enabled by an on-chip beam calculator with 200ns beam-setup and 8ns OTA-switching time. This 22-to-30GHz BF-IC provides coverage for n257/n258/n261 5G NR bands with 64- and 256-element peak PAE of 20% and 4dB receiver NF.
- In Paper 27.4, Nanyang Technological University presents a hybrid coupler-first 5GHz noise-cancelling dual-mode receiver with +10dBm in-band IIP3 in current-mode and 1.7dB NF in voltage-mode. The presented IC offers a choice between high in-band IIP3 or low NF based on the mode of operation with better than -80dBm LO leakage in both modes of operation.
- In Paper 27.5, Samsung Electronics presents a digital-IF receiver with hybrid-interference rejection and LO-sharing for inter/intra CA in a 14nm FinFET CMOS occupying an area of 4.84mm<sup>2</sup>. The receiver achieves an EVM of 2.5% by intra-4CA for B1 LTE10M and LTE20M in one single RX path, as well as inter-3CA for B3, B5, and B7 LTE 20M with shared LO-PLL. The estimated power saving by sharing a single LO-PLL is 17% compared to a 3-CA receiver with 3 separate PLLs.
## Session 27 Highlights: mm-Wave and sub-6GHz Transmitters & Receivers for 5G Radios

### [27.2] A Power-Efficient 24-to-71GHz CMOS Phased-Array Receiver Utilizing Harmonic-Selection Technique Supporting 36dB Inter-Band Blocker Rejection for 5G NR

# [27.5] A Single-Path Digital-IF Receiver Supporting Inter/Intra 5-CA with a Single Integer LO-PLL in 14nm CMOS FinFET

Paper 27.2 Authors: Jian Pang, Yi Zhang, Li Zheng, Minzhe Tang, Yijing Liao, Ashbir Aviat Fadila, Atsushi Shirane, Kenichi Okada

Paper 27.2 Affiliation: Tokyo Institute of Technology, Meguro-ku, Japan

**Paper 27.5 Authors:** Barosaim Sung, Hyun-Gi Seok, Jaekwon Kim, Jaehoon Lee, Taejin Jang, Ilhoon Jang, Youngmin Kim, Anna Yu, Jong-Hyun Jang, Jiyoung Lee, Jeongyeol Bae, Euiyoung Park, Sungjun Lee, Seokwon Lee, Joohan Kim, Beomkon Kim, Yong Lim, Seunghyun Oh, Jongwoo Lee, Byunghak Cho, Inyup Kang

Paper 27.5 Affiliation: Samsung Electronics, Hwaseong-si, Korea

Subcommittee Chair: Stefano Pellerano, Intel, Hillsboro, OR, Wireless

### CONTEXT AND STATE OF THE ART

- 5G radios need to operate over a multitude of closely spaced bands at both sub-6GHz and mm-wave frequencies to enable global spectrum compatibility.
- These radios must operate over a wide frequency range while rejecting strong blockers and interferers both within and outside the operating band.

#### **TECHNICAL HIGHLIGHTS**

- Tokyo Institute of Technology introduces a 24-to-71GHz CMOS phased-array receiver chip covering the majority of 5G FR2 bands.
  - The 65nm IC proposes a harmonic-selection technique to extend the operating bandwidth with low power consumption and attain an inter-band blocker tolerance in excess of 36dB.
- Samsung Electronics introduces a digital-IF receiver with hybrid-interference rejection and LO-sharing for inter/intra carrier aggregation.
  - The 14nm IC employs a single analog path and ADC for band group inter/intra-5CA with channel selection in the digital domain. By sharing a single LO-PLL, an estimated power saving of 17% is achieved when compared to a 3-CA receiver with 3 separate PLLs.

- 5G radio technology promises tens of Gb/s in data rate with a 10× reduction of latency. This will enable applications in enhanced mobile broadband, massive internet of things and mission-critical services.
- The availability of a single-chip receiver covering all available global mm-wave spectrum enables further cost reduction and agile 5G communication systems.
- With the expansion of 5G wireless networks the need for radios able to deliver Gb/s data rates in the presence of strong interferers becomes more and more significant.

### **Memory Subcommittee**

Session Chair: Bor-Doou Rong, Etron, Hsinchu, Taiwan

Session Co-Chair: Hye-Ran Kim, Samsung Electronics, Hwaseong, Korea

Subcommittee Chair: Meng-Feng Chang, National Tsing Hua University

DRAM memories continue to have a significant impact on a wide range of applications, including high-performance graphics, smartphones, server applications, and machine learning. A 192Gb 12-high 896GB/s HBM3 DRAM is presented for high-performance memories. A 16Gb GDDR6 DRAM for next-generation graphics applications shows a maximum pin speed of 27Gb/s/pin, while a 9.5Gb/s/pin 16Gb LPDDR5X DRAM is introduced for low-power mobile applications. The interface between memory and system chip is crucial, these five papers propose different schemes to improve the energy efficiency, performance, and signal integrity of memory I/O.

- In paper 28.1, SK Hynix presents a 12-high stack, 192Gb HBM3 with a 7Gb/s/pin I/O speed to reach a 896GB/s bandwidth. New design features include an in-DRAM ECC, internal NN-DFE, TSV auto-calibration, and machine-learning driven layout optimization. These features enable efficient data transfer among stacked dies, while maintaining low power consumption.
- In paper 28.2, Samsung presents a 16Gb 27Gb/s/pin GDDR6, it is the first to a T-coil in a DRAM process to extend the maximum I/O frequency. A merged-MUX TX increases the operating frequency, and decreases power and area consumption. Quad skew training adjusts the quadrature clock skew to maximize the sampling margin, covering a wide frequency range.
- In paper 28.3, Samsung presents a 16Gb LPDDR5X SDRAM that uses a non-stacking bank architecture, an offset-calibrated sense amplifier for the global data line, extended DVFSC, a stacked transmitter, and a low-power data aligner. It achieves 9.5Gb/s/pin performance using a 4<sup>th</sup>-generation 10nm DRAM fabrication technology.
- In paper 28.4, an inverter-based 4-tap addition-only feed-forward equalization TX is shown. Compared to conventional sourceseries termination drivers, the proposed TX achieves a more compact area and improves power efficiency by forgoing the linear resistors. The proposed scheme overcomes the non-linearity of a conventional inverter-based FFE TX and its poor sensitivity to variations, resulting in an energy efficiency of 1.18pJ/b at 20Gb/s/pin.
- In paper 28.5, a 10Gb/s Di-code transceiver for future HBM interfaces in a 28nm CMOS process is presented. A transimpedance amplifier based transceiver for di-code signaling supports three types of equalization to extend the sampling margin,
  and a mismatch calibration scheme is used to minimize the static current. It achieves a sampling margin of 0.467UI and an
  energy efficiency of 0.385pJ/b.
- In paper 28.6, a transceiver based on a capacitively-driven on-chip link is presented, which has benefit of low power consumption for DRAM global bus is proposed. A detailed analysis of on-chip wire characteristics is presented. The proposed FFE-combined ground-forcing biasing technique shows a well-defined DC level for capacitive driving and has a 78.8fJ/bit/mm energy efficiency at 12Gb/s/wire.
- In paper 28.7, a single-ended data-embedded clock signaling (DECS) transceiver is presented. The proposed receiver directly recovers (self-slicing) and automatically de-serializes the data from the front-end DECS input without a CDR or CDA. In addition, the transceiver can tolerate 40-60% clock duty-cycle distortion. The transceiver achieves an eye width of 0.99UI and a power consumption of 1.24pJ/b at 20Gb/s/pin.
- In paper 28.8, a supply-noise induced jitter (SIJ) cancellation technique, based on a 2<sup>nd</sup>-order adaptive filter, and the clock distribution network for a LPDDR5 mobile DRAM are presented. The proposed technique is based on an adaptive filter using a least mean square (LMS) algorithm to cancel the jitter along the CDN and TX/RX path. The RDQS RMS jitter reduction is more than 80% and results in a 4 times larger eye opening at 6.4Gb/s.

## **Session 28 Highlights: DRAM and Interface**

# [28.1] A 192Gb 12-High 896GB/s HBM3 DRAM with a TSV Auto-Calibration Scheme and Machine-Learning Based Layout Optimization

# [28.2] A 16Gb 27Gb/s/pin T-coil based GDDR6 DRAM with Merged-MUX TX, Optimized WCK Operation, and Alternative-Data-Bus

**Paper 28.1 Authors:** Myeong-Jae Park, Ho Sung Cho, Tae-Sik Yun, Sangjin Byeon, Young Jun Koo, Sangsic Yoon, Dong Uk Lee, Seokwoo Choi, Jihwan Park, Jinhyung Lee, Kyungjun Cho, Junil Moon, Byung-Kuk Yoon, Young-Jun Park, Sang-muk Oh, Chang Kwon Lee, Tae-Kyun Kim, Seong-Hee Lee, Hyun-Woo Kim, Yucheon Ju, Seung-Kyun Lim, Seung Geun Baek, Kyo Yun Lee, Sang Hun Lee, Woo Sung We, Seungchan Kim, Yongseok Choi, Seong-Hak Lee, Seung Min Yang, Gunho Lee, In-Keun Kim, Younghyun Jeon, Jae-Hyung Park, Jong Chan Yun, Chanhee Park, Sun-Yeol Kim, Sungjin Kim, Dong-Yeol Lee, Su-Hyun Oh, Taejin Hwang, Junghyun Shin, Yunho Lee, Hyunsik Kim, Jaeseung Lee, Youngdo Hur, Sangkwon Lee, Jieun Jang, Junhyun Chun, Joohwan Cho

#### Paper 28.1 Affiliation: SK hynix

**Paper 28.2 Authors:** Daewoong Lee, Hye-Jung Kwon, Daehyun Kwon, Jaehyeok Baek, Chulhee Cho, Sanghoon Kim, Donggun An, Chulsoon Chang, Unhak Lim, Jiyeon Im, Wonju Sung, Hye-Ran Kim, Sun-Young Park, HyoungJoo Kim, Hoseok Seol, Juhwan Kim, Jungbum Shin, Kil-Young Kang, Yong-Hun, Kim, Sooyoung Kim, Wansoo Park, Seok-Jung Kim, Chanyong Lee, Seungseob Lee, TaeHoon Park, ChiSung Oh, Hyodong Ban, Hyungjong Ko, Hoyoung Song, Tae-Young Oh, SangJoon Hwang, Kyung Suk Oh, JungHwan Choi, Jooyoung Lee

#### Paper 28.2 Affiliation: Samsung Electronics

Subcommittee Chair: Meng-Feng Chang, National Tsing Hua University

#### CONTEXT AND STATE OF THE ART

- HBM3 is a JEDEC DRAM standard based on discrete components for in-package-memory.
- GDDR6 is a JEDEC DRAM standard based on discrete components for high-speed graphics DRAM.

#### TECHNICAL HIGHLIGHTS

- SK hynix introduces 896GB/s HBM3 DRAM, the highest bandwidth DRAM reported thus far.
  - o In DRAM ECC, internal NN-DFE, TSV auto-calibration and machine-learning layout optimization.
- Samsung Electronics introduces a 27Gb/s/pin GDDR6 DRAM, the highest pin rate DRAM reported thus far.
  - A merged-MUX TX increases the operating frequency, decreases power consumption, and reduces area utilization.
     Quad skew training adjusts the quadrature clock skew to maximize the sampling margin.

- HBM3 is intended for use in a large variety of high-bandwidth applications such as high-performance computing, artificial intelligence, graphics, virtual reality and autonomous driving. HBM3 promises a major boost in memory bandwidth, along with a reduction in circuit board area cost, as well as significantly lower power consumption per bit.
- GDDR6 is intended for use in a large variety of high-speed applications such as graphics and artificial intelligence. GDDR6 offers an increased per-pin bandwidth over any DRAM. It promises a major boost in memory bandwidth at a lower cost.

## **Machine Learning Subcommittee**

Session Chair: Jun Deguchi, Kioxia Corporation, Kawasaki, Japan

Session Co-Chair: Jae-sun Seo, Arizona State University, Tempe, AZ

#### Subcommittee Chair: Marian Verhelst, KU Leuven, Belgium

Beyond conventional ML models, emerging ML algorithms and applications evolve at a fast pace, such as recommender systems, transformers, and spiking neural networks. Such emerging ML models exhibit different model architectures and computation/storage requirements, necessitating innovations in custom hardware design. This session includes four papers, each representing advances in ML chips for emerging applications with new techniques, such as 3D hybrid bonding, approximate/sparse computing, reconfigurable digital computing-in-memory, and spike-based on-chip learning.

- In Paper 29.1, Alibaba and UniIC present a 184QPS/W (at INT8 precision) 64Mb/mm<sup>2</sup> AI accelerator targeting recommendation systems, where 3D hybrid bonding of a 25nm 602mm<sup>2</sup> logic die with 300MHz operation at 1.2V, and a 55nm 602mm<sup>2</sup> DRAM die with 150MHz operation at 1.1V is demonstrated with a process-near-memory (PNM) engine. This work shows >200× better energy efficiency than off-chip memory and >2× better energy efficiency than state-of-the-art PNM solutions.
- In Paper 29.2, Tsinghua University and Tsing Micro demonstrate a 28nm 6.82mm<sup>2</sup> approximate-computing-based transformer processor with big-exact/small-approximate processing elements, bidirectional asymptotic sparsity speculation, and an out-oforder computing scheduler, towards energy-efficient computation of transformers with global attention. Peak energy-efficiency of 27.5TOPS/W is achieved with INT12 precision at 0.56V and 50MHz frequency.
- In Paper 29.3, University of California Santa Barbara and Tsinghua University describe a 28nm 6.83mm<sup>2</sup> sparse transformer accelerator chip that consumes 15.59µJ/token at 0.65V and 125MHz frequency for the BERT-base model with INT8/INT16 mixed precision, where the bitline-transpose digital computing-in-memory (CIM) engines can be dynamically reconfigured in pipeline/parallel modes for attention/fully-connected layers.
- In Paper 29.4, University of Zurich and ETH Zurich present ReckOn, a 28nm 0.86mm<sup>2</sup> task-agnostic spiking recurrent neural network processor with INT8 weight precision enabling on-chip learning over second-long timescales with <0.8% memory overhead and <150µW training power budget at 0.5V and 13MHz frequency, while targeting navigation, gesture recognition and keyword spotting tasks on edge devices.

## **Session 29 Highlights: ML Chips for Emerging Applications**

#### [29.1] 184QPS/W 64Mb/mm<sup>2</sup> 3D Logic-to-DRAM Hybrid Bonding with Process-Near-Memory Engine for Recommendation System

**Authors:** Dimin Niu<sup>1</sup>, Shuangchen Li<sup>1</sup>, Yuhao Wang<sup>1</sup>, Wei Han<sup>1</sup>, Zhe Zhang<sup>2</sup>, Yijin Guan<sup>2</sup>, Tianchan Guan<sup>3</sup>, Fei Sun<sup>1</sup>, Fei Xue<sup>1</sup>, Lide Duan<sup>1</sup>, Yuanwei Fang<sup>1</sup>, Hongzhong Zheng<sup>1</sup>, Yuan Xie<sup>1</sup>, Xiping Jiang<sup>4</sup>, Song Wang<sup>4</sup>, Qiwei Ren<sup>4</sup>

Affiliation: <sup>1</sup>Alibaba Group, Sunnyvale, CA, <sup>2</sup>Alibaba Group, Beijing, China, <sup>3</sup>Alibaba Group, Shanghai, China, <sup>4</sup>UniIC, Xian, China

Subcommittee Chair: Marian Verhelst, KU Leuven, Belgium, ML Subcommittee Chair

### CONTEXT AND STATE OF THE ART

- While AI model size is rapidly increasing, the rate of improvement in both memory capacity and bandwidth is very slow. This represents a significant challenge for memory-bound AI applications.
- Although on-chip memory solutions (e.g. SRAM) offer lower energy than off-chip memory solutions (e.g. HBM and DRAM), onchip memory solutions cannot handle large AI models owing to limited memory capacity. Off-chip memory solutions provide high memory capacity, but the energy consumption for data movement from/to off-chip memory is orders of magnitude larger than on-chip memory solutions. Hardware implementations of AI thereby exhibit the well-known "memory wall" problem.

#### **TECHNICAL HIGHLIGHTS**

Alibaba Group introduces an AI accelerator with 3D logic-to-DRAM hybrid bonding, which is a high density, high energy efficiency process-near-memory (PNM) solution for memory-bound AI applications. The solution shows >200× better energy efficiency than off-chip memory, >2× better energy efficiency than state-of-the-art PNM solutions. The performance is 317.43× better than that of a CPU system, and area efficiency (queries per second per mm<sup>2</sup>) is improved by 660× for a recommendation system application.

### APPLICATIONS AND ECONOMIC IMPACT

 Many memory-bound applications, such as natural language processing, recommendation systems, graph analytics, graph neural networks, and multi-task online inference, are dominating AI applications in modern cloud datacenters. Solutions incorporating tight logic-to-memory connectivity, such as 3D logic-to-DRAM hybrid bonding, can enable the deployment of such applications with good throughput and energy efficiency.

### **Power Management Subcommittee**

Session Chair: Min Chen, Analog Devices, Santa Clara, CA

Session Co-Chair: Li Geng, Xi'an Jiaotong University, Xi'an, China

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA

Power-management topologies and techniques have been adopted to improve the system performance of emerging power-converter applications such as energy harvesting, battery charging, and LDO regulation. Multi-input single-inductor multi-output topology, scalable multi-chip stackable bias-flip technique, reconfigurable series-parallel structure, and active feedforward ripple shaping technique are presented in this session to demonstrate the state-of-the-art performance of these power management systems.

- In Paper 30.1, the University of Virginia presents a multi-input single-inductor multi-output energy-harvesting and powermanagement platform in a 65nm CMOS process to extract energy from TEG/PV/PEH simultaneously while providing four voltage rails. The proposed chip achieves a quiescent current of 32nA, 1.2x10<sup>5</sup> dynamic range, 3.2× energy-extraction gain for the MSVR-SECE rectifier, and 80% efficiency at 1µA.
- In Paper 30.2, KAIST presents a 130V triboelectric nanogenerator energy-harvesting interface in a 0.18µm BCD process using scalable multi-chip stackable bias-flip and daisy-chained synchronous signaling techniques. The proposed chips achieve 3.14× power extraction gain, 96.4µW/cm<sup>2</sup> extracted power per TENG size, and 70.7% peak end-to-end efficiency.
- In Paper 30.3, Samsung Electronics presents a reconfigurable series-parallel charger in a 0.18µm BCD process for dual-battery applications. The proposed chip enables efficient battery charging by connecting batteries in series or parallel and achieves 97.7% efficiency in direct charging mode.
- In Paper 30.4, Intel presents a triode region 4A analog LDO in a 4nm CMOS process with active feedforward ripple shaping and on-chip power noise analyzer. The proposed chip achieves 60mV low-dropout voltage, 92% power-conversion efficiency, and 58.4dB power-supply rejection ratio.

## Session 30 Highlights: Power Management Techniques

# [30.1] A 89W 97.7%-Efficient Reconfigurable Series-Parallel Charger for Dual-Battery Applications

#### Author List:

Paper Authors: Sungwoo Lee, Taejin Jung, Yonghwan Cho, Seonggyu Cho, Minkyu Kwon, Daeung Cho, Sanghui Kang, Jeonguk Heo, Hyeongseok Oh, Sungung Kwak

Paper Affiliation: Samsung Electronics, Hwaseong-si, Korea

Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA, Power Management

### CONTEXT AND STATE OF THE ART

- With the advent of ultra-fast charging and foldable phones, the number of smart phones that use two batteries is gradually increasing. The series-parallel battery charger needs high-speed charging and stable balancing to reduce the excessive in-rush currents.
- The single-chip integrated reconfigurable series-parallel battery charger for dual-battery applications has the advantages of high power, high efficiency, and size reduction as well as cost improvements.

### TECHNICAL HIGHLIGHTS

- Samsung Electronics presents a reconfigurable series-parallel charger with 89W and 97.7% efficiency for dual-battery
  applications.
  - Reconfigurable series-parallel charger for dual-battery enables efficient battery charging by converting batteries in series or parallel, which can supply more than 3× the charging power of the existing ones with a single chip.
  - 89W and 97.7% efficiency are achieved and stable battery voltage balancing is ensured by adding the current limit function to the serial-to-parallel conversion circuit.

### APPLICATIONS AND ECONOMIC IMPACT

• The proposed single-chip series-parallel battery charger achieves high-speed charging through battery series connection, stable system power supply through battery parallel connection, and stable balancing through a per-battery current limit function without additional balancing circuit, enabling small size, high performance, and cost reductions.

## **Session 31 Overview: Audio Amplifiers**

## **Analog Subcommittee**

Session Chair: Qinwen Fan, Delft University of Technology, Delft, The Netherlands

Session Co-Chair: Mahdi Kashmiri, Broadcom, San Jose, CA

Subcommittee Chair: Maurits Ortmanns, University of Ulm, Institute of Microelectronics, Ulm, Baden-Württemberg, Germany

Audio amplifiers have made significant advancements in terms of wider dynamic range, power reduction, and improved linearity. Better sound quality as well as emerging applications such as 3D audio applications and piezoelectric speakers require continuous circuit advancements. The first paper describes a high-dynamic-range and high-linearity audio decoder, followed by a first capacitive-gain low-noise Class-D amplifier. A digital-input Class-D amplifier with supply-voltage-scaling volume control and a series-connected  $\Delta\Sigma$  modulator is then followed by a resistorless piezoelectric speaker driver with LC damping technique.

- In Paper 31.1, MediaTek introduces an audio decoder with a code-change-insensitive Real-Time-DEM algorithm, a poly-resistor linearization technique, and a loop-gain enhancement method, achieving -117dBc THD and 126dB dynamic range while delivering 62mW into a 16Ω load.
- In Paper 31.2, Delft University of Technology presents a first capacitive-gain chopper Class-D audio amplifier, achieving 121.4dB dynamic range, -109.8dB THD+N and 8µV<sub>ms</sub> output noise, while delivering 15/26 W into an 8/4Ω load.
- In Paper 31.3, National Cheng Kung University shows a digital-input Class-D audio amplifier with supply-voltage-scaling volume control and a series-connected ΔΣ modulator, achieving 8× jitter reduction and increasing the maximum output power to 1.5W, a 40% increase over the state-of-the-art.
- In Paper 31.4, Delft University of Technology describes a resistorless Class-D piezoelectric speaker driver with dual voltage/current feedback for LC resonance damping, achieving -91dB THD+N while driving a 4µF capacitive load with a 4.4A peak current.

## Session 31 Highlights: High-Performance Audio Amplifiers

#### [31.1] A -117dBc THD (-132dBc HD3) and 126dB DR Audio Decoder with Code-Change-Insensitive RT-DEM Algorithm and Circuit Technique for Relaxing Velocity Saturation Effect of Poly Resistors

# [31.2] A 121.4dB DR -109.8dB THD+N Capacitively-Coupled Chopper Class-D Audio Amplifier

Paper 31.1 Authors: Shon-Hang Wen, Chuan-Hung Hsiao, Shih-Hsiung Chien, Ya-Chi Chen, Kuan-Hung Chen, Kuan-Dar Chen

Paper 31.1 Affiliation: MediaTek, Hsinchu, Taiwan

Paper 31.2 Authors: Huajun Zhang, Marco Berkhout, Kofi A. A. Makinwa, Qinwen Fan

Paper 31.2 Affiliation: Delft University of Technology, Delft, The Netherlands

Subcommittee Chair: Maurits Ortmanns, University of Ulm, Institute of Microelectronics, Ulm, Baden-Württemberg, Germany, Analog

### CONTEXT AND STATE OF THE ART

- In a digital-to-analog audio interface, Real-Time DEM (RT-DEM) is often used in DAC cells to alleviate ISI. However, the constraint of 1-LSB change between samples demands a high OSR, leading to high power consumption.
- Poly resistors are widely used to realize the feedback network in an audio amplifier. However, their nonlinearity can contribute significant distortion and their noise can limit the dynamic range.
- Cross-over distortion degrades the audio signal linearity with limited amplifier gain in the audio band.

#### **TECHNICAL HIGHLIGHTS**

- MediaTek introduces an audio decoder, achieving the best reported -117dBc THD and 126dB dynamic range.
  - A code-change-insensitive RT-DEM algorithm dynamically selects the new start index of rotated DAC cells for the next code allowing for more than 1-LSB change.
  - A poly resistor linearization scheme suppresses HD3 by mitigating the velocity saturation effect.
  - A 2nd-order damping-factor-control frequency compensation allows a high gain in the audio band.
  - Delft University of Technology presents a first capacitive-gain chopper Class-D audio amplifier, achieving 121.4dB dynamic range, -109.8dB THD+N and 8µVrms output noise.
    - A capacitive feedback architecture is employed to achieve low noise and high dynamic range.
    - Timing and impedance matching techniques are proposed to protect the input devices from high-voltage switching transients.

- Higher fidelity results in better user experience not only in conventional applications such as smarts phones and automobiles but also suitable for emerging 3D audio applications such as virtual reality.
- The use of a capacitive feedback network overcomes the noise limitation of the conventional resistive feedback topology, achieving low noise while avoiding excessive driving power.

# Session 32 Overview: Ultrasound and Beamforming Applications

### Imagers, Medical, MEMS and Display Subcommittee

Session Chair: Jun-Chau Chien, National Taiwan University, Taipei, Taiwan

Session Co-Chair: Jerald Yoo, National University of Singapore, Singapore

Subcommittee Chair: Chris van Hoof, imec, The Netherlands

This session showcases some of the best works in ultrasound ASICs for imaging, communication, and beamforming. The first paper describes a real-time ultrasound image sensor for drone vision at a 7m range. Next, two pitch-matched ultrasound transceivers for medical imaging are presented, demonstrating the highest level in per-element integration and highest spatial resolution to date using 100×100µm<sup>2</sup> transducers. The fourth paper proposes a multi-frequency CMUT-array RX for the water-air acoustic link, and the final paper showcases a low-power adaptive beamformer for speech recognition.

- In Paper 32.1, National University of Singapore describes an Ultrasound Imaging System with on-chip per-voxel RX beamfocusing for 7m-range all-light-condition drone-vision applications. The SoC shows 7.76µs image reconstruction latency at 9.83M focal-points/s and 24fps throughput while consuming 142.3mW.
- In Paper 32.2, TU Delft presents an ASIC for catheter-based high-frame-rate 3D ultrasound probes. The ASIC integrates an analog front-end, ADC, μ-beamformer and a datalink that processes 1000vol/s, while consuming 1.23mW RX power.
- In Paper 32.3, TU Delft shows a smallest-to-date 100×100µm<sup>2</sup> pitch-matched ultrasound transceiver with boxcar-integrationbased RX µ-beamformer for high-resolution 3D imaging. The system operates at 10MHz center frequency with 0.04mm<sup>2</sup>/channel RX area, while consuming 1.2mW RX power.
- In Paper 32.4, Stanford University presents an electronically tunable CMUT receiver for wireless uplink across a water-air interface. The receiver achieves 28kb/s with a 59.7µPa minimum detectable pressure and a programmable gain, and is successfully demonstrated across the water-air interface.
- In Paper 32.5, University of Michigan at Ann Arbor introduces a 4-channel speech-recognition front-end with a self-Direction-of-Arrival correction adaptive beamformer that mitigates distortion. The chip consumes 157µW while achieving 80 dBA SNDR.

## Session 32 Highlights: Ultrasound and Beamforming Applications

# [32.1] BatDrone: A 9.83M-focal-points/s 7.76µs-Latency Ultrasound Imaging System with On-Chip Per-Voxel RX Beamfocusing for 7m-Range Drone Applications

# [32.2] A Pitch-Matched ASIC with Integrated 65V TX and Shared Hybrid Beamforming ADC for Catheter-Based High-Frame-Rate 3D Ultrasound Probes

**Paper 32.1 Authors:** Liuhao Wu<sup>1</sup>, Jiaqi Guo<sup>1</sup>, Rucheng Jiang<sup>1</sup>, Yande Peng<sup>2</sup>, Han Wu<sup>1</sup>, Jiamin Li<sup>1</sup>, Yilong Dong<sup>1</sup>, Miaolin Zhang<sup>1</sup>, Zhuoyue Li<sup>1</sup>, Kian Ann Ng<sup>3</sup>, Chne-Wuen Tsai<sup>1</sup>, Lian Zhang<sup>1</sup>, Longyang Lin<sup>4</sup>, Liwei Lin<sup>2</sup>, Jerald Yoo<sup>1,5</sup>

**Paper 32.1 Affiliation:** National University of Singapore, Singapore, Singapore, <sup>2</sup>University of California at Berkeley, Berkeley, <sup>3</sup>Digipen Institute of Technology, Singapore, Singapore, <sup>4</sup>Southern University of Science and Technology, Shenzhen, China, <sup>5</sup>The N.1 Institute for Health, Singapore, Singapore

Paper 32.2 Authors: Yannick Hopf<sup>1</sup>, Boudewine Ossenkoppele<sup>1</sup>, Mehdi Soozande<sup>2</sup>, Emile Noothout<sup>1</sup>, Zu-Yao Chang<sup>1</sup>, Chao Chen<sup>1</sup>, Hendrik Vos<sup>1</sup><sup>2</sup>, Hans Bosch<sup>2</sup>, Martin Verweij<sup>1</sup><sup>2</sup>, Nico de Jong<sup>1</sup><sup>2</sup>, Michiel Pertijs<sup>1</sup>

Paper 32.2 Affiliation: TU Delft, Delft, The Netherlands, 2 Erasmus MC, Rotterdam, The Netherlands

Subcommittee Chair: Chris van Hoof, imec, the Netherlands

#### CONTEXT AND STATE OF THE ART

- A 3D Ultrasound Imaging System (UIS) is robust to lighting conditions and inherently non-invasive, which is essential in many emerging applications such as all-condition-vision systems, biomedical imaging, and mobile/portable applications
- To date, 3D UIS had low framerate and high latency in image reconstruction, which have been a fundamental limitation for these to be used in latency-sensitive vision systems or applications that require a high framerate
- Innovations in on-chip per-voxel RX beam focusing, micro-beamforming, energy-efficient digitization, and energy-recycling TX break the fundamental performance barrier in 3D UIS

#### TECHNICAL HIGHLIGHTS

- National University of Singapore introduces an Ultrasound Imaging System SoC with fully on-chip per-voxel RX beamfocusing for 7m-range all-light-condition drone-vision applications
  - The SoC has the best throughput reported to date for on-chip ultrasound image reconstruction at 9.83M focal-point/s throughput and 24fps with 100 planes per frame with only 7.76µs latency, while consuming 142.3mW, suitable for alllight-condition drone-vision applications
- TU Delft presents an ASIC for catheter-based high-frame-rate 3D ultrasound probes that integrates analog front-end, ADC, µ-beamformer and a datalink
  - The ASIC is the first work to combine high-voltage transmitters, analog front-end, micro-beamforming, digitization, and transducers all on-chip, enabling high-frame-rate catheter-based 3D ultrasound imaging while consuming 1.5× less power than the prior art

- A breakthrough 3D Ultrasound Imaging System with high frame rate and low latency, which opens the door for new ultrasound applications in the domain of all-light-condition vision systems.
- Low-power ultrasound systems enable new applications in-the-edge such as in drones
- Energy-efficient on-chip micro-beamforming with power- and area-efficient ADCs enables integration of pixel-matched 3D UIS, thereby allowing for many new high-channel-count ultrasound applications

### **Digital Architectures and Systems Subcommittee**

Session Chair: Sanu Mathew, Intel, Hillsboro, OR

Session Co-Chair: Chia-Hsiang Yang, National Taiwan University, Taiwan

Subcommittee Chair: Tom Burd, AMD, Santa Clara, CA

This session showcases performance and efficiency advancements in domain-specific processors for a diverse range of applications. The first paper describes a smart card SoC, followed by a neural signal processor for seizure prediction. A light-field factorization processor for 3D displays comes next, followed by a depth signal processing SoC for a fusion-based 3D system.

- In Paper 33.1, Samsung describes a 45nm, 24.27mm<sup>2</sup> fully integrated biometric smart card SoC that enables fingerprint authentication with anti-spoofing, achieving 1.05A/m minimum magnetic field strength and 1014.7ms transaction time.
- In Paper 33.2, National Taiwan University presents a neural signal processor for seizure prediction with a sensitivity of 92.0%, specificity of 99.1%, and false alarm rate of 0.57/hr, consuming 96.2nJ/class from 0.49V at 6MHz in 40nm CMOS.
- In Paper 33.3, National Tsing Hua University shows a 7×7-view, rank-1/-2/-4 factorization processor for naked-eye full-parallax 3D displays, achieving HD 31-34fps at 180-200MHz in 40nm CMOS.
- In Paper 33.4, KAIST presents a 28nm depth signal processing SoC for a dense RGB-D and 3D bound-box acquisition system, performing point cloud based neural network for 3D object detection at 31.9fps consuming 281.6mW.

## Session 33 Highlights: Domain-Specific Processors

#### [33.1] A 1.05A/m Minimum Magnetic Field Strength Single-Chip, Fully Integrated Biometric Smart Card SoC Achieving 1014.7ms Transaction Time with Anti-Spoofing Fingerprint Authentication

# [33.2] A 96.2nJ/class Neural Signal Processor with Adaptable Intelligence for Seizure Prediction

**Paper 33.1 Authors:** Ji-Soo Chang, Eunsang Jang, Youngkil Choi, Moonkyu Song, Sanghyo Lee, Gi-Jin Kang, Kyeong-Do Kim, Junho Kim, Jongseon Shin, Shin-Wuk Kang, Iktae Chung, Uijong Song, Chang-Yeon Cho, Han-Ju Je, Ho Kang, Junseo Lee, Hansol Lee, Ji-Eun Jang, Kihwan Kim, Yong-Wook Kim, Kyungduck Seo, Seongwook Song, Sung-Ung Kwak

Paper 33.1 Affiliation: Samsung Electronics, Hwaseong, Korea

Paper 33.2 Authors: Yi-Yen Hsieh, Yu-Cheng Lin, Chia-Hsiang Yang

Paper 33.2 Affiliation: National Taiwan University, Taipei, Taiwan

Subcommittee Chair: Tom Burd, AMD, Santa Clara, CA

#### CONTEXT AND STATE OF THE ART

- Domain-specific processors are increasingly being employed to achieve the high energy efficiencies and performance requirements for operating in battery-constrained smart cards and closed-loop neuromodulation devices.
- High-efficiency wireless power transfer along with secure biometric authentication are critical features for enabling secure payment using contactless proximity smart cards.
- Innovations in neural signal processing are transforming the treatment of neurodegenerative disorders using closed-loop neuromodulation by enabling real-time detection and prediction of epileptic seizures.

#### TECHNICAL HIGHLIGHTS

- Samsung introduces a fully integrated biometric smart card SoC in 45nm CMOS enabling accurate fingerprint authentication with anti-spoofing capabilities to enable secure payment processing.
  - Cascaded DC-DC converter power-delivery and a CNN-based anti-spoofing detector enables reliable contactless operation with minimum magnetic field strength of 1.05A/m, meeting the ISO/IEC 14443 standard with a transaction time of 1014.7ms.
- National Taiwan University presents an energy-efficient neural signal processor in 40nm CMOS that achieves realtime accurate seizure prediction.
  - An optimized feature extractor and a reconfigurable SVM kernel enable seizure prediction with a sensitivity of 92.0%, specificity of 99.1%, and false alarm rate of 0.57/hr, consuming 96.2nJ/class at 0.49V.

- Secure reliable contactless smart-card operations are critical to enabling frictionless payment processing at point-of-sale terminals.
- Closed-loop neuromodulation is an emerging solution to epileptic seizure control. Energy-efficient signal processing for accurate real-time seizure prediction is critical for enabling this promising technology.

### **Digital Architectures and Systems Subcommittee**

Session Chair: Chiraag Juvekar, Analog Devices, Boston, MA

Session Co-Chair: Ingrid Verbauwhede, KU Leuven, Belgium

#### Subcommittee Chair: Thomas Burd, AMD, Santa Clara, CA

This session focuses on two important aspects of hardware security: efficient implementation of cryptography and protection against side-channel attacks. The first paper describes a domain-specific processor supporting a multitude of 3<sup>rd</sup>-round finalists from NIST's postquantum cryptography (PQC) competition. The second paper presents a novel machine learning-based side-channel countermeasure that can be patched after manufacturing. The last two papers present two masking-based countermeasures. The third paper presents a threshold implementation to protect neural network accelerators against side-channel attacks. The last paper presents a reconfigurable masking countermeasure for AES in a 4nm process.

- In Paper 34.1, Tsinghua University describes a processor supporting multiple post-quantum cryptographic algorithms, and a diverse set of computing patterns and memory requirements. The processor runs at 500MHz at 0.9V and performs 48KOPS at 3.4µJ/Op energy efficiency and consumes an area of 3.6mm<sup>2</sup> in 28nm.
- In Paper 34.2, NUS presents a 40nm implementation of a machine learning-based approach to counteract power/EM sidechannel attacks. Their approach allows for security upgradeability via retraining and post-silicon hardware patching for newly discovered vulnerabilities. They demonstrate protection for PRESENT and AES with an MTD of 1.2B traces.
- In Paper 34.3, MIT introduces a threshold masking-based neural network accelerator that secures model parameters and inputs against power side-channel attacks. Their 1.59µm<sup>2</sup> demonstration in 28nm runs at 125MHz at 0.95V and limits the area and energy overhead to 64% and 6×, respectively.
- In Paper 34.4, Intel presents a 4nm reconfigurable side-channel-protected AES accelerator that can switch between SCAresistant and dual-core modes. The accelerator runs at 647MHz at 0.75V and consumes 0.007mm<sup>2</sup>. Its reconfigurability allows up to 2.2× throughput boost in dual-core mode, while allowing a MTD greater than 1B power/EM traces in SCA mode.

## **Session 34 Highlights: Hardware Security**

# [34.1] A 28nm 48KOPS 3.4µJ/Op Agile Crypto-Processor for Post-Quantum Cryptography on Multi-Mathematical Problems

#### [34.3] ShieldNN: A Threshold-Implementation-Based Neural Network Accelerator Securing Model Parameters and Inputs Against Power Side-Channel Attacks

**Paper 34.1 Authors:** Yihong Zhu<sup>1</sup>, Wenping Zhu<sup>1</sup>, Min Zhu<sup>2</sup>, Chongyang Li<sup>1</sup>, Chenchen Deng<sup>3</sup>, Chen Chen<sup>1</sup>, Shuying Yin<sup>1</sup>, Shouyi Yin<sup>1,3</sup>, Shaojun Wei<sup>1,3</sup>, Leibo Liu<sup>1,3</sup>

**Paper 34.1 Affiliation:** <sup>1</sup>School of Integrated Circuits, Tsinghua University, Beijing, China, <sup>2</sup>Micro Innovation Integrated Circuit Design Co.,Ltd, Wuxi, China, <sup>3</sup>Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China

Paper 34.3 Authors: Saurav Maji<sup>1</sup>, Utsav Banerjee<sup>2</sup>, Samuel H Fuller<sup>1,3</sup>, Anantha P Chandrakasan<sup>1</sup>

Paper 34.3 Affiliation: <sup>1</sup>Massachusetts Institute of Technology, Cambridge, MA, <sup>2</sup> Indian Institute of Science, Bengaluru, India, <sup>3</sup>Analog Devices, Wilmington, MA

Subcommittee Chair: Thomas Burd, AMD, Santa Clara, CA

#### CONTEXT AND STATE OF THE ART

- With the proliferation of electronics in mobile devices, cyber-physical systems, IoT and other devices, the need for hardware security only grows. This requires highly efficient implementations of current and next-generation cryptographic algorithms, i.e. symmetric and public-key algorithms and also post-quantum algorithms.
- At the same time these devices need protection against physical manipulation, especially passive side-channel and activefault attacks. In the context of side-channel attacks, as the attacker capabilities increase, the countermeasures need to become more robust. Countermeasures exist at the circuit level, e.g. intelligent circuits for current attenuation, as well as at the mathematical level, e.g. masking countermeasures.
- True Random Number Generators (TRNGs) and Physically Unclonable Functions (PUFs) are foundational root-of-trust hardware primitives in secure platforms. Energy-efficient TRNGs and stable low bit-error-rate PUFs are critical to providing high-entropy keys and secure IDs for cryptographic workloads.
- As neural networks are deployed on the edge, the confidentiality of the proprietary models and inputs becomes increasingly important. Dedicated countermeasures are required to ensure that this sensitive data does not leak from the compute node.

### TECHNICAL HIGHLIGHTS

- Researchers at Tsinghua University present a crypto-processor for post-quantum algorithms with a diverse set of
  computing patterns and large memory requirements.
  - A 28nm processor supporting post-quantum cryptographic algorithms achieves 48KOPS throughput and 3.4µJ/Op energy efficiency at 0.9V and 500MHz, at an area of 3.6mm<sup>2</sup>.
- A team of researchers from MIT, IISc Bangalore and Analog Devices present a threshold-masking-based neuralnetwork accelerator that secures model parameters and inputs against power side-channel attacks.
  - This work protects the externally stored model parameters using lightweight Trivium encryption.
  - $\circ$  A 1.59 $\mu$ m<sup>2</sup> 28nm accelerator is presented that implements a threshold-based masking scheme with 64% area overhead and 6× energy-overhead.

- Hardware security has become a key aspect of system-on-chip design to avoid exploits of security vulnerabilities in compute and connected systems, protect data and preserve safety.
- Resilience to physical, machine learning and cryptanalytic attacks has become a major driver in hardware security applications, and is being pursued through the adoption of specific techniques that increase the attack effort by several orders of magnitude.
- The need for high-quality keys for data encryption and device authentication at low silicon area and cost is driving innovation in true random number generators, as well as physically unclonable functions with low bit error rate.
- Processors based on post-quantum cryptographic algorithms are gaining interest, in view of the threat posed by attacks by quantum computers.

# ISSCC 2022 TRENDS



#### PREAMBLE

The Session Overviews and Highlights to follow serve to capture the context, highlights, and potential impact, of the papers to be presented in each Session at ISSCC 2022 in February.

OBTAINING COPYRIGHT to ISSCC press material is EASY!

You may quote the Subcommittee Chair as the author of the text if authorship is required.

You are welcome to use this material, copyright- and royalty-free, with the following understanding:

- That you will maintain at least one reference to ISSCC 2022 in the body of your text, ideally retaining the date and location. For detail, see the FOOTNOTE below.
- That you will provide a courtesy PDF of your excerpted press piece and particulars of its placement to shahriar@ece.ubc.ca

#### FOOTNOTE

 From ISSCC's point of view, the phraseology included in the box below captures what we at ISSCC would like your readership to know about this, the 69th appearance of ISSCC, on February 20<sup>th</sup> to February 24<sup>th</sup> ,2022.

This and other related topics will be discussed at length at ISSCC 2022, the foremost global forum for new developments in the integrated-circuit industry. ISSCC, the International Solid-State Circuits Conference, will be held virtually on February 20 - February 24, 2022

ISSCC Press Kit Disclaimer

The material presented here is preliminary.

As of November 11, 2021, there is not enough information to guarantee its correctness.

Thus, it must be used with some caution.

# HISTORICAL TRENDS IN TECHNICAL THEMES ANALOG SYSTEMS ANALOG SUBCOMMITTEE POWER MANAGEMENT SUBCOMMITTEE DATA CONVERTERS SUBCOMMITTEE



## Analog – 2022 Trends

Subcommittee Chair: Maurits Ortmanns, University of Ulm, Institute of Microelectronics, Ulm, Baden-Württemberg, Germany

In ISSCC 2022, the drive remains strong in developing new analog circuit techniques for pushing the performance of frequency references, sensor interfaces (temperature, electric current, magnetic field, and mass flow), high-speed comparators, and audio amplifiers.

The temperature stability of RC frequency references continues to improve, as shown in Figure 1. By intermittently referencing the XO in a single XO-based clock management system, an on-chip RC oscillator in ISSCC 2022 attains the best stability ever revealed (0.68ppm/°C) using a duty-cycled machine learning-based calibration scheme.



Figure 1: Trends in stability over time for non-crystal oscillators.

In other applications, a new BJT-based CMOS temperature sensor achieves a  $\pm 0.45$ °C (3 $\sigma$ ) inaccuracy using 1-point trim over a wide temperature range from -50°C to 180°C, without sacrificing the energy efficiency at high temperature (9.7pJ·K<sup>2</sup> at room temperature and 7.2pJ·K<sup>2</sup> at 150°C). A high-performance comparator now features a 174µV<sub>rms</sub> input-referred noise and a sub-500ps clock-out delay while consuming only 75fJ per comparison, supporting the development of efficient Gb/s serial communication as well as 5G/6G baseband applications.

Significant advancements are shown in audio amplifiers regarding wider dynamic range, power reduction, and improved linearity, as shown in Figure 2. The design of Class-D amplifiers has matured so much that the fundamental limits for noise are coming in range, and new creative solutions are needed to go beyond. This is addressed in this year's papers using techniques such as supply-voltage-scaling volume control, dual voltage/current feedback, and the first capacitive-gain chopper Class-D audio amplifier. The audio amplifiers are coming alongside high-end audio DACs in terms of DR and THD+N.



SNR (dB) (A-weighted)

Figure 2: Trends in Class-D amplifier performance.

## **Power Management – 2022 Trends**

#### Subcommittee Chair: Yogesh Ramadass, Texas Instruments, Santa Clara, CA

The diversity and wide scope of power-management applications and techniques are evident in this year's ISSCC submissions. Papers in the power management sessions include a broad range of applications from high-voltage line- and bus-interface converters to fast, low-voltage power delivery, to wireless/isolated power delivery, and to envelope-tracking, energy-harvesting, and other concepts. The continued progression and diversification of process technologies used for power-management integrated circuits are also highlighted. Papers showcasing high-level gallium-nitride (GaN) integration, high-voltage SOI-CMOS and deep submicron (4nm) CMOS represent efforts at opposite ends of the voltage and conversion ratio spectrum. New topologies and architectures continue to be a theme at ISSCC 2022. Hybrid and resonant switched-capacitor converters continue to expand in support of new applications while also addressing key industry-driven challenges such as transient performance, electromagnetic interference (EMI), and system robustness.

Among the many new design concepts and paradigms presented at ISSCC 2022, a number of key trends stand out:

- Power converters continue to push towards higher switching frequencies in the 10s to 100s of MHz range to reduce passive component sizes and achieve faster transient responses.
- Such high-frequency operation is enabled by advanced process technologies such as GaN and deep-submicron CMOS (down to 4nm) as well as new topologies, multi-phase operation, and gate-driving techniques.
- Hybrid and resonant switched-capacitor topologies continue to diversify while addressing key application spaces such as 48to-1V point of load conversion and challenges related to regulation and transient response.
- Isolated and wireless power delivery continue to be important drivers for applications that demand good efficiency, high common-mode transient immunity (CMTI) and EMI performance.
- A range of emerging applications including novel energy harvesting devices, battery system management, and low-power IoT and wearables creates opportunities for new circuit architectures and control strategies to address application-specific challenges.

## Data Converters – 2022 Trends

#### Subcommittee Chair: Michael Flynn, University of Michigan, Ann Arbor, MI

Data converters are a critical link between the analog physical world and the world of digital computing and signal processing prevalent in modern electronics. The need to faithfully preserve the signal across domains continues to pressure data converters to deliver more bandwidth and linearity while continuing to increase power efficiency. This year, ISSCC not only continues the trend of reporting highly energy-efficient analog-to-digital converters (ADCs), but also showcases new and exciting converter architectures, which open new possibilities for data conversion.

Pipelined-SAR architectures are pushing the speed limits of the current state-of-the-art in Nyquist converter design, while incremental converters are reaching new levels of efficiency. In noise-shaping converter design, delta-sigma and noise-shaping-SAR converters are continuing their prominent role and show their strengths in both high-efficiency, as well as high-speed data conversion.

The three figures below represent traditional metrics that capture the innovative progress in ADC design. The first figure plots power dissipated relative to the Nyquist sampling rate ( $P/f_{snyq}$ ), as a function of signal-to-noise and distortion ratio (SNDR), to give a measure of ADC power efficiency. Note that a lower  $P/f_{snyq}$  metric represents a more efficient circuit on this chart. For low-to-medium-resolution converters, energy is primarily expended to quantize the signal; thus the overall efficiency of this operation is typically measured by the energy consumed per conversion and quantization step. The dashed trend-line represents a benchmark of 1fJ/conversion-step. Circuit noise becomes more significant with higher-resolution converters, necessitating a different benchmark proportional to the square of the signal-to-noise ratio, represented by the solid line. Designs published from 1997 to 2021 are shown in circles. ISSCC 2022 designs are shown in black dots.

The second figure plots signal fidelity vs. the Nyquist sampling rate normalized to power consumption. At low sampling rates, converters tend to be limited by thermal noise, independent of the sample rate. Higher speeds of operation present additional challenges in maintaining accuracy in an energy-efficient manner, indicated by the roll-off vs. frequency in the dashed line. The last ten years have resulted in an improvement of over 10dB in power-normalized signal fidelity, or a 10× improvement in speed for the same normalized signal fidelity. In this year's ISSCC, audio delta-sigma converters reach new levels of efficiency, while no less than four pipelined-SAR converters are continuing the trends in the speed vs efficiency corner of the graph. Noise-shaping SAR converters show their strengths in high-efficiency, as well as high-speed converter design.

The final figure plots ADC bandwidth as a function of SNDR. Sampling jitter or aperture errors coupled with an increased noise bandwidth make achieving both high resolution and high bandwidth a particularly difficult task. While ten years ago, a state-of-the-art data converter showed an aperture error of approximately 1ps<sub>rms</sub>, in recent years, designs with aperture errors below 100fs<sub>rms</sub> have been published, many of which have been published at ISSCC.

Finally, this year's ISSCC presents two converters advancing the state-of-the-art in incremental converter design, showing the strength of this architecture for extremely efficient and very small converters required for IoT applications.



Figure 1: ADC power efficiency ( $P/f_{snyq}$ ) as a function of SNDR.





Figure 3: Bandwidth vs. SNDR.

f<sub>in,hf</sub> [Hz]

# HISTORICAL TRENDS IN TECHNICAL THEMES COMMUNICATION SYSTEMS RF Subcommittee – Wireless Subcommittee Wireline Subcommittee



## RF – 2022 Trends

#### Subcommittee Chair: Jan Craninckx, imec, Belgium

ISSCC 2022 features record-setting advancements in phase-locked loops (PLLs), voltage-controlled oscillators (VCOs), power amplifiers (PAs), RF front-end circuits, and Terahertz prototypes driven by emerging requirements in 5G and 6G communications, radars, and imaging. While 5G applications show advancements in linear operation, high average efficiency, and duplex operation at 28 and 39GHz, remarkable achievements are also demonstrated in frequency bands above 100GHz to improve output power and efficiency. Additionally, circuit improvements based on advanced CMOS processes benefit existing systems, such as WiFi or low-power IoT devices.

**Frequency Generation**: ISSCC 2022 highlights the improved performance of VCOs and this year's conference features several approaches that demonstrate a -190dBc/Hz FoM. First, a 10GHz series-resonance oscillator achieves a phase noise of -138dBc/Hz at 1MHz offset, and, second, a 60GHz multicore oscillator achieves a phase noise of -136dBc/Hz at 10MHz offset. A triple-core VCO further advances the FoM through broadened tuning range that leverages multiple resonances. New advances in terahertz bands for imaging based on a coupled oscillator arrays with PIN diodes for harmonic generation increase tuning range and RF-to-DC efficiency over prior art.

ISSCC 2022 also introduces PLL concepts for generating RF, microwave, and mm-wave frequency carriers with several low-jitter and power-consumption prototypes that ultimately push the jitter-power figure of merit (FoM) below -250dB as shown in Fig. 1. PLL advancements include a variety of sub-sampling and digital approaches in fractional-N architectures. A fractional-N type-I sampling PLL based on a voltage interpolator exhibits accelerated startup with a crystal oscillator. Breakthroughs in all-digital PLLs for radar offer phase noise of -120dBc/Hz at 1MHz and low frequency-modulation errors are shown. A 5nm FinFET CMOS cascaded PLL achieves 204fsrms jitter and -241dB FoM with wideband operation. An ultra-low-jitter fractional-N ring-oscillator (RO) digital PLL dynamically selects a phase sector to reduce the digital-to-time-converter (DTC) range and thermal noise for 188fsrms and -243dB FoM. A fractional-N digital PLL employs a time-mode arithmetic unit to perform the time estimation and amplification tasks with a -249.4dB FoM. An RO-based injection-locked digital PLL incorporates a reference octupler with a probability-based adaptive calibration to achieve 177fsrms jitter. A harmonic-mixing-based fractional-N PLL with a coupled mm-wave VCO achieves 88fsrms jitter and -250dB FoM. A 28nm fractional-N bang-bang PLL achieves 68.6fsrms jitter. Finally, a 25.8GHz integer-N PLL with a time-amplifying phase-frequency detector suppresses charge pump noise to reach 60fsrms jitter and -252.8dB FoM. These integer-N and fractional-N PLLs continue to improve power consumption and integrated jitter to keep pace with advances in communications and sensing applications.



Fig. 1. PLL trends.

**RF and mm-Wave PAs**: ISSCC 2022 will introduce several exciting PA innovations spanning from RF bands below 7GHz to Terahertz bands extending to 425GHz. These innovations improve efficiency, output power, linearity, and bandwidth of PAs and front-end components. ISSCC 2022 will showcase record-setting work in frequency bands above 100GHz including 1) a Silicon Germanium prototype of a Doherty PA demonstrating average efficiency of 12% and 2) a CMOS PA with 1.6W output power through massively scaled power-combining networks. As shown in Fig. 2, these demonstrations set new records for silicon-based technologies above 100 GHz.



Fig. 2. CMOS and SiGe power amplifier trends in RF and millimeter-wave bands.

Other exciting developments for PAs include a multiband digital PA, realized in a 16-nm FinFET process, that achieves a power of 28dBm for emerging 6-to-7GHz applications, a bulk CMOS PA using a 3-way Doherty architecture to achieve high average efficiency for high back-off signals that reaches 25.5dBm at 28GHz, and a compact wideband bidirectional front-end that achieves high efficiency and low noise while operating without a transmit/receive switch. Other remarkable concepts are presented in RF front-end components, including an accurate power and impedance detector for load impedance sensing at the antenna in a phased array and a broadband (1-to-18GHz) silicon mixer and an integrated LO driver with P<sub>1dB</sub> exceeding 20dBm.

#### Subcommittee Chair:

The continuing demand for higher wireless data rates in the context of mobile battery limitations drives the highthroughput and power-efficient transceiver development in more carrier aggregation (CA) and wider bandwidth (BW) per path. This year, at ISSCC 2022, a sub-6GHz (FR1) 5G radio demonstrates a single-path wide-BW digital-IF receiver architecture allowing each downlink path to support up to 5 CA simultaneously and also maintaining backward compatibility to existing 2G/3G/4G standards. Furthermore, another single-chip 5G mmW (FR2) 16-channel, dual-polarized, phased-array beam-forming IC (BFIC) with its corresponding quad-stream intermediate frequency (IFIC) is also demonstrated to support N257/N258/N261 and N260 with 32 antenna ports.

Figure 1 shows the trend in the number of CA for recent 5G cellular development, as well as the increase of bandwidth per downlink path. It indicates an increasing CA number and wider bandwidth per downlink path for faster data-rate requirement. This year, the sub-6GHz FR1 cellular receiver supports up to 15 inter/intra CA by 3 downlink paths with the capability of achieving 300MHz BW per path in 14nm FinFET CMOS. The mmW (FR2) 5G BFIC is designed in 28nm SOI CMOS and paired with a 14nm FinFET IFIC to support up to 2 non-contiguous intra-band CA and achieve 800MHz BW per downlink path.



Figure 1: Trends in the number of CA from downlink path and max. BW per path for recent cellular SoC implementations

Ultra-low-power (ULP) receivers have continued to make dramatic improvements that facilitate widespread adoption. Two ULP receivers are presented at ISSCC this year that push the limits of sensitivity for receivers operating under 1mW. This includes a record -124dBm sensitivity for a 781µW receiver that leverages a coherent FSK demodulator and integrated 4192-bit correlator. Clever techniques to leverage duty-cycling to reduce average power without compromising latency is also a recent trend.

Selectivity, or interference rejection, is a critical metric for all receivers that operate in the presence of other incumbent transmitters. As all RF spectrum becomes more crowded, adequate SIR (signal to interference ratio) or IRR (image-rejection ratio) is essential for scaling the number of users that can concurrently occupy a band. As Figure 2 highlights, recently published ULP receivers have steadily advanced this metric, even at the expense of sensitivity and power.



Figure 2: Plot of signal-to-interference ratio (SIR) or image-rejection ratio (-IRR) vs. figure-of-merit (FoM = -Sensitivity +  $10*\log(\text{data rate }[b/s]) - 10*\log(\text{power }[W])$ ) as reported by ULP receivers published at ISSCC.

## Wireline – 2022 Trends

#### Subcommittee Chair: Yohan Frans, Xilinx, San Jose, USA

Over the past few decades, electrical and optical interconnects have been the key components bridging the gap between the exponentially growing demand for data bandwidth across electronic components/systems and the relatively gradual increase in pin/cable density. Ranging from handheld electronics to supercomputers, wireline data communication bandwidth must also grow exponentially to avoid limiting the performance scaling of these systems. By increasing the data per pin or cable of various electronic devices and systems, such as memory, graphics, chip-to-chip fabric, backplane, rack-to-rack, and LAN, wireline I/O has fueled incredible technological innovation in electronic devices and systems over past few decades. Figure 1 shows that data-rate per pin has approximately doubled every four years across various I/O standards ranging from DDR, to graphics, to high-speed Ethernet. Figure 2 shows that the data rates for published transceivers have kept pace with these standards while taking advantage of CMOS scaling. Figure 3 shows published transceiver energy efficiency vs. channel losses at the Nyquist frequency in the 40-to-50dB range. In part, this incredible improvement is enabled by the power-performance benefits of process technology scaling. However, sustaining this exponential trend for I/O bandwidth requires more than just transistor scaling. Significant advances in energy efficiency, channel equalization and clocking must be made to enable the next generation of low-power and high-performance computing systems. Papers at ISSCC this year include an example of a PAM-4 receiver at >200Gb/s, PAM-4 long-reach copper interconnect transceivers operating up to 112Gb/s, PAM-4 medium-reach electrical interconnects operating up to 112Gb/s, PAM-4 electrical transmissions operating up to > 60Gb/s over a high-loss channel, a short-reach optical coherent receiver operating up to 200Gb/s, a PAM-4 optical transmitter operating up to 100Gb/s and a PAM-4 bi-directional link operating up to 50Gb/s on a plastic waveguide. New techniques for extending data rate, power reduction, channel equalization, and clock recovery are reported. These transceivers and transceiver building blocks are implemented in CMOS technology.

#### Scaling Electrical Interconnects to 100Gb/s and Reaching out to >200Gb/s

Bandwidth requirements in data centers and telecommunication infrastructure continue to drive the demand for ultra-high-speed wireline communication. Over the past few years, complete transceivers operating up to 112Gb/s were demonstrated across a longreach copper channel with >45dB loss. Two notable trends in these transceivers, especially for long-reach channels, are the adoption of PAM-4 modulation and a transition to DAC/ADC architectures with DSP-based equalization. Although PAM-4 provides twice the data rate at the same baud rate as conventional NRZ to relax channel loss requirements for bandwidth doubling, it also comes with more stringent requirements for linearity, noise and multi-level signaling. This trend has motivated development of low-power data converters, digital equalization and clock recovery along with linear, high-bandwidth TX and RX analog front ends. Over the past two years, the first transceiver components were demonstrated to extend these transceivers to 112Gb/s. This year, ISSCC includes an implementation of a 112Gb/s PAM-4 long-reach transceiver in 5nm CMOS with relatively low power consumption. In paper 6.2, Marvell describes a 112Gb/s PAM-4 transceiver for long-reach copper interconnects consuming only 4.5pJ/b while operating over a channel with 50dB loss. In addition, Peking University (Paper 6.3) presents a 112Gb/s PAM-4 mixed-signal transceiver design in 28nm CMOS, which achieves 1E-11 BER over a 20.8dB-loss channel, while consuming 2.29pJ/b. While 112Gb/s links are maturing, several papers at ISSCC are directed to doubling the data rate to 224Gb/s. In paper 6.1, Intel demonstrates a first 224Gb/s PAM-4 ADC-based RX consisting of an inductively peaked analog front-end, a 64-way interleaved ADC and DSP incorporating a 16-tap digital FFE. The power efficiency of the analog part is 1.41pJ/b while supporting over a channel with 31dB. In paper 17.4, UCLA also demonstrates a 56GHz fractional N-PLL in 28nm CMOS for 224Gb/s PAM-4 transmitters.

#### Short-Reach Links for Intra-Package Communications

As a consequence of the increasing demand for bandwidth in high-throughput systems used in AI, HPC and switch applications, multiple devices are integrated in the same package, and data is sent between chips on the same package or between a central chip and co-packed optics (CPO). For these applications, relatively short distances (<50mm) have to be bridged with minimum power and at the highest throughput per mm chip-edge (Gb/s/mm). Since the channel attenuation and discontinuity in these links is small, low-power analog-oriented architectures are preferred over the heavy DSP-based solutions for longer channels. In paper 6.5, Marvell presents an 8×113Gb/s PAM-4 XSR transceiver in 5nm CMOS, which achieves 1E-11 BER over an 80mm MCM channel while consuming only 1.55pJ/b.

#### **Optical Links for Upcoming 400G Data Center Interconnects**

The explosive growth of data and data-centric computing places stringent demands on the bandwidth and energy efficiency of data center interconnects, spurring the development of several 200-to-400G Ethernet standards. Low-power data converters and optical integration are the two key components for the development of high-performance optical pluggable modules using coherent detection. In Paper 17.1, Peking University presents a first 200Gb/s analog DP-QPSK Coherent optical RX in 28nm CMOS which achieves 4.6pJ/b power efficiency with an active area of 0.06mm<sup>2</sup>. The RX is based on 12MHz carrier recovery, equalization for 10km optical transmission, 1.2Mrad/s SOP and 9dB electrical channel loss. While coherent optical communication covers long distances, demand is also surging for shorter, non-coherent (<2km) optical links. Several 400G Ethernet standards (e.g. 400G-DR4/FR4) target optical line rates of 100Gb/s. Since these links are envisioned to be employed in data centers in high volume, low cost and low power are key requirements. Solutions so far typically employed an analog PAM-4 optical transmitter utilizing a micro-ring or travelling-wave Mach-Zehnder modulator, which resulted in high power dissipation, and large area and cost. In paper 17.2, the California Institute of Technology demonstrates an optical 100Gb/s PAM-4 transmitter in SiP-CMOS platform including a push-pull segmented MZM structure using MOSCAP phase modulators. The driver IC is implemented in 28nm CMOS and consumes only 2.4pJ/b.

#### **Concluding Remarks:**

Continuing to aggressively scale I/O bandwidth is essential for the industry, but the tradeoffs between bandwidth, power, area, cost and reliability are extremely challenging. Advances in circuit architecture, interconnect topologies, transistor scaling and integrated silicon photonics are changing how I/O will be done over the next decade. The most exciting and promising of these emerging technologies for electrical and optical interconnects will be highlighted at ISSCC 2022.



Figure 1: Per-lane data-rate vs. year for a variety of common I/O standards.



Figure 2: Data-rate vs. process node and year.



# HISTORICAL TRENDS IN TECHNICAL THEMES DIGITAL SYSTEMS DIGITAL ARCHITECTURES & SYSTEMS SUBCOMMITTEE

DIGITAL CIRCUITS SUBCOMMITTEE Machine Learning & AI Subcommittee Memory Subcommittee


### **Digital Architectures & Systems (DAS) – 2022 Trends**

#### Subcommittee Chair: Thomas Burd, Advanced Micro Devices, Santa Clara, CA

This year's selection of processor papers highlights the industry trend toward multi-die integration in a package with increased numbers of transistors per system. Innovative packaging technologies, including 3D stacking and direct bonding are being productized, which supports easy integration of multiple process nodes into a single socket. This has also fueled an exponential increase in on-system memory that drives increased performance. The drive to higher clock frequencies has plateaued, being replaced by a drive to increased core counts. Bump and through-silicon-via pitches continue to scale down at a rapid rate, enabling tremendous increase in bandwidth across multiple dies. The mobile CPU continues to increase in both frequency and performance, while providing a wide range of performance and energy efficiency. The system scalability trend is also evident in a configurable array of processing elements across multiple dielets interacting with low bit-error rate channels.



Figure 1: Core-count trends (red diamond designates multi-chip module).



Figure 2: Die counts in a system trends.



Figure 3: Chip-complexity scaling trends (red diamond designates multi-chip module).



Figure 4: On-die cache-size trends (red diamond designates multi-chip module).

We see a continued push on application-processor performance and efficiency, along with video, display, and camera capabilities to fully leverage the ubiquitous connectivity smartphones provide. There are also major innovation efforts in 5G, artificial intelligence (AI) and gaming. 5G cellular technology is becoming more mature and the post-5G era is gaining more focus. The 6G era will feature more antennas, more intelligence, and more use cases. The range of applications of neural network processing units (NPUs) is gradually expanding beyond image and speech recognition, to cellular performance improvement, SoC power and performance optimization. Therefore, not only the performance of the NPU is increasing, but also tiny NPUs for low-power operations are being applied everywhere. For better user experience and game quality, the main concern surrounding the display is moving from resolution to frame rate.

| Graphics           | OpenGL         OpenGL/VG/MAX         AR         VR (Virtual Reality)           (ES1.1)         (ES2.0)         (Augmented Reality)         Vulkan                                                                                               |
|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Display            | VGA WVGA SXGA WQXGA/WQXGA+ WQXGA/WQXGA+ WQXGA/WQXGA+ 4K QHD+<br>@60fps @60fps @60fps @60fpsX2 (VR) @120fps 240fps QHD+                                                                                                                          |
| Camera             | 5-8M 10M 16M 20M 24M 12MxDual 12MxDual 48M 200M<br>360°VR 360°VR Tripple Quad                                                                                                                                                                   |
| Image/Video        | H.264/AVC         H.264/AVC         H.265/MVC         H.265/VP9         AV1         8K           (VGA)         (D1)         (Full HD)         H.264/SVC         H.265/VP9         HDR         HDR10+         @30fps                             |
| Audio              | AAC AAC Plus WMA Dolby DSD TWS<br>Dolby 5.1 TrueHD/Digital+ Dolby Atmos Truly Wireless                                                                                                                                                          |
| Accelerator        | SIMD         Multicore         Heterogenous         Neural Net         5 TOPS         15 TOPS           Multicore (2~4)         (4~8)         Multiprocessing         Processor         5 TOPS         15 TOPS                                  |
| Downlink<br>[Mbps] | UMTS         HSPA         HSPA+         LTE         LTE-A         LTE-A         LTE-A         5G         5G           0.4 ~ 2         1.8 ~ 7         7 ~ 42         100         150 ~ 750         1600         2000         5000         10000 |
| CPU [MIPS]         | 300         500         800         2400         6K         12K         13K         19K         22K         26K           500         800         2400         6K         12K         10K         112K         162K         180K         208K   |
|                    | 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 <u>2022</u>                                                                                                                                                          |

Figure 5: Application-processor trends in smartphones.

The bandwidth of wired and wireless links continue to increase at a rate of approximately 10× higher data rates every five years. Compared to previous years, the changes this year are modest. Massive MIMO and mm-Wave technologies are being actively studied to realize full 5G communication. 5G mobile devices have become mainstream in 2021/2022, with maximum data rates reaching 10Gbps. The explosion of IoE devices will require the evolution of narrow-band wide-area networks.



Figure 6: Data-rate trends in wired, wireless and cellular.

**Circuits for Hardware Security:** With the increasing risk and cost of information theft and safety hazards, hardware security has become a common requirement in intelligent and connected systems. Though focus on cryptographic implementation continues, cost-effective and low bit-error-rate PUFs (physically unclonable functions) are increasingly adopted in smart cards, sensor nodes, consumer devices, and automotive. TRNGs (true random-number generators) are also commonly required to strengthen secret-key generation in cryptographic applications. Techniques to counteract side-channel attacks are enabling higher levels of security at lower design cost, thanks to the higher degree of design reuse and more digital circuit techniques. Quantum computers allow dramatic speed-ups in attacks on existing public-key algorithms. Standardization bodies, such as NIST, have started competitions to identify potential post-quantum cryptographic (PQC) schemes. Novel PQC accelerators are now being designed to efficiently and securely implement these schemes in hardware.

Figure 7 illustrates trends in area scaling in PUFs (area/bit) and TRNGs published recently at ISSCC, showing relentless area and cost reductions. Figure 8 shows the energy/bit scaling in PUFs published at ISSCC, and the relatively higher energy/bit of TRNGs. The sudden PUF energy increase of three years ago is attributable to the stronger emphasis on low bit-error rate. Figure 9 illustrates the native and post-processing bit-error rate, the latter of which has recently seen drastic reductions thanks to new techniques mitigating bit instability for ECC elimination. With regards to techniques counteracting side-channel attacks (EM and power), Figure 10 shows the progressive improvement in the measurements-to-disclosure of cryptographic keys, as determined by the ratio of the power trace count necessary for a successful attack under protected and unprotected designs.



Figure 7. Area/bit trends for physically unclonable functions (PUFs) and area trends for true random number generators (TRNGs) published recently at ISSCC.



Figure 8. Energy/bit trends for PUFs and TRNGs published recently at ISSCC.



Figure 9. Native and post-processing bit error rate for PUFs, showing the dramatic improvements offered by PUF post-processing.



Figure 10. Improvement in measurements-to-disclosure (MTD) of cryptographic keys of side-channel counteraction techniques (normalized to unprotected design).

### **Digital Circuits – 2022 Trends**

#### Subcommittee Chair: Keith Bowman, Qualcomm, Raleigh, NC

The demand for higher performance and energy-efficient platforms ranging from Internet of Everything (IoE) devices to cloud data-centers or telecommunication infrastructure continues to drive innovations in CMOS digital-circuit building blocks with goals of improving performance and energy efficiency, while lowering cost and design effort. Variation-tolerant design continues as a major trend in digital circuits for robustness across process, voltage, temperature and aging variations in computing architectures and systems. System-on-Chip architectures present an increasingly closer interaction between circuits and micro-architecture, including hardware-driven clockgating control. Digital circuit innovations also enable new applications leveraging emerging technologies, such as non-volatile memories, thin-film transistors or human-body communication.

A continued trend towards application-specific accelerators is leading to the development of new circuit techniques that benefit a range of emerging applications, such as optimization problems, artificial intelligence at the edge, or massive 5G telecommunications. Some of these accelerators leverage approximate computation, RRAM/SRAM compute-in-memory strategies, or adaptive digital signal processing for wireless communication.

In addition, continued improvements in traditional digital circuit blocks for integrated power management and clocking are permitting new usage scenarios.

**Integrated Voltage Regulators**: Energy reduction remains a top priority for integrated voltage regulators. Block-level regulation with digital low-dropout (LDO) linear regulators is maturing for integration into scaled process nodes to allow multiple processor cores on the same input voltage rail to operate at unique voltages according to the core workload. Concurrently, high-efficient voltage down-conversion has driven inductor-based regulators (LCVR) and switched-capacitor voltage regulators (SCVR) at finer granularities for dynamic voltage and frequency scaling (DVFS) of individual functional blocks. Compared to traditional analog Buck-converters, mixed-design implementations now rely on digital regulation loops to optimize the power consumption with minimum energy-point tracking (MEPT) at near-threshold operation or during idle phases. Figure 1 describes the power conversion efficiency of these integrated voltage regulators across calendar years, indicating a continuous improvement in the converter efficiency, while offering optimized block-level energy consumption.



Figure 1. Integrated voltage regulator trends in power efficiency.

**Digital PLLs for Low-Jitter Applications**: PLLs continue their analog-to-digital migration to provide more functionality, variability management and lower design complexity at advanced nodes. Demand for compact low-jitter PLLs continues to increase, with integer or fractional multiplication ratios with respect to a low-frequency reference. The use of more automated digital design flows, such as synthesis and automated placement and routing, dramatically reduces development costs, but requires new techniques to reduce jitter. Moreover, increasing reliability constraints in growing markets, such as advanced driver assistance systems, create a need for safety mechanisms to enforce safe operation. In addition, power and area reductions achieved by digital and mixed-signal PLLs now allow their usage as analog functional block drivers, but these new usages come with additional signal integrity constraints, leading to the development of digital circuit techniques for spurious-tone (Spur) cancellation due to frequency mixing with a reference or a fractional multiplier. Figure 2 describes a key figure of merit (FoM) combining jitter, power and reference frequency for digital clock generators across calendar years, highlighting a continued trend in FoM reduction, while recent works outline an increasing effort on Spur cancellation with mixed-signal and synthesized all-digital implementations.



Figure 2. Top: digital clock generators key figure of merit (FoMr) across recent years, defined as: FoMr =  $10 \times \log_{10} \{ (\text{Jitter}_{\text{RMS}}/1\text{s})^2 \times (\text{Power}/1\text{mW}) \times (\text{Fref}/50\text{MHz}) \}$ ; bottom: spurious tone reduction trends.

### Machine Learning (ML) & AI – 2022 Trends

#### Subcommittee Chair: Marian Verhelst, KU Leuven - MICAS, Leuven, Belgium

In response to the world-wide trend of increased interest and enthusiasm in deep learning in recent years, ISSCC established a subcommittee dedicated to machine learning and AI, starting two years ago with the 2020 edition. As deep neural networks succeed in achieving higher accuracy on a wide variety of tasks, the size and computational complexity of these models also keeps rising. Across datacenter, mobile, and IoT workloads, this results in continued demand for more energy-efficient and higher-throughput neural-network computing, for image processing, as well as other workloads. This year's submissions are divided into two sessions. First, there is a full session on novel chip solutions in support of the popular workload of image processing using convolutional neural networks, introducing a broad variety of circuit techniques ranging from analog circuitry to compute-in-memory to hidden networks. Then, we have a smaller session on chip solutions focused on emerging workloads, including transformer networks for natural language processing to large-memory recommendation systems to spike-based neuromorphic processing.

It is important to note that the metrics that matter at the system level are energy-per-inference (or -per-training-example), and inferences/second (or training-examples/second) on a specific task at a given inference (or final trained) accuracy. This year's submissions significantly push the state-of-the-art of these efficiency and throughput numbers yet again, often by combining multiple enhancement techniques within a single chip (or multiple chiplets), implemented across a broad range of technology nodes (Figure 1):



Figure 1: Various parameters impacting low-level and system-level benchmarking metrics.

- Compute-in-memory (CIM) architectures continue to be popular again, making use of SRAM, dynamic-capacitive, and even Flash-based memories. Innovations here continue to shift towards support of more complicated and larger network models, and towards increased flexibility through the inclusion of digital blocks at the edge of CIM macros.
- 2) Exploitation of sparsity in various forms has long been an important focus of both inference and training acceleration for imaging applications, given the large number of zero activations and/or zero weights within deep convolutional neural networks. The advent of transformer networks has introduced the need for sparsity within attention-based compute, since the number of operations can scale as rapidly as the square of the sequence length.
- 3) The incorporation of analog circuitry, either at the front end of the network, or to perform multiply-accumulate operations in the current or charge domain using Kirchhoff's current law, was featured by several papers, leading to better energy efficiency, while still maintaining reasonable neural network accuracy.
- 4) In contrast, several papers explicitly chose to avoid analog accumulation, preferring the precision of a pure digital solution for compute-in-memory, spike accumulation, and other computational tasks, exploiting sparsity and datareuse to achieve improvements in energy efficiency. Figure 2 helps illustrate this taxonomy between digital, analog, OFF-memory and IN-memory compute.

|                       | <b>Digital</b><br>compute                        | Analog<br>compute                                                                                                 |
|-----------------------|--------------------------------------------------|-------------------------------------------------------------------------------------------------------------------|
| OFF-memory<br>compute | • <b>Digital</b> processors<br>• Systolic arrays | <ul> <li>In-sensor processing</li> <li>Analog convolutions</li> <li>Analog spiking<br/>implementations</li> </ul> |
| IN-memory<br>compute  | • Digital in-memory compute                      | • Analog in-memory<br>compute                                                                                     |

Figure 2: Taxonomy of digital vs. analog, and OFF-memory vs. IN-memory compute.

- 5) While most submissions focus on inference of imaging workloads using convolutional neural networks, particularly for low-power edge-based computing, several papers focused on non-imaging workloads. For Transformer networks, papers focused on efficient computation of the attention-based compute; for recommendation systems, hybrid-bonding was introduced to reduce the latency and energy costs of accessing DRAM. A novel spiking processor enabled task-agnostic online learning over seconds using a recurrent neural network, and applied this to navigation, gesture recognition, and keyword-spotting tasks.
- 6) ISSCC 2022 shows machine learning processors implemented across a wide variety of technology nodes, ranging from test chips where novel ideas are carefully evaluated using older technology nodes (65nm, 180nm), to industry chips ranging from 40nm Flash chips to 22nm recommender chips to a 4nm NPU supporting a variety of precision formats.

As different chipsets are often characterized on a different set of tasks, network topologies, and accuracy levels, a direct comparison of the true system-level benchmarking metrics – such as energy/inference and inference/s – is not always straightforward. It is therefore instructive to look at the reported low-level metrics of operations/s and energy/operation within the neural network. Figure 3 displays the energy-efficiency-vs-throughput operating points demonstrated by the accelerators presented at ISSCC 2022 (yellow), compared to the state-of-the-art in 2016-2018 (blue), 2019 (black), 2020 (green), and 2021 (red). Figure 4 plots the evolution of both energy efficiency and area efficiency (throughput-per-unit-area) over the past few years.

From these graphs, the improvement in terms of the low-level metrics of operations/s and energy/operation is not particularly apparent. Yet, ISSCC attendees should keep in mind that these TOPS/W and TOPS/s specifications depend strongly on the level of integration of the chip, and on the particular neural network topologies being used. At ISSCC, we see a clear trend towards more complete SoC integration, in which the highly efficient MAC compute arrays introduced over the past years, are now integrated into full processing systems. A second trend is the support for larger and more complex networks. Going forward, as the field matures, we believe that a common benchmarking methodology must be established which can properly account for the application context and provide proper translation between low-level and system-level performance metrics [1]. In the meantime, the clever combination of sparsity, variable precision, and in-memory computing technologies is continuing to enhance deep-learning processor efficiency and throughput. With the increase of system-level integration of machine learning engines together with important sub-systems (imaging chips, image pre-processing, audio filtering and pre-processing, etc.), these performance improvements will continue to open up new Al applications.



Figure 3: Deep-learning processor energy-efficiency (TOPS/W) and throughput (GOPS).



Figure 4: Evolution in energy efficiency (TOPs/W) and throughput per unit area (TOPs/mm<sup>2</sup>) of ML inferencing processors.

[1] G. W. Burr, S. Lim, B. Murmann, R. Venkatesan and M. Verhelst, "Fair and comprehensive benchmarking of machine learning processing chips," in *IEEE Design & Test*, doi: 10.1109/MDAT.2021.3063366.

© COPYRIGHT 2022 ISSCC-Do Not Reproduce Without Permission

### Memory – 2022 Trends

#### Subcommittee Chair: Meng-Feng Chang, National Tsing Hua University

The demand for high-density, high-bandwidth, and low-energy memory systems continues to grow everywhere: from high-performance computing to SoC, wearables and IoT.

This year the conference presents a high-bandwidth 16Gb 27Gb/s/pin GDDR6 DRAM and 192Gb 12-high 896GB/s HBM3 DRAM. Computing in memory (CiM) performance and energy efficiency are improved by a SLC-MLC PCM CiM macro that achieves 20.5-65.0TOPS/W. The highest densities per area are reported for 3D-NAND in TLC and QLC NANDs, both adopting a circuit-under-cell-array structure while boosting programming throughput as well.

#### TOP PAPERS FROM ISSCC 2022 INCLUDE:

- A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-In-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep Learning
- A 40nm 2M-cell 8b-Precision Hybrid SLC-MLC PCM Computing-in-Memory Macro with 20.5-65.0 TOPS/W for Tiny AI Edge Devices
- A 192Gb 12-High 896GB/s HBM3 DRAM with a TSV Auto-Calibration Scheme and Machine-Learning Based Layout Optimization
- A 16Gb 27Gb/s/pin T-coil based GDDR6 DRAM with Merged-MUX TX, Optimized WCK Operation, and Alternative-Data-Bus
- A 1-Tb 4b/Cell 4-Plane 3D Flash with 162-Layer 68-mm2 Chip Size and 2.4-Gbps IO Speed

#### **COMPUTE IN MEMORY**

Compute-in-memory solutions aim to impact next-generation data center, and cloud inference and training applications with the goal of substantially reducing energy consumption and the total cost of ownership. Edge device inference and training is rapidly increasing with a variety of applications spanning autonomous driving, driver assistance, image recognition, voice recognition and surveillance. Compute-in-memory solutions aim to enable increased performance and capacity on edge devices, enabling new applications and improved performance to end users.

#### HIGH-BANDWIDTH AND LOW-POWER DRAM

In order to keep pace with the ever-increasing performance requirements of various applications, from graphics/mobile to supercomputing, DRAM continues to scale density, form factor, and bandwidth. This year, ISSCC 2022 includes benchmarks for the latest interface standards, such as 896GB/s HBM3 for highest-performance applications (HPC & AI), and a 27Gb/s/pin GDDR6 for graphics applications. Figure 1 shows DRAM bandwidth growth over the last 14 years.



Figure 1 - DRAM data bandwidth growth

#### NON-VOLATILE MEMORY (NVM)

In the past decade, significant investment has been put into emerging memories to find an alternative to floating-gate-based non-volatile memory. The emerging NVMs, such as phase-change memory (PCM), ferroelectric RAM (FeRAM), magnetic spin-torque-transfer (STT-MRAM), and resistive memory (ReRAM), are showing the potential to achieve high-cycling capability and lower power per bit read/write operations. However, conventional flash memories are continuously improving, confirming them as the mainstream today and into the near future.

This year's papers report improvements in write performance for conventional 3D flash memory TLC (164MB/s) and QLC (60MB/s) and the widespread acceptance of asynchronous page read for read performance.

This year's papers also report improvements in bit density for TLC (11.55Gb/mm2) and QLC (15.0Gb/mm2). These high densities are achieved through advancements in 3-dimensional architectures, more than 220 stacked-WL. Figure 2 shows non-volatile memory capacity trends.



Figure 2 - Non-volatile memory capacity trend.

#### NAND FLASH MEMORY

NAND flash memory continues to advance towards higher density, lower power and higher performance; resulting in low-cost storage solutions that are replacing traditional magnetic hard-disk storage with solid-state disks (SSDs). 3D memory technology is the mainstream for NAND flash memory in mass-production by semiconductor industries. Periphery-under-the-array is currently the reference architecture for TLC and QLC: it is enabling higher bit density and multiple planes for throughput improvement.

The state-of-the-art for TLC uses more than 220-stacked-WL. This year, emphasis is on CSL noise canceling scheme and IO speed improvement (up to 2.4Gb/s) through CTLE-based receiver and an internal reference voltage generator. One paper, related to QLC, reports the highest bit density of 15.0Gb/mm2 with 162-stacked-WL and a 60MB/s for program throughput.

Figure 3 shows the observed trend in NAND Flash capacities at ISSCC over the past 20 years.



Figure 3 - NAND flash memory capacity trend.

## HISTORICAL TRENDS IN TECHNICAL THEMES INNOVATIVE TOPICS IMAGERS/MEMS/MEDICAL/DISPLAYS SUBCOMMITTEE TECHNOLOGY DIRECTIONS SUBCOMMITTEE



### IMMD – 2022 Trends (Medical)

#### Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

As illustrated at ISSCC 2022, biomedical systems that interface with the body and the brain in wearable, implantable, and *in vitro* applications continue to evolve toward more robust, functionally complex, and energy-efficient solutions, as well as closed-loop operation. Wearable and implantable SoCs record weak biological signals in the presence of real-life interference, and under stringent power and size constraints. These new SoCs and corresponding wireless power/data transfer techniques pave the way toward robust microdevices that enable innovations in health technologies with long-term measurement to treat on demand.

State-of-the-art biomedical integrated circuits and systems have further advanced this year at ISSCC 2022 with miniaturization, higher sensitivity, higher dynamic range, and interference mitigation as significant trends in implantable and wearable devices, while improving power efficiency with high-input-impedance AFE. High-dynamic-range (>140dB) sensing systems improve tolerance to large-amplitude interference and motion artifacts. At the same time, new techniques are introduced to sense physiological signals such as PPG, IPG, BIOZ, and ExG. Miniaturization combined with a high level of integration and wireless functionality enables wireless neural implants without tissue damage and infection for interfacing with the brain.

Innovations in the on-chip classification of neurological signals that perform inference-based machine learning have the potential to offer improved neural interfaces to treat on-demand and in closed-loop. Incorporating low-energy machine learning at the edge with a high channel count of recording and stimulation will enable emerging therapeutic techniques to restore a healthy condition from chronic pain or other disorders without the side effects of drugs. These advances offer tremendous market potential in the medical market space.

### IMMD – 2022 Trends (Imagers)

#### Subcommittee Chair: Chris Van Hoof, IMEC, Leuven, Belgium

The CMOS image sensor market is mainly driven by consumer applications, seeing mobile devices and smartphones embedding an increasing number of cameras: manufacturers are spending huge efforts to reduce image sensor footprint and incorporate more functionalities. Moreover, adoption of multiple cameras for automotive applications and use of medical imaging solutions are steadily increasing. These are key factors contributing to the growth of this segment, which from an estimated USD 18.5 billion in 2020, is projected to reach USD 28.0 billion by 2025. Beyond conventional photography, depth sensing is gaining strong momentum for user identification, augmented reality, and gaming, and vision sensors are carving out a niche in the low-power applications.

The progress in image sensors is more and more intertwined with the specialization of the fabrication process, such as integration of reliable single-photon detectors, full-depth trench isolation, backside illumination, efficient transfer gates, and maximized collection volume. In parallel, novel architectures pushed by technological advances are allowing advanced features to be implemented on-chip and on-pixel with high efficiency.

At ISSCC 2022, the interest and progress in depth and SPAD sensors are well represented. SPAD-based imagers show their evolution and confirm the recently started trend with applications in intensity imaging. A 1 Megapixel colour 3D-stacked BSI SPAD array showing 143dB dynamic range by Canon demonstrates that these sensors are rapidly exploiting all major innovations available for conventional image sensors. Depth sensors using SPAD arrays are integrating smarter and more efficient techniques to cope with harsh ambient light conditions and with high data rates, by performing neighbor's detection validation and in-pixel distance extraction. Also, indirect-ToF is continuously evolving, reaching a record precision of 38µm.

For high-resolution intensity imaging, stacked BSI technology has become an established standard, enabling excellent sensitivity and on-chip processing, while the pixel race has gained new speed: in just one year the smallest pixel pitch reported at ISSCC decreased by 14%, while keeping intact all other key quality parameters. This boost in shrinking is witnessed by two Samsung works on a 64Mpix image sensor exhibiting a world-first 0.56µm pitch, and a 50Mpix image sensor with 1µm-sized dual pixels with full-depth isolation trenches for all-direction autofocus.

Novel approaches to vision sensors are aiming at always-on sensors with extremely low power consumption. This is accomplished through several techniques that include charge-domain extensive pixel binning for multiple resolution operation and time-based pixel processing with enhanced dynamic range.

### **Technology Directions – 2022 Trends**

#### Subcommittee Chair: Makoto Nagata, Kobe University, Kobe, Japan

Technology innovations bring the promise of enabling new system functionalities or substantially increasing the efficiency of existing ones. Harnessing such innovations for solving tangible real-world problems requires novel system-level solutions. With a focus on envisioning the future, emerging trends in Technology Directions this year at ISSCC 2022 covers a wide range of topics including quantum engineering, low-power secure circuits for IoT, and biomedical sensing, stimulation, and harvesting. ISSCC 2022 features two sessions representing the latest technological innovations in the following areas:

**Quantum engineering:** By exploiting quantum phenomena, such as superposition and entanglement, quantum technologies promise disruptive improvements in several fields, including computing, communication, and sensing. In today's quantum systems, the quantum devices must be often operated at cryogenic temperatures but are mostly controlled by a room-temperature electronic interface. While a few wires can bridge such a temperature gap for today's small-scale systems, full scalability can only be ensured by adopting a *cryogenic* control interface, enabling compact and reliable implementations. At ISSCC 2022, the trend continues in integrating larger cryogenic SoCs in advanced nanometer CMOS processes for the control of multiple qubits, with two papers reporting a qubit driver in 14nm FinFET and the combined driver/receiver in 40nm CMOS, respectively, both for superconducting qubits. Next to that, researchers are focusing on designing more circuit building blocks aimed at cryogenic operation, also exploring other technologies and comparing them to cryo-CMOS, such as for the SiGe BiCMOS VCO presented this year.

Low Power Circuits for IoT: The papers from ISSCC 2022 in this area push the frontiers of low power communications and AI for applications in the Internet of Things. One paper presents the design of a low-power WiFi/Bluetooth combo backscatter chip that enables longer distance operation via a beam scattering approach. Three papers focused on AI then suggest shifting processing to the time domain, demonstrating voice activity detection or keyword spotting using time-domain convolutional neural networks and time-domain feature extraction, or by changing AI processing from using synchronous digital logic to asynchronous event-driven circuitry with mixed-signal compute in memory techniques.

**Biomedical Devices, Circuits, and Systems:** ISSCC 2022 includes innovative and emerging biomedical systems that traverse device, circuits, and system-level design. This year, the developing trends encompass advancements in implantable electrocorticography recording, multi-electrode arrays for cell/tissue interfacing, bio-molecular sensing and bio-energy harvesting. The demonstrated technologies hold the promise to advance the fields of diagnosis of neurological disorders, prosthetic devices, single-cell resolution *in vitro* analytics and diagnosis, label-free single-molecule sensors for diagnostics, drug discovery, DNA sequencing and proteomics, and soil-pH monitoring.





A paper number mentioned in this section follows the convention S.P, where S is the session number and P is the paper number. For example 23.2 will be the second paper in the twenty-third session. You can refer back to the TECHNICAL SESSION OVERVIEWS in this Press Kit for additional details on any given paper. Some of the papers will also be available in the "not-so-technical" SESSION HIGHLIGHTS part of this Press Kit. All sessions and papers are in ascending order in both the Session Overviews and the Session Highlights sections of the Press Kit.

#### **Technical Topics Mapped to Papers**

| Technical Topic                                                                                                                         | All papers in the following Sessions |  |
|-----------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------|--|
| Communication Systems                                                                                                                   | 4, 6, 8, 9, 17, 19, 23, 24, 27       |  |
| Analog Systems                                                                                                                          |                                      |  |
| includes Analog, Power Management and Data<br>Converter Subcommittees                                                                   | 3, 10, 14, 18, 25, 30, 31,           |  |
| <b>Digital Systems</b><br>includes Memory, Digital Circuits, Machine Leaning and<br>AI, Digital Architectures and Systems Subcommittees | 2, 7, 11, 13, 15, 16, 28, 29, 33, 34 |  |
| Innovative Topics<br>includes Imagers/MEMS/Medical Devices/Displays and<br>Technology Directions Subcommittees                          | 5, 12, 20, 22, 32                    |  |

#### Selected Presenting Companies/Institution Mapped to Papers

#### Chart 4.1

| Affiliation                                                       | Paper Numbers          |
|-------------------------------------------------------------------|------------------------|
| Academia Sinica                                                   | 7.5                    |
| Advanced Industrial Science and Technology (AIST)                 | 3.7                    |
| Advanced Institute of Information Technology of Peking University | 11.5, 22.7             |
| Aichi Steel                                                       | 3.7                    |
| Alibaba DAMO Academy                                              | 15.3, 29.1             |
| AMD                                                               | 2.7, 26.4              |
| Analog Devices                                                    | 8.3, 10.5, 17.3, 34.3  |
| Auburn University                                                 | 10.4                   |
| Bosch                                                             | 8.3                    |
| Broadcom                                                          | 6.4, 23.5              |
| California Institute of Technology                                | 17.2                   |
| Canon                                                             | 5.1                    |
| CEA-Léti                                                          | 12.7                   |
| Center for Neuroprosthetics                                       | 20.4                   |
| Chan Zuckerberg Biohub                                            | 20.7                   |
| City University of Hong Kong                                      | 7.5                    |
| Columbia University                                               | 15.7, 16.1, 17.6, 23.4 |

| Cornell University                                  | 20.4                                        |  |  |
|-----------------------------------------------------|---------------------------------------------|--|--|
| Daegu Gyeongbuk Institute of Science and Technology | 20.3                                        |  |  |
| Dartmouth College                                   | 18.4                                        |  |  |
|                                                     | 3.5, 3.6, 4.4, 9.2, 23.3, 24.2, 25.4, 31.2, |  |  |
| Delft University of Technology                      | 31.4, 32.2, 32.3                            |  |  |
| Digipen Institute of Technology                     | 32.1                                        |  |  |
| Duke University                                     | 11.7                                        |  |  |
| Ecole Polytechnique Fédérale de Lausanne            | 22.3                                        |  |  |
| Eindhoven University of Technology                  | 24.2, 25.4                                  |  |  |
| EPFL                                                | 20.4                                        |  |  |
| Erasmus MC                                          | 32.2, 32.3                                  |  |  |
| ETH Zurich                                          | 19.4, 19.6                                  |  |  |
| ETH Zürich                                          | 3.4, 12.5                                   |  |  |
| Meta Reality Labs                                   | 26.3                                        |  |  |
| Fondazione Bruno Kessler                            | 5.2                                         |  |  |
| Fudan University                                    | 13.3, 15.3, 20.1                            |  |  |
| Fujikura                                            | 27.3                                        |  |  |
| Fujitsu                                             | 21.2                                        |  |  |
| gBrain                                              | 20.5                                        |  |  |
| Georgia Institute of Technology                     | 12.5, 16.2, 16.3, 19.4, 19.6, 25.6          |  |  |
| Goodix Technology                                   | 31.2, 31.4                                  |  |  |
| Google Quantum Al                                   | 26.1                                        |  |  |
| Hamad Bin Khalifa University                        | 3.8                                         |  |  |
| Hanbat National University                          | 24.7                                        |  |  |
| Hanvang University                                  | 20.8                                        |  |  |
| Harvard University                                  | 12.2. 12.5                                  |  |  |
| Hong Kong University of Science and Technology      | 14.6                                        |  |  |
| IBM                                                 | 2.4. 13.5. 26.2                             |  |  |
| IBM Almaden Research Center                         | 22.1                                        |  |  |
| IBM Research                                        | 2.3                                         |  |  |
| IBM Systems                                         | 22.1                                        |  |  |
| IBM Systems and Technology                          | 2.3                                         |  |  |
| IBM T. J. Watson Research Center                    | 22.1. 27.3                                  |  |  |
| imec                                                | 12.4. 15.6. 16.4. 24.2                      |  |  |
| imec-Netherlands                                    | 24.2                                        |  |  |
| Incheon National University                         | 20.5.28.5                                   |  |  |
| Indian Institute of Science                         | 34.3                                        |  |  |
| Industrial Technology Research Institute            | 11.8                                        |  |  |
| Infineon Technologies                               | 23.6                                        |  |  |
|                                                     | 21 22 42 61 66 82 92 125 161                |  |  |
| Intel                                               | 18 5 19 5 21 3 30 4 32 5 34 4               |  |  |
| Intel Corporation                                   | 4 5                                         |  |  |
| Iowa State university                               | 14.8                                        |  |  |
| lovef Stefan Institute                              | 5 2                                         |  |  |
|                                                     | 5967132157181203226232                      |  |  |
| KAIST                                               | 24.7. 28.8. 30.2. 33.4                      |  |  |
| Kangwon National University                         | 20.5                                        |  |  |
| KIOXIA                                              | 7.1                                         |  |  |
| Korea Aerospace Research Institute                  | 28.8                                        |  |  |
| Korea Institute of Science and Technology           | 20.6                                        |  |  |
| Korea University                                    | 28.5                                        |  |  |
| Killowon                                            | 12.4.15.6                                   |  |  |

| KU Leuven ESAT                               | 16.4                                           |  |
|----------------------------------------------|------------------------------------------------|--|
| L3 Harris                                    | 4.1                                            |  |
| Leibniz University Hannover                  | 14.2                                           |  |
| LintrinsIC Semiconductors                    | 19.7                                           |  |
| Macronix                                     | 7.5                                            |  |
| Marvell                                      | 6.2, 6.5                                       |  |
| Massachusetts Institute of Technology        | 4.3, 4.5, 19.1, 34.3                           |  |
| MaxLinear                                    | 17.7                                           |  |
| MediaTek                                     | 2.5, 31.1                                      |  |
| Micro Innovation Integrated Circuit Design   | 34.1                                           |  |
| Micron Technology                            | 7.2                                            |  |
| MIRISE Technologies                          | 3.5                                            |  |
| MIT Lincoln Laboratory                       | 4.3                                            |  |
| Mythic                                       | 15.8                                           |  |
| Nano Core Chip Electronic Technology         | 11.5, 22.7                                     |  |
| Nanyang Technological University             | 8.1, 16.5, 27.4                                |  |
| National Cheng Kung University               | 10.2, 31.3                                     |  |
| National Taiwan University                   | 7.5, 12.3, 33.2                                |  |
| National Tsing Hua University                | 7.5, 11.2, 11.3, 11.4, 11.8, 15.9, 33.3        |  |
| National University of Singapore             | 32.1, 34.2                                     |  |
| National Yang Ming Chiao Tung University     | 12.3, 14.1                                     |  |
| NeoNexus                                     | 11.7                                           |  |
| Neuroelectronics Research Flanders           | 12.4                                           |  |
| Newradio Technology                          | 24.3                                           |  |
| Nokia Bell Labs                              | 4.1                                            |  |
| Northwestern University                      | 15.2                                           |  |
| now with IIT Delhi                           | 4.2                                            |  |
| now with Nebula Microsystems                 | 4.2                                            |  |
| NXP Semiconductors                           | 13.7, 25.4, 25.5                               |  |
| Oklahoma State University                    | 4.6                                            |  |
| Oregon State University                      | 25.3                                           |  |
| Poking University                            | 6.3, 10.3, 10.6, 10.7, 11.5, 11.7, 17.1, 22.7, |  |
|                                              | 23.2                                           |  |
|                                              | 24.7                                           |  |
| Polyang University of Science and Technology |                                                |  |
| Point2 technology                            | 5.4, 20.2, 22.2, 20.4, 20.7                    |  |
| Politecnico di Milano                        | 23.6                                           |  |
| Pragmatic Semiconductor Ltd                  | 16.4                                           |  |
|                                              | 16.6                                           |  |
| Qualcomm                                     | 12 5 13 2                                      |  |
| Quitech                                      | 9.2                                            |  |
| Realtek Semiconductor                        | 14.1                                           |  |
| Renesas Electronics                          | 24.4                                           |  |
| Roswell Riotechnology                        | 12.6                                           |  |
| SambaNova Systems                            | 21.0                                           |  |
| Samsung Advanced Institute of Technology     | 12.2.15.1                                      |  |
| Samsung Disnlav                              | 59                                             |  |
|                                              | 31 32 55 58 74 131 136 145 151                 |  |
|                                              | 18.1. 27.1. 27.5. 28.2. 28.3. 28.7. 28.8.30.3  |  |
| Samsung Electronics                          | 33.1                                           |  |
| Samsung Semiconductor                        | 13.4, 27.1                                     |  |

| Seoul National University                                | 23.4, 28.6                                     |  |  |
|----------------------------------------------------------|------------------------------------------------|--|--|
| Sharif University of Technology                          | 3.4                                            |  |  |
| Shizuoka University                                      | 5.4                                            |  |  |
| SK hynix                                                 | 7.3, 11.1, 28.1, 28.8                          |  |  |
| SolidVue                                                 | 5.3                                            |  |  |
| Sony Electronics                                         | 5.6                                            |  |  |
| Sony Europe                                              | 23.3                                           |  |  |
| Sony LSI Design                                          | 5.6                                            |  |  |
| Sony Semiconductor Solutions                             | 5.6, 23.3                                      |  |  |
| SOSO H&C                                                 | 20.8                                           |  |  |
| South China University of Technology                     | 4.3. 19.1                                      |  |  |
| Southern University of Science and Technology            | 22.3. 32.1. 34.2                               |  |  |
| Stanford University                                      | 32.4                                           |  |  |
| Sun Yat-Sen University                                   | 24.5                                           |  |  |
| Sungkyunkwan University                                  | 5 3                                            |  |  |
| Suzhou Novosense Microelectronics                        | 14.7                                           |  |  |
|                                                          | 12 /                                           |  |  |
| Tenstorrent                                              | 21 /                                           |  |  |
|                                                          | 6 1                                            |  |  |
| The N 1 Institute for Health                             | 22.1                                           |  |  |
|                                                          | 10.2                                           |  |  |
| Talijii Olivelsky                                        |                                                |  |  |
|                                                          | 4.8, 15.4, 27.2                                |  |  |
|                                                          |                                                |  |  |
| Tain also a Unite antita                                 | 4.3, 9.3, 10.3, 10.7, 15.5, 19.1, 19.2, 29.2,  |  |  |
|                                                          | 29.3, 34.1                                     |  |  |
|                                                          | 11.2, 11.4, 11.6, 23.1                         |  |  |
|                                                          | 11.3, 16.2, 16.3                               |  |  |
| ISMC Design Technology                                   | 16.2, 16.3                                     |  |  |
| Ulsan National Institute of Science and Technology       | 5.3, 20.6, 20.8                                |  |  |
|                                                          | 29.1                                           |  |  |
| United Semiconductor Japan                               | 12.1, 24.1                                     |  |  |
| University of Texas                                      | 14.4, 18.7                                     |  |  |
| University College Dublin                                | 23.1, 23.3                                     |  |  |
| University of British Columbia                           | 17.7                                           |  |  |
|                                                          | 2.6, 4.7, 9.4, 12.6, 15.5, 16.5, 17.4, 17.5,   |  |  |
| University of California                                 | 19.7, 20.7, 22.4, 23.4, 23.5, 25.1, 29.3, 32.1 |  |  |
| University of Electronic Science and Technology of China | 19.3, 23.7                                     |  |  |
| University of Lisboa                                     | 18.3, 18.8, 22.5, 24.5                         |  |  |
| University of Macau                                      | 3.8, 14.6, 18.3, 18.8, 22.5, 24.5              |  |  |
| University of Massachusetts                              | 26.1                                           |  |  |
| University of Michigan                                   | 12.1, 24.1, 24.6, 32.5                         |  |  |
| University of Padova                                     | 23.6                                           |  |  |
| University of Pavia                                      | 9.1                                            |  |  |
| University of Pennsylvania                               | 12.1                                           |  |  |
| University of Science and Technology of China            | 14.7, 18.2                                     |  |  |
| University of Southampton                                | 17.2                                           |  |  |
| University of Southern California                        | 10.1                                           |  |  |
| University of Technology Sydney                          | 4.3                                            |  |  |
| University of Texas                                      | 4.6, 14.3                                      |  |  |
| University of Twente                                     | 3.3                                            |  |  |
| University of Virginia                                   | 13.8, 30.1                                     |  |  |
| University of Washington                                 | 13.3, 13.7, 16.7                               |  |  |

| University of Zurich and ETH Zurich  | 22.6, 29.4       |  |
|--------------------------------------|------------------|--|
| Vishay                               | 8.3              |  |
| Vlaams Instituut voor Biotechnologie | 12.4             |  |
| Western Digital                      | 7.1              |  |
| Xilinx                               | 8.3              |  |
| Yonsei University                    | 5.7, 20.5        |  |
| Zhejiang University                  | 10.6, 18.6, 25.2 |  |

# **CONTACT INFORMATION**



| ANALOG                |                                                               | MEMORY              |                                              |
|-----------------------|---------------------------------------------------------------|---------------------|----------------------------------------------|
| Subcommittee Chair:   | Maurits Ortmanns                                              | Subcommittee Chair: | Meng-Fan Chang                               |
|                       | University of Ulm                                             |                     | National Tsing Hua University                |
| Work Phone:           | +49-731-50-26200                                              | Work Phone:         | +886-3-516-2181                              |
| Linaii.               | maunts.ortmanns@uni-uni.ue                                    |                     | michang@ee.minu.euu.tw                       |
| Press Designates:     | NA/EU: Drew Hall                                              | Press Designates:   | NA/EU: Violante Moschiano                    |
|                       | FE: Man-Kay Law                                               |                     | FE: Hye-Ran Kim                              |
|                       |                                                               |                     |                                              |
| Subcommittee Chair    | Michael Flynn                                                 | Subcommittee Chair: | Yogesh Ramadass                              |
|                       | University of Michigan at Ann Arbor                           |                     | Texas Instruments                            |
| Work Phone:           | 734-936-2966                                                  | Work Phone:         | 669-721-6737                                 |
| Email:                | mpflynn@umich.edu                                             | Email:              | yogesh.ramadass@ti.com                       |
| Press Designates:     | NA/EU: Jan Westra<br>FE: Chih-Cheng Hsieh                     | Press Designates:   | NA/EU: Johan Janssens<br>FE: Chan-Hong Chern |
|                       |                                                               | RF                  |                                              |
| DIGITAL ARCHITECTURES | S & SYSTEMS                                                   | Subcommittee Chair: | Jan Craninckx                                |
| Subcommittee Chair:   | Thomas Burd                                                   |                     | IMEC                                         |
| West Dheses           | AMD                                                           | Work Phone:         | +32-16-28-87-56                              |
| Work Phone:           | 408-749-2805<br>tom burd@amd.com                              | Email:              | jan.craninckx@imec.be                        |
|                       | tom.burd@and.com                                              | Press Designates    | NA/EU: lim Buckwalter                        |
| Press Designates      | NA/EU: Hugh Mair                                              | Tress Designates.   | FF Wei Deng                                  |
|                       | FE: Sugako Otani                                              |                     | 1 <u>_</u> . 110. <u>_</u> 0.19              |
|                       |                                                               | TECHNOLOGY DIRECTIO | NS                                           |
| DIGITAL CIRCUITS      |                                                               | Subcommittee Chair: | Makoto Nagata                                |
| Subcommittee Chair:   | Keith Bowman                                                  |                     | Kobe University                              |
|                       | Qualcomm                                                      | Work Phone:         | +81-78-803-6569                              |
| Work Phone:           | 919-237-4750                                                  | Email:              | nagata@cs.kobe-u.ac.jp                       |
| Email:                | kbowman@qti.qualcomm.com                                      |                     |                                              |
|                       |                                                               | Press Designates:   | NA/EU: Denis Daly                            |
| Press Designates:     | NA: Alicia Klinetelter<br>EU: Yvain Thonnart<br>EE: Eric Fang |                     | FE. Radu berdan                              |
|                       | · _ · _ · · · · · · · · · · · · · · · ·                       | WIRELESS            |                                              |
|                       |                                                               | Subcommittee Chair: | Stefano Pellerano<br>Intel                   |
| IMAGERS, MEMS, MEDICA | L & DISPLAYS                                                  | Work Phone:         | 503-712-4576                                 |
| Subcommittee Chair:   | Chris Van Hoof<br>imec                                        | Email:              | stefano.pellerano@intel.com                  |
| Work Phone:           | +32-16-281815                                                 | Press Designates:   | NA/EU: David Wentzloff                       |
| Email:                | chris.vanhoof@imec.be                                         |                     | FE: Yuan-Hung Chung                          |
| Press Designates:     | NA/EU: Johan Vanderhaegan                                     |                     |                                              |
|                       | FE: Joonsung Bae                                              | WIRELINE            |                                              |
|                       |                                                               | Subcommittee Chair: | Yonan Frans<br>Vilipy                        |
|                       |                                                               | Work Phone          | 408-879-2707                                 |
| Subcommittee Chair:   | Marian Verhelst                                               | Email:              | vohanf@xilinx.com                            |
|                       | KU Leuven                                                     |                     | ,                                            |
| Work Phone:           | +32-16-328617                                                 | Press Designates:   | NA/EU: Mike Shuo-Wei Chen                    |
| Email:                | marian.verhelst@kuleuven.be                                   | -                   | FE: Takashi Takemoto                         |
| Press Designates:     | NA/EU: Geoff Burr                                             |                     |                                              |
|                       | FE: Kea-Tiong Tang                                            |                     |                                              |

© COPYRIGHT 2022 ISSCC-DO NOT REPRODUCE WITHOUT PERMISSION

#### Program Chair, ISSCC 2022

Edith Beigné Meta Work Phone: Email:

650-709-8127 edith.beigne@gmail.com

#### Program Vice-Chair, ISSCC 2022

Piet Wambacq imec Work Phone: Email:

+32-16-281-218 wambacq@imec.be

#### **Press Coordinator**

Shahriar Mirabbasi University of British Columbia Email: shahriar@ece.ubc.ca

#### **Press-Relations Liaison**

Kenneth C. Smith University of Toronto Work Phone: Email:

416-418-3034 Icfujino@aol.com



Editor-in-Chief:Shahriar MirabbasiEditor-at-Large:Kenneth C (KC) SmithPublisher:Laura Chizuko Fujino

