Wednesday, July 22, 2020

DDR5 Memory Spec Meets Extreme Speeds

JEDEC, a semiconductor engineering organization that sets standards for dynamic random-access memory (DRAM), published the finalized JESD79-5 DDR5 specification last week. The new type of memory doubles per-pin data transfer rate over its predecessor, increases capacity of memory devices by four times, lowers operating voltage, and introduces several methods to improve DRAM reliability at thin nodes — as well as reduces power consumption. JEDEC says that DDR5 SDRAM will be used for upcoming client PCs and servers starting next year.

The evolution of DRAM typically follows the same direction as development of CPUs. Nowadays, microprocessors strive for higher core count, improved per core performance, and greater power efficiency, which sets requirements for system memory. Advanced cores need sufficient memory bandwidth, whereas loads of such cores require high DRAM capacity (and high bandwidth), but there are limitations when it comes to capabilities of memory controllers, signal integrity, as well as power draw. To that end, DDR5 addresses requirements of client and server systems for years to come by evolving DRAM in multiple directions at the same time.

Source: Micron

“With several new performance, reliability and power saving modes implemented in its design, DDR5 is ready to support and enable next-generation technologies,” said Desi Rhoden, chairman JC-42 memory committee at JEDEC.

DDR5 brings a vast number of developments, and given all the options for scalability and improvements that the new standard has, it is possible that the new type of memory will live longer than its predecessors due to generally slower technology transitions — which is a good thing for CPU and SoC designers. The formal publication indicates that the development is complete, all parameters are set, and no new features are to be added.

“High-performance computing requires memory that can keep pace with the ever-increasing demands of today’s processors,” said Joe Macri, AMD Compute and Graphics CTO. “With the publication of the DDR5 standard, AMD can better design its products to meet the future demands of our customers and end users.”

Capacity improvements
Although client and server machines are very different when it comes to components these days, there is one thing they have in common: they starve for high-density DRAMs. Modern servers have a growing need for memory capacity as the number of cores inside server CPUs increase, but it is physically impossible to install more than two modules per memory channel today without using complex memory subsystems with buffers (e.g., LRDIMMs or other buffered solutions). DRAM requirements of PCs grow much slower, but because we want sleeker notebooks, PC makers use more expensive LPDDR memory. Coincidentally, the answer to both servers and clients is the same: higher-capacity, general-purpose DRAM devices.

“DDR5 is primarily about density, making it particularly well-suited for enterprise, cloud and big data applications,” said Rishi Chugh, vice president of product marketing at Cadence’s IP group.

Source: IMEC

DDR5 will start with 8 GB and 16 GB DRAMs, exactly where DDR4 is today. But the standard allows manufacturers to design devices featuring a 24 GB, a 32 GB, or a 64 GB capacity. While DDR4 increased maximum per-device capacity from 8 GB to 16 GB, DDR5 quadruples it from 16 GB all the way to 64 GB. Furthermore, DDR5 allows stacking up to 16 memory devices (up from four in the case of DDR4). There is a limitation though: the standard enables to stack up to 16 DRAMs featuring a 16 GB, 24 GB, or 32 GB capacity, but only up to eight 64 GB devices.

“There is limitation about the upper address bit whether it is a chip ID or upper address bit, so with 64 GB devices we can only support 8-Hi stacking,” said a spokesperson for JEDEC.

8-Hi 64 GB as well as 16-Hi 32 GB DRAM stacks will enable makers to produce 2 TB server-grade modules several years down the road, so a server CPU supporting eight memory channels and two modules per channel could be equipped with 64 TB of DDR5 memory (up from 4 TB today).

“Because DDR5 actually embeds rank information (Chip ID) in the Command/Address instead of using separate signals as did previous generations, the DDR5 implementations can actually support up to 32 ranks (devices) of memory because we can have 16 high stacks of 3D memory,” the spokesperson explained. “Modules are inherently 1 or 2 rank and then each DRAM location on the module can support 16-Hi stack, so the net is 32.”

Tightly-packed 16-Hi DDR5 stacks will certainly not be available in the first wave of DDR5 chips because of manufacturing challenges.

“We will not likely see 16-Hi stacks in the industry with the first roll out of DDR5 because of the additional manufacturing work necessary,” the spokesperson said. “In fact, 8-Hi stacks will likely come quite late. 2-Hi and 4-Hi stacks of memory will be fairly common, so that means 2, 4 and 8 rank modules will be pretty common from the start. We will not really refer to them as ranks anymore, but rather refer to the fact that we are using 3D stacked memory.”

Source: SK Hynix

To increase per-device capacity to 64 Gb, DDR5 developers had to expand the number of banks to account for that. To that end, DDR5 increases the number of bank groups from 16 to 32 while leaving the number of banks per bank group the same (four banks), which allows more pages to be opened at any given time to improve page hit ratio, which not only mitigates higher capacities, but can actually improve system performance as measured by Micron using its 16 Gb DDR5 devices.

Performance Improvements: Speed and Efficiency
Traditionally, every new DRAM standard increases I/O speed to improve memory bandwidth and DDR5 is not an exception. The new memory will start at 4.8 Gbps at 1.1 Volts and will scale to at least 6.4 Gbps at 1.1 Volts. DDR5 implements numerous methods to enable long-term I/O scalability, including DFE (decision feedback equalizer) that eliminates reflective noise at high frequencies, new and improved training modes, as well as on-die termination, just to name a few. In fact, Micron and SK Hynix have already indicated plans to increase I/O speed to 8.4 Gbps at 1.1 Volts over time, which is a testament to DDR5’s ability to scale well beyond its official limit.

“DDR5, developed with significant effort across the industry, marks a great leap forward in memory capability, for the first time delivering a 50% bandwidth jump at the onset of a new technology to meet the demands of AI and high-performance compute,” said Carolyn Duran, vice president of data platforms group and general manager of memory and IO technologies at Intel.

Meanwhile, although high I/O speeds increase peak theoretical bandwidth, actual real-world bandwidth does not always increase linearly along with clocks and therefore engineers have to introduces ways to increase real-world performance. Developers of DDR5 have incorporated numerous methods to boost efficiency of the new memory type, so they expect the upcoming DDR5 to be up to 28% ~ 36% more efficient than DDR4 even at the same I/O speeds depending on the number of banks and bank groups.

Source: Micron

On the architectural level, DDR5 enhances the burst length to 16 bytes from 8 bytes, which not only doubles the number of bytes per operation, but also increases efficiency of memory reads for systems with a 64-byte cache line. Furthermore, for systems that use a 128-byte cache line, there is an option to increase burst length 32 bytes. In addition, because DDR5 memory devices feature 32 bank groups with four banks per group, actual read/write efficiency increases since bank-group-to-bank-group interleaved timing accesses are shorter than bank-to-bank within a specific bank group access.

DDR5 also brings in “same-bank refresh” (REFsb) command in addition to standard “all-bank refresh” (REFab) command. The REFsb can be issued only in the fine granularity refresh mode when each bank must receive a ‘refresh’ command every 1.95μs on average, but in this case REFsb cuts the refresh impact to average idle latency from 11.2 ns to 5.0 ns. According to Micron, simulations show that usage of REFab can improve system performance by 6% to 9% depending on the read/write command ratio.

Source: Micron

On the module level, DDR5 moves on to two fully-independent 32/40-bit I/O channels (non-ECC/ECC) vs. a single 64/72-bit I/O channel in the case of a DDR4 module, which promises to improve overall channel utilization, according to JEDEC. Each channel has its own 7-bit Address (Add) and 7-bit Command (Cmd) bus that operate in DDR mode (which compares to a single 24-bit SDR bus used by DDR4) and supports on-die termination to ensure clean signals and stability at high clocks. The independent 32-bit channels also get handy with the increased burst length as they reduce the number of operations needed to access the 64-byte cache line.

Source: Micron

“We have done a lot to increase the performance,” said Rhoden. “But the channel in DDR5, it is substantially improved over previous generations and it’s allowing us to get even more performance out of the channel itself, scaling.”

JEDEC admits that while narrower independent DRAM channels per module will simplify DRAM controllers on “per-channel basis,” but given all the peculiarities of the new memory standard, DDR5 memory controllers will get more complex than DDR4 controllers. Yet, they are not going to become prohibitively more complex or expensive to implement.

“The training modes that are required to enable the higher speed operation […] add some additional complexity, but we also removed a lot of complexity as well,” said Frank Ross, a lead architect at Micron. “Getting rid of the CKE and adding non-target ODT as part of the command, making a combined command address encoded a bus for the CA with longer burst length of 16, there is a lot more availability for the CA bus. You know, there are tradeoffs. We add some complexity to make it all work. But, of course, we want to make it work efficiently as well. So just having the additional bank groups and things like that add a lot of capabilities for the controller, but only incremental complexity. So, I don’t think we see hugely more complex controllers.”

Because most SoC designers these days tend to use silicon-proven DDR IP and PHY, it is unlikely that increased complexity of DRAM controllers will have an effect on time-to-market of DDR5-based solutions. Cadence and Synopsys will be among the first to offer silicon-proven DDR5 IP with differentiated features and PHY.

“Synopsys’ silicon-proven DesignWare DDR5 IP with differentiated features such as firmware-based training is validated using Micron’s DDR5 DRAM, offering a low-risk solution,” said John Koeter, senior vice president of marketing and strategy for IP at Synopsys. Designers can confidently integrate the DDR controller, PHY, and verification IP solution into their high-performance computing SoCs to meet their data rate, bandwidth efficiency, capacity, and RAS requirements.”

According to Cadence, it has 15 customers that have already adopted its DDR5 IP.

“Cadence has been collaborating with Micron on DDR5 DRAM for more than two years, enabling Micron DDR5 DRAM market adoption with over 15 licensed customers for Cadence’s DDR5 IP,” said Chugh.

RAS Improvements: On-Die ECC, ECC, ECS, and Enhanced PPR
Increasing frequencies, I/O speeds, and capacity all open doors to a long-term evolution across many production nodes for years to come. For example, Samsung plans to produce advanced DRAMs using its D1a process technology that uses extreme ultraviolet (EUV) lithography to improve its per-wafer productivity and potentially cut costs. As DDR5 DRAM cells will get smaller over time and gain speed, they get more prone to errors, which lowers production yields and long-term reliability. Meanwhile, system designers need to ensure that those high-capacity DRAMs work correctly and when/if they render an error, there are ways to mitigate it. This is a reason why DDR5 introduces a host of reliability features not supported by previous-generation DDR specifications.

All DDR5 memory devices feature on-die single error correction (SEC) ECC. The capability pairs 128 data bits with 8 parity bits to form a 136-bit codeword stored in the DRAM during a write command. Subsequent read commands to that address calculate a syndrome based on the 136 bits and correct single-bit errors if they occur before outputting the data to the controller. The on-die ECC promises to help memory makers to improve or maintain yields of high-capacity DDR5 DRAM dies produced using the upcoming process technologies on commercially acceptable levels — which is important for their business. Furthermore, since single-bit errors may occur after a memory device has been in service for years, the on-die ECC is expected to fix such issues too, which will improve long-term reliability of systems.

“It is a scaling enablement,” said Ross. “And it is also an improvement for the overall system reliability as well. So, you get more out of it than just the scaling enablement too.”

On-die single-bit ECC does not eliminate the need for error correction across the whole memory subsystem. On the system level, the DDR5 architecture offers Hamming codes-based ECC to detect double-bit failures and correct them.

On top of on-die and system-level ECC, DDR5 introduces the error check and scrub (ECS) function that performs full scrub of the data bits in the array within a time period, corrects the errors if needed, and reports the number of errors that were corrected during the scrub —subject to a minimum threshold — to the host. The ECS capability is particularly important for servers as it greatly enhances reliability and serviceability within a machine and helps to detect potential failures early and avoid downtime, a nice feature for those running fleets of servers and whose business depends on their wellbeing.

Modern DRAMs are complex semiconductor devices with loads of cells that can failure at some point. To that end, newer memory standards support Post-Package Repair (PPR) capabilities enable an appropriate DRAM controller to remap accesses from a defective row of a memory module to a spare row of the memory module that is healthy. The DDR5 specification features two types of PPR: non-volatile permanent hard repair (hPPR) and temporary soft repair (sPPR).

The sPPR feature backs up only one target row in the bank where the repair has to be performed, which only takes in the range of ~2μs per row of data, according to Micron. As a result, sPPR minimizes system time needed to back up and store large chunks of data, something that will get increasingly important for systems featuring terabytes of RAM.

DDR5’s PPR can also track availability of a PPR in each bank dynamically both in single-die and multi-die 3DS stacked packages, which provides added visibility to RAS capabilities of a given machine.

Energy Efficiency: Taking VRM and PMIC on the Module
Power consumption of DRAM represents a fraction of what a client PC consumes, but in tightly-packed servers memory modules can consume hundreds of Watts (Micron’s 8 Gb DDR4 device consumes about 0.4 W). Therefore, any saving is important for companies employing tens of thousands of machines.

DDR5 lowers Vdd supply to 1.1 volts with an allowable fluctuation range of about 3% (i.e., to ±0.033V) from 1.2 Volts in case of DDR4, which clearly helps to reduce power consumption. Fluctuation range was reduced in a bid to maintain stability at high data rates, but to ensure that everything works as designed, DDR5 memory modules have to feature their own voltage regulator modules (VRMs) and power management ICs (PMICs).

DRAM makers claim that dedicated VRMs and PMICs promise better voltage tolerance for improved DRAM yields, as well as the potential to further reduce power consumption or increase performance. Memory modules with their own VRMs and PMICs will naturally be slightly more expensive than today’s DIMMs for client PCs, but they will simplify design of motherboards. In the server world, design simplification may be important for machines that have 12 – 16 memory slots per CPU socket. Furthermore, the modules will only use specific VRMs and PMICs that they need.

“We will move much of that off of the motherboard, making that easier to manufacturer,” said Rhoden. “Certainly [the motherboards will] cost less. Then, when we add [VRM and PMIC on memory modules], we will add specific components for each of the memory modules as we are moving forward. That is exactly what this is all about.”

At this point neither JEDEC nor memory makers reveal their expectations about the added costs of onboard VRM and PMIC and only say that the market will decide.

Source: JEDEC

 

DDR5 continues to use a 288-pin arrangement, but with a different pinout. Meanwhile, DDR5 modules will be generally slightly more complex than DDR4 modules because of onboard VRM, PMIC, and temperature sensors. It is also noteworthy that VRM, PMIC, and SPD components for client-grade DIMMs and server-grade RDIMMs and LRDIMMs will be different. In the meantime, enthusiast-class modules for overclockers will probably boast with overengineered VRMs to data transfer rates to the max.

Availability
JEDEC usually publishes new specifications well before actual hardware availability. Development of DDR5-supporting server CPUs and SoCs is well underway by companies like AMD, IBM, and Intel, as it is not up to a standard-setting body to preannounce such projects as well as their launch timeframes.

JEDEC says that historically it has taken the industry about a year and a half to introduce platforms using a given specification. Meanwhile, the transition from DDR3 to DDR4 was led by the server market and specialists from JEDEC believe that the conversion to DDR5 will also start from servers with client systems following on later.

“The transition from DDR3 to DDR4 was led by the server market and then the client followed,” said Rhoden. “Since 2014 when that transition started for DDR4, we know that the data center and the cloud market has grown substantially. So, we expect that’s going to be even more so with DDR5, [with servers] driving adoption [of the new standard] and clients following.

Given everything that has been said so far, the first DDR5-supporting server platforms are set to hit the market in late 2021 at the earliest with substantial volumes arriving in 2022. Considering AMD and Intel’s “servers first” approach when it comes to the latest technologies, it is likely that client DDR5-powered machines will ship in the latter half of 2022 at best.

Samsung now says that is going to make DDR5 available in volumes when the market needs the new type of memory without disclosing any actual timeframes.

“We are delighted to see the timely release of this standard, and anticipate bringing our DDR5-standardized solutions to volume production in a timeframe that coincides with market demands,” said Sangjoon Hwang, senior vice president of memory product planning at Samsung Electronics.

SK Hynix, which described its first 16 GB DDR5 chip with a 6.4 Gb/s per pin data rate at ISSCC 2019, says it is ready to mass produce the new type of memory as soon as in the second half of this year. But even if it commences production, all the modules that will be made early will have to sit in the warehouse and wait for DDR5 platforms to arrive.

“SK Hynix is opening up a new sector in the market through the development of the industry’s first DDR5 that meets JEDEC standards,” said Uksong Kang, vice president of DRAM product planning at SK Hynix. “We have been working with many partners to verify DDR5 Ecosystem through development of test chips, and modules since 2018, and doing our best to secure mass-production levels in the second half of this year.

Micron’s Technology Enablement Program
To help other designers of chips and accelerate design, development, and qualification of DDR5-based platforms, Micron has started its DDR5 Technology Enablement Program in collaboration with Cadence, Montage, Rambus, Renesas, and Synopsys. Qualified participants of the program will be able to obtain DDR5 devices and modules as well as various components (DDR5 RCD, DDR5 DB, DDR5 SPD-Hub, DDR5 PMIC, DDR5 temperature sensor) for RDIMMs and LRDIMMs.

“We are pleased to offer a comprehensive portfolio of low-power DDR5 logic devices — including DDR5 RCD, DDR5 DB, DDR5 SPD-Hub, DDR5 PMIC and DDR5 Temperature Sensor — and to help drive quick industry adoption of DDR5 memory,” said Geof Findley, vice president of sales and business development at Montage Technology.

In addition, participants will have access to data sheets, electrical and thermal models, as well as consultation on signal integrity and other kinds of technical support required during development process.

“Micron has been driving deep technical engagements with the world’s largest server and hyperscale companies to help them better understand how DDR5 can benefit their unique workloads,” said Tom Eby, senior vice president and general manager of the compute & networking business unit at Micron. “We are proud to now extend that collaboration across the entire ecosystem through our Technology Enablement Program to help accelerate time to market for this critical next generation technology.”

Summary
JEDEC’s DDR5 SDRAM standard addresses capacity, performance, efficiency, power, and RAS needs of upcoming client and server systems. While the key innovations of new DRAM types have always been performance and capacity, DDR5 supports features that are aimed to improve manufacturing yields, reliability, and resiliency, which are crucially important for memory makers and datacenters.

Among the interesting peculiarities of the upcoming DDR5 memory modules for clients and servers will be onboard VRM and PMIC designed to ensure a stable Vdd supply atr 1.1 Volts with an allowable fluctuation range of about 3%. The move of VRMs and PMICs from the motherboards to the modules will simplify design of the former, but will increase the costs of the latter.

The DDR5 technology will be first adopted by server platforms starting from late 2021 or sometimes in 2022 with client PCs following on a bit later. JEDEC expects DDR5 to have an active life of at least seven years. At this point, DRAM makers expect the first wave of DDR5 memory chips to feature an 8 Gb or a 16 Gb capacity as well as a 4.8 Gbps data transfer rate or higher. Eventually, capacities are set to increase to 64 Gbps, the standard defines data rates up to 6.4 Gbps, but DRAM makers already mull 8.4 Gbps.

Since DRAM producers have been ready with working DDR5 ICs for many quarters now, they seem to be ready to start production of DDR5 chips any time, but without platforms the new DRAMs are hardly needed by the market. Meanwhile, with additional time memory makers now have to refine designs of their DDR5 ICs, it is possible that even the first wave of DDR5 platforms will be able to support higher performance than expected today.

The DDR5 specification describes memory devices featuring a 64 Gb capacity as well as 8-Hi stacked memory chips that will enable makers to build 2 TB modules eventually. Considering all the capacity and performance-related innovations that DDR5 brings, it is clear the standard is here for a long haul and DDR5 memory chips will be produced using multiple process technologies, including those that will use extreme ultraviolet lithography.

 

The post DDR5 Memory Spec Meets Extreme Speeds appeared first on EE Times Asia.



from EE Times Asia https://ift.tt/32LjSiA

No comments:

Post a Comment

Please do not enter any spam link in the comment box.

How I channel my inner Star Trek character at work

In a recent Twitter thread , I self-identified as "some days Deanna, some days Riker." Others shared their own "Star Trek Sp...