Intel’s 2021 Architecture Day is full of in-depth disclosures. This year’s theme focuses on the architectural details of the company’s upcoming Alder Lake CPUs. These CPUs will combine two types of cores in a hybrid, covering everything from desktops to ultra-mobiles. Application programming, the first x86 desktop PC chip. However, unlike other Arm hybrid designs that we have seen that are adjusted for power efficiency, Intel has adjusted its Alder Lake chip to achieve the highest possible performance. Intel claims that Alder Lake’s high-performance cores average 19% Industrial PC The improvement to the Rocket Lake chip marks the fastest high-performance core built by the company, and its new efficiency core provides up to five times the power efficiency of Skylake.Intel’s Alder Lake also supports the following features PCIe 5.0 and DDR5 surpass AMD and Apple in terms of connection technology, and also exceed the number of cores of Ryzen in terms of mobile design. This may be a much-needed victory for Alder Lake when it goes on sale in the fall of 2021.
Intel also outlined its Sapphire Rapids and IPU processors for data centers, which brought a lot of their own new advancements, and shared detailed information about their new products. Xe Arc Alchemist discrete gaming GPU for desktop, Together with the data center binding Ponte Vecchio and Xe-HPC GPU, and also.
Intel has shared a lot of new information about its latest CPU architecture, but we are focusing on Alder Lake in this article. We have put out a short list of disclosures here, but we will study each topic in more depth in the respective sections below and the following pages:
- Alder Lake System-on-chip It will cover everything from desktops to ultra-mobile devices, with TDP ratings ranging from 9W to 125W, all built on the Intel 7 process. Desktop computers are equipped with up to eight performance (P) cores and eight high efficiency (E) cores for a total of 16 cores and 24 threads and up to 30 MB of L3 Cache For a single chip.
- Alder Lake supports DDR4 or DDR5 (LP4x/LP5 is also supported). Desktop computers support x16 PCIe Gen 5 and x4 PCIe Gen 4, while mobile devices support x12 PCIe Gen 4 and x16 PCIe Gen 3, Thunderbolt 4, and Wi-Fi 6E.
- Intel’s new superWireThe ed Performance (P) core has a Golden Cove microarchitecture designed for low-latency single-threaded performance. Compared with the Cypress Cove architecture in Rocket Lake, the IPC is an average of 19% higher. It also supports AVX-512 and AMX (a new AI-focused matrix multiplication ISA) of the data center variant (both of which are disabled on consumer chips).
- Intel’s new single-threaded efficiency (E) core is provided with the Gracemont microarchitecture, designed to improve multi-threaded performance and provide excellent area efficiency (small footprint) and performance per watt. Four of the small cores are located in the same area as the Skylake core, providing 80% performance (under the same power) in thread work. In single-threaded work, the performance of a single E core is also 40% higher than that of a single-threaded Skylake core (at the same power) (notes apply to both).
- Intel’s thread director is a hardware-based technology that provides enhanced telemetry data to the Windows 11 scheduler to ensure that threads are allocated to P or E cores in an optimized manner, which may alleviate one of the main pain points of hybrid architectures Standard desktop environment. This is a sleeper technology that implements a hybrid architecture.
- Alder Lake does not support AVX-512 under any circumstances (integrated in the P core, not supported in the E core).
- Intel’s Sapphire Rapids brings many new technologies, including four chips connected by a high-bandwidth EMIB interconnect, thus purportedly allowing the chips to operate in a manner similar to a monolithic design. The chip is built on the Intel 7 process and has an enhanced version of the Golden Cove micro-architecture. Each core has a 3MB L2 cache. It also supports AVX-512, DSA and the new AMX matrix multiplication function, which can improve AI performance. Some SKUs also come with packaged HBM2E memory.
- Intel will host the first Intel Innovation Event from October 27th to 28th, including keynote speeches, demonstrations and technical conferences. The event will be face-to-face (location not announced) and remote, and is largely considered the official unveiling of the Alder Lake processor stack.
Alder Lake configuration and SoC
Quick review: The design of Intel’s Alder Lake architecture is reminiscent of ARM’s big.LITTLE. The larger cores are mainly used for high-priority single-threaded work, while the smaller cores perform multi-threaded workloads and are not too intensive. Background tasks. Intel uses a combination of “big” performance (P) Golden Cove cores and “small” efficiency (E) Atom Gracemont cores to accomplish the task. We will dig deeper into the core architecture on the next page.
The goal of the collaboration between Intel and Alder Lake is to create a small number of IP modules for mix and match designs to meet the vast consumer market from 7W to 125W TDP.
As you can see above, Intel etched P-Core and E-Core onto the same CPU chip, and the four smaller high-efficiency E-Cores (we mark one E-Core cluster in red) consume roughly the same amount The number of chip areas is used as a single high-performance P core (dark blue). The diagram may not be drawn exactly to scale, but Intel tells us that it can fit four E-Cores in the same space as a single Skylake core.
Alder Lake chips use the Intel 7 process, which was called “10nm Enhanced SuperFin” before Intel recently renamed its process node during its latest process and packaging roadmap update. Golden Cove core support Hyperthreading, Allowing two threads to run on a single core, while the smaller Gracemont core is single-threaded. Both types of cores appear as part of the IP block, which also includes some cache topologies (such as part of L1, L2, and LLC). This means that some models may have seemingly strange distributions of cores and threads.
Intel associates the core, L3 cache (LLC), memory, and other IP blocks with the ring bus, as we have seen in the previous CPU architectures of mainstream desktops.
The media engine is the same in this case Gen12 Xe LP architecture found in Huhu But ported to the Intel 7 process, there are two variants: one for desktop PCs (because they tend to use discrete GPUs) with 32 EUs (GT1), and the other with 96 EUs for the mobile variant of GT2 variants .Intel says Xe LP engine support 1080p Gameplay and features 12 bits End-to-end video pipeline. You will notice that the desktop PC model does not have a Thunderbolt 4 connection or an image processing unit (IPU), these features are only available in the mobile version.
Alder Lake desktop PC chips will be equipped with up to 8 performance cores and 8 efficiency cores for a total of 24 threads (two threads per P-Core and one thread per E-Core). These chips will also be equipped with up to 30MB of L3 cache.
Alder Lake’s new memory controller supports four different memory types: DDR5-4800 and LP5-5200, and DDR4-3200 and LP4x-4266. The extensive memory support of this single design supports different types of memory configurations for different use cases. Intel seems to split its memory support into DDR4 for low-end motherboards (B and H series motherboards) and mobile systems, while DDR5 will only be used for high-end configurations (Z series motherboards). Considering the expected high price of DDR5 memory in the early stages of adoption, this makes sense, but it is worth noting that Intel has not confirmed its approach.
Alder Lake also supports up to PCIe 5.0, with a throughput of 64 GB/s for x16 channel connections.Desktop PC chips support x16 PCIe Gen 5 connections and additional x4 PCIe Gen 4 connections (It’s not clear whether this x4 connection is used for chipset Or exposed to users), and the low-power version supports x12 PCIe Gen 4 configuration and x16 PCIe Gen 3 connection pairing.
The collection of P and E cores, caches, and higher throughput 64 GB/s PCIe 5.0 and DDR5 subsystems requires a powerful structure to ensure low-latency and high-throughput connections between various elements. Alder Lake’s computing structure ties these elements to the 1000 GB/s throughput of the entire element cluster or even a single core. Intel said that the bus has a dynamic bandwidth/delay optimization scheme based on structural utilization, but it is not clear how this is different from a standard ring bus with a traffic routing mechanism. The system can also shift the L3 cache from an inclusive or non-inclusive strategy based on utilization.
In addition, the memory structure supports a throughput up to 204 GB/s, and real-time modulation can be adjusted by bus width and frequency. This means that Alder Lake’s memory subsystem can dynamically adjust between high-frequency and low-frequency operating states based on heuristic workload analysis based on real-time requirements, with the goal of optimizing power or performance based on the workload at hand.
The first chips based on this design come in three different packages, each of which is aimed at a different market segment: Desktop PC chips, which will be inserted into a new motherboard with LGA 1700 CPU socket (Yes, 115x coolers with converters are compatible), high-performance BGA Type3 package for mobile applications (this may be a 12-28W UP3 package, although Intel has not confirmed yet), and high density for mobile applications BGA Type4 HDI package ultra-mobile application (may be equivalent to ultra-thin 7-15W UP4).
We have collected a lot of information from official Linux Coreboot patches, which outline the various combinations of P and E cores, and we also narrowed down Intel’s three product categories in the image above:
- Alder Lake-S: Desktop computer
- Alder Lake-P: High-performance notebook
- Alder Lake-M: Low-power devices
|Big core + small core||Core/thread||Graphics processor|
|8 + 8||16/24||GT1-Gen12 32EU|
|8 + 6||14/22||GT1-Gen12 32EU|
|8 + 4||12/20||GT1-Gen12 32EU|
|8 + 2||10/18||GT1-Gen12 32EU|
|8 + 0||8/16||GT1-Gen12 32EU|
|6 + 8||14/20||GT1-Gen12 32EU|
|6 + 6||12/18||GT1-Gen12 32EU|
|6 + 4||10/16||GT1-Gen12 32EU|
|6 + 2||8/14||GT1-Gen12 32EU|
|6 + 0||6/12||GT1-Gen12 32EU|
|4 + 0||4/8||GT1-Gen12 32EU|
|2 + 0||twenty four||GT1-Gen12 32EU|
*Intel has not officially confirmed these configurations. Therefore, not all models may be available on the market. However, the list assumes that all models have hyperthreading enabled on large cores.
As we have seen above, the flagship desktop computer model will be equipped with eight “large” cores that support hyper-threading and eight single-threaded “small” cores, for a total of 24 threads. So logically, we can expect the 8 + 8 configuration to belong to the Core i9 category, while 8 + 4 may belong to the Core i7, and 6 + 8 and 4 + 0 may belong to the Core i5 and i3 families, respectively. However, due to the new paradigm of hybrid x86 design, it is naturally impossible to know how Intel will divide its product stack.
Now that we have a better understanding of how to design chips at the SoC level, let’s see how Intel ensures that applications land on the correct core, and then get a deeper understanding of the core microarchitecture.