At this year’s Hot Chips 33 presentation, IBM, one of the pioneers of a plethora of technologies we are using today, has presented its works on the IBM Telum processor powering the next generation of the infamous “Z” mainframe computers, that are the backbone of the world’s technology infrastructure.
The IBM Z lineup of mainframe computers uses custom-made processors based on the z/Architecture. Today, IBM has presented the latest workings on the Telum chip, powering the next generation of IBM Z mainframes.
Codenamed Telum, IBM has fabricated the processor on Samsung’s 7 nm node, featuring 22.5 Billion transistors on 530 square mm die. It runs at above 5 GHz clock speed and features 8 CPU cores. Compared to the prior generation, IBM advanced its semiconductor node of choice from the 14 nm process used for the Z15 chip, to the 7 nm used for this new chip. The 8 cores present are paired with 256MB of “semi-private” cache structure, which is split up into eight groups of 32MBs of L2 cache. These structures are later combined to make a shared virtual L3 cache with 256MB capacity, that connects all the cores. The newly formed “ring” provides more than 320 GB/s of bi-directional bandwidth.
The CPU cores themselves are designed as a custom out-of-order execution pipeline with SMT2 (Simultaneous Multithreading with 2 virtual threads) and an operating frequency of over 5 GHz. The new Z/Arch provides some additional instructions for neural network processing, where the dedicated artificial intelligence accelerator comes into play.
The AI accelerator is interconnected with the CPU, where the CPU has a specific set of instructions that triggers the accelerator. It has a computing throughput of over 6 TeraFLOPs at FP16/FP32 workloads. It supports a wide array of machine learning libraries and allows the new processor to scale exponentially with every new chip added to the mainframe systems.
The next-generation IBM Z processor will use two of the 8-core Telum dies to form a 16-core dual-chip module, which is later scaled into a 4-socket drawer, 4-driver system, totaling 32 chips and 256 cores inside this 4-drawer system.
The most interesting part of the system is perhaps the aforementioned AI accelerator, which shows that AI workloads have finally reached a point where even the mainframe computers need to have them. For some business-critical fraud detection workloads, classical algorithmic detection is no longer sufficient, so the AI-enhanced software is becoming much more needed, and IBM sees the need for enabling this software to run much faster. For more information please take a look at the IBM Z website.