Earlier this year, in February, a leaked Task Manager screenshot suggested Intel was looking to kill hyper-threading (HT), the company's name for Simultaneous Multi-Threading (SMT), on Lunar Lake processors. Today at Computex 2024, with the launch of Lunar Lake, Intel has confirmed it is indeed doing so and has also explained why.
If you recall, Intel introduced its Performance Hybrid or Big-Bigger architecture with 12th Gen Alder Lake processors wherein "Bigger" Performance cores or P-cores were combined with "Big" Efficiency cores or E-cores such that heavier tasks would be handled by the P-cores and lighter workloads would be dealt by E-cores.
However, and despite the introduction of Thread Director, Intel noticed that there was opportunity for improvements as the OS scheduler would generally send a task to the hyper-thread last since a physical core is always prioritized.
With Lunar Lake mobile CPUs, Intel claims it is seeing significant improvements in single-threaded performance and efficiency with its new optimized P-core with no HT. Intel says hyper-threading is better reserved for scenarios where multi-threading performance matters more.
The slides below detail the kind of performance and power efficiency improvements Intel observed with Lunar Lake's P-core by disabling the HT:
Intel adds that this is part of its broader effort to streamline the Lunar Lake architecture such that anything that does not contribute to the desired performance or power efficiency is cut out. The company explained what it set out to achieve with these architectures in the slides below. If you are wondering, Lion Cove is the Lunar Lake P-core architecture while Skymont is the one for E-cores.
Another change Lunar Lake has is the introduction of a new L0 D cache (level 0 data cache). Lunar Lake P-cores (Lion Cove) have 2.5MB of L2 cache per core and up to 12MB of shared L3 cache. Meanwhile, the E-cores (Skymont) has 4MB of shared L2 cache.
They form clusters of four P-cores and four E-cores and this 8-core hybrid design constitutes a Lunar Lake compute tile. It also has up to 32GB of on package memory that helps in faster data access and latency reduction.
Intel has also made changes to the Intel Thread Director (ITD). Unlike previous generations, ITD now schedules tasks to the E-cores first provided that the workload can be handled by it. Multi-threaded workloads too are handled this way first and according to the company, with this approach, Microsoft Teams sees a 35% reduction in power consumption.
Microsoft's Tapan Ansel, Senior Software Engineer, Windows Core OS, and Bret Barkelew, Principal Software Engineering Lead (Energy Efficiency), Windows Core OS, say:
With Intel Thread Director technology, which identifies the most power efficient CPUs on Lunar Lake platforms, the Windows OS is able to create a ‘containment zone’ to constrain work to only those CPUs and keep the other performant CPUs parked/idle for use only when needed. This is delivering significant power savings for Teams Video Conferencing scenarios that fit well within the containment zone on Lunar Lake.
All these enhancements and more lead to a 14% IPC improvement (AMD claims 16% with its new Zen 5) for Lunar Lake P-cores (Lion Cove) compared to Meteor Lake P-cores (Redwood Cove):
And on the E-core side, Intel claims Lunar Lake's Skymont is even faster than the P-cores on Raptor Lake (13th Gen); and compared to Meteor Lake's LP E-cores, Skymont is 68% faster with bigger gains seen in floating point (FP) throughput compared to integer.
Finally, we have the NPU or Neural Processing Unit. The company claims huge improvements with its new NPU 4 design. We already knew Intel managed to meet the 40 TOPS necessary for Copilot+ PCs since an earlier announcement.
As you can see in the image above, at 48 peak TOPS (pTOPS), it is 20% ahead of the 40 TOPS necessary and is slightly behind AMD's 50 TOPS on the new Ryzen AI 300 series announced yesterday. However, Intel claims a total platform performance (CPU + GPU + NPU) of 120 TOPS. AMD on the other hand claims a "Total Processor Performance" of 80 TOPS.
Thanks to the huge bump in AI processing on Lunar Lake compared to Meteor Lake, Intel says Stable Diffusion sees a massive improvement in power efficiency on the former.
We have covered the integrated graphics powered by the new Xe2 architecture separately here. Other key features include support for Bluetooth version 5.4, Wi-Fi 7, and more.
You can read about the rest of our Computex 2024 coverage in these articles here.
5 Comments - Add comment