A 20 Year Old Chipset Workaround Has Been Hurting Modern AMD Linux Systems (phoronix.com) 53

Posted by BeauHD on Monday September 26, 2022 @08:45PM from the lag-inducing dept.

AMD engineer K Prateek Nayak recently uncovered that a 20 year old chipset workaround in the Linux kernel still being applied to modern AMD systems is responsible in some cases for hurting performance on modern Zen hardware. Fortunately, a fix is on the way for limiting that workaround to old systems and in turn helping with performance for modern systems. Phoronix reports: Last week was a patch posted for the ACPI processor idle code to avoid an old chipset workaround on modern AMD Zen systems. Since ACPI support was added to the Linux kernel in 2002, there has been a "dummy wait op" to deal with some chipsets where STPCLK# doesn't get asserted in time. The dummy I/O read delays further instruction processing until the CPU is fully stopped. This was a problem with at least some AMD Athlon era systems with a VIA chipset... But not a problem with newer chipsets of roughly the past two decades.

With this workaround still being applied to even modern AMD systems, K Prateek Nayak discovered: "Sampling certain workloads with IBS on AMD Zen3 system shows that a significant amount of time is spent in the dummy op, which incorrectly gets accounted as C-State residency. A large C-State residency value can prime the cpuidle governor to recommend a deeper C-State during the subsequent idle instances, starting a vicious cycle, leading to performance degradation on workloads that rapidly switch between busy and idle phases. One such workload is tbench where a massive performance degradation can be observed during certain runs."

At least for Tbench, this long-time, unconditional workaround in the Linux kernel has been hurting AMD Ryzen / Threadripper / EPYC performance in select workloads. This workaround hasn't affected modern Intel systems since those newer Intel platforms use the alternative MWAIT-based intel_idle driver code path instead. The AMD patch evolved into this patch by Intel Linux engineer Dave Hansen. That patch to limit the "dummy wait" workaround to old systems is already queued into TIP's x86/urgent branch. With it going the route of "x86/urgent" and for fixing a overzealous workaround that isn't needed on modern hardware, it's likely this patch will be submitted this week still for the Linux 6.0 kernel rather than needing to wait until the next (v6.1) merge window.

A 20 Year Old Chipset Workaround Has Been Hurting Modern AMD Linux Systems

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 53 Comments Log In/Create an Account

Comments Filter:

- Re:This won't affect me. (Score:5, Funny)
  
  by Anonymous Coward writes: on Monday September 26, 2022 @08:51PM (#62916549)
  
  Thanks for clearing that up. Can I get your name, email and address so we can add you to the list of people unaffected?
  
- - Re: (Score:2)
    
    by suss ( 158993 ) writes:
    
    Athlon started at 500Mhz, P3 at 400.
    150 MHz sounds like an overclocked K5, 166 MHz would be an original Pentium or a Pentium MMX.
    - Re: (Score:2)
      
      by 4wdloop ( 1031398 ) writes:
      
      No wait - these were the days of clocking down to make things work at all...oh wait maybe it was in the 90s...can't remember anymore ;-)
      - Re: (Score:2)
        
        by Luckyo ( 1726890 ) writes:
        
        My first system back in the day was a K7 Slot A Athlon (first Athlons were all slot format processors rather than socket, you slotted them in like you did extension cards, except vertically rather than horizontally). I am quite certain that there was no way to arrange jumpers to get significantly below 500MHz on my motherboard back in the day, and I would wager that others wouldn't let you do it either. Back then frequency was set via two motherboard jumpers. First would determine the multiplier and second
        
        Re: This won't affect me. (Score:2)
        
        by Anonymouse Cowtard ( 6211666 ) writes:
        
        Back then frequency was set via two motherboard jumpers
        Ahh, you've just reminded me of the 486/Pentium era beige box cases that displayed the clock freq in LEDs on the front of the box.
        And overclocking was as simple as toggling a switch or three on your mobo. No locked CPUs. 32MB RAM was beefy and there was no way known (until MPEG audio and video came along) of filling a 1.6GB HDD.
        
        Re: (Score:2)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
        
        Re: This won't affect me. (Score:2)
        
        by Anonymouse Cowtard ( 6211666 ) writes:
        
        My 1.2GB drive crashed while under warranty. The mum and dad store had no 1.2s in stock, so they replaced it with a 1.6. Bonus 400MB! My god, that's more than half a CD-ROM (still only 650MB).
        I can remember a colleague saying at the time, "How you gonna fill that drive?"
        Yes, obviously we could have been running an NNTP server with 50,000 groups including all the binaries but I wasn't talking production or even workstation level computing. I gave the specs and use case and even mentioned a generic beige bo
        
        Re: (Score:2)
        
        by sglewis100 ( 916818 ) writes:
        
        I have to admit, if your memory of clock frequency LEDs in the front of a PC case dates back to the 486 era... you're considerably younger than me. The XT era, and 8 MHz "turbo buttons" are my recollection.
      - Re: This won't affect me. (Score:1)
        
        by dowhileor ( 7796472 ) writes:
        
        Sounds like the very beginning of the speculative execution nonsense. When intel and and were letting developers access processor instructions like reads/writes to L1, L2 for their own hardware, software, proposed subprocessors.
- Re: (Score:3, Interesting)
  
  by F.Ultra ( 1673484 ) writes:
  
  Well how do you know that this or a similar bug does not exist in Windows? With Thoughts and prayers?
un-VIA-ble (Score:4, Insightful)

by Anonymouse Cowtard ( 6211666 ) writes: on Monday September 26, 2022 @09:27PM (#62916609) Homepage

I got a mobo with a via chipset for a, IIRC, Pentium II 333MHz CPU. Probably around 1999. Worst board I've ever had. Saving a few dollars on a once every 5 year purchase is stupid and I learnt my lesson well.

- Re: (Score:2)
  
  by Luckyo ( 1726890 ) writes:
  
  VIA had a very well earned HORRIBLE reputation with their motherboard chipsets back in the era between pentium 2 and pentium 3.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  I think many people here made that mistake at dinner point in their life. I remember VIA made my following purchase an offer priced first party Intel reference motherboard, that's how much it soured me experience.
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    *over priced. Dan autocorrect.
    - Re: (Score:2)
      
      by parityshrimp ( 6342140 ) writes:
      
      I like to call it autoincorrect. What word was "dinner" supposed to be? "some" fits the sentence quite well, but how the hell did anything get "dinner" out of that?
      - Re: (Score:2)
        
        by nasch ( 598556 ) writes:
        
        Swipe typing on a phone most likely. Those two words would have a similar pattern.
- Re: (Score:2)
  
  by MachineShedFred ( 621896 ) writes:
  
  A lot of VIA's problems (and AMD's by extension, since VIA made most of the chipsets back then before AMD started making their own) was that they would get their clock rates by overclocking the PCI bus, which caused all kinds of shit to go sideways. Usually the fix for these systems was to get the PCI bus back into spec by clocking at 33Mhz, which got your AGP back to 66Mhz since it was a hard 2x clock over PCI, but it meant underclocking the CPU from whatever the marketing guys sold you.
  Now motherboards h
  - Re: (Score:2)
    
    by tlhIngan ( 30335 ) writes:
    
    As far as a VIA chipset for an Intel CPU - that just wasn't a good idea back then. Intel was already on their lock-in warpath and doing things like actively making their AGP video cards purposefully not work with non-Intel chipsets and CPUs. The good news is that their video chipsets largely sucked in comparison to Nvidia, 3Dfx, and even S3.
    Intel PCI buses were also more reliable. If you owned a Creative Sound Blaster with a PCI interface, plugging it into an AMD or VIA chipset computer was a great way to c
Only Zen chips? (Score:3, Insightful)

by Narcocide ( 102829 ) writes: on Monday September 26, 2022 @09:45PM (#62916651) Homepage

There's been a lot of chipset releases between Zen and those old VIA chipsets... this is probably affecting a lot of other hardware in the wild unnecessarily too, isn't it?

- yes and no (Score:2)
  
  by raymorris ( 2726007 ) writes:
  
  It's my understanding it does effect the in-between.
  Zen is 5 years old. Are there a LOT of AMD CPUs running heavy workloads that are more than 5 years old? I guess it depends on your definition of "a lot".
Backport to older kernels? (Score:5, Interesting)

by suss ( 158993 ) writes: on Monday September 26, 2022 @10:24PM (#62916713)

I hope this gets backported to older, but still supported, 4 and 5 kernels.
Also, someone please fix the SATA NCQ brokenness with EPYC and Samsung SSD's...

- Re:Backport to older kernels? (Score:4, Interesting)
  
  by sg_oneill ( 159032 ) writes: on Tuesday September 27, 2022 @01:41AM (#62916951)
  
  It'd likely be up to maintainers of older kernels to decide if they want to port it or not. If the performance is significant, theres a good chance Redhat/etc will pull the patch anyway.
  Also, upgrade your kernels :)
  re the SATA thing, have you, or someone else, got a bug report into the SATA maintainer, whoever that is?
  
  - Re: (Score:2)
    
    by suss ( 158993 ) writes:
    
    Having seen the drama from people upgrading 4 to 5 kernels and no longer having supported older or suddenly broken hardware support, i'll stick with the 4 kernel for as long as i can, on my older hardware.
    The EPYC/samsung thing has been in limbo since about 2018 and neither AMD nor samsung want to take responsibility/fix it. Turning off NCQ altogether is the only thing that makes it go away, but makes the drive much slower.
- Re:Backport to older kernels? (Score:4, Interesting)
  
  by TypoNAM ( 695420 ) writes: on Tuesday September 27, 2022 @09:17AM (#62917643)
  
  I know from experience the SATA NCQ Samsung SSDs issue is not limited to EPYC but even on old hardware such as AMD Athlon(tm) II X2 250 Processor and motherboard with a Marvell 9215 SATA controller.
  Which in my opinion rules out host hardware but more than likely a problem with the SATA driver and/or Samsung SSD controllers.
  
- Re: (Score:2)
  
  by bill_mcgonigle ( 4333 ) * writes:
  
  Gosh, I might even do this by hand if Debian doesn't. It's literally just:
  processor_idle.c:
  - if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + if (boot_cpu_has(X86_FEATURE_HYPERVISOR) || boot_cpu_has(X86_FEATURE_ZEN))
  The energy reduction across every Zen Debian machine could be huge.
- Re:another worthless contribution to global warmin (Score:5, Funny)
  
  by Luckyo ( 1726890 ) writes: on Tuesday September 27, 2022 @12:06AM (#62916873)
  
  And then you decided to make it worse by typing out this comment and hitting send. Have you no shame?
  
News For Nerds, Indeed (Score:2, Funny)

by theshowmecanuck ( 703852 ) writes:

Can I say this was back to core values?
- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  An anecdote is not data. I wouldn't celebrate yet.
not a programmer (Score:2)

by buck-yar ( 164658 ) writes:

but why don't they just not call it when its not needed? They can't detect what processor and chipset the computer is running and if it doesn't need the dummy call, dont call it? And if it does then call it.. Is this somehow possible without compiling
- Re: (Score:1)
  
  by Narcocide ( 102829 ) writes:
  
  Yea, it's probably a simple check they didn't bother making specific enough, but I suspect it's the type of thing that would be a 1-line patch that requires a full kernel rebuild.
observation (Score:2)

by buck-yar ( 164658 ) writes:

"Sampling certain workloads with IBS on AMD Zen3 system shows that a significant amount of time is spent in the dummy op, which incorrectly gets accounted as C-State residency. A large C-State residency value can prime the cpuidle governor to recommend a deeper C-State during the subsequent idle instances, starting a vicious cycle, leading to performance degradation on workloads that rapidly switch between busy and idle phases."
Doesn't the idle system seem overly complex?
- Re: (Score:3)
  
  by necro81 ( 917438 ) writes:
  
  Doesn't the idle system seem overly complex?
  You're saying that the idle system is particularly busy?
Kudos (Score:5, Informative)

by enriquevagu ( 1026480 ) writes: on Tuesday September 27, 2022 @04:39AM (#62917201)

to K Prateek Nayak, the AMD engineer who discovered the problem. This type of issues is incredibly hard to troubleshoot, particularly since they can be hardly replicated by simulation, you need to test them live in the actual hardware.

- Re: (Score:1)
  
  by DamnOregonian ( 963763 ) writes:
  
  This particular issue is easily replicated by simulation.
  The (mis)behavior isn't caused by hardware, it's caused 100% by software.
  - Re: (Score:3)
    
    by BadDreamer ( 196188 ) writes:
    
    This particular issue is pretty much impossible to replicate by simulation, since the behavior is a combination of software response and hardware response to the faulty software.
    - Re: (Score:2)
      
      by DamnOregonian ( 963763 ) writes:
      
      You couldn't be more wrong.
      There is no hardware response.
      
      That's why the issue is so easily fixable.
      The C state idle heuristics are chosen by the idle driver in use by AMD cpus.
      That is *not* automatic behavior by the CPU.
      The C state heuristics in the idle driver were being misled by the imposed wait.
      
      In short, you have no idea what you're talking about.
    - Re: (Score:2)
      
      by DamnOregonian ( 963763 ) writes:
      
      I'll go ahead and elaborate, since 1) you're keen to speak on matters you know nothing about, and 2) apparently some dumbfuck has saw fit to moderate you up, and me down, undoubtedly borne from similar ignorance.
      
      On ACPI systems, the idle state is controlled thusly:
      The kernel asks the firmware (ACPI) to put the processor in a particular idle state, based on its polling of the time in each idle state.
      
      The processor_idle driver (the basic generic x86 ACPI idle driver) had a wasted ACPI call designed to was
      - Re: (Score:2)
        
        by BadDreamer ( 196188 ) writes:
        
        So sure, from a certain point of view, you're right- it required hardware.
        Which was exactly what I claimed. It was not possible to emulate ones way to this issue (by which I do not mean running a VM, but an emulator). And boy, did that cause your cheerios to get a pissy taste. Glad I could affect your day to that extent.
        
        Re: (Score:2)
        
        by DamnOregonian ( 963763 ) writes:
        
        You said:
        This particular issue is pretty much impossible to replicate by simulation
        That is patently false.
        I have explained how.
        If you disagree with that, you are wrong.
        It was not possible to emulate ones way to this issue (by which I do not mean running a VM, but an emulator).
        Flat ass backwards.
        This behavior exists in any ACPI firmware you pair with qemu (because they behave correctly).
        This issue is not dependent on hardware. It is a misbehavior of the processor_idle driver paired with the ACPI firmware.
        This behavior does not exist in a VM, but only because the kernel purposefully skips the extraneous call if it's running with a supported hypervisor.
        And boy, did that cause your cheerios to get a pissy taste. Glad I could affect your day to that extent.
        Being wrong sucks. I get it. But
        
        Re: (Score:2)
        
        by BadDreamer ( 196188 ) writes:
        
        And yes, it is pretty much impossible to replicate. It took careful troubleshooting of specific benchmarks to even notice it was happening, and then a lot of deep diving with profiling to locate it.
        Sure, once you know exactly what is happening, all you need is the code and running it through your head to understand what is going on. But that's not even remotely a feasible way to find out a problem like this even exists, much less pinpoint it.
        Your argument is basically "when you know what is happening, the f
        
        Re: (Score:2)
        
        by DamnOregonian ( 963763 ) writes:
        
        Your argument is basically "when you know what is happening, the fault is trivial to replicate", which is trivially correct, I'll give you that.
        Correct.
        I argued merely that it's easy to simulate, not that it was easy to run into.
        Finding the culprit is also not that difficult.
        
        Querying the C state residency immediately points you in the right direction.
        The numbers are nonsense in "vicious cycle mode".
        This points very directly at the idle driver, since it's responsible for selecting the C states.
        
        Figuring out what is *wrong* with the idle driver is standard debugging, and that's where simulation comes in. This can be easily simulated with qemu,
Improvement (Score:5, Informative)

by necro81 ( 917438 ) writes: on Tuesday September 27, 2022 @08:12AM (#62917521) Journal

The benchmarks attached to the article indicate that the median throughput (using the TBench benchmark) increases from something like 32.2 Gbps to 33.8 Gbps, which isn't a particularly big increase. However, the greatest improvement was in the minimum throughput, which increased from 2.2 Gbps to 33.0 Gbps, which is huge. If you can eliminate those relatively rare cases where performance drops like a rock, that's fantastic.

Fixed by an Intel engineer? (Score:3)

by Gravis Zero ( 934156 ) writes: on Tuesday September 27, 2022 @08:57AM (#62917607)

Well now we know the name of an Intel engineer that got yelled at today. ;)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re:This won't affect me. (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: This won't affect me. (Score:2)

Re: (Score:2)

Re: This won't affect me. (Score:2)

Re: (Score:2)

Re: This won't affect me. (Score:1)

Re: (Score:3, Interesting)

un-VIA-ble (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Only Zen chips? (Score:3, Insightful)

yes and no (Score:2)

Backport to older kernels? (Score:5, Interesting)

Re:Backport to older kernels? (Score:4, Interesting)

Re: (Score:2)

Re:Backport to older kernels? (Score:4, Interesting)

Re: (Score:2)

Re:another worthless contribution to global warmin (Score:5, Funny)

News For Nerds, Indeed (Score:2, Funny)

Re: (Score:2)

not a programmer (Score:2)

Re: (Score:1)

observation (Score:2)

Re: (Score:3)

Kudos (Score:5, Informative)

Re: (Score:1)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Improvement (Score:5, Informative)

Fixed by an Intel engineer? (Score:3)

Related Links Top of the: day, week, month.

Slashdot Top Deals