Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AMD Hardware Linux

A 20 Year Old Chipset Workaround Has Been Hurting Modern AMD Linux Systems (phoronix.com) 53

AMD engineer K Prateek Nayak recently uncovered that a 20 year old chipset workaround in the Linux kernel still being applied to modern AMD systems is responsible in some cases for hurting performance on modern Zen hardware. Fortunately, a fix is on the way for limiting that workaround to old systems and in turn helping with performance for modern systems. Phoronix reports: Last week was a patch posted for the ACPI processor idle code to avoid an old chipset workaround on modern AMD Zen systems. Since ACPI support was added to the Linux kernel in 2002, there has been a "dummy wait op" to deal with some chipsets where STPCLK# doesn't get asserted in time. The dummy I/O read delays further instruction processing until the CPU is fully stopped. This was a problem with at least some AMD Athlon era systems with a VIA chipset... But not a problem with newer chipsets of roughly the past two decades.

With this workaround still being applied to even modern AMD systems, K Prateek Nayak discovered: "Sampling certain workloads with IBS on AMD Zen3 system shows that a significant amount of time is spent in the dummy op, which incorrectly gets accounted as C-State residency. A large C-State residency value can prime the cpuidle governor to recommend a deeper C-State during the subsequent idle instances, starting a vicious cycle, leading to performance degradation on workloads that rapidly switch between busy and idle phases. One such workload is tbench where a massive performance degradation can be observed during certain runs."

At least for Tbench, this long-time, unconditional workaround in the Linux kernel has been hurting AMD Ryzen / Threadripper / EPYC performance in select workloads. This workaround hasn't affected modern Intel systems since those newer Intel platforms use the alternative MWAIT-based intel_idle driver code path instead. The AMD patch evolved into this patch by Intel Linux engineer Dave Hansen. That patch to limit the "dummy wait" workaround to old systems is already queued into TIP's x86/urgent branch. With it going the route of "x86/urgent" and for fixing a overzealous workaround that isn't needed on modern hardware, it's likely this patch will be submitted this week still for the Linux 6.0 kernel rather than needing to wait until the next (v6.1) merge window.

This discussion has been archived. No new comments can be posted.

A 20 Year Old Chipset Workaround Has Been Hurting Modern AMD Linux Systems

Comments Filter:
  • un-VIA-ble (Score:4, Insightful)

    by Anonymouse Cowtard ( 6211666 ) on Monday September 26, 2022 @08:27PM (#62916609) Homepage
    I got a mobo with a via chipset for a, IIRC, Pentium II 333MHz CPU. Probably around 1999. Worst board I've ever had. Saving a few dollars on a once every 5 year purchase is stupid and I learnt my lesson well.
    • by Luckyo ( 1726890 )

      VIA had a very well earned HORRIBLE reputation with their motherboard chipsets back in the era between pentium 2 and pentium 3.

    • I think many people here made that mistake at dinner point in their life. I remember VIA made my following purchase an offer priced first party Intel reference motherboard, that's how much it soured me experience.

    • A lot of VIA's problems (and AMD's by extension, since VIA made most of the chipsets back then before AMD started making their own) was that they would get their clock rates by overclocking the PCI bus, which caused all kinds of shit to go sideways. Usually the fix for these systems was to get the PCI bus back into spec by clocking at 33Mhz, which got your AGP back to 66Mhz since it was a hard 2x clock over PCI, but it meant underclocking the CPU from whatever the marketing guys sold you.

      Now motherboards h

      • by tlhIngan ( 30335 )

        As far as a VIA chipset for an Intel CPU - that just wasn't a good idea back then. Intel was already on their lock-in warpath and doing things like actively making their AGP video cards purposefully not work with non-Intel chipsets and CPUs. The good news is that their video chipsets largely sucked in comparison to Nvidia, 3Dfx, and even S3.

        Intel PCI buses were also more reliable. If you owned a Creative Sound Blaster with a PCI interface, plugging it into an AMD or VIA chipset computer was a great way to c

  • Only Zen chips? (Score:3, Insightful)

    by Narcocide ( 102829 ) on Monday September 26, 2022 @08:45PM (#62916651) Homepage

    There's been a lot of chipset releases between Zen and those old VIA chipsets... this is probably affecting a lot of other hardware in the wild unnecessarily too, isn't it?

    • It's my understanding it does effect the in-between.

      Zen is 5 years old. Are there a LOT of AMD CPUs running heavy workloads that are more than 5 years old? I guess it depends on your definition of "a lot".

  • by suss ( 158993 ) on Monday September 26, 2022 @09:24PM (#62916713)

    I hope this gets backported to older, but still supported, 4 and 5 kernels.

    Also, someone please fix the SATA NCQ brokenness with EPYC and Samsung SSD's...

    • by sg_oneill ( 159032 ) on Tuesday September 27, 2022 @12:41AM (#62916951)

      It'd likely be up to maintainers of older kernels to decide if they want to port it or not. If the performance is significant, theres a good chance Redhat/etc will pull the patch anyway.

      Also, upgrade your kernels :)

      re the SATA thing, have you, or someone else, got a bug report into the SATA maintainer, whoever that is?

      • by suss ( 158993 )

        Having seen the drama from people upgrading 4 to 5 kernels and no longer having supported older or suddenly broken hardware support, i'll stick with the 4 kernel for as long as i can, on my older hardware.

        The EPYC/samsung thing has been in limbo since about 2018 and neither AMD nor samsung want to take responsibility/fix it. Turning off NCQ altogether is the only thing that makes it go away, but makes the drive much slower.

    • by TypoNAM ( 695420 ) on Tuesday September 27, 2022 @08:17AM (#62917643)

      I know from experience the SATA NCQ Samsung SSDs issue is not limited to EPYC but even on old hardware such as AMD Athlon(tm) II X2 250 Processor and motherboard with a Marvell 9215 SATA controller.

      Which in my opinion rules out host hardware but more than likely a problem with the SATA driver and/or Samsung SSD controllers.

    • Gosh, I might even do this by hand if Debian doesn't. It's literally just:


      processor_idle.c:

      - if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
      + if (boot_cpu_has(X86_FEATURE_HYPERVISOR) || boot_cpu_has(X86_FEATURE_ZEN))

      The energy reduction across every Zen Debian machine could be huge.

  • Can I say this was back to core values?

  • but why don't they just not call it when its not needed? They can't detect what processor and chipset the computer is running and if it doesn't need the dummy call, dont call it? And if it does then call it.. Is this somehow possible without compiling

    • Yea, it's probably a simple check they didn't bother making specific enough, but I suspect it's the type of thing that would be a 1-line patch that requires a full kernel rebuild.

  • "Sampling certain workloads with IBS on AMD Zen3 system shows that a significant amount of time is spent in the dummy op, which incorrectly gets accounted as C-State residency. A large C-State residency value can prime the cpuidle governor to recommend a deeper C-State during the subsequent idle instances, starting a vicious cycle, leading to performance degradation on workloads that rapidly switch between busy and idle phases."

    Doesn't the idle system seem overly complex?

    • by necro81 ( 917438 )

      Doesn't the idle system seem overly complex?

      You're saying that the idle system is particularly busy?

  • Kudos (Score:5, Informative)

    by enriquevagu ( 1026480 ) on Tuesday September 27, 2022 @03:39AM (#62917201)

    to K Prateek Nayak, the AMD engineer who discovered the problem. This type of issues is incredibly hard to troubleshoot, particularly since they can be hardly replicated by simulation, you need to test them live in the actual hardware.

    • This particular issue is easily replicated by simulation.
      The (mis)behavior isn't caused by hardware, it's caused 100% by software.
      • This particular issue is pretty much impossible to replicate by simulation, since the behavior is a combination of software response and hardware response to the faulty software.

        • You couldn't be more wrong.
          There is no hardware response.

          That's why the issue is so easily fixable.
          The C state idle heuristics are chosen by the idle driver in use by AMD cpus.
          That is *not* automatic behavior by the CPU.
          The C state heuristics in the idle driver were being misled by the imposed wait.

          In short, you have no idea what you're talking about.
        • I'll go ahead and elaborate, since 1) you're keen to speak on matters you know nothing about, and 2) apparently some dumbfuck has saw fit to moderate you up, and me down, undoubtedly borne from similar ignorance.

          On ACPI systems, the idle state is controlled thusly:
          The kernel asks the firmware (ACPI) to put the processor in a particular idle state, based on its polling of the time in each idle state.

          The processor_idle driver (the basic generic x86 ACPI idle driver) had a wasted ACPI call designed to was
          • So sure, from a certain point of view, you're right- it required hardware.

            Which was exactly what I claimed. It was not possible to emulate ones way to this issue (by which I do not mean running a VM, but an emulator). And boy, did that cause your cheerios to get a pissy taste. Glad I could affect your day to that extent.

            • You said:

              This particular issue is pretty much impossible to replicate by simulation

              That is patently false.
              I have explained how.
              If you disagree with that, you are wrong.

              It was not possible to emulate ones way to this issue (by which I do not mean running a VM, but an emulator).

              Flat ass backwards.
              This behavior exists in any ACPI firmware you pair with qemu (because they behave correctly).
              This issue is not dependent on hardware. It is a misbehavior of the processor_idle driver paired with the ACPI firmware.
              This behavior does not exist in a VM, but only because the kernel purposefully skips the extraneous call if it's running with a supported hypervisor.

              And boy, did that cause your cheerios to get a pissy taste. Glad I could affect your day to that extent.

              Being wrong sucks. I get it. But

              • And yes, it is pretty much impossible to replicate. It took careful troubleshooting of specific benchmarks to even notice it was happening, and then a lot of deep diving with profiling to locate it.

                Sure, once you know exactly what is happening, all you need is the code and running it through your head to understand what is going on. But that's not even remotely a feasible way to find out a problem like this even exists, much less pinpoint it.

                Your argument is basically "when you know what is happening, the f

                • Your argument is basically "when you know what is happening, the fault is trivial to replicate", which is trivially correct, I'll give you that.

                  Correct.
                  I argued merely that it's easy to simulate, not that it was easy to run into.
                  Finding the culprit is also not that difficult.

                  Querying the C state residency immediately points you in the right direction.
                  The numbers are nonsense in "vicious cycle mode".
                  This points very directly at the idle driver, since it's responsible for selecting the C states.

                  Figuring out what is *wrong* with the idle driver is standard debugging, and that's where simulation comes in. This can be easily simulated with qemu,

  • Improvement (Score:5, Informative)

    by necro81 ( 917438 ) on Tuesday September 27, 2022 @07:12AM (#62917521) Journal
    The benchmarks attached to the article indicate that the median throughput (using the TBench benchmark) increases from something like 32.2 Gbps to 33.8 Gbps, which isn't a particularly big increase. However, the greatest improvement was in the minimum throughput, which increased from 2.2 Gbps to 33.0 Gbps, which is huge. If you can eliminate those relatively rare cases where performance drops like a rock, that's fantastic.
  • by Gravis Zero ( 934156 ) on Tuesday September 27, 2022 @07:57AM (#62917607)

    Well now we know the name of an Intel engineer that got yelled at today. ;)

In these matters the only certainty is that there is nothing certain. -- Pliny the Elder

Working...