AMD's Statements on Temperature, Windows Scheduler, & SMT

By Published March 14, 2017 at 2:31 pm
  •  

AMD yesterday released a community update with interesting assertions regarding thread scheduling, temperature reporting, Windows power plan issues, and SMT challenges.

According to AMD’s Robert Hallock, the company has found no indication that Windows 10 thread scheduling is operating improperly for Zen. This should be the final word in any argument that Microsoft thread scheduling issues are sabotaging Ryzen: they aren’t, as stated by AMD below:

“We have investigated reports alleging incorrect thread scheduling on the AMD Ryzen processor. Based on our findings, AMD believes that the Windows 10 thread scheduler is operating properly for ‘Zen,’ and we do not presently believe there is an issue with the scheduler adversely utilizing the logical and physical configurations of the architecture.

“As an extension of this investigation, we have also reviewed topology logs generated by the Sysinternals Coreinfo utility. We have determined that an outdated version of the application was responsible for originating the incorrect topology data that has been widely reported in the media. Coreinfo v3.31 (or later) will produce the correct results.”

Secondly, an official statement has been issued on the hellish confusion that is AMD temperature monitoring. The R7 1700X and 1800X both intentionally report “tCTL” temperatures, as we explained in our initial reviews, that are 20 degrees Celsius higher than the Junction temperature. This is, quote, in order to maintain “a consistent fan policy” across all R7 chips. We’re reasonably confident that HardwareMonitor has been recording true junction temperatures and know that AI Suite takes the TSI temperature bus, combines it with a thermistor in the board, then uses an algorithm to get back to Junction. We’ll be validating (again) with thermocouples in the near future, which can be helpful for determining deltas between different CPU states and the software used to measure them.

Our tests will remain accurate in terms of performance regardless, but the issue is this: we were previously informed that TjMax of all R7 chips is 75C. This post seems to imply a tCTL (offset temperature, ie not the real value) limit of 75C, above which throttling occurs. Our 1800X got hot enough during thermal testing to do some minor throttling, whereas the 1700 at a slightly higher frequency didn’t throttle at all--if that’s because the 1800X is lying to itself and reporting temperatures 20 degrees too high, that’s a problem, and it makes the 1700 a far better purchase. That said, if TjMax is actually the maximum temperature of the Junction diode, then the tCTL value would be ~95C of that same thermal trigger point. We did see throttling occur at 75C on our ASUS board, relying on metrics reported by AI Suite (TSI Bus + thermistor) and HWM. Again, we’ll be doing our best to straighten this out, and it does not affect results thus far. There may be impact on the analytics of the thermal benchmark for the 1800X.

AMD also continues to encourage the use of the High Performance power plan, offering the welcomed news that they’re working on optimization of the Balanced plan for Ryzen CPUs. We saw only minor performance changes between power plans in some cases, but have used High Performance mode for all our benchmarks to be on the safe side. A few particular benchmarks did post bigger swings.

A list of major titles with “neutral/positive benefit[s] from SMT” was included, among them Battlefield 1 and Watch_Dogs 2, which we’ve tested ourselves. We have seen WD2 posting zero noteworthy change with SMT0/1 toggles and stock settings, or BF1 with Dx11 posting slightly negative scaling with SMT1. SMT should be beneficial, but it hasn’t been so far in our gaming tests (even with those specific games), although hopefully that can change. The solution offered for “remaining outliers” is patching the software to support Ryzen, which is to be expected, and we hope to see developers accounting for SMT in the future--but as far as our current benchmarking lineup goes, we aren’t holding our breath. Game-level optimizations always fall on the weakest link in the chain, which tends to be the development house that’s pressed for time and engineering ability. We’ll continue following as things develop.

Read more here: https://community.amd.com/community/gaming/blog/2017/03/13/amd-ryzen-community-update

- Patrick Lathan.

  VigLink badge