First, a few of the sample photos we've been sent have shown blown FETs in a few specific spots on the board. Each seems to correspond with hotspots that we located with thermal imaging: The second MOSFET up and MOSFET #7/#8. Thermal imaging doesn't show us that side of the PCB, and thermal imaging also has a few flaws regarding emissivity of the surface of a thermal pad or backplate and regarding insulation of the hotspot by the thermal pad. These challenges are not present when just measuring the fan curve adjustment -- which we also did, and which we also showed -- but have been on the back of our minds since first exploring EVGA's VRM fracas. There's another problem that we'd already discussed, too: The PCB acts as another layer of material that can sink heat, and easily causes a delta of ~4C from one side to the other.
It wasn't easy to solve, though. We've already got thermocouple readers and ample experience working with them, but didn't have a good set of probes to use for a VRM. The most sensible thing to do would be to probe the VRM components (power stages, MOSFETs) directly to the reader, log those temperatures, and run the card through a number of test cycles to determine:
Card's performance stock, with no special VBIOS or thermal pad applied.
Card's performance with new VBIOS.
Card's performance with thermal pads.
But that wasn't easy to plan. We wanted to probe MOSFET #2 and #7 directly, and needed a good way to stick a thermocouple to the FETs. The new challenges that emerged seem obvious: A flat thermocouple was needed so that heat transfer from the MOSFET to the thermal pad (already applied, not the aftermarket ones) would not be impeded. We considered running a thermocouple between two FETs, or sandwiched against the side of one, as this would allow for uninterrupted thermal transfer to the pads and then the baseplate. Unfortunately, new challenges emerge with this approach: Now we'd have to worry about electrical conductivity, and potentially bridging some of the small electrical components between the MOSFETs.
So what do we do?
The immediate thought was to buy Kapton tape -- which we did -- and apply it to the end of the thermocouple, thus eliminating the concern of electrical conductivity while also allowing for a thermal threshold that exceeds the test environment (no melting tape). That idea was tossed after consulting thermal engineers in the industry, like Corsair's Bobby Kinstle, simply because we had found a better solution. It'd work, the Kapton tape, but it wouldn't be ideal, and we'd have to avoid placing the taped probe between a heatsink and the heat source. Kapton tape also insulates the probe, and we don't know just how much that would impact measurements.
The final solution was to grab some self-adhesive, flat thermocouples, which resolve almost all of our problems. The new probes can be mounted atop the MOSFET itself without concern of completely destroying heat transfer (we discovered this after consulting thermal engineers, again, like Bobby Kinstle), thus allowing for a more direct measurement of the MOSFET in its problem areas. Heat will transfer just fine through the thermocouple's enclosure, and our ability to read temperature will remain accurate within reason. We have found the flat, self-adhesive thermocouples to read at about a +/-1C change from our trusted thermocouples that are exposed.
We're routing these probes away from inductors to avoid EMI, and will run them out the bottom of the card. Traces will be crossed at 90-degrees where possible, but that's not always going to be feasible with a power plane PCB. We'll check for EMI and relocate probes as necessary, until any cross-talk is eliminated.
The new K-type probes were manually attached to some housing for our thermocouple reader, and are now ready to be deployed in the field. We will be running four thermocouples at a time, validating our results, and re-running tests as necessary. The plan is to test real-world scenarios, torture scenarios (FurMark, as Tom's did), and maybe look into SLI scenarios after that.
We'll see. There's a lot to be done, but this was a good start. We've been speaking with thermal engineers on methodology, so the testing has taken a little while to develop. We expect it'll be about another week to conduct all the testing, at which time we'll regroup with 'Buildzoid' to analyze the data.
Editorial: Steve "Lelldorianx" Burke
Video: Andrew "ColossalCake" Coleman