Unexpectedly, spent quite a time to get damn thing working, so here's short howto/note to self. Post is in (poor) english for obvious reasons.
As those who keep an eye on modern CPUs already know, Nehalem has Turbo Mode/Boost that if enabled and if cooling/power conditions are OK allows processor to overclock itself staying within specified TDP envelope. Separate multipliers available for overclocking 1, 2 or 4 cores. While this may help certain workloads, performance of others may degrade. Supposedly, among the latest are workloads that greatly depend on memory bandwidth: increasing core frequency only increases contention in load/store pipeline as uncore and memory controller frequencies stay the same. Also, there may be complex interdependency with SMT (aka HyperThreading): my test program did have a boost with Turbo=on and SMT=off but there was performance degradation when both Turbo and SMT were on.
- Obviously, Turbo Mode and SpeedStep must be enabled in BIOS. Idle power mode of the processor must be set to "Low Power" (or similar). Otherwise, required ACPI kernel modules won't load.
Actually, BIOS defaults on the box I experimented with had all these options set right.Upd: not really: it was crucial to disable C and C1E states. - Recent kernel is required. I had no luck with stock kernels from RHEL 5.2 and RHEL 5.3 alpha, but YMMV. Finally, I built 2.6.28 myself. Also, I had to add
noapicto kernel boot line, as otherwise kernel panicked while going from one P-state to another. Although, this effect may be attributed to the fact that I used beta-quality hardware (there were quite a few complaints about APIC in dmesg). - Turbo Mode is enabled via ACPI, so
acpi_freqmodule must be loaded. Under RHEL, this module is loaded ifcpuspeedsystem service is started. Scaling governor must be set toperformance. Therefore, to disable Turbo Mode (esp. handy in cluster setting), one only needs to set governor to e.g.userspaceand set current frequencyto maximal available (maximal frequency available corresponds to nominal CPU frequency; to my best knowledge, turbo-overclocked frequencies are not exposed via ACPI)upd: to one before the maximal frequency (it is just a bit lower). The maximal one corresponds to Turbo mode enabled.