Available performance monitors for the Intel® Nehalem EX microarchitecture
Counters available for each hardware thread
Fixed-purpose counters
Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose counters. Each can measure only one specific event.
Counter and events
Counter name | Event name |
FIXC0 | INSTR_RETIRED_ANY |
FIXC1 | CPU_CLK_UNHALTED_CORE |
FIXC2 | CPU_CLK_UNHALTED_REF |
Available Options
Option | Argument | Description | Comment |
anythread | N | Set bit 2+(index*4) in config register | |
kernel | N | Set bit (index*4) in config register | |
General-purpose counters
The Intel® Nehalem EX microarchitecture provides 4 general-purpose counters consisting of a config and a counter register.
Counter and events
Counter name | Event name |
PMC0 | * |
PMC1 | * |
PMC2 | * |
PMC3 | * |
Available Options
Option | Argument | Description | Comment |
edgedetect | N | Set bit 18 in config register | |
kernel | N | Set bit 17 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
Special handling for events
The Intel® Nehalem EX microarchitecture provides measuring of offcore events in PMC counters. Therefore the stream of offcore events must be filtered using the OFFCORE_RESPONSE registers. The Intel® Nehalem EX microarchitecture has two of those registers. Own filtering can be applied with the OFFCORE_RESPONSE_0_OPTIONS event. Only for those events two more counter options are available:
Counters available for one hardware thread per socket
MBOX counters
The Intel® Nehalem EX microarchitecture provides measurements of the memory controllers in the Uncore. The description from Intel®:
The memory controller interfaces to the Intel® 7500 Scalable Memory Buffers and translates read and write commands into specific Intel® Scalable Memory Interconnect (Intel® SMI) operations. Intel SMI is based on the FB-DIMM architecture, but the Intel 7500 Scalable Memory Buffer is not an AMB2 device and has significant exceptions to the FB-DIMM2 architecture. The memory controller also provides a variety of RAS features, such as ECC, memory scrubbing, thermal throttling, mirroring, and DIMM sparing. Each socket has two independent memory controllers, and each memory controller has two Intel SMI channels that operate in lockstep.
The Intel® Nehalem EX microarchitecture has 2 memory controllers, each with 6 general-purpose counters. They are exposed through the MSR interface to the operating system kernel. The MBOX and RBOX setup routines are taken from Likwid 3, they are not as flexible as the newer setup routines but programming of the MBOXes and RBOXes is tedious for Westmere EX. It is not possible to specify a FVID (Fill Victim Index) for the MBOX or IPERF option for RBOXes.
Counter and events
Counter name | Event name |
MBOX<0,1>C0 | * |
MBOX<0,1>C1 | * |
MBOX<0,1>C2 | * |
MBOX<0,1>C3 | * |
MBOX<0,1>C4 | * |
MBOX<0,1>C5 | * |
Special handling for events
For the events DRAM_CMD_ALL and DRAM_CMD_ILLEGAL two counter options are available:
Option | Argument | Description | Comment |
match0 | 34 bit address | Set bits 0-33 in MSR_M<0,1>_PMON_ADDR_MATCH register | |
mask0 | 60 bit hex value | Extract bits 6-33 from address and set bits 0-27 in MSR_M<0,1>_PMON_ADDR_MASK register | |
For the events THERM_TRP_DN and THERM_TRP_UP you cannot measure events for all and one specific DIMM simultaneously because they program the same filter register MSR_M<0,1>_PMON_MSC_THR and have contrary configurations.
Although the events FVC_EV<0-3> are available to measure multiple memory events, some overlap and do not allow simultaneous measuring. That's because they program the same filter register MSR_M<0,1>_PMON_ZDP and have contrary configurations. One case are the FVC_EV<0-3>_BBOX_CMDS_READS and FVC_EV<0-3>_BBOX_CMDS_WRITES events that measure memory reads or writes but cannot be measured at the same time.
BBOX counters
The Intel® Nehalem EX microarchitecture provides measurements of the Home Agent in the Uncore. The description from Intel®:
The B-Box is responsible for the protocol side of memory interactions, including coherent and non-coherent home agent protocols (as defined in the Intel® QuickPath Interconnect Specification). Additionally, the B-Box is responsible for ordering memory reads/writes to a given address such that the M-Box does not have to perform this conflict checking. All requests for memory attached to the coupled M-Box must first be ordered through the B-Box.
The memory traffic in an Intel® Nehalem EX system is controller by the Home Agents. Each MBOX has a corresponding BBOX. Each BBOX offers 4 general-purpose counters. They are exposed through the MSR interface to the operating system kernel.
Counter and events
Counter name | Event name |
BBOX<0,1>C0 | * |
BBOX<0,1>C1 | * |
BBOX<0,1>C2 | * |
BBOX<0,1>C3 | * |
Special handling for events
For the matching events MSG_IN_MATCH, MSG_ADDR_IN_MATCH, MSG_OPCODE_ADDR_IN_MATCH, MSG_OPCODE_IN_MATCH, MSG_OPCODE_OUT_MATCH, MSG_OUT_MATCH, OPCODE_ADDR_IN_MATCH, OPCODE_IN_MATCH, OPCODE_OUT_MATCH and ADDR_IN_MATCH two counter options are available:
RBOX counters
The Intel® Nehalem EX microarchitecture provides measurements of the crossbar router in the Uncore. The description from Intel®:
The Crossbar Router (R-Box) is a 8 port switch/router implementing the Intel® QuickPath Interconnect Link and Routing layers. The R-Box is responsible for routing and transmitting all intra- and inter-processor communication.
The Intel® Nehalem EX microarchitecture has two interfaces to the RBOX although each socket contains only one crossbar router. Each RBOX offers 8 general-purpose counters. They are exposed through the MSR interface to the operating system kernel. The RBOX setup routine is taken from Likwid 3.
Counter and events
Counter name | Event name |
RBOX<0,1>C0 | * |
RBOX<0,1>C1 | * |
RBOX<0,1>C2 | * |
RBOX<0,1>C3 | * |
RBOX<0,1>C4 | * |
RBOX<0,1>C5 | * |
RBOX<0,1>C6 | * |
RBOX<0,1>C7 | * |
CBOX counters
The Intel® Nehalem EX microarchitecture provides measurements of the LLC coherency engine in the Uncore. The description from Intel®:
For the Intel Xeon Processor 7500 Series, the LLC coherence engine (C-Box) manages the interface between the core and the last level cache (LLC). All core transactions that access the LLC are directed from the core to a C-Box via the ring interconnect. The C-Box is responsible for managing data delivery from the LLC to the requesting core. It is also responsible for maintaining coherence between the cores within the socket that share the LLC; generating snoops and collecting snoop responses to the local cores when the MESI protocol requires it.
The C-Box is also the gate keeper for all Intel® QuickPath Interconnect (Intel® QPI) messages that originate in the core and is responsible for ensuring that all Intel QuickPath Interconnect messages that pass through the socket’s LLC remain coherent.
The Intel® Nehalem EX microarchitecture has 8 CBOX instances. Each CBOX offers 6 general-purpose counters. They are exposed through the MSR interface to the operating system kernel.
Counter and events
Counter name | Event name |
CBOX<0-7>C0 | * |
CBOX<0-7>C1 | * |
CBOX<0-7>C2 | * |
CBOX<0-7>C3 | * |
CBOX<0-7>C4 | * |
CBOX<0-7>C5 | * |
Available Options
Option | Argument | Description | Comment |
edgedetect | N | Set bit 18 in config register | |
threshold | 5 bit hex value | Set bits 24-28 in config register | |
invert | N | Set bit 23 in config register | |
SBOX counters
The Intel® Nehalem EX microarchitecture provides measurements of the LLC-to-QPI interface in the Uncore. The description from Intel®:
The S-Box represents the interface between the last level cache and the system interface. It manages flow control between the C and R & B-Boxes. The S-Box is broken into system bound (ring to Intel® QPI) and ring bound (Intel® QPI to ring) connections.
As such, it shares responsibility with the C-Box(es) as the Intel® QPI caching agent(s). It is responsible for converting C-box requests to Intel® QPI messages (i.e. snoop generation and data response messages from the snoop response) as well as converting/forwarding ring messages to Intel® QPI packets and vice versa.
The Intel® Nehalem EX microarchitecture has 2 SBOX instances. Each SBOX offers 4 general-purpose counters. They are exposed through the MSR interface to the operating system kernel.
Counter and events
Counter name | Event name |
SBOX<0,1>C0 | * |
SBOX<0,1>C1 | * |
SBOX<0,1>C2 | * |
SBOX<0,1>C3 | * |
Available Options
Option | Argument | Description | Comment |
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
Special handling for events
Only for the TO_R_PROG_EV events two counter options are available:
WBOX counters
The Intel® Nehalem EX microarchitecture provides measurements of the power controller in the Uncore. The description from Intel®:
The W-Box is the primary Power Controller for the Intel® Xeon® Processor 7500 Series.
The Intel® Nehalem EX microarchitecture has one WBOX and it offers 4 general-purpose counters and one fixed counter. They are exposed through the MSR interface to the operating system kernel.
Counter and events
Counter name | Event name |
WBOX0 | * |
WBOX1 | * |
WBOX2 | * |
WBOX3 | * |
WBOXFIX | UNCORE_CLOCKTICKS |
Available Options (Only for WBOX<0-3> counters)
Option | Argument | Description | Comment |
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
UBOX counters
The Intel® Nehalem EX microarchitecture provides measurements of the system configuration controller in the Uncore. The description from Intel®:
The U-Box serves as the system configuration controller for the Intel® Xeon® Processor E7 Family.
The Intel® Nehalem EX microarchitecture has one UBOX and it offers a single general-purpose counter. It is exposed through the MSR interface to the operating system kernel.
Counter and events
Counter name | Event name |
UBOX0 | * |
Available Options (Only for WBOX<0-3> counters)
Option | Argument | Description | Comment |
edgedetect | N | Set bit 18 in config register | |
*/