![]() |
LIKWID
|
This page is valid for Broadwell D.
Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose counters. Each can measure only one specific event.
Counter name | Event name |
---|---|
FIXC0 | INSTR_RETIRED_ANY |
FIXC1 | CPU_CLK_UNHALTED_CORE |
FIXC2 | CPU_CLK_UNHALTED_REF |
Option | Argument | Description | Comment |
---|---|---|---|
anythread | N | Set bit 2+(index*4) in config register | |
kernel | N | Set bit (index*4) in config register |
Commonly the Intel® Broadwell D microarchitecture provides 4 general-purpose counters consisting of a config and a counter register.
Counter name | Event name |
---|---|
PMC0 | * |
PMC1 | * |
PMC2 | * |
PMC3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
kernel | N | Set bit 17 in config register | |
anythread | N | Set bit 21 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
in_transaction | N | Set bit 32 in config register | Only available if Intel® Transactional Synchronization Extensions are available |
in_transaction_aborted | N | Set bit 33 in config register | Only counter PMC2 and only if Intel® Transactional Synchronization Extensions are available |
The Intel® Broadwell D microarchitecture provides measureing of offcore events in PMC counters. Therefore the stream of offcore events must be filtered using the OFFCORE_RESPONSE registers. The Intel® Broadwell D microarchitecture has two of those registers. LIKWID defines some events that perform the filtering according to the event name. Although there are many bitmasks possible, LIKWID natively provides only the ones with response type ANY. Own filtering can be applied with the OFFCORE_RESPONSE_0_OPTIONS and OFFCORE_RESPONSE_1_OPTIONS events. Only for those events two more counter options are available:
Option | Argument | Description | Comment |
---|---|---|---|
match0 | 16 bit hex value | Input value masked with 0x8FFF and written to bits 0-15 in the OFFCORE_RESPONSE register | Check the Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring and https://download.01.org/perfmon/BDW-DE. |
match1 | 22 bit hex value | Input value is written to bits 16-37 in the OFFCORE_RESPONSE register | Check the Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring and https://download.01.org/perfmon/BDW-DE. |
The event MEM_TRANS_RETIRED_LOAD_LATENCY is not available because it needs programming of PEBS registers. PEBS is a kernel-level measurement facility for performance monitoring. Although we can program it from user-space, the results are always 0.
The Intel® Broadwell microarchitecture provides one register for the current core temperature.
Counter name | Event name |
---|---|
TMP0 | TEMP_CORE |
The Intel® Broadwell microarchitecture provides measurements of the current power consumption through the RAPL interface.
Counter name | Event name |
---|---|
PWR0 | PWR_PKG_ENERGY |
PWR1 | PWR_PP0_ENERGY |
PWR2 | PWR_PP1_ENERGY |
PWR3 | PWR_DRAM_ENERGY |
The Intel® Broadwell D microarchitecture provides measurements of the management box in the Uncore. The description from Intel®:
The UBox serves as the system configuration controller for the Intel® Xeon® Processor D-1500 Product Family. In this capacity, the UBox acts as the central unit for a variety of functions:
The Uncore management performance counters are exposed to the operating system through the MSR interface. The name UBOX originates from the Nehalem EX Uncore monitoring where those functional units are called UBOX.
Counter name | Event name |
---|---|
UBOX0 | * |
UBOX1 | * |
UBOXFIX | UBOX_CLOCKTICKS |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 5 bit hex value | Set bits 24-28 in config register | |
invert | N | Set bit 23 in config register |
The Intel® Broadwell microarchitecture provides measurements for the last level cache segments.The description from Intel®:
The LLC coherence engine (CBo) manages the interface between the core and the last level cache (LLC). All core transactions that access the LLC are directed from the core to a CBo via the ring interconnect. The CBo is responsible for managing data delivery from the LLC to the requesting core. It is also responsible for maintaining coherence between the cores within the socket that share the LLC; generating snoops and collecting snoop responses from the local cores when the MESIF protocol requires it.
Counter name | Event name |
---|---|
CBOX<0-15>C0 | * |
CBOX<0-15>C1 | * |
CBOX<0-15>C2 | * |
CBOX<0-15>C3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-28 in config register | |
invert | N | Set bit 23 in config register | |
tid | 6 bit hex value | Set bits 0-5 in MSR_UNC_C<0-15>_PMON_BOX_FILTER register | |
state | 7 bit hex value | Set bits 17-23 in MSR_UNC_C<0-15>_PMON_BOX_FILTER register | M': 0x40, D: 0x20, F: 0x10, M: 0x08, E: 0x04, S: 0x02, I: 0x01 |
nid | 16 bit hex value | Set bits 0-15 in MSR_UNC_C<0-15>_PMON_BOX_FILTER1 register | Note: Node 0 has value 0x0001 |
opcode | 9 bit hex value | Set bits 20-28 in MSR_UNC_C<0-15>_PMON_BOX_FILTER1 register | A list of valid opcodes can be found in the Intel® Xeon D-1500 Uncore Manual. |
match0 | 2 bit hex address | Set bits 30-31 in MSR_UNC_C<0-15>_PMON_BOX_FILTER1 register | See the Intel® Xeon D-1500 Uncore Manual for more information. |
The Intel® Broadwell D microarchitecture provides an event LLC_LOOKUP which can be filtered with the 'state' option. If no 'state' is set, LIKWID sets the state to 0x1F, the default value to measure all lookups.
The Intel® Broadwell D microarchitecture provides measurements of the Home Agent (HA) in the Uncore. The description from Intel®:
Each HA is responsible for the protocol side of memory interactions, including coherent and non-coherent home agent protocols (as defined in the Intel® QuickPath Interconnect Specification). Additionally, the HA is responsible for ordering memory reads/writes, coming in from the modular Ring, to a given address such that the IMC (memory controller).
The Home Agent performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the HA. For systems where each socket has 12 or more cores, there are both HAs available. The name BBOX originates from the Nehalem EX Uncore monitoring where this functional unit is called BBOX.
Counter name | Event name |
---|---|
BBOX<0,1>C0 | * |
BBOX<0,1>C1 | * |
BBOX<0,1>C2 | * |
BBOX<0,1>C3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
opcode | 6 bit hex value | Set bits 0-5 in PCI_UNC_HA_PMON_OPCODEMATCH register of PCI device | |
match0 | 46 bit hex address | Extract bits 6-31 and set bits 6-31 in PCI_UNC_HA_PMON_ADDRMATCH0 register of PCI device Extract bits 32-45 and set bits 0-13 in PCI_UNC_HA_PMON_ADDRMATCH1 register of PCI device |
The Intel® Broadwell D microarchitecture provides measurements of the power control unit (PCU) in the Uncore. The description from Intel®:
The PCU is the primary Power Controller for the Intel® Xeon® Processor D-1500 Product Family.
The uncore implements a power control unit acting as a core/uncore power and thermal manager. It runs its firmware on an internal microcontroller and coordinates the socket’s power states.
The PCU performance counters are exposed to the operating system through the MSR interface. The name WBOX originates from the Nehalem EX Uncore monitoring where those functional units are called WBOX.
Counter name | Event name |
---|---|
WBOX0 | * |
WBOX1 | * |
WBOX2 | * |
WBOX3 | * |
WBOX0FIX | CORES_IN_C3 |
WBOX1FIX | CORES_IN_C6 |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 5 bit hex value | Set bits 24-28 in config register | |
occupancy | 2 bit hex value | Set bit 14-15 in config register | Cores in C0: 0x1, in C3: 0x2, in C6: 0x3 |
occupancy_filter | 32 bit hex value | Set bits 0-31 in MSR_UNC_PCU_PMON_BOX_FILTER register | Band0: bits 0-7, Band1: bits 8-15, Band2: bits 16-23, Band3: bits 24-31 |
occupancy_edgedetect | N | Set bit 31 in config register | |
occupancy_invert | N | Set bit 30 in config register |
The Intel® Broadwell D microarchitecture provides measurements of the IRP box in the Uncore. The description from Intel®:
IRP is responsible for maintaining coherency for IIO traffic that needs to be coherent (e.g. cross-socket P2P).
The IRP box counters are exposed to the operating system through the PCI interface. The IBOX was introduced with the Intel® IvyBridge EP/EN/EX microarchitecture.
Counter name | Event name |
---|---|
IBOX<0,1>C0 | * |
IBOX<0,1>C1 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register |
The Intel® Broadwell D microarchitecture provides measurements of the integrated Memory Controllers (iMC) in the Uncore. The description from Intel®:
The Intel® Xeon® Processor D-1500 Product Family integrated Memory Controller provides the interface to DRAM and communicates to the rest of the Uncore through the Home Agent (i.e. the IMC does not connect to the Ring).
In conjunction with the HA, the memory controller also provides a variety of RAS features.
The integrated Memory Controllers performance counters are exposed to the operating system through PCI interfaces. There may be two memory controllers in the system. There are 4 different PCI devices per memory controller, but only 2 channels. Each channel has 4 different general-purpose counters and one fixed counter for the DRAM clock. The channels of the first memory controller are MBOX0-3, the four channels of the second memory controller (if available) are named MBOX4-7. The name MBOX originates from the Nehalem EX Uncore monitoring where those functional units are called MBOX.
Counter name | Event name |
---|---|
MBOX<0-7>C0 | * |
MBOX<0-7>C1 | * |
MBOX<0-7>C2 | * |
MBOX<0-7>C3 | * |
MBOX<0-7>FIX | DRAM_CLOCKTICKS |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register |
The Intel® Broadwell D microarchitecture provides measurements of the Ring-to-PCIe (R2PCIe) interface in the Uncore. The description from Intel®:
R2PCIe represents the interface between the Ring and IIO traffic to/from PCIe.
The Ring-to-PCIe performance counters are exposed to the operating system through a PCI interface. Independent of the system's configuration, there is only one Ring-to-PCIe interface per CPU socket.
Counter name | Event name |
---|---|
PBOX0 | * |
PBOX1 | * |
PBOX2 | * |
PBOX3 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register |
*/