AllScale Monitoring Service
The AllScale Monitoring Service provides the infrastructure to collect performance data describing the state of the execution environment, as well as the applications executed. These performance measurements include execution times of selected code sections, hardware events, energy usage, or runtime internals such as task queue lengths.
The Monitoring Service also provides means for performance introspection, that is, capabilities to access the collected data online while the application runs. Thereby, providing the AllScale Scheduler with valuable data that can be used for scheduling decisions.
In addition, the Monitoring Service generates performance profiles that give AllScale developers and users insight into the performance behaviour of their applications.
Currently, the AllScale monitoring framework can measure power in an IBM POWER8 machine. IBM POWER8 chips have a few on-chip programmable components (sensors) for power/thermal management such as On-Chip-Controller (OCC), General Purpose Engine (used by OCC), Sleep-Winkle Engine (used to restore core logic after power management idle instruction), and Self Boot Engine (used for chip initialization and to load and start Hostboot firmware).
System and application users on POWER8 can access power and energy usage via different interfaces:
- Out-of-band via IPMI (Time to read sensors: 120ms)
- In-band via IPMI (Time to read sensors: 80ms)
- In-band via XSCOM (Time to read sensors: 1100ns)
- XSCOM is a specialized scan communication interface in the chip pervasive logic to access specific latches from the processor cores.
- Core and memory temperatures are read from Digital Thermal Sensors via XSCOM.
- Exported as HWMON sensors
- OCC In-band Sensors (90ns)
- Involves programming on-chip component OCC to enable instrumentation in host Linux kernel
- OCC In-band Sensors provide a mechanism to expose the platform sensor data in-band through standard Linux interfaces, such as Perf, HWMON/lm-sensors, Sysfs.
- OCC provides power management at per chip granularity:
- Total System Power is read every 250us
- Core temperature is read every 2ms
The main difference among these four interfaces, especially within context of AllScale, is their read latency. Therefore, AllScale utilizes the OCC In-band sensors, since they are better suited for fine-grained task scheduling.