インテル® VTune™ Amplifier 2018 ヘルプ
Use the TSX Exploration analysis to understand Intel® Transactional Synchronization Extensions (Intel TSX) behavior and causes of transactional aborts on Intel processors.
For the transactional success analysis, the VTune Amplifier provides the following metrics:
Clockticks to measure the total number of collected unhalted cycles.
Transactional Cycles to measure the number of cycles spent during transactions. If it is near zero then the application is either not using lock-based synchronization or not using a synchronization library enabled for lock elision through the Intel TSX instructions.
Abort Cycles to measure the number of cycles spent during transactions which were eventually aborted. If it is small relative to Transactional Cycles, then the transactional success rate is high and additional tuning is not required. If it is almost the same as Transactional Cycles (but not very small), then most transactional regions are aborting and lock elision is not going to be beneficial. To identify the causes for transactional aborts and reduce them, enable the Aborts analysis.
VTune Amplifier classifies aborts by the following reasons:
Instruction: Some instructions, such as CPUID and IO instructions, may cause a transactional execution to abort in the implementation.
Data Conflict: A conflicting data access occurs if another logical processor either reads a location that is part of the transactional region's write-set or writes a location that is a part of either the read- or write-set of the transactional region. Since Intel TSX detects data conflicts at the granularity of a cache line, unrelated data locations placed in the same cache line will be detected as conflicts.
Capacity: Transactional aborts may occur due to limited transactional resources. For example, the amount of data accessed in the region may exceed an implementation-specific capacity.
To use the TSX Exploration analysis type, explore:
Configuration options (knobs)
To configure options for the TSX Exploration analysis:
Prerequisites: Create a project and specify an analysis target.
Click the (スタンドアロン GUI)/ (Visual Studio IDE)New Analysis toolbar button.
The Analysis Type window opens.
From the left pane, select Microarchitecture Analysis > TSX Exploration.
The TSX Exploration configuration pane opens on the right.
Configure the following options:
Analyze user tasks, events, and counters check box |
Analyze the tasks, events, and counters specified in your code via the ITT API. This option causes a higher overhead and increases the result size. The default value is false. |
TSX Analysis Step options |
Select a step for analyzing Intel TSX behavior. Start with measuring transactional success and then, if the aborts rate is high, analyze for aborts.
The default value is 1. Transactional success. |
Details button |
Expand/collapse a section listing the default non-editable settings used for this analysis type. If you want to modify or enable additional settings for the analysis, you need to create a custom configuration by copying an existing predefined configuration. VTune Amplifier creates an editable copy of this analysis type configuration and locates it under the Custom Analysis section on the left pane. |
Click Start to run the analysis.
For analysis, use the TSX Exploration viewpoint that includes the following windows:
Summary window displays statistics on the overall application execution.
Bottom-up window displays performance data per CPU metrics (event ratio/event count/sample count) for each program unit.
Top-down Tree window displays hotspot functions in the call tree, performance metrics for a function only (Self value) and for a function and its children together (Total value).
Platform window provides details on tasks specified in your code with the Task API, Ftrace*/Systrace* event tasks, OpenCL™ API tasks, and so on. If corresponding platform metrics are collected, the Platform window displays overtime data as GPU usage on a software queue, CPU time usage, OpenCL™ kernels data, and GPU performance per the Overview group of GPU hardware metrics, Memory Bandwidth, and CPU Frequency