Class ov#

ov : private pass::low_precision::BaseMatcherPass public ov::pass::MatcherPass , private pass::low_precision::LowPrecision public ov::pass::ModelPass

Public Types

enum Direction#

Enumerate directions.

Values:

enumerator FORWARD#

enumerator BACKWARD#

enum ColumnOfProcessorTypeTable#

This enum contains definition of each columns in processor type table which bases on cpu core types. Will extend to support other CPU core type like ARM.

The following are two example of processor type table.

Processor table of 4 numa nodes and 2 socket server

Processor table of 1 numa node desktop

ALL_PROC | MAIN_CORE_PROC | EFFICIENT_CORE_PROC | HYPER_THREADING_PROC | PROC_NUMA_NODE_ID | PROC_SOCKET_ID 32 8 16 8 -1 -1

Values:

enumerator ALL_PROC#: All processors, regardless of backend cpu.

enumerator MAIN_CORE_PROC#: Processor based on physical core of Intel Performance-cores.

enumerator EFFICIENT_CORE_PROC#: Processor based on Intel Efficient-cores.

enumerator HYPER_THREADING_PROC#: Processor based on logical core of Intel Performance-cores.

enumerator PROC_NUMA_NODE_ID#: Numa node id of processors in this row.

enumerator PROC_SOCKET_ID#: Socket id of processors in this row.

enumerator PROC_TYPE_TABLE_SIZE#: Size of processor type table.

enum ProcessorUseStatus#

Definition of CPU_MAP_USED_FLAG column in CPU mapping table.

Values:

enumerator CPU_BLOCKED#: Processor is blocked to use.

enumerator NOT_USED#: Processor is not bound to thread.

enumerator CPU_USED#: CPU is in using.

enum ColumnOfCPUMappingTable#

This enum contains definition of each columns in CPU mapping table which use processor id as index.

GROUP_ID is generated according to the following rules.

If one MAIN_CORE_PROC and one HYPER_THREADING_PROC are based on same Performance-cores, they are in one group.
If some EFFICIENT_CORE_PROC share one L2 cachle, they are in one group.
There are no duplicate group IDs in the system

The following is the example of CPU mapping table.

Four processors of two Pcore
Four processors of four Ecores shared L2 cache

Values:

enumerator CPU_MAP_PROCESSOR_ID#: column for processor id of the processor

enumerator CPU_MAP_NUMA_NODE_ID#: column for node id of the processor

enumerator CPU_MAP_SOCKET_ID#: column for socket id of the processor

enumerator CPU_MAP_CORE_ID#: column for hardware core id of the processor

enumerator CPU_MAP_CORE_TYPE#: column for CPU core type corresponding to the processor

enumerator CPU_MAP_GROUP_ID#: column for group id to the processor. Processors in one group have dependency.

enumerator CPU_MAP_USED_FLAG#: column for resource management of the processor

enumerator CPU_MAP_TABLE_SIZE#: Size of CPU mapping table.

enum ColumnOfCpuStreamsInfoTable#

This enum contains definition of each columns in cpu streams information table.

The following are two example of processor type table.

8 streams on hybrid platform which has 4 threads per stream (TPS). 1.1 2 streams (4 TPS) on physical core of Intel Performance-cores 1.2 4 streams (4 TPS) on Intel Efficient-cores 1.3 2 streams (4 TPS) on logic core of Intel Performance-cores

NUMBER_OF_STREAMS | PROC_TYPE | THREADS_PER_STREAM | STREAM_NUMA_NODE_ID | STREAM_SOCKET_ID 2 1 4 0 0 4 2 4 0 0 2 3 4 0 0

1 stream (10 TPS) on hybrid platform which has 2 threads on physical core and 8 threads on Ecore. 2.1 1 streams (10 TPS) on multiple types of processors 2.2 2 threads on physical core of Intel Performance-cores 2.3 8 threads on Intel Efficient-cores

NUMBER_OF_STREAMS | PROC_TYPE | THREADS_PER_STREAM | STREAM_NUMA_NODE_ID | STREAM_SOCKET_ID 1 0 10 0 0 0 1 2 0 0 0 2 8 0 0

Values:

enumerator NUMBER_OF_STREAMS#: Number of streams on specific CPU core tpye.

enumerator PROC_TYPE#: Core type of current streams.

enumerator THREADS_PER_STREAM#: Number of threads per stream of current streams.

enumerator STREAM_NUMA_NODE_ID#: Numa node id of processors in this row.

enumerator STREAM_SOCKET_ID#: Socket id of processors in this row.

enumerator CPU_STREAMS_TABLE_SIZE#: Size of streams info table.

enum class PropertyMutability#

Enum to define property value mutability.

Values:

enumerator RO#: Read-only property values can not be passed as input parameter.

enumerator RW#: Read/Write property key may change readability in runtime.

enumerator WO#: Write-only property can not be read.

enum class WorkloadType#

Enum to define possible workload types.

Workload type represents the execution priority for an inference.

Values:

enumerator DEFAULT#

enumerator EFFICIENT#

enum class CacheMode#

Enum to define possible cache mode.

Values:

enumerator OPTIMIZE_SIZE#: smaller cache size

enumerator OPTIMIZE_SPEED#: faster loading time

enum class Affinity#

Enum to define possible affinity patterns.

Values:

enumerator NONE#: Disable threads affinity pinning.

enumerator CORE#: Pin threads to cores, best for static benchmarks.

enumerator NUMA#: Pin threads to NUMA nodes, best for real-life, contented cases. On the Windows and MacOS* this option behaves as CORE

enumerator HYBRID_AWARE#: Let the runtime to do pinning to the cores types, e.g. prefer the “big” cores for latency tasks. On the hybrid CPUs this option is default

using TensorSymbol = std::vector<std::shared_ptr<Symbol>>#: Alias for symbol tensor.

using TensorSymbolVector = std::vector<TensorSymbol>#: Alias for vector of symbol tensors.

using EvaluationContext = ov::RTMap#: EvaluationContext stores and manages a context (additional parameters, values and environment) for evaluating ov::Model.

using Rank = Dimension#: Alias for Dimension, used when the value represents the number of axes in a shape, rather than the size of one dimension in a shape.

using TensorVector = std::vector<Tensor>#: A vector of Tensor’s.

using SupportedOpsMap = std::map<std::string, std::string>#

This type of map is used for result of Core::query_model.

key means operation name
value means device name supporting this operation

Public Static Attributes

static constexpr Property<std::vector<PropertyName>, PropertyMutability::RO> supported_properties{"SUPPORTED_PROPERTIES"}#: Read-only property to get a std::vector<PropertyName> of supported read-only properties. This can be used as a compiled model property as well.

static constexpr Property<std::vector<std::string>, PropertyMutability::RO> available_devices = {"AVAILABLE_DEVICES"}#: Read-only property to get a std::vector<std::string> of available device IDs.

static constexpr Property<std::string, PropertyMutability::RO> model_name = {"NETWORK_NAME"}#: Read-only property to get a name of name of a model.

static constexpr Property<uint32_t, PropertyMutability::RO> optimal_number_of_infer_requests{"OPTIMAL_NUMBER_OF_INFER_REQUESTS"}#: Read-only property to get an unsigned integer value of optimal number of compiled model infer requests.

static constexpr Property<bool> enable_profiling = {"PERF_COUNT"}#: The name for setting performance counters option.

static constexpr Property<std::string> cache_dir = {"CACHE_DIR"}#

This property defines the directory which will be used to store any data cached by plugins.

The underlying cache structure is not defined and might differ between OpenVINO releases Cached data might be platform / device specific and might be invalid after OpenVINO version change If this property is not specified or value is empty string, then caching is disabled. The property might enable caching for the plugin using the following code:

ie.set_property("GPU", ov::cache_dir("cache/")); // enables cache for GPU plugin

The following code enables caching of compiled network blobs for devices where import/export is supported

ie.set_property(ov::cache_dir("cache/")); // enables models cache

static constexpr Property<bool, PropertyMutability::RO> loaded_from_cache = {"LOADED_FROM_CACHE"}#: Read-only property to notify user that compiled model was loaded from the cache.

static constexpr Property<WorkloadType, PropertyMutability::RW> workload_type = {"WORKLOAD_TYPE"}#: Read-write property to select in which mode the workload will be executed This is only supported by NPU.

static constexpr Property<CacheMode, PropertyMutability::RW> cache_mode = {"CACHE_MODE"}#: Read-write property to select the cache mode between optimize_size and optimize_speed. If optimize_speed is selected(default), loading time will decrease but the cache file size will increase. If optimize_size is selected, smaller cache files will be created. This is only supported from GPU.

static constexpr Property<std::tuple<unsigned int, unsigned int>, PropertyMutability::RO> range_for_streams{"RANGE_FOR_STREAMS"}#

Read-only property to provide information about a range for streams on platforms where streams are supported.

Property returns a value of std::tuple<unsigned int, unsigned int> type, where:

First value is bottom bound.
Second value is upper bound.

static constexpr Property<unsigned int, PropertyMutability::RO> optimal_batch_size = {"OPTIMAL_BATCH_SIZE"}#

Read-only property to query information optimal batch size for the given device and the network.

Property returns a value of unsigned int type, Returns optimal batch size for a given network on the given device. The returned value is aligned to power of 2. Also, ov::hint::model is the required option for this metric since the optimal batch size depends on the model, so if the ov::hint::model is not given, the result of the metric is always 1. For the GPU the metric is queried automatically whenever the OpenVINO performance hint for the throughput is used, so that the result (>1) governs the automatic batching (transparently to the application). The automatic batching can be disabled with ALLOW_AUTO_BATCHING set to NO

static constexpr Property<uint32_t, PropertyMutability::RO> max_batch_size = {"MAX_BATCH_SIZE"}#: Read-only property to get maximum batch size which does not cause performance degradation due to memory swap impact.

static constexpr Property<uint32_t, PropertyMutability::RW> auto_batch_timeout = {"AUTO_BATCH_TIMEOUT"}#: Read-write property to set the timeout used to collect the inputs for the auto-batching impact.

static constexpr Property<std::tuple<unsigned int, unsigned int, unsigned int>, PropertyMutability::RO> range_for_async_infer_requests = {"RANGE_FOR_ASYNC_INFER_REQUESTS"}#

Read-only property to provide a hint for a range for number of async infer requests. If device supports streams, the metric provides range for number of IRs per stream.

Property returns a value of std::tuple<unsigned int, unsigned int, unsigned int> type, where:

First value is bottom bound.
Second value is upper bound.
Third value is step inside this range.

static constexpr Property<bool, PropertyMutability::RW> force_tbb_terminate = {"FORCE_TBB_TERMINATE"}#

Read-write property to set whether force terminate tbb when ov core destruction value type: boolean.

True explicitly terminate tbb when ov core destruction
False will not involve additional tbb operations when core destruction

static constexpr Property<bool, PropertyMutability::RW> enable_mmap = {"ENABLE_MMAP"}#

Read-write property to configure mmap() use for model read. Enabled by default. For the moment only IR Frontend supports the property.

value type: boolean

True enable mmap() use and map model
False disable mmap() use and read model

static constexpr Property<streams::Num, PropertyMutability::RW> num_streams = {"NUM_STREAMS"}#: The number of executor logical partitions.

static constexpr Property<int32_t, PropertyMutability::RW> inference_num_threads = {"INFERENCE_NUM_THREADS"}#: Maximum number of threads that can be used for inference tasks.

static constexpr Property<int32_t, PropertyMutability::RW> compilation_num_threads = {"COMPILATION_NUM_THREADS"}#: Maximum number of threads that can be used for compilation tasks.

static constexpr Property<Affinity> affinity = {"AFFINITY"}#

The name for setting CPU affinity per thread option.

Deprecated:: Use ov::hint::enable_cpu_pinning

Note

The setting is ignored, if the OpenVINO compiled with OpenMP and any affinity-related OpenMP’s environment variable is set (as affinity is configured explicitly)

static constexpr Property<std::vector<std::string>, PropertyMutability::RO> execution_devices = {"EXECUTION_DEVICES"}#: The devices that the inference task been executed.