ソーシャル・ディスタンスの C++ デモ#

このデモでは、人々を検出し、人々の間の距離を測定する小売向けのソーシャル・ディスタンス・アプリケーションを紹介します。この距離が以前に指定した値よりも小さい場合、アラートがトリガーされます。

デモの目的は次のとおりです:

OpenCV* 経由した入力としてのビデオ/カメラ
複雑な非同期ネットワークのパイプライン処理の例: 人物再識別ネットワークは人物検出結果に基づいて実行されます
ソーシャル・ディスタンスの最低基準違反の視覚化

どのように動作するか#

起動時に、アプリケーションはコマンドライン・パラメーターを読み取り、指定されたネットワークをロードします。人物検出ネットワークと再識別ネットワークの両方が必要です。

アプリケーション・パイプラインのコア・コンポーネントは Worker クラスで、Task クラスの受信インスタンスを実行します。Task は、処理するデータとその処理方法を記述する抽象クラスです。例えば、Task はフレームを読み取りや、検出結果を取得することです。Task インスタンスのプールがあります。これらの Task は実行を待機しています。プールからの Task が実行されると、別の Task が作成されたり、プールに送信されたりします。各 Task は、Task が扱う画像を表す VideoFrame インスタンスへのスマートポインターを保存します。一連の Task が完了し、どの Task も VideoFrame インスタンスを必要としない場合、VideoFrame は破棄されます。これにより、新しい一連の Task の作成がトリガーされます。このデモのパイプラインは、次の一連の Task を実行します:

Reader、新しいフレームを読み取ります
InferTask、検出推論を開始します
DetectionsProcessor、検出推論が完了するのを待機し、再識別モデルを実行します
ResAggregator、推論結果をフレームに描画します
Drawer、推論結果を含むフレームを表示します

シーケンスの最後で、VideoFrame が破棄され、次のフレームのシーケンスが再開されます。

注: デフォルトでは、Open Model Zoo のデモは BGR チャネル順序での入力を期待します。RGB 順序で動作するようにモデルをトレーニングした場合は、サンプルまたはデモ・アプリケーションでデフォルトのチャネル順序を手動で再配置するか、--reverse_input_channels 引数を指定したモデル・オプティマイザー・ツールを使用してモデルを再変換する必要があります。引数の詳細については、[前処理計算の埋め込み](@ref openvino_docs_MO_DG_Additional_Optimization_Use_Cases) の入力チャネルを反転するセクションを参照してください。

実行の準備#

デモの入力画像またはビデオファイルについては、Open Model Zoo デモの概要のデモに使用できるメディアファイルのセクションを参照してください。デモでサポートされるモデルリストは、<omz_dir>/demos/social_distance_demo/cpp/models.lst ファイルにあります。このファイルは、モデル・ダウンローダーおよびコンバーターのパラメーターとして使用され、モデルをダウンロードし、必要に応じて OpenVINO IR 形式 (*.xml + *.bin) に変換できます。

モデル・ダウンローダーの使用例:

omz_downloader --list models.lst

モデル・コンバーターの使用例:

omz_converter --list models.lst

サポートされるモデル#

person-detection-0200
person-detection-0201
person-detection-0202
person-detection-retail-0013
person-reidentification-retail-0277
person-reidentification-retail-0286
person-reidentification-retail-0287
person-reidentification-retail-0288

注: 各種デバイス向けのモデル推論サポートの詳細については、インテルの事前トレーニング・モデルのデバイスサポートとパブリックの事前トレーニング・モデルのデバイスサポートの表を参照してください。

実行する#

-h オプションを指定してアプリケーションを実行すると、使用方法が表示されます:

social_distance_demo [OPTION] 
Options: 
    -h                         Print a usage message.
    -i "<path1>" "<path2>"     Required for video or image files input. Path to video or image files.
    -m_det "<path>"            Required. Path to the Person Detection model .xml file.
    -m_reid "<path>"           Optional. Path to the Person Re-Identification model .xml file.
    -d_det "<device>"          Optional. Specify the target device for Person Detection (the list of available devices is shown below). Default value is CPU.Use "-d HETERO:<comma-separated_devices_list>" format to specify HETERO plugin.The application looks for a suitable plugin for the specified device.
    -d_reid "<device>"         Optional. Specify the target device for Person Re-Identification (the list of available devices is shown below). Default value is CPU. Use "-d_reid HETERO:<comma-separated_devices_list>" format to specify HETERO plugin. The application looks for a suitable plugin for the specified device.
    -r                         Optional. Output inference results as mask histogram.
    -t                         Optional. Probability threshold for person detections.
    -no_show                   Optional. Do not show processed video.
    -auto_resize               Optional. Enable resizable input with support of ROI crop and auto resize.
    -nireq                     Optional. Number of infer requests. 0 sets the number of infer requests equal to the number of inputs.
    -nc                        Required for web camera input. Maximum number of processed camera inputs (web cameras).
    -loop_video                Optional. Enable playing video on a loop.
    -n_iqs                     Optional. Number of allocated frames. It is a multiplier of the number of inputs.
    -ni                        Optional. Specify the number of channels generated from provided inputs (with -i and -nc keys). For example, if only one camera is provided, but -ni is set to 2, the demo will process frames as if they are captured from two cameras. 0 sets the number of input channels equal to the number of provided inputs.
    -fps                       Optional. Set the playback speed not faster than the specified FPS. 0 removes the upper bound.
    -n_wt                      Optional. Set the number of threads including the main thread a Worker class will use.
    -display_resolution        Optional. Specify the maximum output window resolution.
    -nstreams "<integer>"      Optional. Number of streams to use for inference on the CPU or/and GPU in throughput mode (for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>) 
    -nthreads "<integer>"      Optional. Number of threads to use for inference on the CPU (including HETERO and MULTI cases).
    -u                         Optional. List of monitors to show initially.

オプションのリストを空にしてアプリケーションを実行すると、エラーメッセージが表示されます。

例えば、OpenVINO ツールキットの事前トレーニング済みモデルを使用して GPU で推論を実行するには、次のコマンドを実行します:

./social_distance_demo -i <path_to_video>/inputVideo.mp4 -m_det <path_to_model>/person-detection-retail-0013.xml -m_reid <path_to_model>/person-reidentification-retail-0277.xml -d_det GPU

OpenVINO ツールキットの事前トレーニング済みモデルを使用して CPU 上で 2 つの非同期推論要求を使用して 2 つのビデオ入力の推論を実行するには、次のコマンドを実行します:

./social_distance_demo -i <path_to_video>/inputVideo_0.mp4 <path_to_video>/inputVideo_1.mp4 -m_det <path_to_model>/person-detection-retail-0013.xml -m_reid <path_to_model>/person-reidentification-retail-0277.xml -d_det CPU -d_reid CPU -nireq 2

デモの出力#

このデモでは、OpenCV を使用して、境界ボックスとテキストとしてレンダリングされた検出を含む結果フレームを表示します。デモレポート:

FPS: ビデオフレーム処理の平均レート (1 秒あたりのフレーム数)。
レイテンシー: 1 フレームの処理 (フレームの読み取りから結果の表示まで) に必要な平均時間。

これらのメトリックを使用して、アプリケーション・レベルのパフォーマンスを測定できます。