PyTorch モデルから OpenVINO™ IR への変換¶

この Jupyter ノートブックはオンラインで起動でき、ブラウザーのウィンドウで対話型環境を開きます。ローカルにインストールすることもできます。次のオプションのいずれかを選択します。

このチュートリアルでは、OpenVINO ランタイムを使用して PyTorch 分類モデルで推論を行う方法を段階的に説明します。OpenVINO 2023.0 リリース以降、OpenVINO は ONNX 形式に変換する中間ステップを必要とせず、直接 PyTorch モデルへの変換をサポートします。OpenVINO の以前のバージョンを使用する場合、または ONNX の使用する場合は、このチュートリアルを確認してください。

このチュートリアルでは、torchvision の RegNetY_800MF モデルを使用して、PyTorch モデルを OpenVINO 中間表現に変換する方法を説明します。

RegNet モデルは、Ilija Radosavovic、Raj Prateek Kosaraju、Ross Girshick、Kaiming He、Piotr Dollár による Designing Network Design Spaces で提案されました。著者らは、ニューラル・アーキテクチャー検索 (NAS) を実行する検索空間を設計しています。まず高次元の検索空間から開始し、現在の検索空間でサンプリングされた最もパフォーマンスの高いモデルに基づいて制約を実験的に適用することで、検索空間を反復的に縮小します。個々のネットワーク・インスタンスの設計に重点を置くのではなく、ネットワークの集団をパラメーター化するネットワーク設計空間を設計します。全体のプロセスは、従来のネットワークの手動設計に似ていますが、設計空間レベルにまで高められています。RegNet 設計空間は、さまざまなフロップレジームにわたって適切に機能するシンプルで高速なネットワークを提供します。

必要条件¶

ノートブックの依存関係をインストールします。

%pip install -q "openvino>=2023.1.0" scipy

                                        Note: you may need to restart the kernel to use updated packages.

                                    

入力データとラベルマップをダウンロードします。

                                        import requests
from pathlib import Path
from PIL import Image

MODEL_DIR = Path("model")
DATA_DIR = Path("data")

MODEL_DIR.mkdir(exist_ok=True)
DATA_DIR.mkdir(exist_ok=True)
MODEL_NAME = "regnet_y_800mf"

image = Image.open(requests.get("https://farm9.staticflickr.com/8225/8511402100_fea15da1c5_z.jpg", stream=True).raw)

labels_file = DATA_DIR / "imagenet_2012.txt"

if not labels_file.exists():
    resp = requests.get("https://raw.githubusercontent.com/openvinotoolkit/open_model_zoo/master/data/dataset_classes/imagenet_2012.txt")
    with labels_file.open("wb") as f:
        f.write(resp.content)

imagenet_classes = labels_file.open("r").read().splitlines()

                                    

PyTorch モデルのロード¶

一般に、PyTorch モデルは、モデルの重みを含む状態辞書によって初期化された torch.nn.Module クラスのインスタンスを表します。

事前トレーニングされたモデルを取得する一般的な手順:

クラスのインスタンスを作成します。
事前トレーニングされたモデルの重みを含むチェックポイント状態辞書をロードします。
一部の操作を推論モードに切り替えるためモデルを評価に切り替えます。

torchvision モジュールは、モデルクラスの初期化に使用できる関数セットを提供します。ここでは、torchvision.models.regnet_y_800mf を使用します。重み列挙型 RegNet_Y_800MF_Weights.DEFAULT を使用して、事前トレーニングされたモデルの重みをモデル初期化関数に直接渡すことができます。

                                        import torchvision

# get default weights using available weights Enum for model
weights = torchvision.models.RegNet_Y_800MF_Weights.DEFAULT

# create model topology and load weights
model = torchvision.models.regnet_y_800mf(weights=weights)

# switch model to inference mode
model.eval();

入力データを準備¶

以下のコードは、torchvision のモデル固有の変換モジュールを使用して入力データを前処理する方法を示しています。変換後、画像をバッチ化されたテンソルに連結する必要がありますが、この場合、バッチ 1 でモデルを実行するため、最初の次元の入力を圧縮解除するだけです。

                                            import torch

# Initialize the Weight Transforms
preprocess = weights.transforms()

# Apply it to the input image
img_transformed = preprocess(image)

# Add batch dimension to image tensor
input_tensor = img_transformed.unsqueeze(0)

PyTorch モデル推論を実行¶

モデルは生のロジット形式で確率のベクトルを返します。softmax を適用して [0, 1] の範囲で正規化された値を取得できます。元のモデルと変換された OpenVINO の出力が同じであることを示すため、後で再利用できる共通の後処理関数を定義しました。

                                            import numpy as np
from scipy.special import softmax

# Perform model inference on input tensor
result = model(input_tensor)

# Postprocessing function for getting results in the same way for both PyTorch model inference and OpenVINO
def postprocess_result(output_tensor:np.ndarray, top_k:int = 5):
    """
    Posprocess model results. This function applied sofrmax on output tensor and returns specified top_k number of labels with highest probability
    Parameters:
      output_tensor (np.ndarray): model output tensor with probabilities
      top_k (int, *optional*, default 5): number of labels with highest probability for return
    Returns:
      topk_labels: label ids for selected top_k scores
      topk_scores: selected top_k highest scores predicted by model
    """
    softmaxed_scores = softmax(output_tensor, -1)[0]
    topk_labels = np.argsort(softmaxed_scores)[-top_k:][::-1]
    topk_scores = softmaxed_scores[topk_labels]
    return topk_labels, topk_scores

# Postprocess results
top_labels, top_scores = postprocess_result(result.detach().numpy())

# Show results
display(image)
for idx, (label, score) in enumerate(zip(top_labels, top_scores)):
    _, predicted_label = imagenet_classes[label].split(" ", 1)
    print(f"{idx + 1}: {predicted_label} - {score * 100 :.2f}%")

                                        

../_images/102-pytorch-to-openvino-with-output_11_0.png

                                            1: tiger cat - 25.91%
Egyptian cat - 10.26%
computer keyboard, keypad - 9.22%
tabby, tabby cat - 9.09%
hamper - 2.35%

                                        

PyTorch モデル推論のベンチマーク¶

                                            %%timeit

# Run model inference
model(input_tensor)

17.6 ms ± 52.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

PyTorch モデルを OpenVINO 中間表現に変換¶

2023.0 リリース以降、OpenVINO は PyTorch モデルを OpenVINO 中間表現 (IR) 形式に直接変換できるようになりました。これらの目的には、OpenVINO モデル変換 API を使用する必要があります。PyTorchモデル変換の詳細については、OpenVINO のドキュメントを参照してください。

convert_model 関数は PyTorch モデル・オブジェクトを受け入れ、core.compile_model を使用してデバイスにロードする準備が整った openvino.Model インスタンスを返すか、ov.save_model によって次回使用するためにディスクに保存します。オプションで、次のような追加パラメーターを指定することもできます。

compress_to_fp16 - モデルの重みを FP16 データ形式に圧縮するフラグ。これにより、ディスク上のモデル保存に必要なスペースが削減され、FP16 計算がサポートされる推論デバイスの速度が向上する可能性があります。
example_input - モデルトレースに使用できる入力データサンプル。
input_shape - 変換のための入力テンソルの形状。

そして、モデル変換 Python API でサポートされるその他の高度なオプション。詳細はこちらのページをご覧ください。

                                        import openvino as ov

# Create OpenVINO Core object instance
core = ov.Core()

# Convert model to openvino.runtime.Model object
ov_model = ov.convert_model(model)

# Save openvino.runtime.Model object on disk
ov.save_model(ov_model, MODEL_DIR / f"{MODEL_NAME}_dynamic.xml")

ov_model

                                        <Model: 'Model30'
inputs[
<ConstOutput: names[x] shape[?,3,?,?] type: f32>
]
outputs[
<ConstOutput: names[x.21] shape[?,1000] type: f32>
]>

                                    

推論デバイスの選択¶

OpenVINO を使用して推論を実行するためにドロップダウン・リストからデバイスを選択します。

                                            import ipywidgets as widgets

device = widgets.Dropdown(
    options=core.available_devices + ["AUTO"],
    value='AUTO',
    description='Device:',
    disabled=False,
)

device

                                        

                                            Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')

                                        

                                            # Load OpenVINO model on device
compiled_model = core.compile_model(ov_model, device.value)
compiled_model

                                        

                                            <CompiledModel:
inputs[
<ConstOutput: names[x] shape[?,3,?,?] type: f32>
]
outputs[
<ConstOutput: names[x.21] shape[?,1000] type: f32>
]>

                                        

OpenVINO モデル推論を実行¶

                                            # Run model inference
result = compiled_model(input_tensor)[0]

# Posptorcess results
top_labels, top_scores = postprocess_result(result)

# Show results
display(image)
for idx, (label, score) in enumerate(zip(top_labels, top_scores)):
    _, predicted_label = imagenet_classes[label].split(" ", 1)
    print(f"{idx + 1}: {predicted_label} - {score * 100 :.2f}%")

                                        

../_images/102-pytorch-to-openvino-with-output_20_0.png

                                            1: tiger cat - 25.91%
Egyptian cat - 10.26%
computer keyboard, keypad - 9.22%
tabby, tabby cat - 9.09%
hamper - 2.35%

                                        

OpenVINO モデル推論のベンチマーク¶

                                            %%timeit

compiled_model(input_tensor)

3.42 ms ± 7.33 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

静的入力形状を使用して PyTorch モデルを変換¶

デフォルトの変換パスでは動的な入力形状が保持されるため、静的形状のモデルを変換する場合は、変換中に input_shape パラメーターを使用して明示的に指定するか、変換後にモデルを目的の形状に再形成します。モデルの再形成の例については、このチュートリアルを確認してください。

                                        # Convert model to openvino.runtime.Model object
ov_model = ov.convert_model(model, input=[[1,3,224,224]])
# Save openvino.runtime.Model object on disk
ov.save_model(ov_model, MODEL_DIR / f"{MODEL_NAME}_static.xml")
ov_model

                                    

                                        <Model: 'Model66'
inputs[
<ConstOutput: names[x] shape[1,3,224,224] type: f32>
]
outputs[
<ConstOutput: names[x.21] shape[1,1000] type: f32>
]>

                                    

推論デバイスの選択¶

OpenVINO を使用して推論を実行するためにドロップダウン・リストからデバイスを選択します。

device

                                            Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')

                                        

                                            # Load OpenVINO model on device
compiled_model = core.compile_model(ov_model, device.value)
compiled_model

                                        

                                            <CompiledModel:
inputs[
<ConstOutput: names[x] shape[1,3,224,224] type: f32>
]
outputs[
<ConstOutput: names[x.21] shape[1,1000] type: f32>
]>

                                        

ここで、変換されたモデルの入力は、以前変換されたモデルによって報告された [?, 3, ?, ?] ではなく、 [1, 3, 224, 224] 形状のテンソルであることが分かります。

静的な入力形状を使用して OpenVINO モデル推論を実行¶

                                            # Run model inference
result = compiled_model(input_tensor)[0]

# Posptorcess results
top_labels, top_scores = postprocess_result(result)

# Show results
display(image)
for idx, (label, score) in enumerate(zip(top_labels, top_scores)):
    _, predicted_label = imagenet_classes[label].split(" ", 1)
    print(f"{idx + 1}: {predicted_label} - {score * 100 :.2f}%")

                                        

../_images/102-pytorch-to-openvino-with-output_31_0.png

                                            1: tiger cat - 25.91%
Egyptian cat - 10.26%
computer keyboard, keypad - 9.22%
tabby, tabby cat - 9.09%
hamper - 2.35%

                                        

静的入力形状を使用した OpenVINO モデル推論のベンチマーク¶

                                            %%timeit

compiled_model(input_tensor)

2.9 ms ± 17.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

TorchScript モデルを OpenVINO 中間表現に変換¶

TorchScript は、PyTorch コードからシリアル化かつ最適化可能なモデルを作成します。任意の TorchScript プログラムを Python プロセスから保存し、Python に依存しないプロセスに読み込むことができます。TorchScript の詳細については、PyTorch のドキュメントを参照してください。

PyTorch モデルを TorchScript に変換するには、次の 2 つの方法があります。

torch.jit.script - 関数または nn.Module をスクリプト化すると、ソースコードを調査し、TorchScript コンパイラーで TorchScript コードとしてコンパイルし、ScriptModule または ScriptFunction を返します。
torch.jit.trace - 関数をトレースし、ジャストインタイム・コンパイルによって最適化される実行可能ファイルまたは ScriptFunction を返します。

両方のアプローチと、それらの OpenVINO IR への変換について考えてみましょう。

スクリプト化されたモデル¶

torch.jit.script モデルのソースコードを調査し、ScriptModule にコンパイルします。コンパイル後、モデルは推論に使用したり、torch.jit.save 関数を使用してディスクに保存したり、その後、元の PyTorch モデルコード定義なしで他の環境で torch.jit.load を使用して復元したりできます。

TorchScript 自体は Python 言語のサブセットであるため、Python のすべての機能が動作するわけではありませんが、TorchScript はテンソルを計算し、制御に依存する操作を行うのに十分な機能を提供します。完全なガイドについては、TorchScript 言語リファレンスを参照してください。

                                            # Get model path
scripted_model_path = MODEL_DIR / f"{MODEL_NAME}_scripted.pth"

# Compile and save model if it has not been compiled before or load compiled model
if not scripted_model_path.exists():
    scripted_model = torch.jit.script(model)
    torch.jit.save(scripted_model, scripted_model_path)
else:
    scripted_model = torch.jit.load(scripted_model_path)

# Run scripted model inference
result = scripted_model(input_tensor)

# Postprocess results
top_labels, top_scores = postprocess_result(result.detach().numpy())

# Show results
display(image)
for idx, (label, score) in enumerate(zip(top_labels, top_scores)):
    _, predicted_label = imagenet_classes[label].split(" ", 1)
    print(f"{idx + 1}: {predicted_label} - {score * 100 :.2f}%")

                                        

../_images/102-pytorch-to-openvino-with-output_35_0.png

                                            1: tiger cat - 25.91%
Egyptian cat - 10.26%
computer keyboard, keypad - 9.22%
tabby, tabby cat - 9.09%
hamper - 2.35%

                                        

クリプト化されたモデル推論のベンチマーク¶

                                            %%timeit

scripted_model(input_tensor)

13 ms ± 50.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

PyTorch クリプト化モデルを OpenVINO 中間表現に変換¶

スクリプトモデルから OpenVINO IR への変換手順は、元の PyTorch モデルと同様です。

                                            # Convert model to openvino.runtime.Model object
ov_model = ov.convert_model(scripted_model)

# Load OpenVINO model on device
compiled_model = core.compile_model(ov_model, device.value)

# Run OpenVINO model inference
result = compiled_model(input_tensor, device.value)[0]

# Postprocess results
top_labels, top_scores = postprocess_result(result)

# Show results
display(image)
for idx, (label, score) in enumerate(zip(top_labels, top_scores)):
    _, predicted_label = imagenet_classes[label].split(" ", 1)
    print(f"{idx + 1}: {predicted_label} - {score * 100 :.2f}%")

                                        

../_images/102-pytorch-to-openvino-with-output_39_0.png

                                            1: tiger cat - 25.91%
Egyptian cat - 10.26%
computer keyboard, keypad - 9.22%
tabby, tabby cat - 9.09%
hamper - 2.35%

                                        

スクリプト化されたモデルから変換された OpenVINO モデル推論のベンチマーク¶

                                            %%timeit

compiled_model(input_tensor)

3.42 ms ± 6.53 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

トレースされたモデル¶

torch.jit.trace を使用すると、既存のモジュールまたは Python 関数を TorchScript ScriptFunction または ScriptModule に変換できます。サンプル入力を提供する必要があり、モデルが実行され、すべてのテンソルに対して行われた操作が記録されます。

スタンドアロン関数の記録の結果、ScriptFunction が生成されます。
nn.Module.forward または nn.Module の結果の記録により、ScriptModule が生成されます。

スクリプトモデルと同様に、トレースされたモデルは推論に使用したり、torch.jit.save 関数を使用してディスクに保存したり、その後、元の PyTorch モデルコード定義なしで他の環境で torch.jit.load を使用して復元したりできます。

                                            # Get model path
traced_model_path = MODEL_DIR / f"{MODEL_NAME}_traced.pth"

# Trace and save model if it has not been traced before or load traced model
if not traced_model_path.exists():
    traced_model = torch.jit.trace(model, example_inputs=input_tensor)
    torch.jit.save(traced_model, traced_model_path)
else:
    traced_model = torch.jit.load(traced_model_path)

# Run traced model inference
result = traced_model(input_tensor)

# Postprocess results
top_labels, top_scores = postprocess_result(result.detach().numpy())

# Show results
display(image)
for idx, (label, score) in enumerate(zip(top_labels, top_scores)):
    _, predicted_label = imagenet_classes[label].split(" ", 1)
    print(f"{idx + 1}: {predicted_label} - {score * 100 :.2f}%")

                                        

../_images/102-pytorch-to-openvino-with-output_43_0.png

                                            1: tiger cat - 25.91%
Egyptian cat - 10.26%
computer keyboard, keypad - 9.22%
tabby, tabby cat - 9.09%
hamper - 2.35%

                                        

トレースされたモデルの推論のベンチマーク¶

                                            %%timeit

traced_model(input_tensor)

13.4 ms ± 4.67 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

PyTorch トレースモデルを OpenVINO 中間表現に変換¶

トレースされたモデルの OpenVINO IR への変換手順は、元の PyTorch モデルと同様です。

                                            # Convert model to openvino.runtime.Model object
ov_model = ov.convert_model(traced_model)

# Load OpenVINO model on device
compiled_model = core.compile_model(ov_model, device.value)

# Run OpenVINO model inference
result = compiled_model(input_tensor)[0]

# Postprocess results
top_labels, top_scores = postprocess_result(result)

# Show results
display(image)
for idx, (label, score) in enumerate(zip(top_labels, top_scores)):
    _, predicted_label = imagenet_classes[label].split(" ", 1)
    print(f"{idx + 1}: {predicted_label} - {score * 100 :.2f}%")

                                        

../_images/102-pytorch-to-openvino-with-output_47_0.png

                                            1: tiger cat - 25.91%
Egyptian cat - 10.26%
computer keyboard, keypad - 9.22%
tabby, tabby cat - 9.09%
hamper - 2.35%

                                        

トレースモデルから変換された OpenVINO モデル推論のベンチマーク¶

                                            %%timeit

compiled_model(input_tensor)[0]

3.47 ms ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)