OpenVINO によるプログラミング言語の分類

この Jupyter ノートブックはオンラインで起動でき、ブラウザーのウィンドウで対話型環境を開きます。ローカルにインストールすることもできます。次のオプションのいずれかを選択します。

Binder GitHub

概要

このチュートリアルは 2 つのパートに分かれています。

  1. OpenVINO™ IR 形式を使用して、事前トレーニング済みモデルでシンプルな推論パイプラインを作成します。
  2. トレーニング後の量子化を実施します。

Hugging Face Optimum を使用した事前トレーニング済みモデルとベンチマーク・パフォーマンス。

簡単にナビゲートできるよう、Jupyter または IDE のノートブック・アウトラインを自由に使用してください。

はじめに

プログラミング言語の分類は、任意のコードスニペットで使用されているプログラミング言語を識別するタスクです。これは、データセットに含める新しいデータにラベルを付けるのに役立ち、入力スニペットをプログラミング言語に基づいて処理する必要がある場合、中間ステップとして機能する可能性があります。

各プログラミング言語には独自の正式な記号、構文、文法があるため、これは比較的簡単なマシンラーニング・タスクです。ただし、潜在的なエッジケースがいくつかあります。

  • あいまいな短いスニペット: 例えば、TypeScript は JavaScript のスーパーセットであり、JavaScript でできることに加えて、それ以上を実行します。短い入力スニペットでは、これら 2 つを区別することは不可能である可能性があります。TypeScript がスーパーセットであることが分かっていて、モデルがそうではないことを考えると、後処理ステップで入力をデフォルトで JavaScript として分類する必要があります。
  • ネストされたプログラミング言語: 一部の言語は通常、並行して使用されます。例えば、ほとんどの HTML には CSS と JavaScript が含まれており、他のスクリプト言語に SQL がネストされていることも珍しくありません。このような入力では、期待される出力クラスが何であるかは不明です。
  • 進化するプログラミング言語: プログラミング言語は形式的ですが、記号、構文、文法は改訂および更新できます。例えば、セイウチ演算子 (:=) は Golang で独自に使用されるシンボルでしたが、後に Python 3.8 に導入されました。

    このノートブックで使用される分類モデルは、HuggingFace による CodeBERTa-language-id です。このモデルは、CodeSearchNet データセット (Husain、2019) でトレーニングされたマスク言語モデリング CodeBERTa-small-v1 から微調整されました。

    6 つのプログラミング言語をサポートしています: - Go - Java - JavaScript - PHP - Python - Ruby

パート 1: OpenVINO を使用した推論パイプライン

このセクションでは、特定のハードウェアでの推論を最適化し、OpenVINO ツールキットと統合することを目的とした HuggingFace Optimum ライブラリーを使用します。コードは HuggingFace トランスフォーマーと非常に似ていますが、モデルを OpenVINO™ IR 形式に自動的に変換できるようになります。

まず、リポジトリーのインストール手順を完了します。

次のセルがインストールされます。

  • OpenVINO サポート付きの HuggingFace Optimum
  • ベンチマーク結果の HuggingFace Evaluate
%pip install -q "diffusers>=0.17.1" "openvino>=2023.1.0" "nncf>=2.5.0" "gradio" "onnx>=1.11.0" "transformers>=4.33.0" "evaluate" --extra-index-url https://download.pytorch.org/whl/cpu
%pip install -q "git+https://github.com/huggingface/optimum-intel.git"
DEPRECATION: pytorch-lightning 1.6.5 has a non-standard dependency specifier torch>=1.8.*. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pytorch-lightning 1.6.5 requires protobuf<=3.20.1, but you have protobuf 4.25.2 which is incompatible.
tensorflow-metadata 1.14.0 requires protobuf<4.21,>=3.20.3, but you have protobuf 4.25.2 which is incompatible.
tf2onnx 1.16.1 requires protobuf~=3.20, but you have protobuf 4.25.2 which is incompatible.
Note: you may need to restart the kernel to use updated packages.
DEPRECATION: pytorch-lightning 1.6.5 has a non-standard dependency specifier torch>=1.8.*. pip 24.1 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Note: you may need to restart the kernel to use updated packages.

Optimum からの OVModelForSequenceClassification のインポートは、トランスフォーマーからの AutoModelForSequenceClassification と同等です。

from functools import partial
from pathlib import Path

import pandas as pd
from datasets import load_dataset, Dataset
import evaluate
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
from optimum.intel import OVModelForSequenceClassification
from optimum.intel.openvino import OVConfig, OVQuantizer
from huggingface_hub.utils import RepositoryNotFoundError
2024-02-10 00:24:44.251373: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-02-10 00:24:44.284790: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-10 00:24:44.788542: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, tensorflow, onnx, openvino
/opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-609/.workspace/scm/ov-notebook/.venv/lib/python3.8/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(

HuggingFace のリソースは、簡単にクリーンアップできるように、デバイスのグローバルキャッシュではなく、ローカルフォルダー ./model (次のノートブック) にダウンロードされます。詳細はこちらを参照してださい。

MODEL_NAME = "CodeBERTa-language-id"
MODEL_ID = f"huggingface/{MODEL_NAME}"
MODEL_LOCAL_PATH = Path("./model").joinpath(MODEL_NAME)

OpenVINO を使用して推論を実行するためにドロップダウン・リストからデバイスを選択します

import ipywidgets as widgets
import openvino as ov

core = ov.Core()

device = widgets.Dropdown(
    options=core.available_devices + ["AUTO"],
    value='AUTO',
    description='Device:',
    disabled=False,
)

device
Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')
# try to load resources locally
try:
    model = OVModelForSequenceClassification.from_pretrained(MODEL_LOCAL_PATH, device=device.value)
    tokenizer = AutoTokenizer.from_pretrained(MODEL_LOCAL_PATH)
    print(f"Loaded resources from local path: {MODEL_LOCAL_PATH.absolute()}")

# if not found, download from HuggingFace Hub then save locally
except (RepositoryNotFoundError, OSError):
    print("Downloading resources from HuggingFace Hub")
    tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
    tokenizer.save_pretrained(MODEL_LOCAL_PATH)

    # export=True is needed to convert the PyTorch model to OpenVINO
    model = OVModelForSequenceClassification.from_pretrained(MODEL_ID, export=True, device=device.value)
    model.save_pretrained(MODEL_LOCAL_PATH)
    print(f"Ressources cached locally at: {MODEL_LOCAL_PATH.absolute()}")
Downloading resources from HuggingFace Hub
Framework not specified. Using pt to export to ONNX.
Some weights of the model checkpoint at huggingface/CodeBERTa-language-id were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Using the export variant default. Available variants are:
                                    - default: The default ONNX variant.
Using framework PyTorch: 2.2.0+cpu
Overriding 1 configuration item(s)
- use_cache -> False
WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.base has been moved to tensorflow.python.trackable.base. The old module will be deleted in version 2.11.
[ WARNING ]  Please fix your imports. Module %s has been moved to %s. The old module will be deleted in version %s.
Compiling the model to AUTO ...
Ressources cached locally at: /opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-609/.workspace/scm/ov-notebook/notebooks/247-code-language-id/model/CodeBERTa-language-id
code_classification_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
device must be of type <class 'str'> but got <class 'torch.device'> instead
# change input snippet to test model
input_snippet = "df['speed'] = df.distance / df.time"
output = code_classification_pipe(input_snippet)

print(f"Input snippet:\n  {input_snippet}\n")
print(f"Predicted label: {output[0]['label']}")
print(f"Predicted score: {output[0]['score']:.2}")
Input snippet:
                                    df['speed'] = df.distance / df.time

Predicted label: python
Predicted score: 0.81

パート 2: HuggingFace Optimum を使用した OpenVINO トレーニング後の量子化

このセクションでは、トレーニング済みのモデルを量子化します。大まかに言えば、このプロセスはモデル内でより低精度の数値を使用することで、モデルのサイズが小さくなり、推論が高速化されますが、パフォーマンスがわずかに低下する可能性があります。さらに詳しく

HuggingFace Optimum ライブラリーは、OpenVINO のトレーニング後の量子化をサポートします。さらに詳しく

QUANTIZED_MODEL_LOCAL_PATH = MODEL_LOCAL_PATH.with_name(f"{MODEL_NAME}-quantized")
DATASET_NAME = "code_search_net"
LABEL_MAPPING = {"go": 0, "java": 1, "javascript": 2, "php": 3, "python": 4, "ruby": 5}


def preprocess_function(examples: dict, tokenizer):
    """Preprocess inputs by tokenizing the `func_code_string` column"""
    return tokenizer(
        examples["func_code_string"],
        padding="max_length",
        max_length=tokenizer.model_max_length,
        truncation=True,
    )


def map_labels(example: dict) -> dict:
    """Convert string labels to integers"""
    label_mapping = {"go": 0, "java": 1, "javascript": 2, "php": 3, "python": 4, "ruby": 5}
    example["language"] = label_mapping[example["language"]]
    return example


def get_dataset_sample(dataset_split: str, num_samples: int) -> Dataset:
    """Create a sample with equal representation of each class without downloading the entire data"""
    labels = ["go", "java", "javascript", "php", "python", "ruby"]
    example_per_label = num_samples // len(labels)

    examples = []
    for label in labels:
        subset = load_dataset("code_search_net", split=dataset_split, name=label, streaming=True)
        subset = subset.map(map_labels)
        examples.extend([example for example in subset.shuffle().take(example_per_label)])

    return Dataset.from_list(examples)

注: ベースモデルは AutoModelForSequenceClassification を使用してロードされます Transformers

tokenizer = AutoTokenizer.from_pretrained(MODEL_LOCAL_PATH)
base_model = AutoModelForSequenceClassification.from_pretrained(MODEL_ID)

quantizer = OVQuantizer.from_pretrained(base_model)
quantization_config = OVConfig()
Some weights of the model checkpoint at huggingface/CodeBERTa-language-id were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

get_dataset_sample() 関数は、6 つのプログラミング言語にわたって同数の例を num_samples までサンプリングします。

注: 完全なデータセット (5 GB 以上) をダウンロードして使用するには、以下のメソッドのコメントを解除してください。

calibration_sample = get_dataset_sample(dataset_split="train", num_samples=120)
calibration_sample = calibration_sample.map(partial(preprocess_function, tokenizer=tokenizer))

# calibration_sample = quantizer.get_calibration_dataset(
#     DATASET_NAME,
#     preprocess_function=partial(preprocess_function, tokenizer=tokenizer),
#     num_samples=120,
#     dataset_split="train",
#     preprocess_batch=True,
# )
/opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-609/.workspace/scm/ov-notebook/.venv/lib/python3.8/site-packages/datasets/load.py:1454: FutureWarning: The repository for code_search_net contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/code_search_net
You can avoid this message in future by passing the argument trust_remote_code=True.
Passing trust_remote_code=True will be mandatory to load this dataset from the next major release of datasets.
  warnings.warn(
Map:   0%|          | 0/120 [00:00<?, ? examples/s]

quantizer.quantize(...) を呼び出すと、キャリブレーション・データセットを反復処理してモデルを量子化して保存します。

quantizer.quantize(
    quantization_config=quantization_config,
    calibration_dataset=calibration_sample,
    save_directory=QUANTIZED_MODEL_LOCAL_PATH,
)
The argument quantization_config is deprecated, and will be removed in optimum-intel v1.6.0, please use ov_config instead
INFO:nncf:Not adding activation input quantizer for operation: 15 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[token_type_embeddings]/embedding_0
INFO:nncf:Not adding activation input quantizer for operation: 14 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[word_embeddings]/embedding_0
INFO:nncf:Not adding activation input quantizer for operation: 6 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/ne_0
INFO:nncf:Not adding activation input quantizer for operation: 7 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/int_0
INFO:nncf:Not adding activation input quantizer for operation: 8 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/cumsum_0
INFO:nncf:Not adding activation input quantizer for operation: 16 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___2
INFO:nncf:Not adding activation input quantizer for operation: 9 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/type_as_0
INFO:nncf:Not adding activation input quantizer for operation: 10 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 11 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__mul___0
INFO:nncf:Not adding activation input quantizer for operation: 12 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/long_0
INFO:nncf:Not adding activation input quantizer for operation: 13 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__add___1
INFO:nncf:Not adding activation input quantizer for operation: 17 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFEmbedding[position_embeddings]/embedding_0
INFO:nncf:Not adding activation input quantizer for operation: 18 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/__iadd___0
INFO:nncf:Not adding activation input quantizer for operation: 19 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 20 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEmbeddings[embeddings]/Dropout[dropout]/dropout_0
INFO:nncf:Not adding activation input quantizer for operation: 33 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 36 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 42 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 43 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 48 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 49 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[0]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 62 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 65 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 71 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 72 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 77 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 78 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[1]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 91 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 94 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 100 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 101 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 106 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 107 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[2]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 120 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 123 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 129 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 130 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 135 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 136 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[3]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 149 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 152 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 158 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 159 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 164 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 165 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[4]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 178 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfAttention[self]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 181 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfAttention[self]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 187 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 188 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaAttention[attention]/RobertaSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 193 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaOutput[output]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 194 RobertaForSequenceClassification/RobertaModel[roberta]/RobertaEncoder[encoder]/ModuleList[layer]/RobertaLayer[5]/RobertaOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Collecting tensor statistics |█               | 33 / 300
INFO:nncf:Collecting tensor statistics |███             | 66 / 300
INFO:nncf:Collecting tensor statistics |█████           | 99 / 300
INFO:nncf:Compiling and loading torch extension: quantized_functions_cpu...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using tokenizers before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using tokenizers before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using tokenizers before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using tokenizers before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
INFO:nncf:Finished loading torch extension: quantized_functions_cpu
Using framework PyTorch: 2.2.0+cpu
Overriding 1 configuration item(s)
- use_cache -> False
Configuration saved in model/CodeBERTa-language-id-quantized/openvino_config.json

注: 量子化されたモデルはすでに OpenVINO 形式であるため、引数 export=True は必要ありません。

quantized_model = OVModelForSequenceClassification.from_pretrained(QUANTIZED_MODEL_LOCAL_PATH, device=device.value)
quantized_code_classification_pipe = pipeline("text-classification", model=quantized_model, tokenizer=tokenizer)
Compiling the model to AUTO ...
device must be of type <class 'str'> but got <class 'torch.device'> instead
input_snippet = "df['speed'] = df.distance / df.time"
output = quantized_code_classification_pipe(input_snippet)

print(f"Input snippet:\n  {input_snippet}\n")
print(f"Predicted label: {output[0]['label']}")
print(f"Predicted score: {output[0]['score']:.2}")
Input snippet:
                                    df['speed'] = df.distance / df.time

Predicted label: python
Predicted score: 0.82

注: 完全なデータセット (5 GB 以上) をダウンロードして使用するには、以下のメソッドのコメントを解除してください。

validation_sample = get_dataset_sample(dataset_split="validation", num_samples=120)

# validation_sample = load_dataset(DATASET_NAME, split="validation")
/opt/home/k8sworker/ci-ai/cibuilds/ov-notebook/OVNotebookOps-609/.workspace/scm/ov-notebook/.venv/lib/python3.8/site-packages/datasets/load.py:1454: FutureWarning: The repository for code_search_net contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/code_search_net
You can avoid this message in future by passing the argument trust_remote_code=True.
Passing trust_remote_code=True will be mandatory to load this dataset from the next major release of datasets.
  warnings.warn(
# This class is needed due to a current limitation of the Evaluate library with multiclass metrics
# ref: https://discuss.huggingface.co/t/combining-metrics-for-multiclass-predictions-evaluations/21792/16
class ConfiguredMetric:
    def __init__(self, metric, *metric_args, **metric_kwargs):
        self.metric = metric
        self.metric_args = metric_args
        self.metric_kwargs = metric_kwargs

    def add(self, *args, **kwargs):
        return self.metric.add(*args, **kwargs)

    def add_batch(self, *args, **kwargs):
        return self.metric.add_batch(*args, **kwargs)

    def compute(self, *args, **kwargs):
        return self.metric.compute(*args, *self.metric_args, **kwargs, **self.metric_kwargs)

    @property
    def name(self):
        return self.metric.name

    def _feature_names(self):
        return self.metric._feature_names()

まず、テキスト分類用Evaluator オブジェクトと EvaluationModule のセットがインスタンス化されます。次に、エバリュエーターの .compute() メソッドが、基本 code_classification_pipe と量子化された quantized_code_classification_pipeline の両方で呼び出されます。最後に結果が表示されます。

code_classification_evaluator = evaluate.evaluator("text-classification")
# instantiate an object that can contain multiple `evaluate` metrics
metrics = evaluate.combine([
    ConfiguredMetric(evaluate.load('f1'), average='macro'),
])

base_results = code_classification_evaluator.compute(
    model_or_pipeline=code_classification_pipe,
    data=validation_sample,
    input_column="func_code_string",
    label_column="language",
    label_mapping=LABEL_MAPPING,
    metric=metrics,
)

quantized_results = code_classification_evaluator.compute(
    model_or_pipeline=quantized_code_classification_pipe,
    data=validation_sample,
    input_column="func_code_string",
    label_column="language",
    label_mapping=LABEL_MAPPING,
    metric=metrics,
)

results_df = pd.DataFrame.from_records([base_results, quantized_results], index=["base", "quantized"])
results_df
f1 total_time_in_seconds samples_per_second latency_in_seconds
ベース 1.0 2.038191 58.875750 0.016985
量子化された 1.0 2.669737 44.948249 0.022248

クリーンナップ

以下のセルのコメントを解除して実行し、./model にローカルにキャッシュされているすべてのリソースを削除します。

# import os
# import shutil

# try:
#     shutil.rmtree(path=QUANTIZED_MODEL_LOCAL_PATH)
#     shutil.rmtree(path=MODEL_LOCAL_PATH)
#     os.remove(path="./compressed_graph.dot")
#     os.remove(path="./original_graph.dot")
# except FileNotFoundError:
#     print("Directory was already deleted")