OpenVINO™ による固有表現認識¶

この Jupyter ノートブックはオンラインで起動でき、ブラウザーのウィンドウで対話型環境を開きます。ローカルにインストールすることもできます。次のオプションのいずれかを選択します。

固有表現認識 (NER) は、非構造化テキスト内の重要な情報を検出し、それを事前定義されたカテゴリーに分類する自然言語処理方法です。これらのカテゴリーまたは名前付きエンティティーは、名前、場所、会社などのテキストの主要な意味を指します。

NER は、大量のテキストの概要を把握する場合に適した方法です。NER は、非構造化テキスト内の重要な情報を分析したり、大量のデータの情報抽出を自動化するタスクに役立ちます。

このチュートリアルでは、OpenVINO を使用して固有表現認識を実行する方法を示します。事前トレーニング済みモデル elastic/distilbert-base-cased-finetuned-conll03-english を使用します。これは、conll03 英語データセットでトレーニングされた DistilBERT ベースのモデルです。このモデルは、テキスト内の 4 つの名前付きエンティティー (人、場所、組織、および前の 3 つのグループに属さないさまざまなエンティティーの名前) を認識できます。モデルは大文字を検知します。

ユーザー体験を簡素化するため、Hugging Face Optimum ライブラリーを使用しモデルを OpenVINO™ IR 形式に変換して量子化します。

目次¶

必要条件
NER モデルをダウンロード
Hugging Face Optimum API を使用してモデルを量子化
オリジナルモデルと量子化モデルを比較
- パフォーマンスを比較
- モデルのサイズを比較
固有表現認識 OpenVINO ランタイムのデモを準備

必要条件¶

                                        %pip install -q "diffusers>=0.17.1" "openvino>=2023.1.0" "nncf>=2.5.0" "gradio" "onnx>=1.11.0" "transformers>=4.33.0" --extra-index-url https://download.pytorch.org/whl/cpu
%pip install -q "git+https://github.com/huggingface/optimum-intel.git"

NER モデルをダウンロード¶

Hugging Face Transformers ライブラリーを含む Hugging Face Hub から distilbert-base-cased-finetuned-conll03-english モデルを読み込みます。

モデルクラスの初期化は、from_pretrained メソッドの呼び出しから始まります。モデルを簡単に保存するには、save_pretrained() メソッドを使用します。

                                        from transformers import AutoTokenizer, AutoModelForTokenClassification

model_id = "elastic/distilbert-base-cased-finetuned-conll03-english"
model = AutoModelForTokenClassification.from_pretrained(model_id)

original_ner_model_dir = 'original_ner_model'
model.save_pretrained(original_ner_model_dir)

tokenizer = AutoTokenizer.from_pretrained(model_id)

Hugging Face Optimum API を使用してモデルを量子化¶

トレーニング後の静的量子化では、アクティベーション量子化パラメーターを計算するためネットワーク経由でデータが供給されるキャリブレーション・ステップが追加されます。量子化には、Hugging Face Optimum Intel API が使用されます。

NNCF 量子化プロセスには、OVQuantizer クラスを使用します。Hugging Face Optimum Intel API による量子化には次のステップが含まれます。

モデルクラスの初期化は、from_pretrained() メソッドの呼び出しから始まります。
次に、get_calibration_dataset() を使用して、トレーニング後の静的量子化キャリブレーション・ステップに使用するキャリブレーション・データセットを作成します。
モデルを量子化し、結果のモデルを OpenVINO IR 形式で quantize() メソッドを使用して save_directory に保存します。
次に、量子化されたモデルをロードします。Optimum Inference モデルは、Hugging Face Transformers モデルと API 互換性があり、AutoModelForXxx クラスを対応する OVModelForXxx クラスに置き換えるだけです。したがって、OVModelForTokenClassification を使用してモデルを読み込みます。

                                        from functools import partial
from optimum.intel import OVQuantizer

from optimum.intel import OVModelForTokenClassification

def preprocess_fn(data, tokenizer):
    examples = []
    for data_chunk in data["tokens"]:
        examples.append(' '.join(data_chunk))

    return tokenizer(
        examples, padding=True, truncation=True, max_length=128
    )

quantizer = OVQuantizer.from_pretrained(model)
calibration_dataset = quantizer.get_calibration_dataset(
    "conll2003",
    preprocess_function=partial(preprocess_fn, tokenizer=tokenizer),
    num_samples=100,
    dataset_split="train",
    preprocess_batch=True,
)

# The directory where the quantized model will be saved
quantized_ner_model_dir = "quantized_ner_model"

# Apply static quantization and save the resulting model in the OpenVINO IR format
quantizer.quantize(calibration_dataset=calibration_dataset, save_directory=quantized_ner_model_dir)

# Load the quantized model
optimized_model = OVModelForTokenClassification.from_pretrained(quantized_ner_model_dir)

                                    

                                        INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, onnx, openvino

                                    

                                        No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
/home/ea/work/openvino_notebooks/test_env/lib/python3.8/site-packages/datasets/load.py:2089: FutureWarning: 'use_auth_token' was deprecated in favor of 'token' in version 2.14.0 and will be removed in 3.0.0.
You can remove this warning by passing 'token=False' instead.
  warnings.warn(
No configuration describing the quantization process was provided, a default OVConfig will be generated.

                                    

                                        INFO:nncf:Not adding activation input quantizer for operation: 3 DistilBertForTokenClassification/DistilBertModel[distilbert]/Embeddings[embeddings]/NNCFEmbedding[position_embeddings]/embedding_0
INFO:nncf:Not adding activation input quantizer for operation: 2 DistilBertForTokenClassification/DistilBertModel[distilbert]/Embeddings[embeddings]/NNCFEmbedding[word_embeddings]/embedding_0
INFO:nncf:Not adding activation input quantizer for operation: 4 DistilBertForTokenClassification/DistilBertModel[distilbert]/Embeddings[embeddings]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 5 DistilBertForTokenClassification/DistilBertModel[distilbert]/Embeddings[embeddings]/NNCFLayerNorm[LayerNorm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 6 DistilBertForTokenClassification/DistilBertModel[distilbert]/Embeddings[embeddings]/Dropout[dropout]/dropout_0
INFO:nncf:Not adding activation input quantizer for operation: 16 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[0]/MultiHeadSelfAttention[attention]/__truediv___0
INFO:nncf:Not adding activation input quantizer for operation: 25 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[0]/MultiHeadSelfAttention[attention]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 30 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[0]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 31 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[0]/NNCFLayerNorm[sa_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 35 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[0]/__add___1
INFO:nncf:Not adding activation input quantizer for operation: 36 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[0]/NNCFLayerNorm[output_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 46 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[1]/MultiHeadSelfAttention[attention]/__truediv___0
INFO:nncf:Not adding activation input quantizer for operation: 55 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[1]/MultiHeadSelfAttention[attention]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 60 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[1]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 61 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[1]/NNCFLayerNorm[sa_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 65 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[1]/__add___1
INFO:nncf:Not adding activation input quantizer for operation: 66 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[1]/NNCFLayerNorm[output_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 76 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[2]/MultiHeadSelfAttention[attention]/__truediv___0
INFO:nncf:Not adding activation input quantizer for operation: 85 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[2]/MultiHeadSelfAttention[attention]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 90 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[2]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 91 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[2]/NNCFLayerNorm[sa_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 95 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[2]/__add___1
INFO:nncf:Not adding activation input quantizer for operation: 96 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[2]/NNCFLayerNorm[output_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 106 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[3]/MultiHeadSelfAttention[attention]/__truediv___0
INFO:nncf:Not adding activation input quantizer for operation: 115 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[3]/MultiHeadSelfAttention[attention]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 120 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[3]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 121 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[3]/NNCFLayerNorm[sa_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 125 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[3]/__add___1
INFO:nncf:Not adding activation input quantizer for operation: 126 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[3]/NNCFLayerNorm[output_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 136 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[4]/MultiHeadSelfAttention[attention]/__truediv___0
INFO:nncf:Not adding activation input quantizer for operation: 145 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[4]/MultiHeadSelfAttention[attention]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 150 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[4]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 151 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[4]/NNCFLayerNorm[sa_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 155 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[4]/__add___1
INFO:nncf:Not adding activation input quantizer for operation: 156 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[4]/NNCFLayerNorm[output_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 166 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[5]/MultiHeadSelfAttention[attention]/__truediv___0
INFO:nncf:Not adding activation input quantizer for operation: 175 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[5]/MultiHeadSelfAttention[attention]/matmul_1
INFO:nncf:Not adding activation input quantizer for operation: 180 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[5]/__add___0
INFO:nncf:Not adding activation input quantizer for operation: 181 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[5]/NNCFLayerNorm[sa_layer_norm]/layer_norm_0
INFO:nncf:Not adding activation input quantizer for operation: 185 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[5]/__add___1
INFO:nncf:Not adding activation input quantizer for operation: 186 DistilBertForTokenClassification/DistilBertModel[distilbert]/Transformer[transformer]/ModuleList[layer]/TransformerBlock[5]/NNCFLayerNorm[output_layer_norm]/layer_norm_0
INFO:nncf:Collecting tensor statistics |█               | 33 / 300
INFO:nncf:Collecting tensor statistics |███             | 66 / 300
INFO:nncf:Collecting tensor statistics |█████           | 99 / 300
INFO:nncf:Compiling and loading torch extension: quantized_functions_cpu...
INFO:nncf:Finished loading torch extension: quantized_functions_cpu

                                    

Using framework PyTorch: 2.1.0+cpu
/home/ea/work/openvino_notebooks/test_env/lib/python3.8/site-packages/nncf/torch/dynamic_graph/wrappers.py:82: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  result = operator(*args, **kwargs)
Configuration saved in quantized_ner_model/openvino_config.json
Compiling the model to CPU ...
Setting OpenVINO CACHE_DIR to quantized_ner_model/model_cache

オリジナルモデルと量子化モデルを比較¶

オリジナルの distilbert-base-cased-finetuned-conll03-english モデルと、量子化され OpenVINO IR 形式に変換されたモデルを比較して、違いを確認します。

パフォーマンスを比較¶

Optimum Inference モデルは Hugging Face Transformers モデルと API の互換性があるため、推論には Hugging Face Transformers API の pipleine() を使用するだけで済みます。

                                            from transformers import pipeline

ner_pipeline_optimized = pipeline("token-classification", model=optimized_model, tokenizer=tokenizer)

ner_pipeline_original = pipeline("token-classification", model=model, tokenizer=tokenizer)

                                            import time
import numpy as np

def calc_perf(ner_pipeline):
    inference_times = []

    for data in calibration_dataset:
        text = ' '.join(data['tokens'])
        start = time.perf_counter()
        ner_pipeline(text)
        end = time.perf_counter()
        inference_times.append(end - start)

    return np.median(inference_times)


print(
    f"Median inference time of quantized model: {calc_perf(ner_pipeline_optimized)} "
)

print(
    f"Median inference time of original model: {calc_perf(ner_pipeline_original)} "
)

                                        

                                            Median inference time of quantized model: 0.008135671014315449
Median inference time of original model: 0.108725632991991

モデルのサイズを比較¶

                                            from pathlib import Path

pytorch_model_file = Path(original_ner_model_dir) / "pytorch_model.bin"
if not pytorch_model_file.exists():
    pytorch_model_file = pytorch_model_file.parent / "model.safetensors"
print(f'Size of original model in Bytes is {pytorch_model_file.stat().st_size}')
print(f'Size of quantized model in Bytes is {Path(quantized_ner_model_dir, "openvino_model.bin").stat().st_size}')

                                        

                                            Size of original model in Bytes is 260803668
Size of quantized model in Bytes is 133539000

固有表現認識 OpenVINO ランタイムのデモを準備¶

これで、独自のテキストで NER モデルを試すことができます。テキストボックスに文章を入力し、送信ボタンをクリックすると、モデルがテキスト内の認識されたエンティティーにラベルを付けます。

                                        import gradio as gr

examples = [
    "My name is Wolfgang and I live in Berlin.",
]

def run_ner(text):
    output = ner_pipeline_optimized(text)
    return {"text": text, "entities": output}

demo = gr.Interface(run_ner,
                    gr.Textbox(placeholder="Enter sentence here...", label="Input Text"),
                    gr.HighlightedText(label="Output Text"),
                    examples=examples,
                    allow_flagging="never")

if __name__ == "__main__":
    try:
        demo.launch(debug=False)
    except Exception:
        demo.launch(share=True, debug=False)
# if you are launching remotely, specify server_name and server_port
# demo.launch(server_name='your server name', server_port='server port in int')
# Read more in the docs: https://gradio.app/docs/