OpenVINO による文法エラー修正
この Jupyter ノートブックは、ローカルへのインストール 後にのみ起動できます。
AI ベースの自動修正製品は、使いやすさ、編集速度、手頃な価格のため、ますます人気が高まっています。これらの製品は、電子メール、ブログ、チャットのテキストの品質を向上させます。
文法エラー修正 (GEC) は、スペル、句読点、文法、単語選択のエラーなど、テキスト内のさまざまな種類のエラーを修正するタスクです。GEC は通常、文の修正タスクとして策定されます。GEC システムは、誤りの可能性がある文を入力として受け取り、それをより正しいバージョンに変換することが期待されます。以下の例を参照してください:
入力 (エラーあり)
出力 (修正済み)
I like to rides my bicycle.
I like to ride my bicycle.
下の画像に示すように、書き言葉におけるさまざまな種類のエラーを修正できます。
error_types
このチュートリアルでは、OpenVINO を使用して文法エラー修正を実行する方法を説明します。Hugging Face トランスフォーマー ・ライブラリーの事前トレーニング済みモデルを使用します。ユーザー・エクスペリエンスを簡素化するために、Hugging Face の Optimum ライブラリーを使用してモデルを OpenVINO™ IR 形式に変換します。
これは次の手順で構成されます:
前提条件のインストール
OpenVINO と Hugging Face Optimum の統合を使用して、パブリックソースからモデルをダウンロードして変換します。
文法エラーチェックのための推論パイプラインを作成します
NNCF 量子化による文法修正パイプラインの最適化
パフォーマンスと精度の観点から、元のパイプラインと最適化されたパイプラインを比較します。
目次:
どのような仕組みか ?
文法エラー修正タスクは、文法的に誤った文を入力として受け取り、正しい文を出力として返すようにモデルをトレーニングするシーケンスツーシーケンス・タスクと考えることができます。JFLEG データセットの拡張バージョンで微調整された FLAN-T5 モデルを使用します。
スケーリング命令 - 微調整された言語モデル 論文とともにリリースされた FLAN-T5 のバージョンは、タスクの組み合わせに基づいて微調整された T5 の拡張バージョンです。この論文では、タスク数のスケーリング、モデルサイズのスケーリング、思考連鎖データの微調整に特に焦点を当てて、命令の微調整について説明します。この論文では、全体の命令の微調整が、事前トレーニング済み言語モデルのパフォーマンスと使いやすさを向上させる一般的な方法であることが分かりました。
flan-t5_training
モデルの詳細については、論文 、オリジナルのリポジトリー 、Hugging Face モデルカード をご覧ください
さらに、処理する文の数を減らすため、文法の正確さチェックを実行することもできます。このタスクは、モデルが入力テキストを取得し、テキストに文法エラーが含まれている場合はラベル 1 を予測し、含まれていない場合はラベル 0 を予測する、単純なバイナリーテキスト分類として考える必要があります。CoLA データセットで微調整された RoBERTa ベースモデルである roberta-base-CoLA モデルを使用します。RoBERTa モデルは、RoBERTa: A Robustly Optimized BERT Pretraining Approach (英語) 論文で提案されました。これは BERT を基盤として主要なハイパーパラメーターを変更し、次の文の事前トレーニング目標を削除して、はるかに大きなミニバッチと学習率でトレーニングします。モデルに関する詳細は、Meta Ai のブログ投稿 と Hugging Face のドキュメント で確認できます。
FLAN-T5 と RoBERTa について詳しく理解できたので、始めましょう。🚀
必要条件
最初に、OpenVINO 統合によって高速化された Hugging Face
Optimum ライブラリーをインストールする必要があります。Hugging Face Optimum API は、Hugging Face Transformers ライブラリーのモデルを OpenVINO™ IR 形式に変換および量子化できる高レベル API です。詳細については、Hugging Face Optimum のドキュメント を参照してください。
% pip install -q "torch>=2.1.0" "git+https://github.com/huggingface/optimum-intel.git" "openvino>=2024.0.0" onnx tqdm "gradio>=4.19" "transformers>=4.33.0" --extra-index-url https://download.pytorch.org/whl/cpu
% pip install -q "nncf>=2.9.0" datasets jiwer
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Note : you may need to restart the kernel to use updated packages .
Note : you may need to restart the kernel to use updated packages .
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
モデルのダウンロードと変換
Optimum Intel を使用すると、Hugging Face ハブ から最適化されたモデルをロードし、Hugging Face API を使用して OpenVINO ランタイムで推論を実行するパイプラインを作成できます。Optimum 推論モデルは、Hugging Face Transformers モデルと API の互換性があります。つまり、AutoModelForXxx
クラスを対応する OVModelForXxx
クラスに置き換えるだけで済みます。
以下は RoBERTa テキスト分類モデルの例です。
-from transformers import AutoModelForSequenceClassification
+from optimum.intel.openvino import OVModelForSequenceClassification
from transformers import AutoTokenizer, pipeline
model_id = "textattack/roberta-base-CoLA"
-model = AutoModelForSequenceClassification.from_pretrained(model_id)
+model = OVModelForSequenceClassification.from_pretrained(model_id, from_transformers=True)
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
モデルクラスの初期化は、from_pretrained
メソッドの呼び出しから始まります。トランスフォーマー・モデルをダウンロードして変換する場合は、パラメーター from_transformers=True
を追加する必要があります。save_pretrained
メソッドを使用して、変換されたモデルを次回に使用するため保存できます。トークナイザー・クラスとパイプライン API は Optimum モデルと互換性があります。
from pathlib import Path
from transformers import pipeline , AutoTokenizer
from optimum.intel.openvino import OVModelForSeq2SeqLM , OVModelForSequenceClassification
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
2024-03-25 11:56:04.043628: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on.You may see slightly different numerical results due to floating-point round-off errors from different computation orders.To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0 .
2024-03-25 11:56:04.045940: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-25 11:56:04.079112: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-25 11:56:04.079147: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-25 11:56:04.079167: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-25 11:56:04.085243: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-25 11:56:04.085971: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-25 11:56:05.314633: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
INFO : nncf : NNCF initialized successfully . Supported frameworks detected : torch , tensorflow , onnx , openvino
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
推論デバイスの選択
OpenVINO を使用して推論を実行するためにドロップダウン・リストからデバイスを選択します
import ipywidgets as widgets
import openvino as ov
core = ov . Core ()
device = widgets . Dropdown (
options = core . available_devices + [ "AUTO" ],
value = "AUTO" ,
description = "Device:" ,
disabled = False ,
)
device
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Dropdown ( description = 'Device:' , index = 3 , options = ( 'CPU' , 'GPU.0' , 'GPU.1' , 'AUTO' ), value = 'AUTO' )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
文法チェッカー
grammar_checker_model_id = "textattack/roberta-base-CoLA"
grammar_checker_dir = Path ( "roberta-base-cola" )
grammar_checker_tokenizer = AutoTokenizer . from_pretrained ( grammar_checker_model_id )
if grammar_checker_dir . exists ():
grammar_checker_model = OVModelForSequenceClassification . from_pretrained ( grammar_checker_dir , device = device . value )
else :
grammar_checker_model = OVModelForSequenceClassification . from_pretrained ( grammar_checker_model_id , export = True , device = device . value , load_in_8bit = False )
grammar_checker_model . save_pretrained ( grammar_checker_dir )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Framework not specified.Using pt to export the model. Some weights of the model checkpoint at textattack/roberta-base-CoLA were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).Using the export variant default.Available variants are: - default: The default ONNX variant.Using framework PyTorch: 2.2.1+cpu
Overriding 1 configuration item(s)
- use_cache -> False
/home/ea/miniconda3/lib/python3.11/site-packages/transformers/modeling_utils.py:4225: FutureWarning: _is_quantized_training_enabled is going to be deprecated in transformers 4.39.0.Please use model.hf_quantizer.is_trainable instead
warnings.warn(
Compiling the model to AUTO ...
text-classification
タスクの推論パイプラインを使用して、モデルの動作を確認しましょう。Hugging Face 推論パイプラインの使用法の詳細については、チュートリアル をご覧ください
input_text = "They are moved by salar energy"
grammar_checker_pipe = pipeline (
"text-classification" ,
model = grammar_checker_model ,
tokenizer = grammar_checker_tokenizer ,
)
result = grammar_checker_pipe ( input_text )[ 0 ]
print ( f "input text: { input_text } " )
print ( f 'predicted label: { "contains_errors" if result [ "label" ] == "LABEL_1" else "no errors" } ' )
print ( f 'predicted score: { result [ "score" ] : .2 } ' )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
input text : They are moved by salar energy
predicted label : contains_errors
predicted score : 0.88
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
これで完了です! モデルはサンプル内のエラーを検出できるようです。
文法コレクター
文法コレクターモデルをロードする手順は、使用されるモデルクラスを除いて、非常に似ています。FLAN-T5 はシーケンス間テキスト生成モデルであるため、これを実行するには OVModelForSeq2SeqLM
クラスと text2text-generation
生成パイプラインを使用する必要があります。
grammar_corrector_model_id = "pszemraj/flan-t5-large-grammar-synthesis"
grammar_corrector_dir = Path ( "flan-t5-large-grammar-synthesis" )
grammar_corrector_tokenizer = AutoTokenizer . from_pretrained ( grammar_corrector_model_id )
if grammar_corrector_dir . exists ():
grammar_corrector_model = OVModelForSeq2SeqLM . from_pretrained ( grammar_corrector_dir , device = device . value )
else :
grammar_corrector_model = OVModelForSeq2SeqLM . from_pretrained ( grammar_corrector_model_id , export = True , device = device . value )
grammar_corrector_model . save_pretrained ( grammar_corrector_dir )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Framework not specified.Using pt to export the model.Using the export variant default.Available variants are:
- default: The default ONNX variant. Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model ) instead. This warning will be raised to an exception in v4.41. Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4} Using framework PyTorch: 2.2.1+cpu
Overriding 1 configuration item(s)
- use_cache -> False
/home/ea/miniconda3/lib/python3.11/site-packages/transformers/modeling_utils.py:4225: FutureWarning: _is_quantized_training_enabled is going to be deprecated in transformers 4.39.0.Please use model.hf_quantizer.is_trainable instead
warnings.warn(
Using framework PyTorch: 2.2.1+cpu /home/ea/miniconda3/lib/python3.11/site-packages/transformers/modeling_utils.py:943: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect.We can't record the data flow of Python values, so this value will be treated as a constant in the future.This means that the trace might not generalize to other inputs!
If causal_mask.shape[1] < attention_mask.shape[1]: Using framework PyTorch: 2.2.1+cpu
Overriding 1 configuration item(s)
- use_cache -> True
/home/ea/miniconda3/lib/python3.11/site-packages/transformers/models/t5/modeling_t5.py:509: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect.We can't record the data flow of Python values, so this value will be treated as a constant in the future.This means that the trace might not generalize to other inputs!
elif past_key_value.shape[2] != key_value_states.shape[1]: Some non-default generation parameters are set in the model config.These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model ) instead.This warning will be raised to an exception in v4.41.Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
Compiling the encoder to AUTO ...
Compiling the decoder to AUTO ...
Compiling the decoder to AUTO ...
Some non-default generation parameters are set in the model config.These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model ) instead.This warning will be raised to an exception in v4.41.Non-default generation parameters: {'max_length': 512, 'min_length': 8, 'num_beams': 2, 'no_repeat_ngram_size': 4}
grammar_corrector_pipe = pipeline (
"text2text-generation" ,
model = grammar_corrector_model ,
tokenizer = grammar_corrector_tokenizer ,
)
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
result = grammar_corrector_pipe ( input_text )[ 0 ]
print ( f "input text: { input_text } " )
print ( f 'generated text: { result [ "generated_text" ] } ' )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
input text : They are moved by salar energy
generated text : They are powered by solar energy .
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
結果はかなり良さそうです!
デモ・パイプラインの準備
それでは、すべてをまとめて文法修正のパイプラインを作成しましょう。パイプラインは入力テキストを受け入れ、その正確性を検証し、必要に応じて正しいバージョンを生成します。こはいくつかのステップから構成されます:
文章ごとにテキストを分割します。
文法チェッカーを使用して各文の文法の正確さを確認します。
必要に応じて、文の改善版を生成します。
import re
import transformers
from tqdm.notebook import tqdm
def split_text ( text : str ) -> list :
"""
Split a string of text into a list of sentence batches.
Parameters:
text (str): The text to be split into sentence batches.
Returns:
list: A list of sentence batches. Each sentence batch is a list of sentences.
"""
# 正規表現を使用してテキストを文に分割
sentences = re . split ( r "(?<=[^A-Z].[.?])+(?=[A-Z])" , text )
# 文のバッチを保存するリストを初期化
sentence_batches = []
# 現在の文のバッチを保存する一時リストを初期化
temp_batch = []
# 文章を繰り返し
for sentence in sentences :
# 一時バッチに文を追加
temp_batch . append ( sentence )
# 一時バッチの長さが 2 から 3 文の間、または最後のバッチである場合は、それを文バッチのリストに追加
if len ( temp_batch ) >= 2 and len ( temp_batch ) <= 3 or sentence == sentences [ - 1 ]:
sentence_batches . append ( temp_batch )
temp_batch = []
return sentence_batches
def correct_text (
text : str ,
checker : transformers . pipelines . Pipeline ,
corrector : transformers . pipelines . Pipeline ,
separator : str = " " ,
) -> str :
"""
Correct the grammar in a string of text using a text-classification and text-generation pipeline.
Parameters:
text (str): The inpur text to be corrected.
checker (transformers.pipelines.Pipeline): The text-classification pipeline to use for checking the grammar quality of the text.
corrector (transformers.pipelines.Pipeline): The text-generation pipeline to use for correcting the text.
separator (str, optional): The separator to use when joining the corrected text into a single string. Default is a space character.
Returns:
str: The corrected text.
"""
# テキストを文のバッチに分割
sentence_batches = split_text ( text )
# 修正したテキストを保存するリストを初期化
corrected_text = []
# 文のバッチを反復処理
for batch in tqdm ( sentence_batches , total = len ( sentence_batches ), desc = "correcting text.." ):
# バッチ内の文を 1 つの文字列に結合
raw_text = " " . join ( batch )
# テキスト分類パイプラインを使用してテキストの文法品質をチェック
results = checker ( raw_text )
# テキスト分類の結果が LABEL_1 でないか、スコアが 0.9 未満の LABEL_1 である場合にのみテキストを修正
if results [ 0 ][ "label" ] != "LABEL_1" or ( results [ 0 ][ "label" ] == "LABEL_1" and results [ 0 ][ "score" ] < 0.9 ):
# テキスト生成パイプラインを使用してテキストを修正
corrected_batch = corrector ( raw_text )
corrected_text . append ( corrected_batch [ 0 ][ "generated_text" ])
else :
corrected_text . append ( raw_text )
# 修正したテキストを 1 つの文字列に結合
corrected_text = separator . join ( corrected_text )
return corrected_text
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
実際に動作を見てみましょう。
default_text = (
"Most of the course is about semantic or content of language but there are also interesting"
" topics to be learned from the servicefeatures except statistics in characters in documents.At"
" this point, He introduces herself as his native English speaker and goes on to say that if"
" you contine to work on social scnce"
)
corrected_text = correct_text ( default_text , grammar_checker_pipe , grammar_corrector_pipe )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
correcting text..: 0%| | 0/1 [00:00<?, ?it/s]
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
print ( f "input text: { default_text } \n " )
print ( f "generated text: { corrected_text } " )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
input text : Most of the course is about semantic or content of language but there are also interesting topics to be learned from the servicefeatures except statistics in characters in documents . At this point , He introduces herself as his native English speaker and goes on to say that if you contine to work on social scnce
generated text : Most of the course is about the semantic content of language but there are also interesting topics to be learned from the service features except statistics in characters in documents . At this point , she introduces herself as a native English speaker and goes on to say that if you continue to work on social science , you will continue to be successful .
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
量子化
NNCF は、量子化レイヤーをモデルグラフに追加し、トレーニング・データセットのサブセットを使用してこれらの追加の量子化レイヤーのパラメーターを初期化することで、トレーニング後の量子化を可能にします。量子化操作は FP32
/FP16
ではなく INT8
で実行されるため、モデル推論が高速化されます。
文法チェッカーモデルはテキスト修正パイプライン全体のごく一部を占めるため、文法コレクターモデルのみを最適化します。文法コレクター自体は、エンコーダー、ファースト・コール・デコーダー、履歴付きデコーダーの 3 つのモデルで構成されています。最後のモデルの推論のシェアが他のモデルを圧倒します。このため、それだけを量子化します。
最適化プロセスには次の手順が含まれます:
量子化用のキャリブレーション・データセットを作成します。
nncf.quantize()
を実行して、量子化されたモデルを取得します。
openvino.save_model()
関数を使用して INT8
モデルをシリアル化します。
モデルの推論速度を向上させるため量子化を実行するかどうかを以下で選択してください。
to_quantize = widgets . Checkbox (
value = True ,
description = "Quantization" ,
disabled = False ,
)
to_quantize
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Checkbox ( value = True , description = 'Quantization' )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
量子化の実行
以下では量子化されたモデルを取得します。ソースコードについては utils.py
を参照してください。量子化には比較的時間がかかり、完了するまでに時間がかかります。
from utils import get_quantized_pipeline , CALIBRATION_DATASET_SIZE
grammar_corrector_pipe_fp32 = grammar_corrector_pipe
grammar_corrector_pipe_int8 = None
if to_quantize . value :
quantized_model_path = Path ( "quantized_decoder_with_past" ) / "openvino_model.xml"
grammar_corrector_pipe_int8 = get_quantized_pipeline (
grammar_corrector_pipe_fp32 ,
grammar_corrector_tokenizer ,
core ,
grammar_corrector_dir ,
quantized_model_path ,
device . value ,
calibration_dataset_size = CALIBRATION_DATASET_SIZE ,
)
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Downloading readme: 0%| | 0.00/5.94k [00:00<?, ?B/s]
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Downloading data: 100%|
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 148k/148k [00:01<00:00, 79.1kB/s]
Downloading data: 100%|
█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 141k/141k [00:01<00:00, 131kB/s]
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Generating validation split: 0%| | 0/755 [00:00<?, ?例/s]
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Generating test split: 0%| | 0/748 [00:00<?, ?例/s]
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Collecting calibration data: 0%| | 0/10 [00:00<?, ?it/s]
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Output ()
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Output ()
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
INFO : nncf : 72 ignored nodes were found by name in the NNCFGraph
INFO : nncf : 145 ignored nodes were found by name in the NNCFGraph
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Output ()
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Compiling the encoder to AUTO ...
Compiling the decoder to AUTO ...
Compiling the decoder to AUTO ...
Compiling the decoder to AUTO ...
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
修正結果を見てみましょう。量子化された INT8 モデルと元の FP32 モデルに対して生成されたテキストはほぼ同じになるはずです。
if to_quantize . value :
corrected_text_int8 = correct_text ( default_text , grammar_checker_pipe , grammar_corrector_pipe_int8 )
print ( f "Input text: { default_text } \n " )
print ( f "Generated text by INT8 model: { corrected_text_int8 } " )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
correcting text..: 0%| | 0/1 [00:00<?, ?it/s]
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
Input text : Most of the course is about semantic or content of language but there are also interesting topics to be learned from the servicefeatures except statistics in characters in documents . At this point , He introduces herself as his native English speaker and goes on to say that if you contine to work on social scnce
Generated text by INT8 model : Most of the course is about semantics or content of language but there are also interesting topics to be learned from the service features except statistics in characters in documents . At this point , she introduces himself as a native English speaker and goes on to say that if you continue to work on social science , you will continue to do so .
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)
インタラクティブなデモ
import gradio as gr
import time
def correct ( text , quantized , progress = gr . Progress ( track_tqdm = True )):
grammar_corrector = grammar_corrector_pipe_int8 if quantized else grammar_corrector_pipe
start_time = time . perf_counter ()
corrected_text = correct_text ( text , grammar_checker_pipe , grammar_corrector )
end_time = time . perf_counter ()
return corrected_text , f " { end_time - start_time : .2f } "
def create_demo_block ( quantized : bool , show_model_type : bool ):
model_type = ( " optimized" if quantized else " original" ) if show_model_type else ""
with gr . Row ():
gr . Markdown ( f "## Run { model_type } grammar correction pipeline" )
with gr . Row ():
with gr . Column ():
input_text = gr . Textbox ( label = "Text" )
with gr . Column ():
output_text = gr . Textbox ( label = "Correction" )
correction_time = gr . Textbox ( label = "Time (seconds)" )
with gr . Row ():
gr . Examples ( examples = [ default_text ], inputs = [ input_text ])
with gr . Row ():
button = gr . Button ( f "Run { model_type } " )
button . click (
correct ,
inputs = [ input_text , gr . Number ( quantized , visible = False )],
outputs = [ output_text , correction_time ],
)
with gr . Blocks () as demo :
gr . Markdown ( "# Interactive demo" )
quantization_is_present = grammar_corrector_pipe_int8 is not None
create_demo_block ( quantized = False , show_model_type = quantization_is_present )
if quantization_is_present :
create_demo_block ( quantized = True , show_model_type = True )
# リモートで起動する場合は、server_name と server_port を指定
# demo.launch(server_name='your server name', server_port='server port in int')
# 詳細はドキュメントをご覧ください: https://gradio.app/docs/
try :
demo . queue () . launch ( debug = False )
except Exception :
demo . queue () . launch ( share = True , debug = False )
OpenVINO による文法エラー修正 — OpenVINO™ ドキュメントCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboardCopy to clipboard — Version(wp-content)