oneAPI 1.3 暫定仕様書 Rev. 1 の解説 (16)

この記事は、https://www.oneapi.io/spec/ で 2023年9月14日に公開された『oneAPI 1.3 Provisional Specification Rev. 1』 (HTML、PDF) をベースにしています。原文は2000 ページ近くあり、翻訳の時間とリソースも限られるため、全文翻訳ではなく、記事形式で区切った仕様とその解説を提供することにしました。

この回では、『oneAPI 1.3 Provisional Specification Rev. 1』の「oneDNN」の「Post-ops」の節を取り上げています。

post-ops

post-ops は、プリミティブの後に追加される操作です。これは、属性 (英語) メカニズムにより実装されます。複数の post-op 操作がある場合、追加された順番に実行されます。

注: post-ops は、計算中に中間データを保存しません。そのため、通常は推論にのみ適しています。

post-ops は、dnnl::post_ops と表記され、dnnl::primitive_attr::set_post_ops() 関数により属性にアタッチされると一度だけコピーされます。次に、属性を有効にするにはプリミティブ記述子作成関数に渡す必要があります。以下に例を示します。

dnnl::post_ops po; // デフォルトの空のpost-ops
assert(po.len() == 0); // post-ops はアタッチされていません

po.append_SOMETHING(params); // 特定のpost-op を追加
po.append_SOMETHING_ELSE(other_params); // もう 1 つのpost-op を追加

// (!) post-op 操作が追加される順番が重要であることに注意してください！
assert(po.len() == 2);

dnnl::primitive_attr attr; // デフォルト属性
attr.set_post_ops(po); // post-ops を属性にアタッチ
// これ以降の po 変更は attr に格納されている値には影響しません

primitive::primitive_desc op_pd(params, attr); // attr を使用して pd を作成

注: プリミティブにより post-op サポートは異なり、さらに、サポートは実際のプリミティブの実装にも依存します。しかし、堅牢なコードはそれに応じてエラーを処理できるはずです。「エラー処理に関連する属性」 (英語) を参照してください。

注: post-ops は、操作対象のメモリー・オブジェクトのメモリー形式を変更することはありません。

post-op オブジェクトは、dnnl::post_ops::kind() 関数を使用して検査できます。この関数は post-op のインデックス (これは、dnnl::post_ops::len() が返す値より小さい必要があります) を取得して検査し、種別を返します。

サポートされる post-ops

Eltwise post-op

eltwise (要素ごとの) post-op は、dnnl::post_ops::append_eltwise() 関数によって追加されます。dnnl::post_ops::kind() は、そのような post-op に対し dnnl::primitive::kind::eltwise を返します。

eltwise post-op は、

dst[:]=Op(...)

を以下に置き換えます。

dst[:]=scale⋅eltwise(Op(...))

Op(...) の中間結果は保持されません。

scake 係数は、int8 推論でのみサポートされます。それ以外では、scale は 1.0 (デフォルト値) である必要があります。スケール・パラメーターは、デフォルトで 1.0 に設定されており、引数 DNNL_ARG_ATTR_MULTIPLE_POST_OP の dnnl::primitive_attr::set_scales_mask() 属性を使用して設定できます。

sum Post-op

sum (累積) post-op は、プリミティブの結果を既存のデータと累積し、dnnl::post_ops::append_sum() 関数を使用して追加されます。dnnl::post_ops::kind() は、そのような post-op に対し dnnl::primitive::kind::sum を返します。

結果を累積する前に、既存の値に scale を掛けます。scale 係数で使用でき、int8 推論でのみサポートされており、結果と既存のデータサイズが異なる場合にのみ使用します。それ以外では、scale は 1.0 (デフォルト値) である必要があります。スケール・パラメーターは、デフォルトで 1.0 に設定されており、引数 DNNL_ARG_ATTR_MULTIPLE_POST_OP の dnnl::primitive_attr::set_scales_mask() 属性を使用して設定できます。

さらに、post-ops の合計は、デスティネーション値を同じサイズの異なるデータタイプとして再解釈できます。例えば、8 ビット符号付きデータを符号なしとして再解釈、またはその逆に使用できます (値は共通範囲内にある必要があります)。

sum (累積) post-op は、

dst[:]=Op(...)

を以下に置き換えます。

dst[:]=scale ⋅ as_data_type (dst[:])+Op(...)

バイナリー post-ops

バイナリー post-op は、

\dst[:] = \operatorname{Op}(...)

を以下に置き換えます。

dst[:]=binary(Op(...),scale [:]⋅Source_1[:])

バイナリー post-op は、バイナリー・プリミティブ (英語) と同じアルゴリズム、ブロードキャスト・セマンティクスをサポートします。

さらに、バイナリー post-op の scale パラメーターは、デフォルトで 1.0 に設定されており、引数 DNNL_ARG_ATTR_MULTIPLE_POST_OP | DNNL_ARG_SRC_1 の dnnl::primitive_attr::set_scales_mask() 属性を使用して設定できます。
例:

primitive_attr attr;
post_ops p_ops;
p_ops.append_binary(algorithm::binary_add, summand_md);

attr.set_post_ops(p_ops);
attr.set_scales_mask(DNNL_ARG_ATTR_MULTIPLE_POST_OP(0) | DNNL_ARG_SRC_1,
        /* mask */ 0);

連鎖した post-ops の例

post-ops は、連続して追加することで連鎖させることができます。順番が重要であることに注意してください。post-ops は追加された順番で実行されます。

Sum -> ReLU

これは、ResNet ファミリーの CNN トポロジーでは一般的なパターンです。

dnnl::post_ops po;
po.append_sum();
po.append_eltwise(
        /* algorithm = */ dnnl::algorithm::eltwise_relu,
        /* neg slope = */ 0.f,
        /* unused for ReLU */ 0.f);

dnnl::primitive_attr attr;
attr.set_post_ops(po);

convolution_forward::primitive_desc(conv_d, attr, engine);

これにより次の計算が行われます。

dst[:]=ReLU(dst[:]+conv(src[:],weights[:])

API

API については、こちら (英語) をご覧ください。

法務上の注意書き

The content of this oneAPI Specification is licensed under the Creative Commons Attribution 4.0 International License (英語). Unless stated otherwise, the sample code examples in this document are released to you under the MIT license (英語).

This specification is a continuation of Intel’s decades-long history of working with standards groups and industry/academia initiatives such as The Khronos Group*, to create and define specifications in an open and fair process to achieve interoperability and interchangeability. oneAPI is intended to be an open specification and we encourage you to help us make it better. Your feedback is optional, but to enable Intel to incorporate any feedback you may provide to this specification, and to further upstream your feedback to other standards bodies, including The Khronos Group SYCL* specification, please submit your feedback under the terms and conditions below. Any contribution of your feedback to the oneAPI Specification does not prohibit you from also contributing your feedback directly to other standard bodies, including The Khronos Group under their respective submission policies.

By opening an issue, providing feedback, or otherwise contributing to the specification, you agree that Intel will be free to use, disclose, reproduce, modify, license, or otherwise distribute your feedback at its sole discretion without any obligations or restrictions of any kind, including without limitation, intellectual property rights or licensing obligations.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice.

* その他の社名、製品名などは、一般に各社の表示、商標または登録商標です。

« パート 15 目次パート 17 »