DPCT1115#

メッセージ#

sycl::ext::oneapi::group_local_memory_for_overwrite は、work-group のデータ並列カーネルの非カーネル・ファンクター・スコープでグループのローカルメモリーを割り当てる際に使用されます。ソースコードの調整が必要な場合があります。

詳細な説明#

sycl::ext::oneapi::group_local_memory_for_overwrite は、work-group のデータ並列カーネルの非カーネル・ファンクター・スコープでグループのローカルメモリーを割り当てる際に使用されます。グループローカル変数をカーネルファンクターのスコープで定義する必要がある制限は、この拡張機能の将来のバージョンで改善される可能性があります。

詳細については、sycl_ext_oneapi_local_memory.asciidoc をご覧ください。

修正方法の提案#

例えば、以下のオリジナル CUDA* コードについて考えてみます。

  template <int S> __device__ void devfun() { 
   __shared__ int slm1[32 * S]; 
     ...
  } 
 
  template <int S> __global__ void kernel() { 
   __shared__ int slm2[S]; 
   devfun<S>(); 
  } 
 
 void hostfun() { kernel<256><<<1, 1>>>(); }

このコードは、以下の SYCL* コードに移行されます。

  template <int S> inline void devfun(int *p, const sycl::nd_item<3> &item_ct1) { 
   /* 
   DPCT1115:0: The sycl::ext::oneapi::group_local_memory_for_overwrite is used to allocate 
   group-local memory at the none kernel functor scope of a work-group data 
   parallel kernel. You may need to adjust the code. 
   */ 
   auto &slm1 = 
   *sycl::ext::oneapi::group_local_memory_for_overwrite<int[32 * S]>(item_ct1.get_group()); 
   ... 
 } 
 
 template <int S> __dpct_inline__ void kernel(const sycl::nd_item<3> &item_ct1) { 
  auto &slm2 = 
  *sycl::ext::oneapi::group_local_memory_for_overwrite<int[S]>(item_ct1.get_group()); 
  devfun<S>(item_ct1); 
 } 
 
 void hostfun() { dpct::get_default_queue().parallel_for( 
  sycl::nd_range<3>(sycl::range<3>(1, 1, 1), sycl::range<3>(1, 1, 1)), 
  [=](sycl::nd_item<3> item_ct1) { 
  kernel<256>(item_ct1); 
  }); 
 }

このコードは次のように書き換えられます。

  template <int S> inline void devfun(int *slm1) { 
      ... 
  } 
 
  template <int S> __dpct_inline__ void kernel(int *slm1, int *slm2) { 
 
   devfun<S>(slm1); 
  } 
 
 void hostfun() { dpct::get_default_queue().submit( 
  [&](sycl::handler &cgh) { 
  sycl::local_accessor<int, 1> slm1_acc_ct1(sycl::range<1>(32 * 256), cgh); 
  sycl::local_accessor<int, 1> slm2_acc_ct1(sycl::range<1>(256), cgh); 
 
  cgh.parallel_for( 
  sycl::nd_range<3>(sycl::range<3>(1, 1, 1), sycl::range<3>(1, 1, 1)), 
  [=](sycl::nd_item<3> item_ct1) { 
  kernel<256>(slm1_acc_ct1.get_pointer(), slm2_acc_ct1.get_pointer()); 
  }); 
 });

インテル® DPC++
互換性ツール・
デベロッパー・ガイド
およびリファレンス

DPCT1115

目次

DPCT1115#

メッセージ#

詳細な説明#

修正方法の提案#

インテル® DPC++互換性ツール・デベロッパー・ガイドおよびリファレンス

DPCT1115

目次

DPCT1115#

メッセージ#

詳細な説明#

修正方法の提案#

インテル® DPC++
互換性ツール・
デベロッパー・ガイド
およびリファレンス