RPP Tensor Support (#70) · ROCm/rpp@ac1907f

Commit

RPP Tensor Support (#70)

* add definitions for rpp tensor api

* Initial commit

* Initial commit - pln1/pln3 tensor testsuite

* Mods for tensor test suite

* Mods for brightness tensor host

* arrangementParams to layoutParams

* Rename to tensor_augmentations

* Fix tensor host test suites

* Modify host tensor support for brightness

* Initial commit for tensor hip test suite

* Multiple of 8 stride option

* Add initial tensor support for hip

* Tensor test suite support for hip pln

* Fixes for GPU tensor support

* Add host ROI null check

* Initial commit for perf tests

* Perf tests for RPP tensor support

* Add gpu support for ltrb to xywh, remove roiType, fix pln3 brightness methods

* Remove method1 for pln3 gpu, keep method2

* Fix hip tensor unittests

* Add support for fused layout conversion on host

* Add tensor unittest suite support for layout toggle

* Add tensor perf tests for host - initial commit

* Add tensor host test suite for perf tests

* Add support for NHWC-NCHW toggle in HIP

* Add test suite support for layout toggle

* Reset hip unittests script

* Unroll pln3 kernel

* Add initial multi-bitDepth host support, remove templates

* Move SSE code to macros in rpp_cpu_simd

* Add support for f32 in brightness

* Macro changes, Add support for f16 brightness

* Add support for tensor i8

* Enable multi-bitDepth support in host perf tests

* Add initial multi-bitDepth support for HIP

* Add support for load24s in hip common, toggle layouts

* Enable perf tests for multi-bitdepth in hip test suite

* Fix bug in perf tests for tensor hip suite

* Add mods to use d_float8, d_float24 and d_uint6

* Add f16 support in hip

* Add f16 support in perf tests

* Reduce loads and stores

* Typecast to float4 mod

* Modify RPPMAX2/MIN2 to std::max/min

* Pass all arguments to sse macros

* Reduce scope of time vars

* Add omp_time_used

* Change host to hip in folder name and help

* Change error enums to negative

* Avoid pointer or index increment by collating loads

* Use variadic funcitons and pack templating to handle loads/stores

* Fix i8 blank image issue in hip

* Combine loads in f16/f32 and organize rpp_hip_common file

* Fix I8 store issue - trials

* Fix I8 store issue

* Add manual typecast to float4

* Use int4 to read roiTensorPtrSrc

* rppi_validate cleanup

* Test suite build fix

Co-authored-by: rrawther <Rajy.MeeyakhanRawther@amd.com>

Loading branch information

r-abishek and rrawther authored Sep 23, 2021

1 parent 1b0d22b commit ac1907f

include/rpp.h

-Original file line number
+Diff line change
@@ Expand Up / @@ -49,6 +49,7 @@ extern "C" { @@
     #include "rppcore.h"
     #include "rppdefs.h"
     #include "rppi.h"
+    #include "rppt.h"
     #include "rppversion.h"
@@ Expand Down @@

0 comments on commit `ac1907f`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `ac1907f`

Commit

There are no files selected for viewing

0 comments on commit ac1907f

0 comments on commit `ac1907f`