Skip to content

Commit

Permalink
RPP Tensor Support (#70)
Browse files Browse the repository at this point in the history
* add definitions for rpp tensor api

* Initial commit

* Initial commit - pln1/pln3 tensor testsuite

* Mods for tensor test suite

* Mods for brightness tensor host

* arrangementParams to layoutParams

* Rename to tensor_augmentations

* Fix tensor host test suites

* Modify host tensor support for brightness

* Initial commit for tensor hip test suite

* Multiple of 8 stride option

* Add initial tensor support for hip

* Tensor test suite support for hip pln

* Fixes for GPU tensor support

* Add host ROI null check

* Initial commit for perf tests

* Perf tests for RPP tensor support

* Add gpu support for ltrb to xywh, remove roiType, fix pln3 brightness methods

* Remove method1 for pln3 gpu, keep method2

* Fix hip tensor unittests

* Add support for fused layout conversion on host

* Add tensor unittest suite support for layout toggle

* Add tensor perf tests for host - initial commit

* Add tensor host test suite for perf tests

* Add support for NHWC-NCHW toggle in HIP

* Add test suite support for layout toggle

* Reset hip unittests script

* Unroll pln3 kernel

* Add initial multi-bitDepth host support, remove templates

* Move SSE code to macros in rpp_cpu_simd

* Add support for f32 in brightness

* Macro changes, Add support for f16 brightness

* Add support for tensor i8

* Enable multi-bitDepth support in host perf tests

* Add initial multi-bitDepth support for HIP

* Add support for load24s in hip common, toggle layouts

* Enable perf tests for multi-bitdepth in hip test suite

* Fix bug in perf tests for tensor hip suite

* Add mods to use d_float8, d_float24 and d_uint6

* Add f16 support in hip

* Add f16 support in perf tests

* Reduce loads and stores

* Typecast to float4 mod

* Modify RPPMAX2/MIN2 to std::max/min

* Pass all arguments to sse macros

* Reduce scope of time vars

* Add omp_time_used

* Change host to hip in folder name and help

* Change error enums to negative

* Avoid pointer or index increment by collating loads

* Use variadic funcitons and pack templating to handle loads/stores

* Fix i8 blank image issue in hip

* Combine loads in f16/f32 and organize rpp_hip_common file

* Fix I8 store issue - trials

* Fix I8 store issue

* Add manual typecast to float4

* Use int4 to read roiTensorPtrSrc

* rppi_validate cleanup

* Test suite build fix

Co-authored-by: rrawther <Rajy.MeeyakhanRawther@amd.com>
  • Loading branch information
r-abishek and rrawther authored Sep 23, 2021
1 parent 1b0d22b commit ac1907f
Show file tree
Hide file tree
Showing 38 changed files with 11,886 additions and 678 deletions.
1 change: 1 addition & 0 deletions include/rpp.h
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ extern "C" {
#include "rppcore.h"
#include "rppdefs.h"
#include "rppi.h"
#include "rppt.h"
#include "rppversion.h"


Expand Down
Loading

0 comments on commit ac1907f

Please sign in to comment.