Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add definitions for rpp tensor api * Initial commit * Initial commit - pln1/pln3 tensor testsuite * Mods for tensor test suite * Mods for brightness tensor host * arrangementParams to layoutParams * Rename to tensor_augmentations * Fix tensor host test suites * Modify host tensor support for brightness * Initial commit for tensor hip test suite * Multiple of 8 stride option * Add initial tensor support for hip * Tensor test suite support for hip pln * Fixes for GPU tensor support * Add host ROI null check * Initial commit for perf tests * Perf tests for RPP tensor support * Add gpu support for ltrb to xywh, remove roiType, fix pln3 brightness methods * Remove method1 for pln3 gpu, keep method2 * Fix hip tensor unittests * Add support for fused layout conversion on host * Add tensor unittest suite support for layout toggle * Add tensor perf tests for host - initial commit * Add tensor host test suite for perf tests * Add support for NHWC-NCHW toggle in HIP * Add test suite support for layout toggle * Reset hip unittests script * Unroll pln3 kernel * Add initial multi-bitDepth host support, remove templates * Move SSE code to macros in rpp_cpu_simd * Add support for f32 in brightness * Macro changes, Add support for f16 brightness * Add support for tensor i8 * Enable multi-bitDepth support in host perf tests * Add initial multi-bitDepth support for HIP * Add support for load24s in hip common, toggle layouts * Enable perf tests for multi-bitdepth in hip test suite * Fix bug in perf tests for tensor hip suite * Add mods to use d_float8, d_float24 and d_uint6 * Add f16 support in hip * Add f16 support in perf tests * Reduce loads and stores * Typecast to float4 mod * Modify RPPMAX2/MIN2 to std::max/min * Pass all arguments to sse macros * Reduce scope of time vars * Add omp_time_used * Change host to hip in folder name and help * Change error enums to negative * Avoid pointer or index increment by collating loads * Use variadic funcitons and pack templating to handle loads/stores * Fix i8 blank image issue in hip * Combine loads in f16/f32 and organize rpp_hip_common file * Fix I8 store issue - trials * Fix I8 store issue * Add manual typecast to float4 * Use int4 to read roiTensorPtrSrc * rppi_validate cleanup * Test suite build fix Co-authored-by: rrawther <Rajy.MeeyakhanRawther@amd.com>
- Loading branch information