From ca9b3354348f9642bfa5ca3807e7aee9fd85d8db Mon Sep 17 00:00:00 2001 From: Abishek <52214183+r-abishek@users.noreply.github.com> Date: Tue, 23 Jul 2024 23:48:59 -0700 Subject: [PATCH 1/7] RPP Jitter on HOST and HIP (#384) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Jitter Tensor Kernel * Jitter HIP Kernel * Jitter Tensor Kernel * Jitter PKD3 to PLN3 version * Fix Jitter variations of HIP and HOST u8 * Fix Jitter variations of HIP and HOST u8 * Jitter Tensor HOST variations * Fix Jitter HOST f16 variations Includes Cleanup * Cleanup and Optimize Jitter HOST AVX * Fix boundary pixels in Jitter HOST Kernel * Fix bound compute * Fix merge conflicts * Cleanup Jitter Implementation * Additional cleanup * Cleanup Includes variable renaming * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Add HOST test suite support * fix output corruption * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * added separate kernels for doing flip when horizontal flip is not set * fixed build issue * Add supported case * reverted incorrect changes happened with merge --------- Signed-off-by: dependabot[bot] Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: Sundarrajan98 Co-authored-by: Pavel Tcherniaev Co-authored-by: fiona-gladwin Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * RPP Vignette Tensor on HOST and HIP (#311) * Add Vignette Tensor HOST and HIP Implementation * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Add Vignette Tensor HOST and HIP Implementation * Address review comments * Update rpp_hip_common.hpp * Update vignette.hpp to add rpp_hip_math_nearbyintf8() * Update Tensor_hip.cpp to add hipHostFree * Fix init * Repeated initialization bugfix * Add host case 46 --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu Co-authored-by: Kiriti Gowda Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: Sundarrajan98 Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.37.1 to 0.38.0 in /docs/sphinx (#333) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.1 to 0.38.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.1...v0.38.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Resample (#310) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Intial commit - slice_audio * Intial commit - mel_filter_bank * Intial commit - spectrogram * Intial commit - resample * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Remove unused variables in header file * Add axes parameter * Replace Rpp64s with Rpp32s * Replace vectors with arrays Includes optimization * Cleanup * Cleanup * Cleanup and optimize * Move malloc outside openMP loop * Optimize and precompute cutOff * Cleanup * Fix buffer used * Fix buffer used * Additional Cleanup * Fix buffer allocation Includes minor optimization * Optimize post incrmeent operation * Optimize post increment operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * move Tensor_host_audio.cpp to host folder * fix qa mismatches * move Tensor_host_audio.cpp to host folder * fix qa mismatches * move Tensor_host_audio.cpp to host folder * Add spectrogram case in Tensor_host_audio.cpp * move Tensor_host_audio.cpp to host folder * fix qa mismatches * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * Add Doxygen comments * Add Doxygen comments * minor change * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * removed unnecessary files * removed debugging print statement * updated copyright * updated description for resample based on latest changes * converted golden outputs for resample to binary files * Passed resampling window as a parameter to resampling function * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * removed unnecessary files removed unncessary validation checks in test suite * modified sinc to use ONE_OVER_6 macro * combined srcLength and channels into single tensor removed the usage of quality parameter since not used in the kernel * minor change * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * used std functions for floor and ceil use static_cast instead of floor in the resample kernel * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * Png update (#316) * PNG file conversion * reference .png files * remove JPG files * edit IMAGE_PATH * RPP Test Suite Upgrade 6 - Restructure common HIP/HOST code (#315) * moved the common functions used in a python test suites to to a common python script created helper function for displaying QA test summary * reversed the order of performance runs loop and decode loop in all test suites * modified remaining python scripts to use print qa helper function for displaying QA results * added new helper function for print the performance test results as a summary * added caseMax, caseMin variables in image test suite made changes to run only necessary bitdepths needed incase of qa mode --------- Co-authored-by: sampath1117 * Fix build error * removed outBegin variable * remove duplicate line in readme --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 Co-authored-by: Sam Wu Co-authored-by: Pavel Tcherniaev Co-authored-by: fiona-gladwin Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> * Docs - Missing input and output images for Doxygen (#331) * added missing outputs for image augmentations modified the description with correct output names * added gif for voxel input and outputs * modified the output images for water, resize_crop_mirror and resize_mirror_normalize --------- Co-authored-by: sampath1117 * Scratch buffers rename for HOST and HIP (#324) * Change all maskArr to scratchBufferHip * Change all tempFloatmem to scratchBufferHost * Update CMakeLists.txt Version updates * RPP BitwiseAND and BitwiseOR Tensor on HOST and HIP (#318) * HOST test suite update for voxel processing * Initial commit - Implements PLN1 Fmadd Kernel * Add dependencies for fmadd kernel * Implement NDHWC variant for Fmadd Includes testsuite changes to support 3 channels * Implement Slice HOST Kernel Includes testsuite changes * Fix NCDHW variant for Slice * Cleanup * Fix NDHWC variant * Fix stride used for NDHWC * Fix NDHWC layout handling in testsuite Temporarily converts pln3 inputs into pkd3 inputs later stores them as pln3 after processing * Add sample input .nii file Also fixes build error in testsuite * Fix NDHWC layout for fmadd and slice Also includes fixes in voxel testsuite * Initial commit - Bitwise AND HOST Tensor * Match u8 and i8 outputs with BatchPD variant * Fix i8 PKD3 -> PLN3 * Initial commit - Bitwise AND HIP Tensor Also includes fixing f16 and f32 datatype of HOST * Add reference outputs * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Modify reference outputs Update Copywrite * Combine templated functions to support all datatypes * Initial commit - Bitwise OR HOST * Fix GPU kernel details * Fix case number for HOST testsuite * Initial commit - Bitwise OR HIP Includes reference output * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Address review comments * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Update rppt_tensor_arithmetic_operations.h * Update rppt_tensor_arithmetic_operations.h * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * Move bitwise operations into under logical ops * Fix doxygen comments * Merge with master * Cleanup * Revert change in CMakeLists.txt * Add docs outputs * Cleanup --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu Co-authored-by: Kiriti Gowda Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: Sundarrajan98 Co-authored-by: Pavel Tcherniaev Co-authored-by: fiona-gladwin Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * Minor common-fixes for HIP (#345) * Use scratchBufferHip * minor fix * remove additional variable use * Add CHECK_RETURN_STATUS to hip API * handle fix * Readme Updates: --usecase=rocm (#349) * RPP Tensor Audio Support - Spectrogram (#312) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * Initial commit - Spectrogram * Add QA .bin reference file * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Address internal review comments * Modify cmakelist * Fix QA mismatch * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Fix build errors on OCL backend * Merge remote-tracking branch 'origin' into sn/audio_spectrogram_master_merge * Fix build error in tensor testsuite * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * Address review comments * Revert change in runTests.py * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp… * Update CHANGELOG.md (#352) * RPP Tensor Audio Support - Slice (#325) * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * modified the slice kernel and api as per the latest changes * added test case of 1D slice in audio test suite * reverted unwanted changes * updated the slice voxel testing configuration to validate the kernel correctly * updated the description for slice voxel gpu kernel * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * revert incorrect changes happened with merge * fix build issue in test suite * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * added missed validation checks for slice api removed unncessary param in HIP kernel * removed redundant variable * moved the initializatons required for slice in test suite to a separate helper function * reorganized code for better reusability * add comment for init_slice_voxel() function * modify NSR kernel output types to make it compatible with latest slice * code cleanup added erro code for layout mismatch * added slice test case in HOST Image test suite * added test case for slice in image HIP test suite * fixed layout condition check for NHWC slice kernel * minor change * added golden output for slice 2d and 3d cases * freed memory for buffers allocated for slice in test suite * updated the validation check for slice in voxel test suite * Update rpp_test_suite_common.h to add set_generic_descriptor_slice * Update Tensor_host.cpp * Update Tensor_hip.cpp * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2)… * RPP Tensor Audio Support - MelFilterBank (#332) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Intial commit - slice_audio * Intial commit - mel_filter_bank * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Remove unused variables in header file * Add axes parameter * Replace Rpp64s with Rpp32s * Replace vectors with arrays Includes optimization * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Fix buffer allocation Includes minor optimization * Optimize post incrmeent operation * Optimize post increment operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * move Tensor_host_audio.cpp to host folder * fix qa mismatches * move Tensor_host_audio.cpp to host folder * fix qa mismatches * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * Add Doxygen comments * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * Initial commit - Spectrogram * Add QA .bin reference file * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Address internal review comments * Modify cmakelist * Fix QA mismatch * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Fix build errors on OCL backend * Fix spectrogram Removes slice kernel * Cleanup Modify reference outputs * Merge remote-tracking branch 'origin' into sn/audio_spectrogram_master_merge * Fix build error in tensor testsuite * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Change to camelCase for variable naming Also includes cleanup * Cleanup testsuite for MFB * Update notes * Address review comments * Revert change in runTests.py * Modified codes to use handle memory Also fixes reference output file * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HI… * RPP Tensor Normalize ND on HOST and HIP (#335) * Change enum name * Support Batch processing Includes few fixes * Fix testsuite * Add Voxel unittest change testSuite CMakeLists * Add Doxygen Voxel augmentations * minor change * Add readme for Voxel test suite * Cleanup Includes modification in function naming for fmadd operation * Modify HIP testsuite * Optimize AVX Includes testsuite name change for normalize * Fix output dump issue in HIP and profiler logs * Move __AVX2__ flag * Changes to remove localThreads definitions, add _hip to kernel names * Fix QA reference inputs Also includes reverting to 16 pixel load for AVX * Fix codacy warnings * Fix toggle variant HWC -> CHW * Fix conflicting ROI types in API between HIP and HOST Also includes U8 support for slice * Use ROI Tensor instead of roi pointer * Add support for ND channel normalize * Add support for ND channel normalize * Fix usage of begin values Includes fixing of function names as per axis_mask * Add support for audio kernel * resolved issue with QA mode after U8 addition * made changes to display the exact variant being run in QA mode and performance test mode * minor change * resolved issue with unit test mode changed few variables from snake_case to camel case * reset DEBUG_MODE flag * resolved issue with HIP profiler tests * Add testsuite support for audio * Fix audio normalize testsuite Also adds QA reference outputs for normalize audio * Cleanup * Improve readability for normalize ND QA mode * Support ND axes normalize * Add templated C version for u8->f32 and i8->f32 * Update docs Also adds error code for invalid datatype for Slice kernel * Fix i8->f32 datatype * Update docs * Modify normalize testsuite to supporting any ND kernel Fix merge issues Also removes other voxel kernels * Fix audio testsuite and runMiscTests script * Disable QA tests when toggle is set in runMiscTests script * Support internal mean and stddev computation for 3D * Fix Axis mask for 3D Includes cleanup and testsuite changes * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Update rppdefs.h for comments on2D/ 3D types * Rename to fused_multiply_add_scalar * Implement collapse axis functionality for ND * Implement mean and stddev internal compute for ND normalize * Fix paramStride after collapse axis for ND * Fix build error * Fix mean and stddev compute in ND Cleanup * Cleanup * Additional cleanup * Fix strides for 2D and 3D Also includes fix for normalize ND kernel after collapse axis * minor changes * added QA inputs for 3D data * fixed issue with idx used for mean and std dev in case of ND Normalize * resolved the segfault issue with collapse axis for batch size > 1 * Fix 3d mean and stddev compute for axismask 5 Includes cleanup * Cleanup 2d audio kernel and fix audio testsuite Also handled striding for mean and stddev tensors when input dimensions within batch differs * Fix maxSize compute in normalize ND kernel * fixed normalization function for 3D * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Change names of ref outputs * Fix host test suite cmake * Add Voxel tests for ctest and CI * Remove boost deps and change name fmadd to fused_multiply_add_scalar * Add project name to remove warning * Add scriptPath variable usage to make paths generic for CI * Move CHECK to header * Add C++17 warning fix * Add clarity in final QA result display - match voxel tests with other tensor tests * Build fixes * Fix merge issue of double call to set_max_dimensions * Add clarity on QA test final result * Add references for sample nii image usage * Remove tensor voxel slice augmentation output sample from main ReadMe * Codacy fix * resolved output mismatch issue with axismask5 * Fix index of roiTensor used in maxSize compute Includes cleanup Adds QA inputs and outputs for 3d axis 0,1 with mean and stddev input * Add QA for 4d with internal mean and stddev compute for axis 0,1,3 * Add extra QA tests to support code coverage * Add comments * Update doxygen for normalize ND Includes minor fix in audio testsuite * added normalize hip codes * reverted unwanted changes happened with merge * remove ricap mods * removed unwanted file changes * minor bug fix * reverted back to 1 pixel load and store for 2D kernel for better performance * experimental change * removed experimental change made the compute mod function as inline * avoided the reusage of power inside for loop * allocated pinned memory in handle and used same buffers in normalize kernel * restructured code in ND kernel * made mean and stddev buffers as gpu memory instead of pinned memory * reveted back few changes in test suite for supporting qa mode with axismask 3 * added condition to compute param index only when max param volume is not 1 * fixed the issue with numDims in normalize HOST * added initial version for mean compute of 2D inputs for axisMask1 axisMask2 * added executor for mean kernel launch for 2D inputs * added kernels for mean compute for 2D inputs * added mean compute support for 2 axes cases for 3d inputs * added mean compute for axisMask 4 and axisMask 5 cases * added mean compute for axisMask 3 and axisMask 6 for 3d inputs * added support for axisMask 7 for 3D inputs * restructured kernel launch for mean compute for 2D and 3D inputs * combined all reduction kernels to single kernel * moved common reduction to a helper function so that it can be resued * added initial support for stddev 2d inputs * added stddev compute support for 2d and 3d inputs * bug fix on boundary condition && mean index calculation for 3D inputs * bug fix for axisMask 7 for 3D inputs * added initial support for nd mean and stddev compute * added final kernel for computing mean and std values for ND * optimized nd mean and stddev compute if number of meanss/stddev computations is lesser than max shared memory size * removed redundant code * nwc - fixed the performance issue with axismask 7 * resolved the performance issue with axisMask == 3 and axisMask == 4 * bug fix for axisMask == 4 * fixed the performance issues with axisMask 6 * removed the usage of mod calculation for normalize 2d kernel * removed the usage of mod calculation for normalize 3d kernels removed the usage of paramShape and paramStrides buffers from 2d and 3d kernels since not needed anymore * minor change for axisMask 6 * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * modified the axisMask order in kernel for better categorization * categorized kernels into multiple sections and added info * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * moved normalize from geometric to statistical * removed commented lines in test suite * renamed normalize_generic.hpp to normalize.hpp updated copyright * moved common helper in misc HOST and HIP test suites to a separate header file * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * modified fill_roi_values function * made the changes w.r.t scriptPath * moved rpp_rsqrt_avx under rpp math helpers reverted unwanted file changes * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * added bin golden input and output for 2d data made changes in test suite to support the reading and output comparision from bin files removed the olde golden input and output .txt files * added golden inputs for 2d mean and std added golden output for 2d when mean and std is passed from user modified the helper functions to calculate the strides for 2 modes of normalize * added golden input and output for 3D data * fix for output mean and stddev outputs compute for axisMask 3 * fixed the precision issue with 3d normalization kernel when mean and std is passed from user further cleanup in test suite * use static_cast instead of c style casting * added template argument to kernels for supporting multiple bitdepths * Revert rpp_load24_f32pkd3_to_f32pln3_avx() Cleanup comments in HOST normalize * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fixed output mismatch seen with 3d HOST normalize kernel when mean and stddev are passed from user * Fix outputs with 2d normalize HOST * Fix HOST 2d outputs when AxisMask is set to 1 with mean and stddev computed internally * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Change all maskArr to scratchBufferHip * Change all tempFloatmem to scratchBufferHost * Cleanup * combined multiple params as a single param wherever possible in kernel launch made the descriptor pointer as pinned memory * Removed the unnecessary memcpy for ND normalize * added axisMask as additional param from test suite added caseMin, caseMax changes and qaMode parameter to python test suite used helper function for displaying qa mode results * remove unncessary variable in test suite added roi start co-ordinates in index calculation * updated source index calculation with roi begin values for 2d and nd mean, stddev compute kernels * change variable from snake case to camel case updated source index calculation with roi begin values for 3d mean, stddev compute kernels * Modify HOST testsuite to process AxisMask * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * fix the logic for ND ROI based index calculation * added helper function for setting the description pointer in misc test suite * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCom… * SWDEV-459739 - Remove the package obsolete setting (#353) The package was obsoleting itself and was causing upgrade issues. Removed the same. * Audio support merge commit fixes (#354) * add NFT and NTF layouts * Set layout for spectrogram and melfilterbank directly in testsuite * Remove extra blank line in testsuite * minor changes in test suite * minor change in MFB description --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 * Revert unnecessary merge changes * minor change * Address review comments * Address review comments * convert SSE functions to AVX * Resolve Review comments Add Vectorized f16 code fix the output mismatch between HOST and HIP * Resolve review comments * Add jitter in non QA case list and random Outputs list change indexing for f32 and f16 loads * Add Warp Affine test case in test suite --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: fiona-gladwin Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu Co-authored-by: Kiriti Gowda Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: Sundarrajan98 Co-authored-by: Pavel Tcherniaev Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> Co-authored-by: raramakr <91213141+raramakr@users.noreply.github.com> --- .../effects_augmentations_jitter_150x150.png | Bin 0 -> 10736 bytes include/rppt_tensor_effects_augmentations.h | 44 + src/include/cpu/rpp_cpu_common.hpp | 19 + src/include/cpu/rpp_cpu_simd.hpp | 76 ++ src/include/hip/rpp_hip_common.hpp | 3 +- .../cpu/host_tensor_effects_augmentations.hpp | 1 + src/modules/cpu/kernel/jitter.hpp | 929 ++++++++++++++++++ .../hip/hip_tensor_effects_augmentations.hpp | 1 + src/modules/hip/kernel/jitter.hpp | 314 ++++++ .../rppt_tensor_effects_augmentations.cpp | 157 +++ utilities/test_suite/HIP/Tensor_hip.cpp | 59 +- utilities/test_suite/HIP/runTests.py | 4 +- utilities/test_suite/HOST/Tensor_host.cpp | 56 +- utilities/test_suite/HOST/runTests.py | 4 +- utilities/test_suite/rpp_test_suite_common.h | 2 + 15 files changed, 1658 insertions(+), 11 deletions(-) create mode 100644 docs/data/doxygenOutputs/effects_augmentations_jitter_150x150.png create mode 100644 src/modules/cpu/kernel/jitter.hpp create mode 100644 src/modules/hip/kernel/jitter.hpp diff --git a/docs/data/doxygenOutputs/effects_augmentations_jitter_150x150.png b/docs/data/doxygenOutputs/effects_augmentations_jitter_150x150.png new file mode 100644 index 0000000000000000000000000000000000000000..8aef1cbe6c3edf3c8b0139441fa8303c8ef0f1d1 GIT binary patch literal 10736 zcmbWcWl$VX)c?D!QDN9;1=A2%i``5JUF`qx8M-m-Q6`1+#xuNW${11=c&5) z&3$q2JyU&Zrsnk2_tRa|eY*Sg&+8TdS3yQz1^@>K0KmOnfY()k6aXIXKl{(a|7VB@ z|9K=tLD$S7zi$jGP|sA%Z_+1ni^20G?{5C47Sf2+bHAs`^3qadUF zPssl#_u2!%Lj|M&QV`&10q}Tm2zYR>0|4qbok;&9?@jFg8XPqI0`h@&W%@l>!g@8L4oQDdOtnDRKpS`5xF>?(;Lnk68 zAtj^#z`)4F%*)3wASfg(^;ud*R!&|)LsLszM^{hZ+``hz+6H9n=I-I?{nfVLt|5OOKV$S|G?nT@W|-c-2B4t#ic*XE89D} zd;156e~*r#SJyYUcmH7b5C3t&0TBKd>&^c^a^byk!T;9>DF1Q6!F#I27Qga}y4R{hrNe z%F_s5Fef=tSOmoqQnmi8`EBU zkt-$g$D+>?#R3>k=6^s9)q_gNQ2X29H&SGT;zw4a2ruA6|C#=~rk`g;>1WPa^ zC$r$iT=Z#29I%?f3uakOc?lCZ3u;y@f32R=QqGRiD4C9ITerOdkBgCiMuZT$cU9si z&en(*vb;*hh^?&uBu{H3O)Icy>O!26T~-M+>WI0D=o8Jmouuy3qt1wZ#ypPgOwTx- ziJ`Z5#pL{=H9(Z$6Wz|?Npl*#xSoUw>>H@n$0@X7F&4Gccnrd6v1!%I!bgrVa+DH7 zt(dHFep7EoPGL7Su{4x;PM+IWfM8`rJ>r=bDOI-us9s}g4bx=r3rhj@sq_zIfc1<< zcm0phZyz80v}^0>Fd3&+e&kvV8)jjFJUK=KE_Xf(JiVFew-gSiUS@BEykT$|&>dLpGfcMZd6Hy)|EG zB94Stv{v16aC7h`g`2*emCx28LANCX0*FM^MP-700%}!)7;9d}yKQA~K*=@ppdGCb z?^=6%%5-F-3vMPkjAk9M3J4|)1~{S=Nt|~5;#mb|%o)%c(97S#fp%5tU61Y&Ui{|w ziMhE02LQcacxGnxcY5ztgVdzOHQ6ofvblUqcPV8T3tlEjkZP9}DBO$)GJKzYG9cA; zg_#qQgL+fX;kmBBqX(^XHRtgGH|r(Xp$|d1PPXQMxI z3es@^_TK4(mqR$c5pPN=Q%PACE>ZMnbyJp;?kJv4J7pcSTZ@#%T+t-mmJ+Tn^d~J_ zzhNx>1XjZLyaK#PDKOoC$-lcG>wbw!F2<94sQ8@q^UU&W^c8@<^ho3zywv!6UdB{a)# zA-@M#vvl|ii@be5K;(guP?X37m!HCGm>2zYOj)F@w3l-D{+y$VY|jW9w-E%01lJ31rksn=?|EG&)Cpv_VFI1cZNg1J z2tOUbvot^6g6im)iupPHh1NSry=_h6{hKYsoZixv^AjQfRHZRkwDB&Sh8s0AzW6lB}W$wQtNdUptH+ zBbJjU0=0Z53tpD*;1xjd3b0hghV2(cTn0pA#FS@dD_n8bP2GoKCh&?RZ#MiYg9G%a zuab(hpbadPSgHjXFYV^t2*yHO7Sr#$`F{kb14%6@gQn+7yV5>Qr?^bh1h!Te1Oi`% zbJ#D#e?NIz`K7s!^SF5giM2^1{6aYBaHNclA!kH_M-ho2M)B}o*dSP#0QnH5a<7iO z0@?$@4&e+O{j!7uywD$P1%Huzzt(_QM`|bQ<2THXj+-`sV;{C$)QZwwKT--O3cUhO zmeklo8NvBib%ZtE6jUnWS!5AnBm?0om_)~LC!4|XA}?AqmQ%4v^G2v7ds*fS3sJ=) zTilc_Tk&>3m1#e&_93){z-`680L#*1Lw+MH|++Z(!wkyNb>}&Y0ZfE-7BR_2a4wore4cPB- zF}9f~(Vt>o2rogy1m|I^OheJD!llux$^G8XTfJ_Q`W20IHEMjFwSzx)M z%2US51AhlgW=!%+4g7&9xJ5^5>lN^Q6T6E}Z?yDD1bUEKvD#0ZF|WIhtU;W~Dybz# z#t%W8wIG$@9Dx%drw);+YR@fibE>iZNf=JULH8C0o?L7q6D=~{gHNS%fDQx4Y*P!zwT zX@PhDcsE_`Yxh@#`#74b-#p8_*yZHNSocZ!8GJKxR_Wl6Q0#-*5@FZgK1z4Z{gc#^ z993Gq-E&BYC*?gpEl%RGF{-Fmp)AX$?kWH2xVM&G3S|M@inLKI+GbxpUTqyR%c-+Z z6)`DBpD}yM#11D(7V8<~(c0B0uZ&4|8?&B(quP8|&@@al_Ye0dB-+Qm?#EURiumuf z>aeO9Ts2y}CW~wGJrpNNKdO(wCTEwa_W{Uqq>!D3kit-Mcv3kA-~!%@FZ76MFpDFd zLdrW(US~fp;w{^~-(8?e)^g{_q@~f4ofyp#d$5S_&S9CcocSDYbV;xq;_DlfTdC=U zEkiGI_UF=>#2rZS_%_W8mAex2v<@og%XT2-c0vk+l9Y|mW$c;X)9lM36th&f&LmHd zzr$ltDpv75e^&jZvHF(@h-g!v70KB36qI`uZ8*a20{vT9b5Fwm^6^IK;>A~M+yN!A z9X_dNFQIzj6`+3tyxff&Tj%+H&+y>(3RvWG9hc} zZZ@0jy_Jk;5s-tGL%0>^%eXXdS01$$O%KAiX<`;mZ2#0NB6%N*SjvDBg5-g_X|rjg z(>1UE3iy{}Z(nBtPuP^Wc22XEMdd?rBQZS}$yzbJs#QdZv!)7TkbNTZ>oabY2KZigQslbkYKQN_ zyGrux9vqrFx7YRO9)_+KMPfw8$-^ONc{aaOu@KH_NHoAYP0t`pDgQ%pc zU_qxk)K)>C^*e^SoAMTIMd})57y7YLCrt3#>T$iI=dO8Eqel{=OFpHKIZ6Si z`+f49gwuKTtc6W;thwjub!b))@cuW8n2M3YfdRQ$ya}e2?yC6LKTmzw!jvt77vK|P z4w%LZy1-@3jzxxY_F)`b2SbHz5?coq^-`lEiDUR-MhMqF+d!_J&N@LYTgRUqQlLpM zyBXo`D}XR)tN0aw7zhvR8{ZAA-4zjY9eUA@YPdl5jwfx*1Sp+3aV)ReaE?0wi6HWc z-)eph9rwtn;+Gu}mf`h}sTw78ywKXNpoHGIm3t={*F{g}$8H!ZyO$A2s}r90Fxycr-7uTXVI|LzRgRu>Cjg|gK8uTK-7BE1avmcmTfqC7H z%y!|!1mrWfZ2yvNt~y#SoO9%d2s&&%ii&?8+F4{GkpU|Dd4jcViVVinX8Ul~4c)hS zgn!Uw1p+B#jL>UAr+(A1;)oz)>@MV!;HHz-ZS`|qw1_ul&_*2e7BeTIYrIGjr(PX@moWOE<3)* zikX+S{7=^)kw%Gxi$@WE!!Ae=xoul`a!$4fV9veLN`9xVC78pcq1K1aITg)jRS7Kr zy)TML+U7n;11)El#cizcR?%tV8cNEJpJkRuUUhifu!-7$y4}2Z)6T&?r<<~EnQ)N81Vl3xAl{^}pB-0(teaHp@#t#lYWsBe_v1+q zjO`ExU>d<4#84tKiE4`XP$2w7i3B%ca~ugqc@?$M&Q=(xNT-`+s3E+y))~?( z0Eyx``3cUh&|9RZoAQaWjK2seCvS&Zn0#R2=lr>DgOX^wo(r#IVs4EgR~8y#e5vWO zD%P+!NcP{@^!Y15V$wgVy7$lb@qenSC-b)=645y#mpZ*15x&&}lYtAPCEkHUcB`n-b6ifaNKDN#NQ7_S$dpmmtuU<)lTNYu=7duUU8aGyc=G>jQ*?4g_xm&%$F!az=;qbLzYQcU5i$}S!BWb$n z<8>FE+2ld(yLCoPQalbiO+0wJe)m^E^uut-g@d~u@2VihI0aS!YCOn#W6a`OskJZn z1If_hmtR*xn5*zUge`Z?i~9Mg96`CNTANNiH#JX`Dw-M<&le{_va2s>kH4YhyN5ka z;Z2@8fo#x&_{|kOx`%+y;ygr%suS(LBca~V-?XxUm}DVCDx2ydG2kxWMlNRRgK|@< zzzPYI-!u2vt+@K+)%{G(D0ZYX02~VqQP6rnTNLrUcZjF{v z9h;0lL=0%kOta$uh;+IN4mFkSp4fG=EAn6-+1Ax1C*ysLZ5~en;KPIex_&~*(hf}8 zMr>IKkJV3ZyQd)?2+<8w1$TTS{Bd?4WrF>2WO}{wQ6lVx4oX$%FIEIvZf&{J?VI5> zP##*0qs-PJWhDj>vTDhkjo}Qaa#rkn0Z+{%!y9CD!a7V?*~zt<+#XA(SFS$d4b3K- zjZ7dUS#uX*#ka~-@;Awn{R@a zQ+KEnp6u){PqsIrygUinG7QS92x2^4@Y737L+vY030=dJdt+0mQum3Hh$$)pw;~xpP1pgVvgybIGDO=3{#Jc-dRz!!ssNoyVoIRq$4-dnpz@#U*mB> zW!pBCi}_G_(=6{mPM=+o+0kOo-I1xz$wlS_Ui)R@WJP02mVv*R7Bxc3y9XYdvL*3O$c>&Yds@gMc zU5jDmlECRMh5)$Rb~eccRO>Xh4(DDAuozM+?_DPL^-8)(&Mf#qrIMKXda!Lo z1~4{qS7P`(UkjfAlxU`?h#*tmy=CCDX2b7<4@PRcIlfmat)VqK2S3D73X3K3C z@%{nT$Iz3nOwg*zApewn)QelYdq?_5{=P1GAHC585Y6yEIWtt6jGJkMC#K&X7^}%~ z%N2a-r1b4gu|=I3L&v@UMCGQ*6QP)lw7MhXED=V234g8}tQE;lSN0nxm!tO&8yYMd zq6&>6Je|7^6R0Z$1MRWlvFUKe46zDcPU*T{$3OgcQ`k3Q+hET|rMP;edsUF*tV^IrJEv~{%z7xCbBn7W%G^N?g=Xf>x0K8R!^iMh+P>T)FFq%g}}yH`Mb26iq=gi}?1eQW%f3sRCQW#{^;ypvOx6lU$Rq0i5j zM}==8oZDKG7s>jWPALY&WIX!3Ca{7534Fdy@{Qv_7N|^R)yYakC4cRQJX6#gShJ`SAX%5w z68SCA^?@O}Q37Do^p(D?Suy>zyQWdMY=d%fBs#zR)+M5G)?Jrc%6 zJt{ZeM&547@!#*`)rm9rGQ1|S@`$HXm)rtyvTwe3ugqLzCIhwS=%RwiRAbEU*=!y3G0fsS@-H;YD!Ru%qOTR6-0 zJCpwgS=uXIgK%|^RvP2>J2&!QxQUA&R_u_~^(Zh}44cyisYou_Uu>4fyg%GJQFu!X zX6WlR;3-3>!sac=F0xC{9 zewgzLxj2iV&e`@N3Zt-8O_S3`)r#RkT@h^-4MaF5zVO*zchk!0o08j&euuT;2$83| z-;xlGIb^Pqw;bZ$^vHDtPf^_Z(NY;JkTH~9(=S$6iFKu#7Q=?->~c^4R1%?C358pK zX2$IGtuKZT(bGgWWY=Rmk-!l(@Cm?97mHh8P%KmEyMq#XqFPvb7t(}jUgNLGtVkuw z(3+9R-Uhu*tCq_Aa{z8l$Cy?Rp65t3$bnu(eiXl9E506`prj|UULh7; z7|Nf{HF3n}EOCqt@nZH$-ZXoMuLiA+M$#p5pS+t4g?=G7 zcjDo*8zn?NHot9>*^3%k2ULT+h-=6%?op1)IqvFo4a03V41gF4JG^EW_$bMA8!FQr zZtmg)&uLw}un0ELsUvO7m&=8(X7ou4Z2nDAsGr+BS*SVVfEbE*%A9PWbKDX;i@RQtzn9{S_t8CR~u1ZTNKoJw?2%^iQmy_Jda7TcZbPdHf_jxj z{dQgfcrrDxKo0EXAUdy8=D^(Xuw<6D*MLO7uGU>f-!U=C~R~)6l*HAx1y-xj7e@T;+3y z6jGK>{g29LpCDs!*2(@&gWOHF+ctLS<;9LbG;d5ol)i2M_Mo5B#$UwsKPLhX#g*ah z`ZR7FnRj6tX;lsbN>6Bka@Nq|=F5T}qKxE}bn9Z<8Cc-Kwo`}!!Y_?cy<;Y3>06bJr4;=H^FqIX@*rC>?~|2=h(1yghif7+ zr?YFxnGGi_h+0=1gTGVBIV98COlW(~_K}Fj03*15U1%_F;MJ9?5HJF8wBywTD)rX3 zb4*c>R2aICyBnszZO;Q+>gP<`ugp8)kS5;&*S}4BwwMk@(b}y>?d${?nLj=;y47iPqzH*xNBRrw zI1w9mWo`$+zXCMcqG`_@O7uo~e(hlmFw$Yx9Ip8;PpoM98%YI6{jC7)u(L$cdwq<5fe zCch66G{+mVnjYcSEfo^^!t*Y8oH}hQ3(J@scC8AvJ9_Yp**2mbT2;~e@)fkgSl8OE z(TwAoFH0Ul4&W#D3+^3sTgqI`BcW8ie+7sxso*EomuYy-^i@hIhr-g;uwz1a}1ad6w)HYo!)qY(q=d!c>Y}%4KJqc9e=VUnef}1@TpDGV^ z>W}=bzx(LeixlRBG+7}4FF#}5!L~z}wKa}h6bNNXas+^uN z>sVDupE=RRXhMv$S7!DV4l{TcVirBaTHV5HN_U-r6u8{ouzyfi2|MZ0Fw|cer)e}s zQx2fGC^6EL@^N3R8}$UbN9evRi?E_zE2udX*rW|N|BjH3DkGlE8yD@%GIqQ5GKPUK zPvoP~{H1o+h5N=x^bU!U1#=cD6dBTnzEBbZ=qQLxGGXd_Fyn!Q)x)iPZ@)8YY^T*P z2G7oaRXxN*7Uu>7D7>odiw#nnkPl29$MP)6^$pCK8I#GnP{86`u#F|9#V%IR!(G-& zAj7*o>g%FdQ{7m8M`ixtDy@FPP-XUl8G)bC8azKShp4)$vLB~VphSB+1usTpD06eP0 z<>Jdibp(~V>w|4WIr==QfG~{hXX$|^H{D|{P#_uPN;iD3n=W3opI3)y?8;GB{lq_p zc_fCsKwee-`DdW`lVnX3H21rxVxFYG;`VAM)kQBck&c}mU83PQqNf+9vL$akqChj2 zXw6WF3nCXBI?{DVy$)U zsXCm3&kY@-&+ltJNSM$gPBQl-BJsTvxKP!QMsb-rl2!pO1lj)k8qil^ zLIg3}b>-oGI;e#jD|KX7o$_Y(cp%{)zXV*rW35YcbUEPdbNChIG!CT|Dsl9^wPgR{0gShv^sKx0`YEUNTFqpSeEev{STjb=ZlF_7gsm z9IYBz8L3EgLaLA1`Q-ZgYB|0c2VU$59#@t*3h8cqmgigQW-bCzU`a$l@sttz%PBD& zos7@6+5^D%44!@kR5i@!rTx)Vk|Y=z5-OL@Qt80z5~rr+80f za^!YK)K_Iq6H#a^g25S~D$PyZosWc5^-d-cOeBVeQHXxYq+vnr+9Fw{ALUk4)hEhZ zBmd;!!!$1Ew;952jo6LhyyGu)7u_XTROa}?;=T^Xbu@Dlrp#?<2{_PsxemH`5I7<7 z!@v&7fTWR$YkhAL#!8DjJ+mS1XeaDL>?8IhK*D zIR|t=d&?Ov_59C>%*C4AA%Y|TPsCEvsgLkAiDLP22N1B?N5VdJz| zZ^0x)e0IBEj7arXV1AmM{@BU$5~MmtW^Rxx>rb^9hz`4>l-jXAGlpV(ZH73==vo9} zjhZ8^uMb3FRLmh9Djj>Owp9L%CUy=6^Bi~-?>yBKZ z?!SeNZ&h+{Z%RYqnp?;k9tWIPCJP5G(<1cmH74~Cc_#3g@Gw8{N@4zZGKLx`!+f?E zU}{^Fc18vlSW?ePV!v-YvliXR%|-DDf@jk$<>vSlyLqPK(7q!yPq;>>`OPhC6c49( zmk;`u#QRJ7%a$7GF1ObXHYJ%-7UIUHVo<0d$@MUZ!a)=%YFTY1OFt&kAjU#B<6j!o zj1;9Hs_G({y9k7qbR23qowh&QTiep#%Pt?v{W5?elqfKk8=%(YQLVcBwHk(luTZBX zgfm8!z$Ufik%D4DQ;hkMHAGAHipi7@yOq+^nOKz!n@6f8__Jxalag@ z24(d)G|nZKSLO|1vfIkQ{x0e4iX4eKi;hcRM$%!N_9G+x7fWGC)e}_dg$(Pi3ksl4a<$lceUDza~nx z_FFOQaNh4QR!_OwRLFP + * - srcPtr depth ranges - Rpp8u (0 to 255), Rpp16f (0 to 1), Rpp32f (0 to 1), Rpp8s (-128 to 127). + * - dstPtr depth ranges - Will be same depth as srcPtr. + * \image html img150x150.png Sample Input + * \image html effects_augmentations_jitter_img150x150.png Sample Output + * \param [in] srcPtr source tensor in HOST memory + * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) + * \param [out] dstPtr destination tensor in HOST memory + * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) + * \param [in] kernelSizeTensor kernelsize value for jitter calculation (kernelSize = 3/5/7 for optimal use) + * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) + * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() + * \return A \ref RppStatus enumeration. + * \retval RPP_SUCCESS Successful completion. + * \retval RPP_ERROR* Unsuccessful completion. + */ +RppStatus rppt_jitter_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *kernelSizeTensor, Rpp32u seed, RpptROIPtr roiTensorPtrSrc, RpptRoiType roiType, rppHandle_t rppHandle); + +#ifdef GPU_SUPPORT +/*! \brief Jitter augmentation on HIP backend for a NCHW/NHWC layout tensor + * \details The jitter augmentation adds a jitter effect for a batch of RGB(3 channel) / greyscale(1 channel) images with an NHWC/NCHW tensor layout.
+ * - srcPtr depth ranges - Rpp8u (0 to 255), Rpp16f (0 to 1), Rpp32f (0 to 1), Rpp8s (-128 to 127). + * - dstPtr depth ranges - Will be same depth as srcPtr. + * \image html img150x150.png Sample Input + * \image html effects_augmentations_jitter_img150x150.png Sample Output + * \param [in] srcPtr source tensor in HIP memory + * \param un[in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) + * \param [out] dstPtr destination tensor in HIP memory + * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) + * \param [in] kernelSizeTensor kernelsize value for jitter calculation (kernelSize = 3/5/7 for optimal use) + * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) + * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() + * \return A \ref RppStatus enumeration. + * \retval RPP_SUCCESS Successful completion. + * \retval RPP_ERROR* Unsuccessful completion. + */ +RppStatus rppt_jitter_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32u *kernelSizeTensor, Rpp32u seed, RpptROIPtr roiTensorPtrSrc, RpptRoiType roiType, rppHandle_t rppHandle); +#endif // GPU_SUPPORT + /*! \brief Gaussian noise augmentation on HOST backend * \details This function adds gaussian noise to a batch of 4D tensors. * Support added for u8 -> u8, f32 -> f32 datatypes. diff --git a/src/include/cpu/rpp_cpu_common.hpp b/src/include/cpu/rpp_cpu_common.hpp index 779f6f2d1..be8eaeeaa 100644 --- a/src/include/cpu/rpp_cpu_common.hpp +++ b/src/include/cpu/rpp_cpu_common.hpp @@ -6111,6 +6111,25 @@ inline void compute_separable_horizontal_resample(Rpp32f *inputPtr, T *outputPtr } } +inline void compute_jitter_src_loc_avx(__m256i *pxXorwowStateX, __m256i *pxXorwowStateCounter, __m256 &pRow, __m256 &pCol, __m256 &pKernelSize, __m256 &pBound, __m256 &pHeightLimit, __m256 &pWidthLimit, __m256 &pStride, __m256 &pChannel, Rpp32s *srcLoc) +{ + __m256 pRngX = rpp_host_rng_xorwow_8_f32_avx(pxXorwowStateX, pxXorwowStateCounter); + __m256 pRngY = rpp_host_rng_xorwow_8_f32_avx(pxXorwowStateX, pxXorwowStateCounter); + __m256 pX = _mm256_mul_ps(pRngX, pKernelSize); + __m256 pY = _mm256_mul_ps(pRngY, pKernelSize); + pX = _mm256_max_ps(_mm256_min_ps(_mm256_floor_ps(_mm256_add_ps(pRow, _mm256_sub_ps(pX, pBound))), pHeightLimit), avx_p0); + pY = _mm256_max_ps(_mm256_min_ps(_mm256_floor_ps(_mm256_add_ps(pCol, _mm256_sub_ps(pY, pBound))), pWidthLimit), avx_p0); + __m256i pxSrcLoc = _mm256_cvtps_epi32(_mm256_fmadd_ps(pX, pStride, _mm256_mul_ps(pY, pChannel))); + _mm256_storeu_si256((__m256i*) srcLoc, pxSrcLoc); +} + +inline void compute_jitter_src_loc(RpptXorwowStateBoxMuller *xorwowState, Rpp32s row, Rpp32s col, Rpp32s kSize, Rpp32s heightLimit, Rpp32s widthLimit, Rpp32s stride, Rpp32s bound, Rpp32s channels, Rpp32s &loc) +{ + Rpp32u heightIncrement = rpp_host_rng_xorwow_f32(xorwowState) * kSize; + Rpp32u widthIncrement = rpp_host_rng_xorwow_f32(xorwowState) * kSize; + loc = std::max(std::min(static_cast(row + heightIncrement - bound), heightLimit), 0) * stride; + loc += std::max(std::min(static_cast(col + widthIncrement - bound), (widthLimit - 1)), 0) * channels; +} inline void compute_sum_16_host(__m256i *p, __m256i *pSum) { pSum[0] = _mm256_add_epi32(_mm256_add_epi32(p[0], p[1]), pSum[0]); //add 16 values to 8 diff --git a/src/include/cpu/rpp_cpu_simd.hpp b/src/include/cpu/rpp_cpu_simd.hpp index bd7da2a5d..b9e79c146 100644 --- a/src/include/cpu/rpp_cpu_simd.hpp +++ b/src/include/cpu/rpp_cpu_simd.hpp @@ -3859,6 +3859,20 @@ inline void rpp_resize_nn_load_u8pkd3(Rpp8u *srcRowPtrsForInterp, Rpp32s *loc, _ p = _mm_shuffle_epi8(px[0], xmm_pkd_mask); // Shuffle to obtain 4 RGB [R01|G01|B01|R11|G11|B11|R21|G21|B21|R31|G31|B31|00|00|00|00] } +template +inline void rpp_resize_nn_extract_pkd3_avx(T *srcRowPtrsForInterp, Rpp32s *loc, __m256i &p) +{ + p = _mm256_setr_epi8(*(srcRowPtrsForInterp + loc[0]), *(srcRowPtrsForInterp + loc[0] + 1), *(srcRowPtrsForInterp + loc[0] + 2), + *(srcRowPtrsForInterp + loc[1]), *(srcRowPtrsForInterp + loc[1] + 1), *(srcRowPtrsForInterp + loc[1] + 2), + *(srcRowPtrsForInterp + loc[2]), *(srcRowPtrsForInterp + loc[2] + 1), *(srcRowPtrsForInterp + loc[2] + 2), + *(srcRowPtrsForInterp + loc[3]), *(srcRowPtrsForInterp + loc[3] + 1), *(srcRowPtrsForInterp + loc[3] + 2), + *(srcRowPtrsForInterp + loc[4]), *(srcRowPtrsForInterp + loc[4] + 1), *(srcRowPtrsForInterp + loc[4] + 2), + *(srcRowPtrsForInterp + loc[5]), *(srcRowPtrsForInterp + loc[5] + 1), *(srcRowPtrsForInterp + loc[5] + 2), + *(srcRowPtrsForInterp + loc[6]), *(srcRowPtrsForInterp + loc[6] + 1), *(srcRowPtrsForInterp + loc[6] + 2), + *(srcRowPtrsForInterp + loc[7]), *(srcRowPtrsForInterp + loc[7] + 1), *(srcRowPtrsForInterp + loc[7] + 2), + 0, 0, 0, 0, 0, 0, 0, 0); +} + inline void rpp_resize_nn_load_u8pln1(Rpp8u *srcRowPtrsForInterp, Rpp32s *loc, __m128i &p) { __m128i px[4]; @@ -3871,6 +3885,16 @@ inline void rpp_resize_nn_load_u8pln1(Rpp8u *srcRowPtrsForInterp, Rpp32s *loc, _ p = _mm_unpacklo_epi8(px[0], px[1]); // unpack to obtain [R01|R11|R21|R31|00|00|00|00|00|00|00|00|00|00|00|00] } +template +inline void rpp_resize_nn_extract_pln1_avx(T *srcRowPtrsForInterp, Rpp32s *loc, __m256i &p) +{ + p = _mm256_setr_epi8(*(srcRowPtrsForInterp + loc[0]), *(srcRowPtrsForInterp + loc[1]), + *(srcRowPtrsForInterp + loc[2]), *(srcRowPtrsForInterp + loc[3]), + *(srcRowPtrsForInterp + loc[4]), *(srcRowPtrsForInterp + loc[5]), + *(srcRowPtrsForInterp + loc[6]), *(srcRowPtrsForInterp + loc[7]), + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); +} + inline void rpp_resize_nn_load_f32pkd3_to_f32pln3(Rpp32f *srcRowPtrsForInterp, Rpp32s *loc, __m128 *p) { p[0] = _mm_loadu_ps(srcRowPtrsForInterp + loc[0]); // LOC0 load [R01|G01|B01|R02] - Need RGB 01 @@ -3880,6 +3904,42 @@ inline void rpp_resize_nn_load_f32pkd3_to_f32pln3(Rpp32f *srcRowPtrsForInterp, R _MM_TRANSPOSE4_PS(p[0], p[1], p[2], pTemp); // Transpose to obtain RGB in each vector } +inline void rpp_resize_nn_load_f32pkd3_to_f32pln3_avx(Rpp32f *srcRowPtrsForInterp, Rpp32s *loc, __m256 *p) +{ + __m128 p128[8]; + p128[0] = _mm_loadu_ps(srcRowPtrsForInterp + loc[0]); + p128[1] = _mm_loadu_ps(srcRowPtrsForInterp + loc[1]); + p128[2] = _mm_loadu_ps(srcRowPtrsForInterp + loc[2]); + p128[3] = _mm_loadu_ps(srcRowPtrsForInterp + loc[3]); + _MM_TRANSPOSE4_PS(p128[0], p128[1], p128[2], p128[3]); + p128[4] = _mm_loadu_ps(srcRowPtrsForInterp + loc[4]); + p128[5] = _mm_loadu_ps(srcRowPtrsForInterp + loc[5]); + p128[6] = _mm_loadu_ps(srcRowPtrsForInterp + loc[6]); + p128[7] = _mm_loadu_ps(srcRowPtrsForInterp + loc[7]); + _MM_TRANSPOSE4_PS(p128[4], p128[5], p128[6], p128[7]); + p[0] = _mm256_setr_m128(p128[0], p128[4]); + p[1] = _mm256_setr_m128(p128[1], p128[5]); + p[2] = _mm256_setr_m128(p128[2], p128[6]); +} + +inline void rpp_resize_nn_load_f16pkd3_to_f32pln3_avx(Rpp16f *srcRowPtrsForInterp, Rpp32s *loc, __m256 *p) +{ + p[0] = _mm256_setr_ps((Rpp32f)*(srcRowPtrsForInterp + loc[0]), (Rpp32f)*(srcRowPtrsForInterp + loc[1]), + (Rpp32f)*(srcRowPtrsForInterp + loc[2]), (Rpp32f)*(srcRowPtrsForInterp + loc[3]), + (Rpp32f)*(srcRowPtrsForInterp + loc[4]), (Rpp32f)*(srcRowPtrsForInterp + loc[5]), + (Rpp32f)*(srcRowPtrsForInterp + loc[6]), (Rpp32f)*(srcRowPtrsForInterp + loc[7])); + + p[1] = _mm256_setr_ps((Rpp32f)*(srcRowPtrsForInterp + loc[0] + 1), (Rpp32f)*(srcRowPtrsForInterp + loc[1] + 1), + (Rpp32f)*(srcRowPtrsForInterp + loc[2] + 1), (Rpp32f)*(srcRowPtrsForInterp + loc[3] + 1), + (Rpp32f)*(srcRowPtrsForInterp + loc[4] + 1), (Rpp32f)*(srcRowPtrsForInterp + loc[5] + 1), + (Rpp32f)*(srcRowPtrsForInterp + loc[6] + 1), (Rpp32f)*(srcRowPtrsForInterp + loc[7] + 1)); + + p[2] = _mm256_setr_ps((Rpp32f)*(srcRowPtrsForInterp + loc[0] + 2), (Rpp32f)*(srcRowPtrsForInterp + loc[1] + 2), + (Rpp32f)*(srcRowPtrsForInterp + loc[2] + 2), (Rpp32f)*(srcRowPtrsForInterp + loc[3] + 2), + (Rpp32f)*(srcRowPtrsForInterp + loc[4] + 2), (Rpp32f)*(srcRowPtrsForInterp + loc[5] + 2), + (Rpp32f)*(srcRowPtrsForInterp + loc[6] + 2), (Rpp32f)*(srcRowPtrsForInterp + loc[7] + 2)); +} + inline void rpp_resize_nn_load_f32pln1(Rpp32f *srcRowPtrsForInterp, Rpp32s *loc, __m128 &p) { __m128 pTemp[4]; @@ -3892,6 +3952,22 @@ inline void rpp_resize_nn_load_f32pln1(Rpp32f *srcRowPtrsForInterp, Rpp32s *loc, p = _mm_unpacklo_ps(pTemp[0], pTemp[1]); // Unpack to obtain [R01|R11|R21|R31] } +inline void rpp_resize_nn_load_f32pln1_avx(Rpp32f *srcRowPtrsForInterp, Rpp32s *loc, __m256 &p) +{ + p = _mm256_setr_ps(*(srcRowPtrsForInterp + loc[0]), *(srcRowPtrsForInterp + loc[1]), + *(srcRowPtrsForInterp + loc[2]), *(srcRowPtrsForInterp + loc[3]), + *(srcRowPtrsForInterp + loc[4]), *(srcRowPtrsForInterp + loc[5]), + *(srcRowPtrsForInterp + loc[6]), *(srcRowPtrsForInterp + loc[7])); +} + +inline void rpp_resize_nn_load_f16pln1_avx(Rpp16f *srcRowPtrsForInterp, Rpp32s *loc, __m256 &p) +{ + p = _mm256_setr_ps((Rpp32f)*(srcRowPtrsForInterp + loc[0]), (Rpp32f)*(srcRowPtrsForInterp + loc[1]), + (Rpp32f)*(srcRowPtrsForInterp + loc[2]), (Rpp32f)*(srcRowPtrsForInterp + loc[3]), + (Rpp32f)*(srcRowPtrsForInterp + loc[4]), (Rpp32f)*(srcRowPtrsForInterp + loc[5]), + (Rpp32f)*(srcRowPtrsForInterp + loc[6]), (Rpp32f)*(srcRowPtrsForInterp + loc[7])); +} + inline void rpp_resize_nn_load_i8pkd3(Rpp8s *srcRowPtrsForInterp, Rpp32s *loc, __m128i &p) { __m128i px[4]; diff --git a/src/include/hip/rpp_hip_common.hpp b/src/include/hip/rpp_hip_common.hpp index 16e3a2765..721800c80 100644 --- a/src/include/hip/rpp_hip_common.hpp +++ b/src/include/hip/rpp_hip_common.hpp @@ -1944,7 +1944,8 @@ __device__ __forceinline__ float rpp_hip_rng_xorwow_f32(T *xorwowState) return outFloat - 1; // return 0 <= outFloat < 1 } -__device__ __forceinline__ void rpp_hip_rng_8_xorwow_f32(RpptXorwowState *xorwowState, d_float8 *randomNumbersPtr_f8) +template +__device__ __forceinline__ void rpp_hip_rng_8_xorwow_f32(T *xorwowState, d_float8 *randomNumbersPtr_f8) { randomNumbersPtr_f8->f1[0] = rpp_hip_rng_xorwow_f32(xorwowState); randomNumbersPtr_f8->f1[1] = rpp_hip_rng_xorwow_f32(xorwowState); diff --git a/src/modules/cpu/host_tensor_effects_augmentations.hpp b/src/modules/cpu/host_tensor_effects_augmentations.hpp index 56d5ea817..ce7450aab 100644 --- a/src/modules/cpu/host_tensor_effects_augmentations.hpp +++ b/src/modules/cpu/host_tensor_effects_augmentations.hpp @@ -31,6 +31,7 @@ SOFTWARE. #include "kernel/noise_shot.hpp" #include "kernel/noise_gaussian.hpp" #include "kernel/non_linear_blend.hpp" +#include "kernel/jitter.hpp" #include "kernel/glitch.hpp" #include "kernel/water.hpp" #include "kernel/ricap.hpp" diff --git a/src/modules/cpu/kernel/jitter.hpp b/src/modules/cpu/kernel/jitter.hpp new file mode 100644 index 000000000..ec717150a --- /dev/null +++ b/src/modules/cpu/kernel/jitter.hpp @@ -0,0 +1,929 @@ +#include "rppdefs.h" +#include "rpp_cpu_simd.hpp" +#include "rpp_cpu_common.hpp" + +RppStatus jitter_u8_u8_host_tensor(Rpp8u *srcPtr, + RpptDescPtr srcDescPtr, + Rpp8u *dstPtr, + RpptDescPtr dstDescPtr, + Rpp32u *kernelSizeTensor, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + RpptROIPtr roiTensorPtrSrc, + RpptRoiType roiType, + RppLayoutParams layoutParams, + rpp::Handle& handle) +{ + RpptROI roiDefault = {0, 0, (Rpp32s)srcDescPtr->w, (Rpp32s)srcDescPtr->h}; + Rpp32u numThreads = handle.GetNumThreads(); + + omp_set_dynamic(0); +#pragma omp parallel for num_threads(numThreads) + for(int batchCount = 0; batchCount < dstDescPtr->n; batchCount++) + { + RpptROI roi; + RpptROIPtr roiPtrInput = &roiTensorPtrSrc[batchCount]; + compute_roi_validation_host(roiPtrInput, &roi, &roiDefault, roiType); + + Rpp32u kernelSize = kernelSizeTensor[batchCount]; + Rpp32u bound = (kernelSize - 1) / 2; + Rpp32u heightLimit = roi.xywhROI.roiHeight - bound; + Rpp32u offset = batchCount * srcDescPtr->strides.nStride; + + Rpp8u *srcPtrImage, *dstPtrImage; + srcPtrImage = srcPtr + batchCount * srcDescPtr->strides.nStride; + dstPtrImage = dstPtr + batchCount * dstDescPtr->strides.nStride; + + Rpp8u *srcPtrChannel, *dstPtrChannel; + srcPtrChannel = srcPtrImage + (roi.xywhROI.xy.y * srcDescPtr->strides.hStride) + (roi.xywhROI.xy.x * layoutParams.bufferMultiplier); + dstPtrChannel = dstPtrImage; + + Rpp32u alignedLength = roi.xywhROI.roiWidth & ~7; // Align dst width to process 4 dst pixels per iteration + Rpp32u vectorIncrement = 24; + Rpp32u vectorIncrementPerChannel = 8; + RpptXorwowStateBoxMuller xorwowState; + Rpp32s srcLocArray[8] = {0}; + + __m256i pxXorwowStateX[5], pxXorwowStateCounter; + rpp_host_rng_xorwow_state_offsetted_avx(xorwowInitialStatePtr, xorwowState, offset, pxXorwowStateX, &pxXorwowStateCounter); + __m256 pKernelSize = _mm256_set1_ps(kernelSize); + __m256 pChannel = _mm256_set1_ps(layoutParams.bufferMultiplier); + __m256 pHStride = _mm256_set1_ps(srcDescPtr->strides.hStride); + __m256 pHeightLimit = _mm256_set1_ps(heightLimit); + __m256 pWidthLimit = _mm256_set1_ps(roi.xywhROI.roiWidth - 1); + __m256 pBound = _mm256_set1_ps(bound); + + // Jitter with fused output-layout toggle (NHWC -> NCHW) + if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp8u *dstPtrRowR, *dstPtrRowG, *dstPtrRowB; + dstPtrRowR = dstPtrChannel; + dstPtrRowG = dstPtrRowR + dstDescPtr->strides.cStride; + dstPtrRowB = dstPtrRowG + dstDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8u *dstPtrTempR, *dstPtrTempG, *dstPtrTempB; + dstPtrTempR = dstPtrRowR; + dstPtrTempG = dstPtrRowG; + dstPtrTempB = dstPtrRowB; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256i pxRow; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_resize_nn_extract_pkd3_avx(srcPtrChannel, srcLocArray, pxRow); + rpp_simd_store(rpp_store24_u8pkd3_to_u8pln3_avx, dstPtrTempR, dstPtrTempG, dstPtrTempB, pxRow); + dstPtrTempR += vectorIncrementPerChannel; + dstPtrTempG += vectorIncrementPerChannel; + dstPtrTempB += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, srcDescPtr->c, loc); + *dstPtrTempR++ = *(srcPtrChannel + loc); + *dstPtrTempG++ = *(srcPtrChannel + 1 + loc); + *dstPtrTempB++ = *(srcPtrChannel + 2 + loc); + } + dstPtrRowR += dstDescPtr->strides.hStride; + dstPtrRowG += dstDescPtr->strides.hStride; + dstPtrRowB += dstDescPtr->strides.hStride; + } + } + + // Jitter with fused output-layout toggle (NCHW -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp8u *dstPtrRow; + dstPtrRow = dstPtrChannel; + Rpp8u *srcPtrRowR, *srcPtrRowG, *srcPtrRowB; + srcPtrRowR = srcPtrChannel; + srcPtrRowG = srcPtrRowR + srcDescPtr->strides.cStride; + srcPtrRowB = srcPtrRowG + srcDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8u *dstPtrTemp; + dstPtrTemp = dstPtrRow; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256i pxRow[3]; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_resize_nn_extract_pln1_avx(srcPtrRowR, srcLocArray, pxRow[0]); + rpp_resize_nn_extract_pln1_avx(srcPtrRowG, srcLocArray, pxRow[1]); + rpp_resize_nn_extract_pln1_avx(srcPtrRowB, srcLocArray, pxRow[2]); + rpp_simd_store(rpp_store24_u8pln3_to_u8pkd3_avx, dstPtrTemp, pxRow); + dstPtrTemp += vectorIncrement; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTemp++ = *(srcPtrRowR + loc); + *dstPtrTemp++ = *(srcPtrRowG + loc); + *dstPtrTemp++ = *(srcPtrRowB + loc); + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + + // Jitter without fused output-layout toggle (NHWC -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp8u *srcPtrRow, *dstPtrRow; + srcPtrRow = srcPtrChannel; + dstPtrRow = dstPtrChannel; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8u *dstPtrTemp; + dstPtrTemp = dstPtrRow; + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256i pxRow; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_resize_nn_extract_pkd3_avx(srcPtrRow, srcLocArray, pxRow); + rpp_simd_store(rpp_store24_u8_to_u8_avx, dstPtrTemp, pxRow); + dstPtrTemp += vectorIncrement; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTemp++ = *(srcPtrRow + loc); + *dstPtrTemp++ = *(srcPtrRow + 1 + loc); + *dstPtrTemp++ = *(srcPtrRow + 2 + loc); + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + + // Jitter with fused output-layout toggle (NCHW -> NCHW) + else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp8u *dstPtrRow; + dstPtrRow = dstPtrChannel; + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8u *dstPtrTemp; + dstPtrTemp = dstPtrRow; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + Rpp8u *dstPtrTempChn, *srcPtrTempChn; + srcPtrTempChn = srcPtrChannel; + dstPtrTempChn = dstPtrTemp; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + for(int c = 0; c < srcDescPtr->c; c++) + { + __m256i pxRow; + rpp_resize_nn_extract_pln1_avx(srcPtrTempChn, srcLocArray, pxRow); + rpp_storeu_si64((__m128i *)(dstPtrTempChn), _mm256_castsi256_si128(pxRow)); + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (;vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp8u *dstPtrTempChn = dstPtrTemp; + Rpp8u *srcPtrTempChn = srcPtrChannel; + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + for(int c = 0; c < srcDescPtr->c; c++) + { + *dstPtrTempChn = *(srcPtrTempChn + loc); + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp++; + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + } + + return RPP_SUCCESS; +} + +RppStatus jitter_f32_f32_host_tensor(Rpp32f *srcPtr, + RpptDescPtr srcDescPtr, + Rpp32f *dstPtr, + RpptDescPtr dstDescPtr, + Rpp32u *kernelSizeTensor, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + RpptROIPtr roiTensorPtrSrc, + RpptRoiType roiType, + RppLayoutParams layoutParams, + rpp::Handle& handle) +{ + RpptROI roiDefault = {0, 0, (Rpp32s)srcDescPtr->w, (Rpp32s)srcDescPtr->h}; + Rpp32u numThreads = handle.GetNumThreads(); + + omp_set_dynamic(0); +#pragma omp parallel for num_threads(numThreads) + for(int batchCount = 0; batchCount < dstDescPtr->n; batchCount++) + { + RpptROI roi; + RpptROIPtr roiPtrInput = &roiTensorPtrSrc[batchCount]; + compute_roi_validation_host(roiPtrInput, &roi, &roiDefault, roiType); + + Rpp32u kernelSize = kernelSizeTensor[batchCount]; + Rpp32u bound = (kernelSize - 1) / 2; + Rpp32u heightLimit = roi.xywhROI.roiHeight - bound; + Rpp32u offset = batchCount * srcDescPtr->strides.nStride; + + Rpp32f *srcPtrImage, *dstPtrImage; + srcPtrImage = srcPtr + batchCount * srcDescPtr->strides.nStride; + dstPtrImage = dstPtr + batchCount * dstDescPtr->strides.nStride; + + Rpp32f *srcPtrChannel, *dstPtrChannel; + srcPtrChannel = srcPtrImage + (roi.xywhROI.xy.y * srcDescPtr->strides.hStride) + (roi.xywhROI.xy.x * layoutParams.bufferMultiplier); + dstPtrChannel = dstPtrImage; + + Rpp32u alignedLength = roi.xywhROI.roiWidth & ~7; // Align dst width to process 4 dst pixels per iteration + Rpp32u vectorIncrement = 24; + Rpp32u vectorIncrementPerChannel = 8; + RpptXorwowStateBoxMuller xorwowState; + Rpp32s srcLocArray[8] = {0}; + + __m256i pxXorwowStateX[5], pxXorwowStateCounter; + rpp_host_rng_xorwow_state_offsetted_avx(xorwowInitialStatePtr, xorwowState, offset, pxXorwowStateX, &pxXorwowStateCounter); + __m256 pKernelSize = _mm256_set1_ps(kernelSize); + __m256 pChannel = _mm256_set1_ps(layoutParams.bufferMultiplier); + __m256 pHStride = _mm256_set1_ps(srcDescPtr->strides.hStride); + __m256 pHeightLimit = _mm256_set1_ps(heightLimit); + __m256 pWidthLimit = _mm256_set1_ps(roi.xywhROI.roiWidth-1); + __m256 pBound = _mm256_set1_ps(bound); + + + // Jitter with fused output-layout toggle (NHWC -> NCHW) + if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp32f *dstPtrRowR, *dstPtrRowG, *dstPtrRowB; + dstPtrRowR = dstPtrChannel; + dstPtrRowG = dstPtrRowR + dstDescPtr->strides.cStride; + dstPtrRowB = dstPtrRowG + dstDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp32f *dstPtrTempR, *dstPtrTempG, *dstPtrTempB; + dstPtrTempR = dstPtrRowR; + dstPtrTempG = dstPtrRowG; + dstPtrTempB = dstPtrRowB; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256 pxRow[3]; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_simd_load(rpp_resize_nn_load_f32pkd3_to_f32pln3_avx, srcPtrChannel, srcLocArray, pxRow); + rpp_simd_store(rpp_store24_f32pln3_to_f32pln3_avx, dstPtrTempR, dstPtrTempG, dstPtrTempB, pxRow); + dstPtrTempR += vectorIncrementPerChannel; + dstPtrTempG += vectorIncrementPerChannel; + dstPtrTempB += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTempR++ = *(srcPtrChannel + loc); + *dstPtrTempG++ = *(srcPtrChannel + 1 + loc); + *dstPtrTempB++ = *(srcPtrChannel + 2 + loc); + } + dstPtrRowR += dstDescPtr->strides.hStride; + dstPtrRowG += dstDescPtr->strides.hStride; + dstPtrRowB += dstDescPtr->strides.hStride; + } + } + + // Jitter with fused output-layout toggle (NCHW -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp32f *dstPtrRow; + dstPtrRow = dstPtrChannel; + Rpp32f *srcPtrRowR, *srcPtrRowG, *srcPtrRowB; + srcPtrRowR = srcPtrChannel; + srcPtrRowG = srcPtrRowR + srcDescPtr->strides.cStride; + srcPtrRowB = srcPtrRowG + srcDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp32f *dstPtrTemp; + dstPtrTemp = dstPtrRow; + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256 pxRow[4]; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_simd_load(rpp_resize_nn_load_f32pln1_avx, srcPtrRowR, srcLocArray, pxRow[0]); + rpp_simd_load(rpp_resize_nn_load_f32pln1_avx, srcPtrRowG, srcLocArray, pxRow[1]); + rpp_simd_load(rpp_resize_nn_load_f32pln1_avx, srcPtrRowB, srcLocArray, pxRow[2]); + rpp_simd_store(rpp_store24_f32pln3_to_f32pkd3_avx, dstPtrTemp, pxRow); + dstPtrTemp += vectorIncrement; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTemp++ = *(srcPtrRowR + loc); + *dstPtrTemp++ = *(srcPtrRowG + loc); + *dstPtrTemp++ = *(srcPtrRowB + loc); + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + + // Jitter without fused output-layout toggle (NHWC -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp32f *srcPtrRow, *dstPtrRow; + srcPtrRow = srcPtrChannel; + dstPtrRow = dstPtrChannel; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp32f *dstPtrTemp; + dstPtrTemp = dstPtrRow; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + __m256 pRow; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + rpp_simd_load(rpp_load8_f32_to_f32_avx, (srcPtrChannel + loc), &pRow); + rpp_simd_store(rpp_store8_f32_to_f32_avx, dstPtrTemp, &pRow); + dstPtrTemp += 3; + } +#endif + dstPtrRow += dstDescPtr->strides.hStride; + } + } + // Jitter with fused output-layout toggle (NCHW -> NCHW) + else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp32f *dstPtrRow; + dstPtrRow = dstPtrChannel; + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp32f *dstPtrTemp; + dstPtrTemp = dstPtrRow; + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + Rpp32f *srcPtrTempChn, *dstPtrTempChn; + srcPtrTempChn = srcPtrChannel; + dstPtrTempChn = dstPtrTemp; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + + for (int c = 0; c < dstDescPtr->c; c++) + { + __m256 pxRow; + rpp_simd_load(rpp_resize_nn_load_f32pln1_avx, srcPtrTempChn, srcLocArray, pxRow); + rpp_simd_store(rpp_store8_f32_to_f32_avx, dstPtrTempChn, &pxRow); + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (;vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32f *dstPtrTempChn = dstPtrTemp; + Rpp32f *srcPtrTempChn = srcPtrChannel; + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + for(int c = 0; c < srcDescPtr->c; c++) + { + *dstPtrTempChn = (Rpp32f)*(srcPtrTempChn + loc); + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp++; + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + } + + return RPP_SUCCESS; +} + +RppStatus jitter_f16_f16_host_tensor(Rpp16f *srcPtr, + RpptDescPtr srcDescPtr, + Rpp16f *dstPtr, + RpptDescPtr dstDescPtr, + Rpp32u *kernelSizeTensor, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + RpptROIPtr roiTensorPtrSrc, + RpptRoiType roiType, + RppLayoutParams layoutParams, + rpp::Handle& handle) +{ + RpptROI roiDefault = {0, 0, (Rpp32s)srcDescPtr->w, (Rpp32s)srcDescPtr->h}; + Rpp32u numThreads = handle.GetNumThreads(); + + omp_set_dynamic(0); +#pragma omp parallel for num_threads(numThreads) + for(int batchCount = 0; batchCount < dstDescPtr->n; batchCount++) + { + RpptROI roi; + RpptROIPtr roiPtrInput = &roiTensorPtrSrc[batchCount]; + compute_roi_validation_host(roiPtrInput, &roi, &roiDefault, roiType); + + Rpp32u kernelSize = kernelSizeTensor[batchCount]; + Rpp32u bound = (kernelSize - 1) / 2; + Rpp32u heightLimit = roi.xywhROI.roiHeight - bound; + Rpp32u offset = batchCount * srcDescPtr->strides.nStride; + + Rpp16f *srcPtrImage, *dstPtrImage; + srcPtrImage = srcPtr + batchCount * srcDescPtr->strides.nStride; + dstPtrImage = dstPtr + batchCount * dstDescPtr->strides.nStride; + + Rpp16f *srcPtrChannel, *dstPtrChannel; + srcPtrChannel = srcPtrImage + (roi.xywhROI.xy.y * srcDescPtr->strides.hStride) + (roi.xywhROI.xy.x * layoutParams.bufferMultiplier); + dstPtrChannel = dstPtrImage; + + Rpp32u alignedLength = roi.xywhROI.roiWidth & ~7; // Align dst width to process 4 dst pixels per iteration + Rpp32u vectorIncrement = 24; + Rpp32u vectorIncrementPerChannel = 8; + RpptXorwowStateBoxMuller xorwowState; + Rpp32s srcLocArray[8] = {0}; + + __m256i pxXorwowStateX[5], pxXorwowStateCounter; + rpp_host_rng_xorwow_state_offsetted_avx(xorwowInitialStatePtr, xorwowState, offset, pxXorwowStateX, &pxXorwowStateCounter); + __m256 pKernelSize = _mm256_set1_ps(kernelSize); + __m256 pChannel = _mm256_set1_ps(layoutParams.bufferMultiplier); + __m256 pHStride = _mm256_set1_ps(srcDescPtr->strides.hStride); + __m256 pHeightLimit = _mm256_set1_ps(heightLimit); + __m256 pWidthLimit = _mm256_set1_ps(roi.xywhROI.roiWidth-1); + __m256 pBound = _mm256_set1_ps(bound); + + + // Jitter with fused output-layout toggle (NHWC -> NCHW) + if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp16f *dstPtrRowR, *dstPtrRowG, *dstPtrRowB; + dstPtrRowR = dstPtrChannel; + dstPtrRowG = dstPtrRowR + dstDescPtr->strides.cStride; + dstPtrRowB = dstPtrRowG + dstDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp16f *dstPtrTempR, *dstPtrTempG, *dstPtrTempB; + dstPtrTempR = dstPtrRowR; + dstPtrTempG = dstPtrRowG; + dstPtrTempB = dstPtrRowB; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + Rpp32f dstPtrTempR_ps[8], dstPtrTempG_ps[8], dstPtrTempB_ps[8]; + __m256 pxRow[3]; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_simd_load(rpp_resize_nn_load_f16pkd3_to_f32pln3_avx, srcPtrChannel, srcLocArray, pxRow); + rpp_simd_store(rpp_store24_f32pln3_to_f32pln3_avx, dstPtrTempR_ps, dstPtrTempG_ps, dstPtrTempB_ps, pxRow); + for(int cnt = 0; cnt < vectorIncrementPerChannel; cnt++) + { + dstPtrTempR[cnt] = (Rpp16f) dstPtrTempR_ps[cnt]; + dstPtrTempG[cnt] = (Rpp16f) dstPtrTempG_ps[cnt]; + dstPtrTempB[cnt] = (Rpp16f) dstPtrTempB_ps[cnt]; + } + dstPtrTempR += vectorIncrementPerChannel; + dstPtrTempG += vectorIncrementPerChannel; + dstPtrTempB += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTempR++ = *(srcPtrChannel + loc); + *dstPtrTempG++ = *(srcPtrChannel + 1 + loc); + *dstPtrTempB++ = *(srcPtrChannel + 2 + loc); + } + dstPtrRowR += dstDescPtr->strides.hStride; + dstPtrRowG += dstDescPtr->strides.hStride; + dstPtrRowB += dstDescPtr->strides.hStride; + } + } + + // Jitter with fused output-layout toggle (NCHW -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp16f *dstPtrRow; + dstPtrRow = dstPtrChannel; + Rpp16f *srcPtrRowR, *srcPtrRowG, *srcPtrRowB; + srcPtrRowR = srcPtrChannel; + srcPtrRowG = srcPtrRowR + srcDescPtr->strides.cStride; + srcPtrRowB = srcPtrRowG + srcDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp16f *dstPtrTemp; + dstPtrTemp = dstPtrRow; + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + Rpp32f dstPtrTemp_ps[25]; + __m256 pxRow[4]; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_simd_load(rpp_resize_nn_load_f16pln1_avx, srcPtrRowR, srcLocArray, pxRow[0]); + rpp_simd_load(rpp_resize_nn_load_f16pln1_avx, srcPtrRowG, srcLocArray, pxRow[1]); + rpp_simd_load(rpp_resize_nn_load_f16pln1_avx, srcPtrRowB, srcLocArray, pxRow[2]); + rpp_simd_store(rpp_store24_f32pln3_to_f32pkd3_avx, dstPtrTemp_ps, pxRow); + for(int cnt = 0; cnt < vectorIncrement; cnt++) + dstPtrTemp[cnt] = (Rpp16f) dstPtrTemp_ps[cnt]; + dstPtrTemp += vectorIncrement; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTemp++ = *(srcPtrRowR + loc); + *dstPtrTemp++ = *(srcPtrRowG + loc); + *dstPtrTemp++ = *(srcPtrRowB + loc); + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + + // Jitter without fused output-layout toggle (NHWC -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp16f *srcPtrRow, *dstPtrRow; + srcPtrRow = srcPtrChannel; + dstPtrRow = dstPtrChannel; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp16f *dstPtrTemp; + dstPtrTemp = dstPtrRow; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32f srcPtrTemp_ps[8], dstPtrTemp_ps[8]; + Rpp32s loc; + __m256 pRow; + + for(int cnt = 0; cnt < vectorIncrementPerChannel; cnt++) + { + srcPtrTemp_ps[cnt] = (Rpp16f)srcPtrChannel[loc + cnt]; + } + + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + rpp_simd_load(rpp_load8_f32_to_f32_avx, srcPtrTemp_ps, &pRow); + rpp_simd_store(rpp_store8_f32_to_f32_avx, dstPtrTemp_ps, &pRow); + + for(int cnt = 0; cnt < vectorIncrementPerChannel; cnt++) + { + dstPtrTemp[cnt] = (Rpp16f) dstPtrTemp_ps[cnt]; + } + dstPtrTemp += 3; + } +#endif + dstPtrRow += dstDescPtr->strides.hStride; + } + } + // Jitter with fused output-layout toggle (NCHW -> NCHW) + else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp16f *dstPtrRow; + dstPtrRow = dstPtrChannel; + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp16f *dstPtrTemp; + dstPtrTemp = dstPtrRow; + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + Rpp16f *srcPtrTempChn, *dstPtrTempChn; + srcPtrTempChn = srcPtrChannel; + dstPtrTempChn = dstPtrTemp; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + + for (int c = 0; c < dstDescPtr->c; c++) + { + Rpp32f dstPtrTemp_ps[8]; + __m256 pxRow; + rpp_simd_load(rpp_resize_nn_load_f16pln1_avx, srcPtrTempChn, srcLocArray, pxRow); + rpp_simd_store(rpp_store8_f32_to_f32_avx, dstPtrTemp_ps, &pxRow); + for(int cnt = 0; cnt < vectorIncrementPerChannel; cnt++) + { + dstPtrTempChn[cnt] = (Rpp16f) dstPtrTemp_ps[cnt]; + } + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (;vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp16f *dstPtrTempChn = dstPtrTemp; + Rpp16f *srcPtrTempChn = srcPtrChannel; + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + for(int c = 0; c < srcDescPtr->c; c++) + { + *dstPtrTempChn = (Rpp16f)*(srcPtrTempChn + loc); + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp++; + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + } + + return RPP_SUCCESS; +} + +RppStatus jitter_i8_i8_host_tensor(Rpp8s *srcPtr, + RpptDescPtr srcDescPtr, + Rpp8s *dstPtr, + RpptDescPtr dstDescPtr, + Rpp32u *kernelSizeTensor, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + RpptROIPtr roiTensorPtrSrc, + RpptRoiType roiType, + RppLayoutParams layoutParams, + rpp::Handle& handle) +{ + RpptROI roiDefault = {0, 0, (Rpp32s)srcDescPtr->w, (Rpp32s)srcDescPtr->h}; + Rpp32u numThreads = handle.GetNumThreads(); + + omp_set_dynamic(0); +#pragma omp parallel for num_threads(numThreads) + for(int batchCount = 0; batchCount < dstDescPtr->n; batchCount++) + { + RpptROI roi; + RpptROIPtr roiPtrInput = &roiTensorPtrSrc[batchCount]; + compute_roi_validation_host(roiPtrInput, &roi, &roiDefault, roiType); + + Rpp32u kernelSize = kernelSizeTensor[batchCount]; + Rpp32u bound = (kernelSize - 1) / 2; + Rpp32u heightLimit = roi.xywhROI.roiHeight - bound; + Rpp32u offset = batchCount * srcDescPtr->strides.nStride; + + Rpp8s *srcPtrImage, *dstPtrImage; + srcPtrImage = srcPtr + batchCount * srcDescPtr->strides.nStride; + dstPtrImage = dstPtr + batchCount * dstDescPtr->strides.nStride; + + Rpp8s *srcPtrChannel, *dstPtrChannel; + srcPtrChannel = srcPtrImage + (roi.xywhROI.xy.y * srcDescPtr->strides.hStride) + (roi.xywhROI.xy.x * layoutParams.bufferMultiplier); + dstPtrChannel = dstPtrImage; + + Rpp32u alignedLength = roi.xywhROI.roiWidth & ~7; // Align dst width to process 4 dst pixels per iteration + Rpp32u vectorIncrement = 24; + Rpp32u vectorIncrementPerChannel = 8; + RpptXorwowStateBoxMuller xorwowState; + Rpp32s srcLocArray[8] = {0}; + + __m256i pxXorwowStateX[5], pxXorwowStateCounter; + rpp_host_rng_xorwow_state_offsetted_avx(xorwowInitialStatePtr, xorwowState, offset, pxXorwowStateX, &pxXorwowStateCounter); + __m256 pKernelSize = _mm256_set1_ps(kernelSize); + __m256 pChannel = _mm256_set1_ps(layoutParams.bufferMultiplier); + __m256 pHStride = _mm256_set1_ps(srcDescPtr->strides.hStride); + __m256 pHeightLimit = _mm256_set1_ps(heightLimit); + __m256 pWidthLimit = _mm256_set1_ps(roi.xywhROI.roiWidth-1); + __m256 pBound = _mm256_set1_ps(bound); + + // Jitter with fused output-layout toggle (NHWC -> NCHW) + if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp8s *dstPtrRowR, *dstPtrRowG, *dstPtrRowB; + dstPtrRowR = dstPtrChannel; + dstPtrRowG = dstPtrRowR + dstDescPtr->strides.cStride; + dstPtrRowB = dstPtrRowG + dstDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8s *dstPtrTempR, *dstPtrTempG, *dstPtrTempB; + dstPtrTempR = dstPtrRowR; + dstPtrTempG = dstPtrRowG; + dstPtrTempB = dstPtrRowB; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256i pxRow; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_resize_nn_extract_pkd3_avx(srcPtrChannel, srcLocArray, pxRow); + rpp_simd_store(rpp_store24_i8pkd3_to_i8pln3_avx, dstPtrTempR, dstPtrTempG, dstPtrTempB, pxRow); + dstPtrTempR += vectorIncrementPerChannel; + dstPtrTempG += vectorIncrementPerChannel; + dstPtrTempB += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTempR++ = *(srcPtrChannel + loc); + *dstPtrTempG++ = *(srcPtrChannel + 1 + loc); + *dstPtrTempB++ = *(srcPtrChannel + 2 + loc); + } + dstPtrRowR += dstDescPtr->strides.hStride; + dstPtrRowG += dstDescPtr->strides.hStride; + dstPtrRowB += dstDescPtr->strides.hStride; + } + } + + // Jitter with fused output-layout toggle (NCHW -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp8s *dstPtrRow; + dstPtrRow = dstPtrChannel; + Rpp8s *srcPtrRowR, *srcPtrRowG, *srcPtrRowB; + srcPtrRowR = srcPtrChannel; + srcPtrRowG = srcPtrRowR + srcDescPtr->strides.cStride; + srcPtrRowB = srcPtrRowG + srcDescPtr->strides.cStride; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8s *dstPtrTemp; + dstPtrTemp = dstPtrRow; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256i pxRow[3]; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_resize_nn_extract_pln1_avx(srcPtrRowR, srcLocArray, pxRow[0]); + rpp_resize_nn_extract_pln1_avx(srcPtrRowG, srcLocArray, pxRow[1]); + rpp_resize_nn_extract_pln1_avx(srcPtrRowB, srcLocArray, pxRow[2]); + rpp_simd_store(rpp_store24_i8pln3_to_i8pkd3_avx, dstPtrTemp, pxRow); + dstPtrTemp += vectorIncrement; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTemp++ = *(srcPtrRowR + loc); + *dstPtrTemp++ = *(srcPtrRowG + loc); + *dstPtrTemp++ = *(srcPtrRowB + loc); + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + + // Jitter without fused output-layout toggle (NHWC -> NHWC) + else if ((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + Rpp8s *srcPtrRow, *dstPtrRow; + srcPtrRow = srcPtrChannel; + dstPtrRow = dstPtrChannel; + + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8s *dstPtrTemp; + dstPtrTemp = dstPtrRow; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + __m256i pxRow; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + rpp_resize_nn_extract_pkd3_avx(srcPtrRow, srcLocArray, pxRow); + rpp_simd_store(rpp_store24_i8_to_i8_avx, dstPtrTemp, pxRow); + dstPtrTemp += vectorIncrement; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (; vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + *dstPtrTemp++ = (Rpp8s)*(srcPtrRow + loc); + *dstPtrTemp++ = (Rpp8s)*(srcPtrRow + 1 + loc); + *dstPtrTemp++ = (Rpp8s)*(srcPtrRow + 2 + loc); + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + // Jitter with fused output-layout toggle (NCHW -> NCHW) + else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + Rpp8s *dstPtrRow; + dstPtrRow = dstPtrChannel; + for(int dstLocRow = 0; dstLocRow < roi.xywhROI.roiHeight; dstLocRow++) + { + Rpp8s *dstPtrTemp; + dstPtrTemp = dstPtrRow; + + __m256 pRow = _mm256_set1_ps(dstLocRow); + __m256 pCol = avx_pDstLocInit; + int vectorLoopCount = 0; +#if __AVX2__ + for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrementPerChannel) + { + Rpp8s *dstPtrTempChn, *srcPtrTempChn; + srcPtrTempChn = srcPtrChannel; + dstPtrTempChn = dstPtrTemp; + compute_jitter_src_loc_avx(pxXorwowStateX, &pxXorwowStateCounter, pRow, pCol, pKernelSize, pBound, pHeightLimit, pWidthLimit, pHStride, pChannel, srcLocArray); + for(int c = 0; c < srcDescPtr->c; c++) + { + __m256i pxRow; + rpp_resize_nn_extract_pln1_avx(srcPtrTempChn, srcLocArray, pxRow); + rpp_storeu_si64((__m128i *)(dstPtrTempChn), _mm256_castsi256_si128(pxRow)); + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp += vectorIncrementPerChannel; + pCol = _mm256_add_ps(avx_p8, pCol); + } +#endif + for (;vectorLoopCount < roi.xywhROI.roiWidth; vectorLoopCount++) + { + Rpp8s *dstPtrTempChn = dstPtrTemp; + Rpp8s *srcPtrTempChn = srcPtrChannel; + Rpp32s loc; + compute_jitter_src_loc(&xorwowState, dstLocRow, vectorLoopCount, kernelSize, heightLimit, roi.xywhROI.roiWidth, srcDescPtr->strides.hStride, bound, layoutParams.bufferMultiplier, loc); + for(int c = 0; c < srcDescPtr->c; c++) + { + *dstPtrTempChn = (Rpp8s)*(srcPtrTempChn + loc); + srcPtrTempChn += srcDescPtr->strides.cStride; + dstPtrTempChn += dstDescPtr->strides.cStride; + } + dstPtrTemp++; + } + dstPtrRow += dstDescPtr->strides.hStride; + } + } + } + + return RPP_SUCCESS; +} diff --git a/src/modules/hip/hip_tensor_effects_augmentations.hpp b/src/modules/hip/hip_tensor_effects_augmentations.hpp index f1da2cdb9..12e80a1f4 100644 --- a/src/modules/hip/hip_tensor_effects_augmentations.hpp +++ b/src/modules/hip/hip_tensor_effects_augmentations.hpp @@ -31,6 +31,7 @@ SOFTWARE. #include "kernel/noise_shot.hpp" #include "kernel/noise_gaussian.hpp" #include "kernel/non_linear_blend.hpp" +#include "kernel/jitter.hpp" #include "kernel/glitch.hpp" #include "kernel/water.hpp" #include "kernel/ricap.hpp" diff --git a/src/modules/hip/kernel/jitter.hpp b/src/modules/hip/kernel/jitter.hpp new file mode 100644 index 000000000..bbc407cda --- /dev/null +++ b/src/modules/hip/kernel/jitter.hpp @@ -0,0 +1,314 @@ +#include +#include "rpp_hip_common.hpp" +#include "rng_seed_stream.hpp" + +__device__ __forceinline__ void jitter_roi_and_srclocs_hip_compute(int4 *srcRoiPtr_i4, RpptXorwowStateBoxMuller *xorwowState, uint kernelSize, uint bound, int id_x, int id_y, d_float16 *locSrc_f16) +{ + d_float8 widthIncrement_f8, heightIncrement_f8; + rpp_hip_rng_8_xorwow_f32(xorwowState, &widthIncrement_f8); + rpp_hip_math_multiply8_const(&widthIncrement_f8, &widthIncrement_f8, static_cast(kernelSize)); + rpp_hip_rng_8_xorwow_f32(xorwowState, &heightIncrement_f8); + rpp_hip_math_multiply8_const(&heightIncrement_f8, &heightIncrement_f8, static_cast(kernelSize)); + + d_float8 increment_f8, locDst_f8x, locDst_f8y; + increment_f8.f4[0] = make_float4(0.0f, 1.0f, 2.0f, 3.0f); // 8 element vectorized kernel needs 8 increments - creating uint4 for increments 0, 1, 2, 3 here, and adding (float4)4 later to get 4, 5, 6, 7 incremented srcLocs + increment_f8.f4[1] = make_float4(4.0f, 5.0f, 6.0f, 7.0f); + locDst_f8x.f4[0] = static_cast(id_x) + increment_f8.f4[0]; + locDst_f8x.f4[1] = static_cast(id_x) + increment_f8.f4[1]; + locDst_f8y.f4[0] = locDst_f8y.f4[1] = (float4)id_y; + + locSrc_f16->f8[0].f4[0] = static_cast(srcRoiPtr_i4->x) + locDst_f8x.f4[0] + widthIncrement_f8.f4[0] - static_cast(bound); + locSrc_f16->f8[0].f4[1] = static_cast(srcRoiPtr_i4->x) + locDst_f8x.f4[1] + widthIncrement_f8.f4[1] - static_cast(bound); + locSrc_f16->f8[1].f4[0] = static_cast(srcRoiPtr_i4->y) + locDst_f8y.f4[0] + heightIncrement_f8.f4[0] - static_cast(bound); + locSrc_f16->f8[1].f4[1] = static_cast(srcRoiPtr_i4->y) + locDst_f8y.f4[1] + heightIncrement_f8.f4[1] - static_cast(bound); + + // Apply boundary checks and adjustments + for(int i = 0; i < 8; ++i) + { + locSrc_f16->f1[i] = fmaxf(fminf(floorf(locSrc_f16->f1[i]), static_cast(srcRoiPtr_i4->z - 1)), 0.0f); + locSrc_f16->f1[i + 8] = fmaxf(fminf(floorf(locSrc_f16->f1[i + 8]), static_cast(srcRoiPtr_i4->w - bound)), 0.0f); + } +} + +template +__global__ void jitter_pkd_tensor(T *srcPtr, + uint2 srcStridesNH, + T *dstPtr, + uint2 dstStridesNH, + uint *kernelsize, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + uint *xorwowSeedStream, + RpptROIPtr roiTensorPtrSrc) +{ + int id_x = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x) * 8; + int id_y = hipBlockIdx_y * hipBlockDim_y + hipThreadIdx_y; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + + if ((id_y >= roiTensorPtrSrc[id_z].xywhROI.roiHeight) || (id_x >= roiTensorPtrSrc[id_z].xywhROI.roiWidth)) + { + return; + } + + uint srcIdx = (id_z * srcStridesNH.x); + uint dstIdx = (id_z * dstStridesNH.x) + (id_y * dstStridesNH.y) + (id_x * 3); + uint seedStreamIdx = (id_y * dstStridesNH.y) + (hipBlockIdx_x * hipBlockDim_x) + hipThreadIdx_x; + uint kernelSize = kernelsize[id_z]; + uint bound = (kernelSize - 1) / 2; + + RpptXorwowStateBoxMuller xorwowState; + uint xorwowSeed = xorwowSeedStream[seedStreamIdx % SEED_STREAM_MAX_SIZE]; + xorwowState.x[0] = xorwowInitialStatePtr->x[0] + xorwowSeed; + xorwowState.x[1] = xorwowInitialStatePtr->x[1] + xorwowSeed; + xorwowState.x[2] = xorwowInitialStatePtr->x[2] + xorwowSeed; + xorwowState.x[3] = xorwowInitialStatePtr->x[3] + xorwowSeed; + xorwowState.x[4] = xorwowInitialStatePtr->x[4] + xorwowSeed; + xorwowState.counter = xorwowInitialStatePtr->counter + xorwowSeed; + + int4 srcRoi_i4 = *(int4 *)&roiTensorPtrSrc[id_z]; + d_float16 locSrc_f16; + jitter_roi_and_srclocs_hip_compute(&srcRoi_i4, &xorwowState, kernelSize, bound, id_x, id_y, &locSrc_f16); + + d_float24 dst_f24; + rpp_hip_interpolate24_nearest_neighbor_pkd3(srcPtr + srcIdx, srcStridesNH.y, &locSrc_f16, &srcRoi_i4, &dst_f24); + rpp_hip_pack_float24_pkd3_and_store24_pkd3(dstPtr + dstIdx, &dst_f24); +} + +template +__global__ void jitter_pln_tensor(T *srcPtr, + uint3 srcStridesNCH, + T *dstPtr, + uint3 dstStridesNCH, + int channelsDst, + uint *kernelsize, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + uint *xorwowSeedStream, + RpptROIPtr roiTensorPtrSrc) +{ + int id_x = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x) * 8; + int id_y = hipBlockIdx_y * hipBlockDim_y + hipThreadIdx_y; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + + if ((id_y >= roiTensorPtrSrc[id_z].xywhROI.roiHeight) || (id_x >= roiTensorPtrSrc[id_z].xywhROI.roiWidth)) + { + return; + } + + uint srcIdx = (id_z * srcStridesNCH.x); + uint dstIdx = (id_z * dstStridesNCH.x) + (id_y * dstStridesNCH.z) + id_x; + uint seedStreamIdx = (id_y * dstStridesNCH.z) + (hipBlockIdx_x * hipBlockDim_x) + hipThreadIdx_x; + uint kernelSize = kernelsize[id_z]; + uint bound = (kernelSize - 1) / 2; + + RpptXorwowStateBoxMuller xorwowState; + uint xorwowSeed = xorwowSeedStream[seedStreamIdx % SEED_STREAM_MAX_SIZE]; + xorwowState.x[0] = xorwowInitialStatePtr->x[0] + xorwowSeed; + xorwowState.x[1] = xorwowInitialStatePtr->x[1] + xorwowSeed; + xorwowState.x[2] = xorwowInitialStatePtr->x[2] + xorwowSeed; + xorwowState.x[3] = xorwowInitialStatePtr->x[3] + xorwowSeed; + xorwowState.x[4] = xorwowInitialStatePtr->x[4] + xorwowSeed; + xorwowState.counter = xorwowInitialStatePtr->counter + xorwowSeed; + + int4 srcRoi_i4 = *(int4 *)&roiTensorPtrSrc[id_z]; + d_float16 locSrc_f16; + jitter_roi_and_srclocs_hip_compute(&srcRoi_i4, &xorwowState, kernelSize, bound, id_x, id_y, &locSrc_f16); + + d_float8 dst_f8; + rpp_hip_interpolate8_nearest_neighbor_pln1(srcPtr + srcIdx, srcStridesNCH.z, &locSrc_f16, &srcRoi_i4, &dst_f8); + rpp_hip_pack_float8_and_store8(dstPtr + dstIdx, &dst_f8); + + if (channelsDst == 3) + { + srcIdx += srcStridesNCH.y; + dstIdx += dstStridesNCH.y; + + rpp_hip_interpolate8_nearest_neighbor_pln1(srcPtr + srcIdx, srcStridesNCH.z, &locSrc_f16, &srcRoi_i4, &dst_f8); + rpp_hip_pack_float8_and_store8(dstPtr + dstIdx, &dst_f8); + + srcIdx += srcStridesNCH.y; + dstIdx += dstStridesNCH.y; + + rpp_hip_interpolate8_nearest_neighbor_pln1(srcPtr + srcIdx, srcStridesNCH.z, &locSrc_f16, &srcRoi_i4, &dst_f8); + rpp_hip_pack_float8_and_store8(dstPtr + dstIdx, &dst_f8); + } +} + +template +__global__ void jitter_pkd3_pln3_tensor(T *srcPtr, + uint2 srcStridesNH, + T *dstPtr, + uint3 dstStridesNCH, + uint *kernelsize, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + uint *xorwowSeedStream, + RpptROIPtr roiTensorPtrSrc) +{ + int id_x = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x) * 8; + int id_y = hipBlockIdx_y * hipBlockDim_y + hipThreadIdx_y; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + + if ((id_y >= roiTensorPtrSrc[id_z].xywhROI.roiHeight) || (id_x >= roiTensorPtrSrc[id_z].xywhROI.roiWidth)) + { + return; + } + + uint srcIdx = (id_z * srcStridesNH.x); + uint dstIdx = (id_z * dstStridesNCH.x) + (id_y * dstStridesNCH.z) + id_x; + uint seedStreamIdx = (id_y * dstStridesNCH.z) + (hipBlockIdx_x * hipBlockDim_x) + hipThreadIdx_x; + uint kernelSize = kernelsize[id_z]; + uint bound = (kernelSize - 1) / 2; + + RpptXorwowStateBoxMuller xorwowState; + uint xorwowSeed = xorwowSeedStream[seedStreamIdx % SEED_STREAM_MAX_SIZE]; + xorwowState.x[0] = xorwowInitialStatePtr->x[0] + xorwowSeed; + xorwowState.x[1] = xorwowInitialStatePtr->x[1] + xorwowSeed; + xorwowState.x[2] = xorwowInitialStatePtr->x[2] + xorwowSeed; + xorwowState.x[3] = xorwowInitialStatePtr->x[3] + xorwowSeed; + xorwowState.x[4] = xorwowInitialStatePtr->x[4] + xorwowSeed; + xorwowState.counter = xorwowInitialStatePtr->counter + xorwowSeed; + + int4 srcRoi_i4 = *(int4 *)&roiTensorPtrSrc[id_z]; + d_float16 locSrc_f16; + jitter_roi_and_srclocs_hip_compute(&srcRoi_i4, &xorwowState, kernelSize, bound, id_x, id_y, &locSrc_f16); + + d_float24 dst_f24; + rpp_hip_interpolate24_nearest_neighbor_pkd3(srcPtr + srcIdx, srcStridesNH.y, &locSrc_f16, &srcRoi_i4, &dst_f24); + rpp_hip_pack_float24_pkd3_and_store24_pln3(dstPtr + dstIdx, dstStridesNCH.y, &dst_f24); +} + +template +__global__ void jitter_pln3_pkd3_tensor(T *srcPtr, + uint3 srcStridesNCH, + T *dstPtr, + uint2 dstStridesNH, + uint *kernelsize, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + uint *xorwowSeedStream, + RpptROIPtr roiTensorPtrSrc) +{ + int id_x = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x) * 8; + int id_y = hipBlockIdx_y * hipBlockDim_y + hipThreadIdx_y; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + + if ((id_y >= roiTensorPtrSrc[id_z].xywhROI.roiHeight) || (id_x >= roiTensorPtrSrc[id_z].xywhROI.roiWidth)) + { + return; + } + + uint srcIdx = (id_z * srcStridesNCH.x); + uint dstIdx = (id_z * dstStridesNH.x) + (id_y * dstStridesNH.y) + (id_x * 3); + uint seedStreamIdx = (id_y * dstStridesNH.y) + (hipBlockIdx_x * hipBlockDim_x) + hipThreadIdx_x; + uint kernelSize = kernelsize[id_z]; + uint bound = (kernelSize - 1) / 2; + + RpptXorwowStateBoxMuller xorwowState; + uint xorwowSeed = xorwowSeedStream[seedStreamIdx % SEED_STREAM_MAX_SIZE]; + xorwowState.x[0] = xorwowInitialStatePtr->x[0] + xorwowSeed; + xorwowState.x[1] = xorwowInitialStatePtr->x[1] + xorwowSeed; + xorwowState.x[2] = xorwowInitialStatePtr->x[2] + xorwowSeed; + xorwowState.x[3] = xorwowInitialStatePtr->x[3] + xorwowSeed; + xorwowState.x[4] = xorwowInitialStatePtr->x[4] + xorwowSeed; + xorwowState.counter = xorwowInitialStatePtr->counter + xorwowSeed; + + int4 srcRoi_i4 = *(int4 *)&roiTensorPtrSrc[id_z]; + d_float16 locSrc_f16; + jitter_roi_and_srclocs_hip_compute(&srcRoi_i4, &xorwowState, kernelSize, bound, id_x, id_y, &locSrc_f16); + + d_float24 dst_f24; + rpp_hip_interpolate24_nearest_neighbor_pln3(srcPtr + srcIdx, &srcStridesNCH, &locSrc_f16, &srcRoi_i4, &dst_f24); + rpp_hip_pack_float24_pln3_and_store24_pkd3(dstPtr + dstIdx, &dst_f24); +} + +template +RppStatus hip_exec_jitter_tensor(T *srcPtr, + RpptDescPtr srcDescPtr, + T *dstPtr, + RpptDescPtr dstDescPtr, + uint *kernelSizeTensor, + RpptXorwowStateBoxMuller *xorwowInitialStatePtr, + RpptROIPtr roiTensorPtrSrc, + RpptRoiType roiType, + rpp::Handle& handle) +{ + if (roiType == RpptRoiType::LTRB) + hip_exec_roi_converison_ltrb_to_xywh(roiTensorPtrSrc, handle); + + int globalThreads_x = (dstDescPtr->strides.hStride + 7) >> 3; + int globalThreads_y = dstDescPtr->h; + int globalThreads_z = dstDescPtr->n; + + Rpp32u *xorwowSeedStream; + xorwowSeedStream = (Rpp32u *)&xorwowInitialStatePtr[1]; + CHECK_RETURN_STATUS(hipMemcpyAsync(xorwowSeedStream, rngSeedStream4050, SEED_STREAM_MAX_SIZE * sizeof(Rpp32u), hipMemcpyHostToDevice, handle.GetStream())); + + if ((srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + hipLaunchKernelGGL(jitter_pkd_tensor, + dim3(ceil((float)globalThreads_x/LOCAL_THREADS_X), ceil((float)globalThreads_y/LOCAL_THREADS_Y), ceil((float)globalThreads_z/LOCAL_THREADS_Z)), + dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z), + 0, + handle.GetStream(), + srcPtr, + make_uint2(srcDescPtr->strides.nStride, srcDescPtr->strides.hStride), + dstPtr, + make_uint2(dstDescPtr->strides.nStride, dstDescPtr->strides.hStride), + kernelSizeTensor, + xorwowInitialStatePtr, + xorwowSeedStream, + roiTensorPtrSrc); + } + else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + hipLaunchKernelGGL(jitter_pln_tensor, + dim3(ceil((float)globalThreads_x/LOCAL_THREADS_X), ceil((float)globalThreads_y/LOCAL_THREADS_Y), ceil((float)globalThreads_z/LOCAL_THREADS_Z)), + dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z), + 0, + handle.GetStream(), + srcPtr, + make_uint3(srcDescPtr->strides.nStride, srcDescPtr->strides.cStride, srcDescPtr->strides.hStride), + dstPtr, + make_uint3(dstDescPtr->strides.nStride, dstDescPtr->strides.cStride, dstDescPtr->strides.hStride), + dstDescPtr->c, + kernelSizeTensor, + xorwowInitialStatePtr, + xorwowSeedStream, + roiTensorPtrSrc); + } + else if ((srcDescPtr->c == 3) && (dstDescPtr->c == 3)) + { + if ((srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NCHW)) + { + hipLaunchKernelGGL(jitter_pkd3_pln3_tensor, + dim3(ceil((float)globalThreads_x/LOCAL_THREADS_X), ceil((float)globalThreads_y/LOCAL_THREADS_Y), ceil((float)globalThreads_z/LOCAL_THREADS_Z)), + dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z), + 0, + handle.GetStream(), + srcPtr, + make_uint2(srcDescPtr->strides.nStride, srcDescPtr->strides.hStride), + dstPtr, + make_uint3(dstDescPtr->strides.nStride, dstDescPtr->strides.cStride, dstDescPtr->strides.hStride), + kernelSizeTensor, + xorwowInitialStatePtr, + xorwowSeedStream, + roiTensorPtrSrc); + } + else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NHWC)) + { + globalThreads_x = (srcDescPtr->strides.hStride + 7) >> 3; + hipLaunchKernelGGL(jitter_pln3_pkd3_tensor, + dim3(ceil((float)globalThreads_x/LOCAL_THREADS_X), ceil((float)globalThreads_y/LOCAL_THREADS_Y), ceil((float)globalThreads_z/LOCAL_THREADS_Z)), + dim3(LOCAL_THREADS_X, LOCAL_THREADS_Y, LOCAL_THREADS_Z), + 0, + handle.GetStream(), + srcPtr, + make_uint3(srcDescPtr->strides.nStride, srcDescPtr->strides.cStride, srcDescPtr->strides.hStride), + dstPtr, + make_uint2(dstDescPtr->strides.nStride, dstDescPtr->strides.hStride), + kernelSizeTensor, + xorwowInitialStatePtr, + xorwowSeedStream, + roiTensorPtrSrc); + } + } + + return RPP_SUCCESS; +} diff --git a/src/modules/rppt_tensor_effects_augmentations.cpp b/src/modules/rppt_tensor_effects_augmentations.cpp index 80e3d3a10..8fc2d00ee 100644 --- a/src/modules/rppt_tensor_effects_augmentations.cpp +++ b/src/modules/rppt_tensor_effects_augmentations.cpp @@ -932,6 +932,78 @@ RppStatus rppt_glitch_host(RppPtr_t srcPtr, return RPP_SUCCESS; } +/******************** jitter ********************/ + +RppStatus rppt_jitter_host(RppPtr_t srcPtr, + RpptDescPtr srcDescPtr, + RppPtr_t dstPtr, + RpptDescPtr dstDescPtr, + Rpp32u *kernelSizeTensor, + Rpp32u seed, + RpptROIPtr roiTensorPtrSrc, + RpptRoiType roiType, + rppHandle_t rppHandle) +{ + RppLayoutParams layoutParams = get_layout_params(srcDescPtr->layout, srcDescPtr->c); + RpptXorwowStateBoxMuller xorwowInitialState[SIMD_FLOAT_VECTOR_LENGTH]; + rpp_host_rng_xorwow_f32_initialize_multiseed_stream_boxmuller(xorwowInitialState, seed); + + if ((srcDescPtr->dataType == RpptDataType::U8) && (dstDescPtr->dataType == RpptDataType::U8)) + { + jitter_u8_u8_host_tensor(static_cast(srcPtr) + srcDescPtr->offsetInBytes, + srcDescPtr, + static_cast(dstPtr) + dstDescPtr->offsetInBytes, + dstDescPtr, + kernelSizeTensor, + xorwowInitialState, + roiTensorPtrSrc, + roiType, + layoutParams, + rpp::deref(rppHandle)); + } + else if ((srcDescPtr->dataType == RpptDataType::F16) && (dstDescPtr->dataType == RpptDataType::F16)) + { + jitter_f16_f16_host_tensor(reinterpret_cast(static_cast(srcPtr) + srcDescPtr->offsetInBytes), + srcDescPtr, + reinterpret_cast(static_cast(dstPtr) + dstDescPtr->offsetInBytes), + dstDescPtr, + kernelSizeTensor, + xorwowInitialState, + roiTensorPtrSrc, + roiType, + layoutParams, + rpp::deref(rppHandle)); + } + else if ((srcDescPtr->dataType == RpptDataType::F32) && (dstDescPtr->dataType == RpptDataType::F32)) + { + jitter_f32_f32_host_tensor(reinterpret_cast(static_cast(srcPtr) + srcDescPtr->offsetInBytes), + srcDescPtr, + reinterpret_cast(static_cast(dstPtr) + dstDescPtr->offsetInBytes), + dstDescPtr, + kernelSizeTensor, + xorwowInitialState, + roiTensorPtrSrc, + roiType, + layoutParams, + rpp::deref(rppHandle)); + } + else if ((srcDescPtr->dataType == RpptDataType::I8) && (dstDescPtr->dataType == RpptDataType::I8)) + { + jitter_i8_i8_host_tensor(static_cast(srcPtr) + srcDescPtr->offsetInBytes, + srcDescPtr, + static_cast(dstPtr) + dstDescPtr->offsetInBytes, + dstDescPtr, + kernelSizeTensor, + xorwowInitialState, + roiTensorPtrSrc, + roiType, + layoutParams, + rpp::deref(rppHandle)); + } + + return RPP_SUCCESS; +} + /********************************************************************************************************************/ /*********************************************** RPP_GPU_SUPPORT = ON ***********************************************/ /********************************************************************************************************************/ @@ -1641,6 +1713,8 @@ RppStatus rppt_vignette_gpu(RppPtr_t srcPtr, #endif // backend } +/******************** erase ********************/ + RppStatus rppt_erase_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, @@ -1850,4 +1924,87 @@ RppStatus rppt_glitch_gpu(RppPtr_t srcPtr, #endif // backend } +/******************** jitter ********************/ + +RppStatus rppt_jitter_gpu(RppPtr_t srcPtr, + RpptDescPtr srcDescPtr, + RppPtr_t dstPtr, + RpptDescPtr dstDescPtr, + Rpp32u *kernelSizeTensor, + Rpp32u seed, + RpptROIPtr roiTensorPtrSrc, + RpptRoiType roiType, + rppHandle_t rppHandle) +{ +#ifdef HIP_COMPILE + + RpptXorwowStateBoxMuller xorwowInitialState; + xorwowInitialState.x[0] = 0x75BCD15 + seed; + xorwowInitialState.x[1] = 0x159A55E5 + seed; + xorwowInitialState.x[2] = 0x1F123BB5 + seed; + xorwowInitialState.x[3] = 0x5491333 + seed; + xorwowInitialState.x[4] = 0x583F19 + seed; + xorwowInitialState.counter = 0x64F0C9 + seed; + xorwowInitialState.boxMullerFlag = 0; + xorwowInitialState.boxMullerExtra = 0.0f; + + RpptXorwowStateBoxMuller *d_xorwowInitialStatePtr; + d_xorwowInitialStatePtr = reinterpret_cast(rpp::deref(rppHandle).GetInitHandle()->mem.mgpu.scratchBufferHip.floatmem); + CHECK_RETURN_STATUS(hipMemcpy(d_xorwowInitialStatePtr, &xorwowInitialState, sizeof(RpptXorwowStateBoxMuller), hipMemcpyHostToDevice)); + + if ((srcDescPtr->dataType == RpptDataType::U8) && (dstDescPtr->dataType == RpptDataType::U8)) + { + hip_exec_jitter_tensor(static_cast(srcPtr) + srcDescPtr->offsetInBytes, + srcDescPtr, + static_cast(dstPtr) + dstDescPtr->offsetInBytes, + dstDescPtr, + kernelSizeTensor, + d_xorwowInitialStatePtr, + roiTensorPtrSrc, + roiType, + rpp::deref(rppHandle)); + } + else if ((srcDescPtr->dataType == RpptDataType::F16) && (dstDescPtr->dataType == RpptDataType::F16)) + { + hip_exec_jitter_tensor(reinterpret_cast(static_cast(srcPtr) + srcDescPtr->offsetInBytes), + srcDescPtr, + (half*) (static_cast(dstPtr) + dstDescPtr->offsetInBytes), + dstDescPtr, + kernelSizeTensor, + d_xorwowInitialStatePtr, + roiTensorPtrSrc, + roiType, + rpp::deref(rppHandle)); + } + else if ((srcDescPtr->dataType == RpptDataType::F32) && (dstDescPtr->dataType == RpptDataType::F32)) + { + hip_exec_jitter_tensor((Rpp32f*) (static_cast(srcPtr) + srcDescPtr->offsetInBytes), + srcDescPtr, + (Rpp32f*) (static_cast(dstPtr) + dstDescPtr->offsetInBytes), + dstDescPtr, + kernelSizeTensor, + d_xorwowInitialStatePtr, + roiTensorPtrSrc, + roiType, + rpp::deref(rppHandle)); + } + else if ((srcDescPtr->dataType == RpptDataType::I8) && (dstDescPtr->dataType == RpptDataType::I8)) + { + hip_exec_jitter_tensor(static_cast(srcPtr) + srcDescPtr->offsetInBytes, + srcDescPtr, + static_cast(dstPtr) + dstDescPtr->offsetInBytes, + dstDescPtr, + kernelSizeTensor, + d_xorwowInitialStatePtr, + roiTensorPtrSrc, + roiType, + rpp::deref(rppHandle)); + } + + return RPP_SUCCESS; +#elif defined(OCL_COMPILE) + return RPP_ERROR_NOT_IMPLEMENTED; +#endif // backend +} + #endif // GPU_SUPPORT diff --git a/utilities/test_suite/HIP/Tensor_hip.cpp b/utilities/test_suite/HIP/Tensor_hip.cpp index aad78241e..ec1b47d9b 100644 --- a/utilities/test_suite/HIP/Tensor_hip.cpp +++ b/utilities/test_suite/HIP/Tensor_hip.cpp @@ -66,7 +66,8 @@ int main(int argc, char **argv) bool additionalParamCase = (testCase == 8 || testCase == 21 || testCase == 23|| testCase == 24 || testCase == 40 || testCase == 41 || testCase == 49 || testCase == 54 || testCase == 79); bool kernelSizeCase = (testCase == 40 || testCase == 41 || testCase == 49 || testCase == 54); bool dualInputCase = (testCase == 2 || testCase == 30 || testCase == 33 || testCase == 61 || testCase == 63 || testCase == 65 || testCase == 68); - bool randomOutputCase = (testCase == 8 || testCase == 84 || testCase == 49 || testCase == 54); + bool randomOutputCase = (testCase == 6 || testCase == 8 || testCase == 84 || testCase == 49 || testCase == 54); + bool nonQACase = (testCase == 24); bool interpolationTypeCase = (testCase == 21 || testCase == 23 || testCase == 24 || testCase == 79); bool reductionTypeCase = (testCase == 87 || testCase == 88 || testCase == 89 || testCase == 90 || testCase == 91); bool noiseTypeCase = (testCase == 8); @@ -406,6 +407,10 @@ int main(int argc, char **argv) if(testCase == 46) CHECK_RETURN_STATUS(hipHostMalloc(&intensity, batchSize * sizeof(Rpp32f))); + Rpp32u *kernelSizeTensor; + if(testCase == 6) + CHECK_RETURN_STATUS(hipHostMalloc(&kernelSizeTensor, batchSize * sizeof(Rpp32u))); + RpptChannelOffsets *rgbOffsets; if(testCase == 35) CHECK_RETURN_STATUS(hipHostMalloc(&rgbOffsets, batchSize * sizeof(RpptChannelOffsets))); @@ -561,6 +566,22 @@ int main(int argc, char **argv) break; } + case 6: + { + testCaseName = "jitter"; + + Rpp32u seed = 1255459; + for (i = 0; i < batchSize; i++) + kernelSizeTensor[i] = 5; + + startWallTime = omp_get_wtime(); + if (inputBitDepth == 0 || inputBitDepth == 1 || inputBitDepth == 2 || inputBitDepth == 5) + rppt_jitter_gpu(d_input, srcDescPtr, d_output, dstDescPtr, kernelSizeTensor, seed, roiTensorPtrSrc, roiTypeSrc, handle); + else + missingFuncFlag = 1; + + break; + } case 8: { testCaseName = "noise"; @@ -709,6 +730,36 @@ int main(int argc, char **argv) break; } + case 24: + { + testCaseName = "warp_affine"; + + if ((interpolationType != RpptInterpolationType::BILINEAR) && (interpolationType != RpptInterpolationType::NEAREST_NEIGHBOR)) + { + missingFuncFlag = 1; + break; + } + + Rpp32f6 affineTensor_f6[batchSize]; + Rpp32f *affineTensor = (Rpp32f *)affineTensor_f6; + for (i = 0; i < batchSize; i++) + { + affineTensor_f6[i].data[0] = 1.23; + affineTensor_f6[i].data[1] = 0.5; + affineTensor_f6[i].data[2] = 0; + affineTensor_f6[i].data[3] = -0.8; + affineTensor_f6[i].data[4] = 0.83; + affineTensor_f6[i].data[5] = 0; + } + + startWallTime = omp_get_wtime(); + if (inputBitDepth == 0 || inputBitDepth == 1 || inputBitDepth == 2 || inputBitDepth == 5) + rppt_warp_affine_gpu(d_input, srcDescPtr, d_output, dstDescPtr, affineTensor, interpolationType, roiTensorPtrSrc, roiTypeSrc, handle); + else + missingFuncFlag = 1; + + break; + } case 26: { testCaseName = "lens_correction"; @@ -1448,7 +1499,7 @@ int main(int argc, char **argv) 1.QA Flag is set 2.input bit depth 0 (U8) 3.source and destination layout are the same*/ - if(qaFlag && inputBitDepth == 0 && (srcDescPtr->layout == dstDescPtr->layout) && !(randomOutputCase)) + if(qaFlag && inputBitDepth == 0 && (srcDescPtr->layout == dstDescPtr->layout) && !(randomOutputCase) && !(nonQACase)) { if (testCase == 87) compare_reduction_output(static_cast(reductionFuncResultArr), testCaseName, srcDescPtr, testCase, dst, scriptPath); @@ -1516,7 +1567,7 @@ int main(int argc, char **argv) 2.input bit depth 0 (Input U8 && Output U8) 3.source and destination layout are the same 4.augmentation case does not generate random output*/ - if(qaFlag && inputBitDepth == 0 && ((srcDescPtr->layout == dstDescPtr->layout) || pln1OutTypeCase) && !(randomOutputCase)) + if(qaFlag && inputBitDepth == 0 && ((srcDescPtr->layout == dstDescPtr->layout) || pln1OutTypeCase) && !(randomOutputCase) && !(nonQACase)) compare_output(outputu8, testCaseName, srcDescPtr, dstDescPtr, dstImgSizes, batchSize, interpolationTypeName, noiseTypeName, testCase, dst, scriptPath); // Calculate exact dstROI in XYWH format for OpenCV dump @@ -1603,6 +1654,8 @@ int main(int argc, char **argv) CHECK_RETURN_STATUS(hipHostFree(shapeTensor)); if(roiTensor != NULL) CHECK_RETURN_STATUS(hipHostFree(roiTensor)); + if(testCase == 6) + CHECK_RETURN_STATUS(hipHostFree(kernelSizeTensor)); free(input); free(input_second); free(output); diff --git a/utilities/test_suite/HIP/runTests.py b/utilities/test_suite/HIP/runTests.py index 01da79c8d..88606d0e1 100644 --- a/utilities/test_suite/HIP/runTests.py +++ b/utilities/test_suite/HIP/runTests.py @@ -276,7 +276,7 @@ def rpp_test_suite_parser_and_validator(): subprocess.run(["make", "-j16"], cwd=".") # nosec # List of cases supported -supportedCaseList = ['0', '1', '2', '4', '8', '13', '20', '21', '23', '26', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '45', '46', '54', '61', '63', '65', '68', '70', '79', '80', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92'] +supportedCaseList = ['0', '1', '2', '4', '6', '8', '13', '20', '21', '23', '26', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '45', '46', '54', '61', '63', '65', '68', '70', '79', '80', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92'] # Create folders based on testType and profilingOption if testType == 1 and profilingOption == "YES": @@ -484,7 +484,7 @@ def rpp_test_suite_parser_and_validator(): print_performance_tests_summary(logFile, functionalityGroupList, numRuns) # print the results of qa tests -nonQACaseList = ['8', '24', '54', '84'] # Add cases present in supportedCaseList, but without QA support +nonQACaseList = ['6', '8', '24', '54', '84'] # Add cases present in supportedCaseList, but without QA support if qaMode and testType == 0: qaFilePath = os.path.join(outFilePath, "QA_results.txt") diff --git a/utilities/test_suite/HOST/Tensor_host.cpp b/utilities/test_suite/HOST/Tensor_host.cpp index 4c3d4f0e8..bb1312a5e 100644 --- a/utilities/test_suite/HOST/Tensor_host.cpp +++ b/utilities/test_suite/HOST/Tensor_host.cpp @@ -66,7 +66,8 @@ int main(int argc, char **argv) bool additionalParamCase = (testCase == 8 || testCase == 21 || testCase == 23 || testCase == 24 || testCase == 79); bool dualInputCase = (testCase == 2 || testCase == 30 || testCase == 33 || testCase == 61 || testCase == 63 || testCase == 65 || testCase == 68); - bool randomOutputCase = (testCase == 8 || testCase == 84); + bool randomOutputCase = (testCase == 6 || testCase == 8 || testCase == 84); + bool nonQACase = (testCase == 24); bool interpolationTypeCase = (testCase == 21 || testCase == 23 || testCase == 24 || testCase == 79); bool reductionTypeCase = (testCase == 87 || testCase == 88 || testCase == 89 || testCase == 90 || testCase == 91); bool noiseTypeCase = (testCase == 8); @@ -517,6 +518,24 @@ int main(int argc, char **argv) break; } + case 6: + { + testCaseName = "jitter"; + + Rpp32u kernelSizeTensor[batchSize]; + Rpp32u seed = 1255459; + for (i = 0; i < batchSize; i++) + kernelSizeTensor[i] = 5; + + startWallTime = omp_get_wtime(); + startCpuTime = clock(); + if (inputBitDepth == 0 || inputBitDepth == 1 || inputBitDepth == 2 || inputBitDepth == 5) + rppt_jitter_host(input, srcDescPtr, output, dstDescPtr, kernelSizeTensor, seed, roiTensorPtrSrc, roiTypeSrc, handle); + else + missingFuncFlag = 1; + + break; + } case 8: { testCaseName = "noise"; @@ -672,6 +691,37 @@ int main(int argc, char **argv) break; } + case 24: + { + testCaseName = "warp_affine"; + + if ((interpolationType != RpptInterpolationType::BILINEAR) && (interpolationType != RpptInterpolationType::NEAREST_NEIGHBOR)) + { + missingFuncFlag = 1; + break; + } + + Rpp32f6 affineTensor_f6[batchSize]; + Rpp32f *affineTensor = (Rpp32f *)affineTensor_f6; + for (i = 0; i < batchSize; i++) + { + affineTensor_f6[i].data[0] = 1.23; + affineTensor_f6[i].data[1] = 0.5; + affineTensor_f6[i].data[2] = 0; + affineTensor_f6[i].data[3] = -0.8; + affineTensor_f6[i].data[4] = 0.83; + affineTensor_f6[i].data[5] = 0; + } + + startWallTime = omp_get_wtime(); + startCpuTime = clock(); + if (inputBitDepth == 0 || inputBitDepth == 1 || inputBitDepth == 2 || inputBitDepth == 5) + rppt_warp_affine_host(input, srcDescPtr, output, dstDescPtr, affineTensor, interpolationType, roiTensorPtrSrc, roiTypeSrc, handle); + else + missingFuncFlag = 1; + + break; + } case 26: { testCaseName = "lens_correction"; @@ -1462,7 +1512,7 @@ int main(int argc, char **argv) 1.QA Flag is set 2.input bit depth 0 (U8) 3.source and destination layout are the same*/ - if(qaFlag && inputBitDepth == 0 && (srcDescPtr->layout == dstDescPtr->layout) && !(randomOutputCase)) + if(qaFlag && inputBitDepth == 0 && (srcDescPtr->layout == dstDescPtr->layout) && !(randomOutputCase) && !(nonQACase)) { if (testCase == 87) compare_reduction_output(static_cast(reductionFuncResultArr), testCaseName, srcDescPtr, testCase, dst, scriptPath); @@ -1528,7 +1578,7 @@ int main(int argc, char **argv) 2.input bit depth 0 (Input U8 && Output U8) 3.source and destination layout are the same 4.augmentation case does not generate random output*/ - if(qaFlag && inputBitDepth == 0 && ((srcDescPtr->layout == dstDescPtr->layout) || pln1OutTypeCase) && !(randomOutputCase)) + if(qaFlag && inputBitDepth == 0 && ((srcDescPtr->layout == dstDescPtr->layout) || pln1OutTypeCase) && !(randomOutputCase) && !(nonQACase)) compare_output(outputu8, testCaseName, srcDescPtr, dstDescPtr, dstImgSizes, batchSize, interpolationTypeName, noiseTypeName, testCase, dst, scriptPath); // Calculate exact dstROI in XYWH format for OpenCV dump diff --git a/utilities/test_suite/HOST/runTests.py b/utilities/test_suite/HOST/runTests.py index 93cd64713..bec4de5be 100644 --- a/utilities/test_suite/HOST/runTests.py +++ b/utilities/test_suite/HOST/runTests.py @@ -258,7 +258,7 @@ def rpp_test_suite_parser_and_validator(): subprocess.run(["make", "-j16"], cwd=".") # nosec # List of cases supported -supportedCaseList = ['0', '1', '2', '4', '8', '13', '20', '21', '23', '26', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '45', '46', '54', '61', '63', '65', '68', '70', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92'] +supportedCaseList = ['0', '1', '2', '4', '6', '8', '13', '20', '21', '23', '26', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '45', '46', '54', '61', '63', '65', '68', '70', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92'] print("\n\n\n\n\n") print("##########################################################################################") @@ -309,7 +309,7 @@ def rpp_test_suite_parser_and_validator(): run_performance_test(loggingFolder, logFileLayout, srcPath1, srcPath2, dstPath, case, numRuns, testType, layout, qaMode, decoderType, batchSize, roiList) # print the results of qa tests -nonQACaseList = ['8', '24', '54', '84'] # Add cases present in supportedCaseList, but without QA support +nonQACaseList = ['6', '8', '24', '54', '84'] # Add cases present in supportedCaseList, but without QA support if qaMode and testType == 0: qaFilePath = os.path.join(outFilePath, "QA_results.txt") diff --git a/utilities/test_suite/rpp_test_suite_common.h b/utilities/test_suite/rpp_test_suite_common.h index 71ca9fb34..eddf78702 100644 --- a/utilities/test_suite/rpp_test_suite_common.h +++ b/utilities/test_suite/rpp_test_suite_common.h @@ -75,11 +75,13 @@ std::map augmentationMap = {1, "gamma_correction"}, {2, "blend"}, {4, "contrast"}, + {6, "jitter"}, {8, "noise"}, {13, "exposure"}, {20, "flip"}, {21, "resize"}, {23, "rotate"}, + {24, "warp_afffine"}, {26, "lens_correction"}, {29, "water"}, {30, "non_linear_blend"}, From 2a4373e8cececf104d4c437e2e76475842967919 Mon Sep 17 00:00:00 2001 From: Abishek <52214183+r-abishek@users.noreply.github.com> Date: Tue, 23 Jul 2024 23:49:31 -0700 Subject: [PATCH 2/7] RPP Audio Support HIP - Non silent region (#395) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * added initial skeleton code for the NSR HIP kernel * added test suite support for audio in HIP * initial commit for working NSR kernel with batch size 1 * added max reduction kernel for finding max value in MMS buffer reorganized code for better readability initial commit where QA tests pass * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * optimized find region kernel * added profiler support for hip test suite * modified kernel launch configuration for moving_mean_square_hip_tensor kernel remmodified variable names for better readability * changed the pinned memory for mmsArr to HIP memoryy modified the block size for max kernel * modified the datatype for NSR HIP kernel outputs from float to int * modify NSR HOST kernel outputs to int * change shm_pos to smem_pos * minor change * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * added separate kernels for doing flip when horizontal flip is not set * fixed build issue * Add supported case * reverted incorrect changes happened with merge --------- Signed-off-by: dependabot[bot] Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: Sundarrajan98 Co-authored-by: Pavel Tcherniaev Co-authored-by: fiona-gladwin Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * RPP Vignette Tensor on HOST and HIP (#311) * Add Vignette Tensor HOST and HIP Implementation * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Add Vignette Tensor HOST and HIP Implementation * Address review comments * Update rpp_hip_common.hpp * Update vignette.hpp to add rpp_hip_math_nearbyintf8() * Update Tensor_hip.cpp to add hipHostFree * Fix init * Repeated initialization bugfix * Add host case 46 --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu Co-authored-by: Kiriti Gowda Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: Sundarrajan98 Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.37.1 to 0.38.0 in /docs/sphinx (#333) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.1 to 0.38.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.1...v0.38.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * minor code cleanup * changed the declaration of shared memory * RPP Tensor Audio Support - Resample (#310) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Intial commit - slice_audio * Intial commit - mel_filter_bank * Intial commit - spectrogram * Intial commit - resample * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Remove unused variables in header file * Add axes parameter * Replace Rpp64s with Rpp32s * Replace vectors with arrays Includes optimization * Cleanup * Cleanup * Cleanup and optimize * Move malloc outside openMP loop * Optimize and precompute cutOff * Cleanup * Fix buffer used * Fix buffer used * Additional Cleanup * Fix buffer allocation Includes minor optimization * Optimize post incrmeent operation * Optimize post increment operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * move Tensor_host_audio.cpp to host folder * fix qa mismatches * move Tensor_host_audio.cpp to host folder * fix qa mismatches * move Tensor_host_audio.cpp to host folder * Add spectrogram case in Tensor_host_audio.cpp * move Tensor_host_audio.cpp to host folder * fix qa mismatches * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * Add Doxygen comments * Add Doxygen comments * minor change * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * removed unnecessary files * removed debugging print statement * updated copyright * updated description for resample based on latest changes * converted golden outputs for resample to binary files * Passed resampling window as a parameter to resampling function * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * removed unnecessary files removed unncessary validation checks in test suite * modified sinc to use ONE_OVER_6 macro * combined srcLength and channels into single tensor removed the usage of quality parameter since not used in the kernel * minor change * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * used std functions for floor and ceil use static_cast instead of floor in the resample kernel * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * Png update (#316) * PNG file conversion * reference .png files * remove JPG files * edit IMAGE_PATH * RPP Test Suite Upgrade 6 - Restructure common HIP/HOST code (#315) * moved the common functions used in a python test suites to to a common python script created helper function for displaying QA test summary * reversed the order of performance runs loop and decode loop in all test suites * modified remaining python scripts to use print qa helper function for displaying QA results * added new helper function for print the performance test results as a summary * added caseMax, caseMin variables in image test suite made changes to run only necessary bitdepths needed incase of qa mode --------- Co-authored-by: sampath1117 * Fix build error * removed outBegin variable * remove duplicate line in readme --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 Co-authored-by: Sam Wu Co-authored-by: Pavel Tcherniaev Co-authored-by: fiona-gladwin Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> * Docs - Missing input and output images for Doxygen (#331) * added missing outputs for image augmentations modified the description with correct output names * added gif for voxel input and outputs * modified the output images for water, resize_crop_mirror and resize_mirror_normalize --------- Co-authored-by: sampath1117 * Scratch buffers rename for HOST and HIP (#324) * Change all maskArr to scratchBufferHip * Change all tempFloatmem to scratchBufferHost * Update CMakeLists.txt Version updates * removed f16 includes since not needed for audio fixed indentation for some code removed stream sync for unncessary places * modified scratch buffer name used in NSR hip kernel * moved gridStride as param from kernel launch added missing hipDeviceSynchronize() in test suite * restructured python test suite * minor change * removed gridStride based processing in moving_mean_square_hip_tensor kernel * build fix made changes to get max shared memory limit in system using handle function * add comment for smem_pos function * fixed spacing in Doxygen * Update CMakeLists.txt Version Upgrade * Bump rocm-docs-core[api_reference] from 0.38.1 to 1.0.0 in /docs/sphinx (#337) * Bump rocm-docs-core[api_reference] from 0.38.1 to 1.0.0 in /docs/sphinx Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.38.1 to 1.0.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.38.1...v1.0.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] * Use Python 3.10 in RTD config --------- Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sam Wu * Bump rocm-docs-core[api_reference] from 1.0.0 to 1.1.0 in /docs/sphinx (#339) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.0.0 to 1.1.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.0.0...v1.1.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Gaussian Noise Voxel Tensor on HOST and HIP (#323) * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Initial commit * Merge changes and fixes for gaussian noise 3d * Test suite merge and fixes for gaussian noise 3d * added initial support for gaussian noise HOST NDHWC variant * added NCDHW support * added u8 and i8 bitdepth support * updated gaussian noise voxel host outer api to match with hip api merged gaussian noise voxel kernel codes in 2d kernel codes * resolved black pixels issue across border * minor changes * modified HIP kernel as per the latest changes * modified the description as per the latest changes * made changes in test suite * added new host compute functions for gaussian noise 3d * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * moved the copy 3d function to rpp_cpu_common.hpp * reverted incorrect changes happened with merge * fix test suite issue with RMN * revert incorrect merge changes remove empty blank lines * modify suffix from 3d to voxel for gaussian noise added U8 support for gaussian noise HIP voxel kernel * added separate copy kernel for copying input to output when mean and stddev passed is 0 * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fixed bug in test suite * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * change function name from CHECK to CHECK_RETURN_STATUS * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buff… * modify CHECK to CHECK_RETURN_STATUS in hip audio test suite * removed additional line added in merge * Minor common-fixes for HIP (#345) * Use scratchBufferHip * minor fix * remove additional variable use * Add CHECK_RETURN_STATUS to hip API * handle fix * Readme Updates: --usecase=rocm (#349) * RPP Tensor Audio Support - Spectrogram (#312) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * Initial commit - Spectrogram * Add QA .bin reference file * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Address internal review comments * Modify cmakelist * Fix QA mismatch * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Fix build errors on OCL backend * Merge remote-tracking branch 'origin' into sn/audio_spectrogram_master_merge * Fix build error in tensor testsuite * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * Address review comments * Revert change in runTests.py * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp… * Update CHANGELOG.md (#352) * RPP Tensor Audio Support - Slice (#325) * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * modified the slice kernel and api as per the latest changes * added test case of 1D slice in audio test suite * reverted unwanted changes * updated the slice voxel testing configuration to validate the kernel correctly * updated the description for slice voxel gpu kernel * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * revert incorrect changes happened with merge * fix build issue in test suite * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * added missed validation checks for slice api removed unncessary param in HIP kernel * removed redundant variable * moved the initializatons required for slice in test suite to a separate helper function * reorganized code for better reusability * add comment for init_slice_voxel() function * modify NSR kernel output types to make it compatible with latest slice * code cleanup added erro code for layout mismatch * added slice test case in HOST Image test suite * added test case for slice in image HIP test suite * fixed layout condition check for NHWC slice kernel * minor change * added golden output for slice 2d and 3d cases * freed memory for buffers allocated for slice in test suite * updated the validation check for slice in voxel test suite * Update rpp_test_suite_common.h to add set_generic_descriptor_slice * Update Tensor_host.cpp * Update Tensor_hip.cpp * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2)… * RPP Tensor Audio Support - MelFilterBank (#332) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Intial commit - slice_audio * Intial commit - mel_filter_bank * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Remove unused variables in header file * Add axes parameter * Replace Rpp64s with Rpp32s * Replace vectors with arrays Includes optimization * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Fix buffer allocation Includes minor optimization * Optimize post incrmeent operation * Optimize post increment operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * move Tensor_host_audio.cpp to host folder * fix qa mismatches * move Tensor_host_audio.cpp to host folder * fix qa mismatches * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * Add Doxygen comments * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * Initial commit - Spectrogram * Add QA .bin reference file * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Address internal review comments * Modify cmakelist * Fix QA mismatch * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Fix build errors on OCL backend * Fix spectrogram Removes slice kernel * Cleanup Modify reference outputs * Merge remote-tracking branch 'origin' into sn/audio_spectrogram_master_merge * Fix build error in tensor testsuite * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Change to camelCase for variable naming Also includes cleanup * Cleanup testsuite for MFB * Update notes * Address review comments * Revert change in runTests.py * Modified codes to use handle memory Also fixes reference output file * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.37.0...v0.37.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Voxel Flip on HIP and HOST (#285) * added support for flip voxel * added test suite support * added golden outputs for flip voxel made changes in test suite to run QA tests for flip * updated golden outputs with correct values * minor bug fix in the hip test suite * made changes to variable names for better readability fixed comments in test suite minor cleanup * combined the flip axis factor as ternary operator in HIP kernel added new enum for error handling when source and destination layouts are not matching * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted flip voxel golden outputs to bin files * changed copyright from 2023 to 2024 * Update flip_voxel.hpp license * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HI… * RPP Tensor Normalize ND on HOST and HIP (#335) * Change enum name * Support Batch processing Includes few fixes * Fix testsuite * Add Voxel unittest change testSuite CMakeLists * Add Doxygen Voxel augmentations * minor change * Add readme for Voxel test suite * Cleanup Includes modification in function naming for fmadd operation * Modify HIP testsuite * Optimize AVX Includes testsuite name change for normalize * Fix output dump issue in HIP and profiler logs * Move __AVX2__ flag * Changes to remove localThreads definitions, add _hip to kernel names * Fix QA reference inputs Also includes reverting to 16 pixel load for AVX * Fix codacy warnings * Fix toggle variant HWC -> CHW * Fix conflicting ROI types in API between HIP and HOST Also includes U8 support for slice * Use ROI Tensor instead of roi pointer * Add support for ND channel normalize * Add support for ND channel normalize * Fix usage of begin values Includes fixing of function names as per axis_mask * Add support for audio kernel * resolved issue with QA mode after U8 addition * made changes to display the exact variant being run in QA mode and performance test mode * minor change * resolved issue with unit test mode changed few variables from snake_case to camel case * reset DEBUG_MODE flag * resolved issue with HIP profiler tests * Add testsuite support for audio * Fix audio normalize testsuite Also adds QA reference outputs for normalize audio * Cleanup * Improve readability for normalize ND QA mode * Support ND axes normalize * Add templated C version for u8->f32 and i8->f32 * Update docs Also adds error code for invalid datatype for Slice kernel * Fix i8->f32 datatype * Update docs * Modify normalize testsuite to supporting any ND kernel Fix merge issues Also removes other voxel kernels * Fix audio testsuite and runMiscTests script * Disable QA tests when toggle is set in runMiscTests script * Support internal mean and stddev computation for 3D * Fix Axis mask for 3D Includes cleanup and testsuite changes * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Update rppdefs.h for comments on2D/ 3D types * Rename to fused_multiply_add_scalar * Implement collapse axis functionality for ND * Implement mean and stddev internal compute for ND normalize * Fix paramStride after collapse axis for ND * Fix build error * Fix mean and stddev compute in ND Cleanup * Cleanup * Additional cleanup * Fix strides for 2D and 3D Also includes fix for normalize ND kernel after collapse axis * minor changes * added QA inputs for 3D data * fixed issue with idx used for mean and std dev in case of ND Normalize * resolved the segfault issue with collapse axis for batch size > 1 * Fix 3d mean and stddev compute for axismask 5 Includes cleanup * Cleanup 2d audio kernel and fix audio testsuite Also handled striding for mean and stddev tensors when input dimensions within batch differs * Fix maxSize compute in normalize ND kernel * fixed normalization function for 3D * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Change names of ref outputs * Fix host test suite cmake * Add Voxel tests for ctest and CI * Remove boost deps and change name fmadd to fused_multiply_add_scalar * Add project name to remove warning * Add scriptPath variable usage to make paths generic for CI * Move CHECK to header * Add C++17 warning fix * Add clarity in final QA result display - match voxel tests with other tensor tests * Build fixes * Fix merge issue of double call to set_max_dimensions * Add clarity on QA test final result * Add references for sample nii image usage * Remove tensor voxel slice augmentation output sample from main ReadMe * Codacy fix * resolved output mismatch issue with axismask5 * Fix index of roiTensor used in maxSize compute Includes cleanup Adds QA inputs and outputs for 3d axis 0,1 with mean and stddev input * Add QA for 4d with internal mean and stddev compute for axis 0,1,3 * Add extra QA tests to support code coverage * Add comments * Update doxygen for normalize ND Includes minor fix in audio testsuite * added normalize hip codes * reverted unwanted changes happened with merge * remove ricap mods * removed unwanted file changes * minor bug fix * reverted back to 1 pixel load and store for 2D kernel for better performance * experimental change * removed experimental change made the compute mod function as inline * avoided the reusage of power inside for loop * allocated pinned memory in handle and used same buffers in normalize kernel * restructured code in ND kernel * made mean and stddev buffers as gpu memory instead of pinned memory * reveted back few changes in test suite for supporting qa mode with axismask 3 * added condition to compute param index only when max param volume is not 1 * fixed the issue with numDims in normalize HOST * added initial version for mean compute of 2D inputs for axisMask1 axisMask2 * added executor for mean kernel launch for 2D inputs * added kernels for mean compute for 2D inputs * added mean compute support for 2 axes cases for 3d inputs * added mean compute for axisMask 4 and axisMask 5 cases * added mean compute for axisMask 3 and axisMask 6 for 3d inputs * added support for axisMask 7 for 3D inputs * restructured kernel launch for mean compute for 2D and 3D inputs * combined all reduction kernels to single kernel * moved common reduction to a helper function so that it can be resued * added initial support for stddev 2d inputs * added stddev compute support for 2d and 3d inputs * bug fix on boundary condition && mean index calculation for 3D inputs * bug fix for axisMask 7 for 3D inputs * added initial support for nd mean and stddev compute * added final kernel for computing mean and std values for ND * optimized nd mean and stddev compute if number of meanss/stddev computations is lesser than max shared memory size * removed redundant code * nwc - fixed the performance issue with axismask 7 * resolved the performance issue with axisMask == 3 and axisMask == 4 * bug fix for axisMask == 4 * fixed the performance issues with axisMask 6 * removed the usage of mod calculation for normalize 2d kernel * removed the usage of mod calculation for normalize 3d kernels removed the usage of paramShape and paramStrides buffers from 2d and 3d kernels since not needed anymore * minor change for axisMask 6 * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * License - updates to 2024 and consistency changes (#298) * Match all CMakeLists.txt license as per RPP's outermost LICENSE file * Match all python files' license as per RPP's outermost LICENSE file * Match all .hpp files' license as per RPP's outermost LICENSE file * Match all .cpp files' license as per RPP's outermost LICENSE file * Match all .h files' license as per RPP's outermost LICENSE file * Remove all rights reserved as per LICENSE file * Remove double space in "Copyright (c) 2019 - 2023 Advanced Micro Devices, Inc." * Match all .cmake files' license as per RPP's outermost LICENSE file * Match all .cpp.in files' license as per RPP's outermost LICENSE file * Replace 283 occurrences in 282 files - 2023 to 2024 * Add "MIT License" title to 281 instances * Add missing license * Test - Update README.md for test_suite (#299) * Bump rocm-docs-core[api_reference] from 0.33.0 to 0.33.1 in /docs/sphinx (#301) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.0 to 0.33.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.0...v0.33.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump rocm-docs-core[api_reference] from 0.33.1 to 0.33.2 in /docs/sphinx (#302) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.1 to 0.33.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.1...v0.33.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * modified the axisMask order in kernel for better categorization * categorized kernels into multiple sections and added info * Update doc codeowners (#303) * Documentation - Bump rocm-docs-core[api_reference] from 0.33.2 to 0.34.0 in /docs/sphinx (#304) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.33.2 to 0.34.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.33.2...v0.34.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Test suite - upgrade 5 qa perf (#305) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Snehaa Giridharan Co-authored-by: r-abishek * RPP Color Temperature on HOST and HIP (#271) * Initial commit - Color Temperature HOST Tensor * Initial commit - Color Temperature HIP Tensor * Add color temperature golden outputs * address review comments * Use reinterpret_cast instead of static_cast * Combine templated functions to support all datatypes into one (got minor perf difference of order 3%) Also fixes indentation * Fix i8 datatype * Cleanup * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Fix PLN3 variant outputs Also modifies reference outputs * Update color_temperature.hpp license * Delete color_temperature_u8_Tensor_PKD3.csv * Delete color_temperature_u8_Tensor_PLN3.csv --------- Co-authored-by: snehaa8 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * RPP Voxel 3D Tensor Add/Subtract scalar on HOST and HIP (#272) * added HOST support for voxel add kernel * added HIP support for voxel add kernel * added test suite support for add scalar * added Doxygen support and modified hip kernel function names as per new standard * added HOST support for voxel subtract kernel * added HIP support for voxel subtract kernel * added test suite support * updated the golden outputs for subtract with correct values * removed unnessary validation checks * Remove double spaces * Fix header * Fix all retval docs * Fix docs to add memory type * Fix comment * Add divider comment * Use post-increment efficiently * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted add and subtract scalar golden outputs to bin files * changed copyright from 2023 to 2024 * Update add_scalar.hpp license * Update subtract_scalar.hpp license --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * RPP Magnitude on HOST and HIP (#278) * Initial commit - Magnitude HOST Tensor * Add QA reference outputs * Update runTests.py * Initial commit - Magnitude HIP Tensor * Add dual input support in testsuite * Optimize HOST kernel further * Optimize i8 datatype further * Modify comments * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update Copywright year * Combine templated functions to support all datatypes * Modify format of reference outputs * Update rppi_arithmetic_operations.h license * Update rppt_tensor_arithmetic_operations.h license * Update host_tensor_arithmetic_operations.hpp * Update magnitude.hpp license * Update hip_tensor_arithmetic_operations.hpp license * Delete magnitude_u8_Tensor_PKD3.csv * Delete magnitude_u8_Tensor_PLN1.csv * Delete magnitude_u8_Tensor_PLN3.csv * Update rpp_test_suite_common.h license * Update runTests.py license * Update Tensor_hip.cpp license * Update runTests.py license * Update Tensor_host.cpp license --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> * moved normalize from geometric to statistical * removed commented lines in test suite * renamed normalize_generic.hpp to normalize.hpp updated copyright * moved common helper in misc HOST and HIP test suites to a separate header file * Bump rocm-docs-core[api_reference] from 0.34.0 to 0.34.2 in /docs/sphinx (#309) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.0 to 0.34.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.0...v0.34.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Tensor Audio Support - Down Mixing (#296) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Intial commit - pre_emphasis_filter * Intial commit - down_mixing * Replace vectors with arrays * Cleanup * Minor cleanup * Optimize downmixing Kernel Includes cleanup * Replace Rpp64s with Rpp32s * Cleanup * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Optimize post incrmeent operation * Optimize post increment operation * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * added doxygen changes for preemphasis filter * updated changes for preemphasis filter in test suite * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * move tensor_host_audio.cpp to host folder * Fix build errors and qa tests in Audio Test suite * Fix build errors and qa tests in Audio Test suite * Add reference output and test samples for downmix * Add down_mix in augmentation list and supported cases * Remove auto-merge repeated funcs * Improve clarity of header docs * Remove blank line * Improve clarity on header docs * Add Doxygen comments * minor change * converted golden outputs to binary file for downmixing * removed old golden output file for preemphasis and todecibels * modified info for downmixing as per new changes used handle memory for temporary buffers * formatting changes * moved the common code for SSE and AVX to outside * Update down_mixing.hpp license * Update rppt_tensor_audio_augmentations.h * combined the srcLength and channels tensors into single tensor --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sundarrajan98 * RPP Voxel 3D Tensor Multiply scalar on HOST and HIP (#306) * added HIP support for voxel scalar multiply kernel * added HOST support for voxel multiply kernel added golden outputs for voxel multiply kernel * merge with master * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * converted multiply scalar voxel golden outputs to bin files * changed copyright from 2023 to 2024 --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Test Suite Bugfix (#307) * experimental changes for adding qa mode for performance tests * made changes to add display more information w.r.t QA results summary for performance tests * minor changes * Add changes to dump qa results to excel file * Add performance QA for three new tensor functions * update prerequisites in readme * added changes to handle unsupported cases * removed treshold dictionary and added performance Noise treshold add new dataset for performance QA * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Changes to the performane summary dataframe * minor changes * Update CMakeLists.txt to add ${CMAKE_CURRENT_SOURCE_DIR} for CI * Update CMakeLists.txt fix * Update CMakeLists.txt fix * remove tabulate dependency * Update README.md to remove tabulate pip install * Fix for CI machine failure * Add note on performance * Fix segmentation fault * Revert QAmode to restrict HIP bitdepths * Use Rpp64u for HOST while comparing outputs * Fix ambiguous abs call * Fix for SLES CI HIP fail - error: incompatible pointer types assigning to 'unsigned long *' from 'unsigned long long *' - refOutput = TensorSumReferenceOutputs[numChannels].data(); --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM Co-authored-by: Snehaa Giridharan Co-authored-by: Pavel Tcherniaev * modified fill_roi_values function * made the changes w.r.t scriptPath * moved rpp_rsqrt_avx under rpp math helpers reverted unwanted file changes * Bump rocm-docs-core[api_reference] from 0.34.2 to 0.35.0 in /docs/sphinx (#313) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.34.2 to 0.35.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.34.2...v0.35.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Reduction - Tensor min and Tensor max on HOST and HIP (#260) * Minor Change * Add Validation check for DST_FOLDER path * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * Add Validation checks for all options in testAllScript.sh * Add sanity check for dual Input cases Set Max Dimension and Max Image Dump Replaced Fast DCT tag with Accurate DCT * Regenerate golden outputs using accurate dct Flag Add golden outputs for some new augmentations * Fix Flip golden outputs mismatch Fix PLN3 variants mismatch in QA mode * Add MAX_BATCH_SIZE check removed Augmentations function calls for failing Qa modes code cleanup * Add crop and gamma correction augmentations code cleanup * Add comments to functions in rpp_test_suite_common.h * minor change * code cleanup * minor code changes * Change roi and Image sizes for crop augmentation * Change numIterations option to numRuns Addressed PR comments * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * Add turboJpeg header to update maxHeight and maxWidth values * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Change the performance Timings logic * Add Avx2 implementation for F32 and U8 toggle variants * minor change to support u8_f16 and u8_f32 cases * Regenerate LUT golden outputs with ACCURATE_DCT tag * Minor code changes * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * Made changes to the runTests.py in Host to remove testAllScipts.sh * Made changes to the runTests.py in HIP to remove testAllScipts.sh * Initial commit - Image min and max Reduction kernel Includes * u8 datatype for both min and max HOST Tensor of all variants. * Testsuite changes. * NWC -initial code for min max PLN3 - PLN3 * made changes to split min and max kernels seperately * splitted kernels for min and max * made changes to print final max/min in the R,G,B channels * fixed inaccuracies in min/max computation * made changes to typecast intermediate output to output requested by user added comments for the code code cleanup and minor changes in test suite * fixed build issues removed image folders used for min, max and sum reverted unwanted file changes * minor changes in test suite * removed support for unwanted test case in Tensor_hip.cpp * Adds new option roi * remove testAllScripts.sh * Adds roi Option in HIP backend * Implement f32 variants * Implement f16 and i8 datatype variants * change F32 load and store logic * Add build flags in CMakeLists.txt to set AVX/SSE flags based on the system configuration * minor code changes * Initial commit - Image sum Reduction kernel Includes u8 PLN1 -> PLN1 conversion for HOST Tensor * Implement PKD3 and PLN3 for Image sum Tensor HOST * Support i8, f16 and f32 datatypes * Initial commit - Image sum Reduction HIP kernel Includes u8 PLN1 -> PLN1 conversion for Tensor * Implement PKD3 and PLN3 for Image sum Tensor HIP * Add support in testsuite Revert normalization for i8 HOST Tensor variants * Fix HIP testsuite Remove additional blanks for 1 channel output * Modify print statement in HIP testsuite * Improve readability for testsuite outputs * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * Fix HIP to support larger inputs * optimized load and store functions for water U8 and F32 variants in host removed commented code * Cleanup * removed golden outputs for water * minor changes * Cleanup Support Reduction QA test in testsuite * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * Remove unused variables and C style casting * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * Optimize u8 datatype further * Fix static_cast * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * added rotate case with golden outputs changed generic bilinear HOST codes to match with HIP codes * Add golden output for remaining all tensor augmentations * fix python script issues * Optimize u8 and i8 datatype Uses uint and int internal processing instead of float * Fix testsuite build errors * minor change * Fix QA check * Modify api naming from image_sum to tensor_sum Includes changes for both HOST and HIP * Support HIP Backend for RICAP * change rcm and rmn golden outputs * Fix HIP pkd3->pkd3 variant * changes based on review comments * change test_suite folder to tests * Optimize u8 and i8 datatype of HIP Includes modification in naming of shared memory * minor fix * changed generic nn F32 loads using gather and setr instructions * Optimize and cleanup U8 HIP * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Fix i8 datatype variants Includes cleanup * Fix the issues with color_to_greyscale * remove the empty folder creation * reverting back the folder name change * minor change * added comments for latest changes * minor change * Improve readability and Cleanup * Fix QA for HIP Includes cleanup * resolved review comments * minor change * Modify api naming from image_ to tensor_ for HOST * Add support for QA tests * removed range check for RMN U8-F32 and U8-F16 variants changed from hipMemset to hipMemsetAsync for RMN HIP Kernel removed multiplication by 255 for stdDev in RMN HOST U8-F16 and U8-F32 variants * Modify naming of shared memory with _smem in HIP Includes cleanup * Typecast and reuse markArr for HIP U8 and I8 * Cleanup and minor optimization * minor fix * fix codacy warnings * Additional cleanup * Cleanup and move #define * Changed the complexity of if statements in runTests.py * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Codacy fixes * Fix codacy warnings * Codacy fix * Address other codacy warnings * cleanup * Change Image functions to generic * Update ricap.hpp with reference paper * resolved minor issues happened with merge * minor changes * fixed minor issue with getting profiler times * minor formatting changes * resolved build issues in test suite renamed the min and max kernel file names * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * Cmake fix to prevent warning * Fix paths in new python scripts * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * Test suite fixes after tensor_min / tensor_max HOST merge * Fix max case * QA tests fix for hip and host * naming convention changes as per new std * Substitute imagePartial with partial * Substitute imageMin/imageMax with min/max * Replace hipMemset with hipMemsetAsync, and replace hipDeviceSynchronize with hipStreamSynchronize * Use variable instead of batchCount*4 * Use post increment effectivly * Resolve codacy warnings * Additional cleanup * remove unused variable * Documentation - Bump rocm-docs-core[api_reference] from 0.28.0 to 0.29.0 in /docs/sphinx (#265) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.28.0 to 0.29.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Remove auto merge boost * Spaces formatting * Bump rocm-docs-core[api_reference] from 0.29.0 to 0.30.1 in /docs/sphinx (#268) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.29.0 to 0.30.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.29.0...v0.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add support for mi300 (#269) * Documentation - Bump rocm-docs-core[api_reference] from 0.30.1 to 0.30.2 in /docs/sphinx (#273) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.1 to 0.30.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.1...v0.30.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Cleanup by removing oneliner functions as inline * RPP Tensor Audio Support - To Decibels (#258) * Initial commit - Non slient region detection Includes unittest setup * Initial commit - To Decibels Includes unittest setup * Replace vectors with arrays * Cleanup * Replace Rpp64s with Rpp32s * Optimize and precompute cutOff * Fix buffer used * Fix buffer used * Additional Cleanup * Update testsuite for Audio * code cleanup * Add Readme file for Audio test suite * changes based on review comments * minor change * Remove unittest folders and updated README.md * Remove unit tests * minor change * code cleanup * added common header file for audio helper functions * removed unncessary audio wav files fixed bug in ROI updation for audio test suite resolved issue in summary generation for performance tests in python * removed log file * added doxygen support for audio * added doxygen changes for to_decibels * updated test suite support for to_decibels * minor change * removed the usage of getMax function and used std::max_element * modularized code in test suite * merge with latest changes * minor change * minor change * resolved codacy warnings * Codacy fix - Remove unused cpuTime * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * resolved issue with file_system dependency in test suite * Doxygen changes changed malloc to new in NSR kernel * RPP RICAP Tensor for HOST and HIP (#213) * Initial commit - Ricap HOST Tensor Includes testsuite changes * Add QA tests for RICAP Used three_images_224x224_src1 folder to create golden outputs * Add three_images_224x224_src1 into TEST_IMAGES * Support HIP Backend for RICAP * Fix HIP pkd3->pkd3 variant * regenerated golden outputs for RICAP minor changes in HOST shell script for handling RICAP in QA mode * minor bug fix in RICAP HIP kernels * Improve readability and Cleanup * Additional cleanup * Cleanup testsuite Includes new golden outputs * Additional testuite fixes * Minor cleanup * Fix codacy warnings * Address other codacy warnings * Update ricap.hpp with reference paper * Add RICAP dataset path in readme * Make changes to error codes returned * Modify roi crop region for unit and perf tests * RPP Tensor Water Augmentation on HOST and HIP (#181) * added water HOST and HIP codes * added water case in test suite * added golden outputs for water * added omp thread changes for water augmentation * experimental changes * fixed output issue with AVX2 instructions * added AVX2 support for PKD3 load function minor changes in PLN variant load functions * nwc commit - added avx2 changes for u8 layout toggle variants but need to add store functions for completion * Add Avx2 implementation for F32 and U8 toggle variants * Add AVX2 support for u8 pkd3-pln3 and i8 pkd3-pln3 for water augmentation * change F32 load and store logic * optimized the store function for F32 PLN3-PKD3 * reverted back irrelevant changes * minor change * optimized load and store functions for water U8 and F32 variants in host removed commented code * removed golden outputs for water * minor changes * renamed few functions and removed unused functions updated i8 pln1 load as per the optimized u8 pln1 load * fixed bug in i8 load function * changed cast to c++ style resolved spacing issues and added comments for AVX codes for better understanding made changes to handle cases where QA Tests are not supported * added golden outputs for water * updated golden outputs with latest changes * modified the u8, i8 pkd3-pln3 function and added comments for the vectorized code * fixed minor bug in I8 variants * made to changes to resolve codacy warnings * changed cast to c++ style in hip kernel * changed generic nn F32 loads using gather and setr instructions * added comments for latest changes * minor change * added definition for storing 32 and 64 bits from a 128bit register --------- Co-authored-by: sampath1117 Co-authored-by: HazarathKumarM * Fix build error * CMakeLists - Version Update 1.5.0 - TOT Version * CHANGELOG Updates Version 1.5.0 placeholder * Boost deps fix for test suite --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda * Documentation - Readme & changelog updates (#251) * readme and changelog updates for 6.0 * minor update * added ctests for audio test suite for CI made changes to add more clarity on the QA Tests results * Cmake mods for ctest * HOST-only build error bugfix * added qa mode paramter to python audio script added golden output map for QA testing of Non silent region detection * minor change * Documentation - Bump rocm-docs-core[api_reference] from 0.26.0 to 0.27.0 in /docs/sphinx (#253) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.26.0 to 0.27.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.26.0...v0.27.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * RPP Resize Mirror Normalize Bugfix (#252) * added fix for hipMemset * remove pixel check for U8-F32 and U8-F16 for HOST codes --------- Co-authored-by: sampath1117 * added example for MMS calculation in comments for better understanding * Sphinx - updates (#257) * Sphinx - updates * Doxygen - Updates * Docs - Remove index.md * updated info used to for running audio test suite * removed bitdepth variable from audio test suite * added more information on computing NSR outputs in the example added * Fix doxygen for decibels Also removes extra QA reference files * Fix build errors and qa tests in Audio Test suite * Remove auto-merge repeated funcs * Improve clarity on header docs * made changes based on review comments * stored golden outputs of to_decibels in binary file removed golden output text files for non silent region * removed unused parameter in verify_output function * updated list of cases supported in python script * added error handling for opening golden output file * Codacy fix and tests warning fix * Codacy fix * Codacy fix trial * codacy fix for checking boundaries of fstream --------- Signed-off-by: dependabot[bot] Co-authored-by: Snehaa Giridharan Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Kiriti Gowda Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Documentation - Bump rocm-docs-core[api_reference] from 0.30.2 to 0.30.3 in /docs/sphinx (#274) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.30.2 to 0.30.3. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/RadeonOpenCompute/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.30.2...v0.30.3) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Adding issue template (#270) * Add files via upload * added ROCm v6, MI300, default component * Fix cast used in testsuite Includes minor fixes * Fix displaying f16 outputs * Optimize HOST min/max reduce function further * Fix spacing in HIP kernels * Fix PLN1 outputs for u8 and i8 datatypes of HOST backend * RPP Test Suite Upgrade 4 - CSV to BIN conversions for file size reduction (#293) * change golden outputs from .csv files to .bin files * Changed comparision funtions to use .bin files * Address review comments * minor change * Address review comments * minor change --------- Co-authored-by: HazarathKumarM * Bump rocm-docs-core[api_reference] from 0.31.0 to 0.33.0 in /docs/sphinx (#294) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.31.0 to 0.33.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.31.0...v0.33.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Store reference outputs via map for min and max kernels * Update tensor_max.hpp license * Update tensor_min.hpp license * Fix output comparison check * Merge branch 'ar/opt_tensor_min_tensor_max' of https://github.com/r-abishek/rpp into sn/tensor_min_max * Modify exit condition used in outer most kernel * Modify srcIdx for HIP Tensor min * Using maximum as 255 for HIP Tensor min * Modify srcIdx for HIP Tensor max kernel Also fixes build error in testsuite * Fix corrupted outputs displayed for Tensor sum * Fix corruption issue seen with tensor sum kernel * Fix minimum for I8 Tensor max kernel * Modified HIP buffer initialization with a common function * Fix redefinition * Remove additional variables xAlignedLength * Remove unwanted xAlignedLength and xDiff * Remove redefinition of TensorSumReferenceOutputs * Fix for CI issue * Add parenthesis --------- Signed-off-by: dependabot[bot] Co-authored-by: HazarathKumarM Co-authored-by: sampath1117 Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: fiona-gladwin Co-authored-by: Kiriti Gowda Co-authored-by: Lisa Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> * CI - Update precheckin.groovy * added bin golden input and output for 2d data made changes in test suite to support the reading and output comparision from bin files removed the olde golden input and output .txt files * added golden inputs for 2d mean and std added golden output for 2d when mean and std is passed from user modified the helper functions to calculate the strides for 2 modes of normalize * added golden input and output for 3D data * fix for output mean and stddev outputs compute for axisMask 3 * fixed the precision issue with 3d normalization kernel when mean and std is passed from user further cleanup in test suite * use static_cast instead of c style casting * added template argument to kernels for supporting multiple bitdepths * Revert rpp_load24_f32pkd3_to_f32pln3_avx() Cleanup comments in HOST normalize * Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sphinx (#319) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.0 to 0.35.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.0...v0.35.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fixed output mismatch seen with 3d HOST normalize kernel when mean and stddev are passed from user * Fix outputs with 2d normalize HOST * Fix HOST 2d outputs when AxisMask is set to 1 with mean and stddev computed internally * Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sphinx (#322) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.35.1 to 0.36.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.35.1...v0.36.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Change all maskArr to scratchBufferHip * Change all tempFloatmem to scratchBufferHost * Cleanup * combined multiple params as a single param wherever possible in kernel launch made the descriptor pointer as pinned memory * Removed the unnecessary memcpy for ND normalize * added axisMask as additional param from test suite added caseMin, caseMax changes and qaMode parameter to python test suite used helper function for displaying qa mode results * remove unncessary variable in test suite added roi start co-ordinates in index calculation * updated source index calculation with roi begin values for 2d and nd mean, stddev compute kernels * change variable from snake case to camel case updated source index calculation with roi begin values for 3d mean, stddev compute kernels * Modify HOST testsuite to process AxisMask * Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /docs/sphinx (#328) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Link cleanup (#326) * link updates * update tables * pare down index * API cleanup * consistency * verbiage * Update notes * fix the logic for ND ROI based index calculation * added helper function for setting the description pointer in misc test suite * Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /docs/sphinx (#329) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 0.37.0 to 0.37.1. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCom… * SWDEV-459739 - Remove the package obsolete setting (#353) The package was obsoleting itself and was causing upgrade issues. Removed the same. * modified code to use globalThreads_z from description pointer instead from handle api fixed incoorect warning message in test suite cmake * Audio support merge commit fixes (#354) * add NFT and NTF layouts * Set layout for spectrogram and melfilterbank directly in testsuite * Remove extra blank line in testsuite * minor changes in test suite * minor change in MFB description --------- Co-authored-by: Snehaa Giridharan Co-authored-by: sampath1117 * renamed instances of tensor_hip_audio to tensor_audio_hip * made helper functions inline * added comments for prev_pow2 and next_pow2 functions * change max reduction kernel block size from 512 to 256 * change base types to RPP types in hip executor * added error codes for tile length and shared memory * minor change * Bump rocm-docs-core[api_reference] from 1.1.1 to 1.1.2 in /docs/sphinx (#358) Bumps [rocm-docs-core[api_reference]](https://github.com/RadeonOpenCompute/rocm-docs-core) from 1.1.1 to 1.1.2. - [Release notes](https://github.com/RadeonOpenCompute/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/RadeonOpenCompute/rocm-docs-core/compare/v1.1.1...v1.1.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Docker updates (#356) * docker updates * rwmove spaces * rwmove spaces * install half from source * remove advance AMD flag * review comments * Version Updates (#359) * add versioning to rpp to check in other projects * add renamed file * remove square helper function in HIP kernel * optimize index computation in prefix_sim function * add more comments in code for better readability * Merge branch 'develop' into sr/nsr_hip * removed print statement for invalid cases in hip executor changed function name of smem_pos * vectorized computations in compute_prefix_sum function * increased scratch memory allocation and removed the hipMalloc and hipFree inside the hip kernel * added explanation for the scratch memory allocated in HIP backend * added comments for compute_prefix_sum helper function * added audio flag changes in HIP test suite cmake * removed empty blank line * added const variable for maximum MMS buffer size added CTests for audio HIP kernels * added few more comments for moving mean sqaure hip kernel * minor change to comment * audio test suite changes for python 2 compatibility --------- Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: sampath1117 Co-authored-by: kiritigowda Co-authored-by: Lisa Co-authored-by: HazarathKumarM Co-authored-by: Sam Wu Co-authored-by: Snehaa Giridharan Co-authored-by: Snehaa-Giridharan <118163708+snehaa8@users.noreply.github.com> Co-authored-by: Sundarrajan98 Co-authored-by: Pavel Tcherniaev Co-authored-by: fiona-gladwin Co-authored-by: Lakshmi Kumar Co-authored-by: abhimeda <138710508+abhimeda@users.noreply.github.com> Co-authored-by: randyh62 <42045079+randyh62@users.noreply.github.com> Co-authored-by: raramakr <91213141+raramakr@users.noreply.github.com> --- include/rppdefs.h | 9 +- include/rppt_tensor_audio_augmentations.h | 108 +++-- src/modules/hip/handlehip.cpp | 7 +- .../hip/hip_tensor_audio_augmentations.hpp | 30 ++ .../kernel/non_silent_region_detection.hpp | 426 ++++++++++++++++++ .../rppt_tensor_audio_augmentations.cpp | 49 ++ utilities/test_suite/CMakeLists.txt | 9 + utilities/test_suite/HIP/CMakeLists.txt | 27 +- utilities/test_suite/HIP/Tensor_audio_hip.cpp | 246 ++++++++++ utilities/test_suite/HIP/runAudioTests.py | 295 ++++++++++++ .../test_suite/HOST/Tensor_audio_host.cpp | 3 + utilities/test_suite/common.py | 11 +- utilities/test_suite/rpp_test_suite_audio.h | 9 +- 13 files changed, 1174 insertions(+), 55 deletions(-) create mode 100644 src/modules/hip/hip_tensor_audio_augmentations.hpp create mode 100644 src/modules/hip/kernel/non_silent_region_detection.hpp create mode 100644 utilities/test_suite/HIP/Tensor_audio_hip.cpp create mode 100644 utilities/test_suite/HIP/runAudioTests.py diff --git a/include/rppdefs.h b/include/rppdefs.h index df006d348..4a40defa4 100644 --- a/include/rppdefs.h +++ b/include/rppdefs.h @@ -64,6 +64,7 @@ SOFTWARE. const float ONE_OVER_6 = 1.0f / 6; const float ONE_OVER_3 = 1.0f / 3; const float ONE_OVER_255 = 1.0f / 255; +const uint MMS_MAX_SCRATCH_MEMORY = 76800000; // maximum scratch memory size (number of floats) needed for MMS buffer in RNNT training /******************** RPP typedefs ********************/ @@ -136,7 +137,13 @@ typedef enum /*! \brief src and dst layout mismatch \ingroup group_rppdefs */ RPP_ERROR_LAYOUT_MISMATCH = -18, /*! \brief Number of channels is invalid. (Needs to adhere to function specification.) \ingroup group_rppdefs */ - RPP_ERROR_INVALID_CHANNELS = -19 + RPP_ERROR_INVALID_CHANNELS = -19, + /*! \brief Invalid output tile length (Needs to adhere to function specification.) \ingroup group_rppdefs */ + RPP_ERROR_INVALID_OUTPUT_TILE_LENGTH = -20, + /*! \brief Shared memory size needed is beyond the bounds (Needs to adhere to function specification.) \ingroup group_rppdefs */ + RPP_ERROR_OUT_OF_BOUND_SHARED_MEMORY_SIZE = -21, + /*! \brief Scratch memory size needed is beyond the bounds (Needs to adhere to function specification.) \ingroup group_rppdefs */ + RPP_ERROR_OUT_OF_BOUND_SCRATCH_MEMORY_SIZE = -22, } RppStatus; /*! \brief RPP rppStatus_t type enums diff --git a/include/rppt_tensor_audio_augmentations.h b/include/rppt_tensor_audio_augmentations.h index 2c964e87c..c8223845a 100644 --- a/include/rppt_tensor_audio_augmentations.h +++ b/include/rppt_tensor_audio_augmentations.h @@ -48,33 +48,55 @@ extern "C" { * \details Non Silent Region Detection augmentation for 1D audio buffer \n Finds the starting index and length of non silent region in the audio buffer by comparing the calculated short-term power with cutoff value passed - * \param[in] srcPtr source tensor in HOST memory - * \param[in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) - * \param[in] srcLengthTensor source audio buffer length (1D tensor in HOST memory, of size batchSize) - * \param[out] detectedIndexTensor beginning index of non silent region (1D tensor in HOST memory, of size batchSize) - * \param[out] detectionLengthTensor length of non silent region (1D tensor in HOST memory, of size batchSize) - * \param[in] cutOffDB cutOff in dB below which the signal is considered silent - * \param[in] windowLength window length used for computing short-term power of the signal - * \param[in] referencePower reference power that is used to convert the signal to dB - * \param[in] resetInterval number of samples after which the moving mean average is recalculated to avoid precision loss - * \param[in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() + * \param [in] srcPtr source tensor in HOST memory + * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) + * \param [in] srcLengthTensor source audio buffer length (1D tensor in HOST memory, of size batchSize) + * \param [out] detectedIndexTensor beginning index of non silent region (1D tensor in HOST memory, of size batchSize) + * \param [out] detectionLengthTensor length of non silent region (1D tensor in HOST memory, of size batchSize) + * \param [in] cutOffDB cutOff in dB below which the signal is considered silent + * \param [in] windowLength window length used for computing short-term power of the signal + * \param [in] referencePower reference power that is used to convert the signal to dB + * \param [in] resetInterval number of samples after which the moving mean average is recalculated to avoid precision loss + * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. * \retval RPP_SUCCESS Successful completion. * \retval RPP_ERROR* Unsuccessful completion. */ RppStatus rppt_non_silent_region_detection_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, Rpp32s *srcLengthTensor, Rpp32s *detectedIndexTensor, Rpp32s *detectionLengthTensor, Rpp32f cutOffDB, Rpp32s windowLength, Rpp32f referencePower, Rpp32s resetInterval, rppHandle_t rppHandle); +#ifdef GPU_SUPPORT +/*! \brief Non Silent Region Detection augmentation on HIP backend + * \details Non Silent Region Detection augmentation for 1D audio buffer + \n Finds the starting index and length of non silent region in the audio buffer by comparing the + calculated short-term power with cutoff value passed + * \param [in] srcPtr source tensor in HIP memory + * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) + * \param [in] srcLengthTensor source audio buffer length (1D tensor in Pinned/HIP memory, of size batchSize) + * \param [out] detectedIndexTensor beginning index of non silent region (1D tensor in Pinned/HIP memory, of size batchSize) + * \param [out] detectionLengthTensor length of non silent region (1D tensor in Pinned/HIP memory, of size batchSize) + * \param [in] cutOffDB cutOff in dB below which the signal is considered silent + * \param [in] windowLength window length used for computing short-term power of the signal + * \param [in] referencePower reference power that is used to convert the signal to dB + * \param [in] resetInterval number of samples after which the moving mean average is recalculated to avoid precision loss + * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() + * \return A \ref RppStatus enumeration. + * \retval RPP_SUCCESS Successful completion. + * \retval RPP_ERROR* Unsuccessful completion. + */ +RppStatus rppt_non_silent_region_detection_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, Rpp32s *srcLengthTensor, Rpp32s *detectedIndexTensor, Rpp32s *detectionLengthTensor, Rpp32f cutOffDB, Rpp32s windowLength, Rpp32f referencePower, Rpp32s resetInterval, rppHandle_t rppHandle); +#endif // GPU_SUPPORT + /*! \brief To Decibels augmentation on HOST backend * \details To Decibels augmentation for 1D audio buffer converts magnitude values to decibel values - * \param[in] srcPtr source tensor in HOST memory - * \param[in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) - * \param[out] dstPtr destination tensor in HOST memory - * \param[in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) - * \param[in] srcDims source tensor sizes for each element in batch (2D tensor in HOST memory, of size batchSize * 2) - * \param[in] cutOffDB minimum or cut-off ratio in dB - * \param[in] multiplier factor by which the logarithm is multiplied - * \param[in] referenceMagnitude Reference magnitude if not provided maximum value of input used as reference - * \param[in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() + * \param [in] srcPtr source tensor in HOST memory + * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) + * \param [out] dstPtr destination tensor in HOST memory + * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) + * \param [in] srcDims source tensor sizes for each element in batch (2D tensor in HOST memory, of size batchSize * 2) + * \param [in] cutOffDB minimum or cut-off ratio in dB + * \param [in] multiplier factor by which the logarithm is multiplied + * \param [in] referenceMagnitude Reference magnitude if not provided maximum value of input used as reference + * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. * \retval RPP_SUCCESS Successful completion. * \retval RPP_ERROR* Unsuccessful completion. @@ -83,14 +105,14 @@ RppStatus rppt_to_decibels_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_ /*! \brief Pre Emphasis Filter augmentation on HOST backend * \details Pre Emphasis Filter augmentation for audio data - * \param[in] srcPtr source tensor in HOST memory - * \param[in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) - * \param[out] dstPtr destination tensor in HOST memory - * \param[in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) - * \param[in] srcLengthTensor source audio buffer length (1D tensor in HOST memory, of size batchSize) - * \param[in] coeffTensor preemphasis coefficient (1D tensor in HOST memory, of size batchSize) - * \param[in] borderType border value policy - * \param[in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() + * \param [in] srcPtr source tensor in HOST memory + * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) + * \param [out] dstPtr destination tensor in HOST memory + * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) + * \param [in] srcLengthTensor source audio buffer length (1D tensor in HOST memory, of size batchSize) + * \param [in] coeffTensor preemphasis coefficient (1D tensor in HOST memory, of size batchSize) + * \param [in] borderType border value policy + * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. * \retval RPP_SUCCESS Successful completion. * \retval RPP_ERROR* Unsuccessful completion. @@ -99,13 +121,13 @@ RppStatus rppt_pre_emphasis_filter_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, /*! \brief Down Mixing augmentation on HOST backend * \details Down Mixing augmentation for audio data -* \param[in] srcPtr source tensor in HOST memory -* \param[in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) -* \param[out] dstPtr destination tensor in HOST memory -* \param[in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) -* \param[in] srcDimsTensor source audio buffer length and number of channels (1D tensor in HOST memory, of size batchSize * 2) -* \param[in] normalizeWeights bool flag to specify if normalization of weights is needed -* \param[in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() +* \param [in] srcPtr source tensor in HOST memory +* \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) +* \param [out] dstPtr destination tensor in HOST memory +* \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) +* \param [in] srcDimsTensor source audio buffer length and number of channels (1D tensor in HOST memory, of size batchSize * 2) +* \param [in] normalizeWeights bool flag to specify if normalization of weights is needed +* \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. * \retval RPP_SUCCESS Successful completion. * \retval RPP_ERROR* Unsuccessful completion. @@ -155,15 +177,15 @@ RppStatus rppt_mel_filter_bank_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, Rpp /*! \brief Resample augmentation on HOST backend * \details Resample augmentation for audio data -* \param[in] srcPtr source tensor in HOST memory -* \param[in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) -* \param[out] dstPtr destination tensor in HOST memory -* \param[in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) -* \param[in] inRate Input sampling rate (1D tensor in HOST memory, of size batchSize) -* \param[in] outRate Output sampling rate (1D tensor in HOST memory, of size batchSize) -* \param[in] srcDimsTensor source audio buffer length and number of channels (1D tensor in HOST memory, of size batchSize * 2) -* \param[in] window Resampling window (struct of type RpptRpptResamplingWindow) -* \param[in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() +* \param [in] srcPtr source tensor in HOST memory +* \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) +* \param [out] dstPtr destination tensor in HOST memory +* \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32) +* \param [in] inRate Input sampling rate (1D tensor in HOST memory, of size batchSize) +* \param [in] outRate Output sampling rate (1D tensor in HOST memory, of size batchSize) +* \param [in] srcDimsTensor source audio buffer length and number of channels (1D tensor in HOST memory, of size batchSize * 2) +* \param [in] window Resampling window (struct of type RpptRpptResamplingWindow) +* \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. * \retval RPP_SUCCESS Successful completion. * \retval RPP_ERROR* Unsuccessful completion. diff --git a/src/modules/hip/handlehip.cpp b/src/modules/hip/handlehip.cpp index 42e72db98..08eb93674 100644 --- a/src/modules/hip/handlehip.cpp +++ b/src/modules/hip/handlehip.cpp @@ -239,7 +239,12 @@ struct HandleImpl } hipMalloc(&(this->initHandle->mem.mgpu.rgbArr.rgbmem), sizeof(RpptRGB) * this->nBatchSize); - hipMalloc(&(this->initHandle->mem.mgpu.scratchBufferHip.floatmem), sizeof(Rpp32f) * 8294400); // 3840 x 2160 + + /* (600000 + 293 + 128) * 128 - Maximum scratch memory required for Non Silent Region Detection HIP kernel used in RNNT training (uses a batchsize 128) + - 600000 is the maximum size that will be required for MMS buffer based on Librispeech dataset + - 293 is the size required for storing reduction outputs for 600000 size sample + - 128 is the size required for storing cutOffDB values for batch size 128 */ + hipMalloc(&(this->initHandle->mem.mgpu.scratchBufferHip.floatmem), sizeof(Rpp32f) * 76853888); } }; diff --git a/src/modules/hip/hip_tensor_audio_augmentations.hpp b/src/modules/hip/hip_tensor_audio_augmentations.hpp new file mode 100644 index 000000000..4ff4a2740 --- /dev/null +++ b/src/modules/hip/hip_tensor_audio_augmentations.hpp @@ -0,0 +1,30 @@ +/* +MIT License + +Copyright (c) 2019 - 2024 Advanced Micro Devices, Inc. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +*/ + +#ifndef HIP_TENSOR_AUDIO_AUGMENTATIONS_HPP +#define HIP_TENSOR_AUDIO_AUGMENTATIONS_HPP + +#include "kernel/non_silent_region_detection.hpp" + +#endif // HIP_TENSOR_AUDIO_AUGMENTATIONS_HPP diff --git a/src/modules/hip/kernel/non_silent_region_detection.hpp b/src/modules/hip/kernel/non_silent_region_detection.hpp new file mode 100644 index 000000000..80511464b --- /dev/null +++ b/src/modules/hip/kernel/non_silent_region_detection.hpp @@ -0,0 +1,426 @@ +#include +#include "rpp_hip_common.hpp" + +// -------------------- Set 0 - moving mean square kernel device helpers -------------------- + +// calculate the position in shared memory to avoid bank conflicts +__host__ __device__ __forceinline__ int compute_pos_in_smem(int pos) +{ + return pos + (pos >> 5); // since shared memory banks considered is 32 +} + +/* compute prefix sum on the input buffer passed + prefix sum of an array is an array where each element is the sum of all previous elements in the input array, inclusive of the current element */ +__device__ __forceinline__ void compute_prefix_sum(float *input, uint bufferLength) +{ + int offset = 1; + int2 offset_i2 = static_cast(offset); + int2 offsetAB_i2 = make_int2(offset - 1, 2 * offset - 1); + int threadIdxMul2 = 2 * hipThreadIdx_x; + int blockDimMul2 = 2 * hipBlockDim_x; + + /* compute intermediate prefix sums in a up sweep manner + (each level in the hierarchy doubles the distance between the pairs of elements being added) */ + for (int d = bufferLength >> 1; d > 0; d >>= 1) + { + // syncthreads before proceeding to next iteration + __syncthreads(); + int dMul2 = 2 * d; + for (int idxMul2 = threadIdxMul2; idxMul2 < dMul2; idxMul2 += blockDimMul2) + { + int2 pos_i2 = (offset_i2 * static_cast(idxMul2)) + offsetAB_i2; + input[compute_pos_in_smem(pos_i2.y)] += input[compute_pos_in_smem(pos_i2.x)]; + } + offset <<= 1; + offset_i2 = static_cast(offset); + offsetAB_i2 = make_int2(offset - 1, 2 * offset - 1); + } + + if (hipThreadIdx_x == 0) + { + int last = bufferLength - 1; + input[compute_pos_in_smem(last)] = 0; + } + + /* compute final prefix sums in a down sweep manner + (each level in the hierarchy halves the distance between the pairs of elements being added) */ + for (int d = 1; d < bufferLength; d <<= 1) + { + offset >>= 1; + offset_i2 = static_cast(offset); + offsetAB_i2 = make_int2(offset - 1, 2 * offset - 1); + __syncthreads(); + // syncthreads before proceeding to next iteration + + int dMul2 = 2 * d; + for (int idxMul2 = threadIdxMul2; idxMul2 < dMul2; idxMul2 += blockDimMul2) + { + int2 pos_i2 = offset_i2 * static_cast(idxMul2) + offsetAB_i2; + int posA = compute_pos_in_smem(pos_i2.x); + int posB = compute_pos_in_smem(pos_i2.y); + float t = input[posA]; + input[posA] = input[posB]; + input[posB] += t; + } + } + __syncthreads(); +} + +// -------------------- Set 1 - moving mean square compute kernel -------------------- + +__global__ void moving_mean_square_hip_tensor(float *srcPtr, + uint nStride, + float *mmsArr, + int *srcLengthTensor, + int outputTileLength, + int windowLength, + float windowFactor, + int inputTileLength) +{ + int id_x = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + uint srcLength = srcLengthTensor[id_z]; + uint batchStride = id_z * nStride; + int blockStart = hipBlockIdx_x * outputTileLength; + + if (blockStart >= srcLength) + return; + + float *input = srcPtr + batchStride; + extern __shared__ float squaredPrefixSum_smem[]; + + float *inBlockPtr = srcPtr + batchStride + blockStart; + float *outBlockPtr = mmsArr + batchStride + blockStart; + + // find the valid output tile length values needed for given block + int validOutputTileLength = std::min(outputTileLength, srcLength - blockStart); + + // assign pointers that points to block begin and block end locations + float *extendedBlockStart = inBlockPtr - windowLength; + float *extendedBlockEnd = inBlockPtr + validOutputTileLength; + + // load input data to shared memory + for(int pos = hipThreadIdx_x; pos < inputTileLength; pos += hipBlockDim_x) + { + float val = 0.0f; + auto extendedBlockPtr = extendedBlockStart + pos; + + /* check if extendedBlockPtr is within the valid region of input + and load the value from extendedBlockPtr if it is within valid region */ + if (extendedBlockPtr >= input && extendedBlockPtr < extendedBlockEnd) + val = *extendedBlockPtr; + squaredPrefixSum_smem[compute_pos_in_smem(pos)] = val * val; + } + + // compute prefix sum + compute_prefix_sum(squaredPrefixSum_smem, inputTileLength); + + // compute the mms value here + for(int pos = hipThreadIdx_x; pos < validOutputTileLength; pos += hipBlockDim_x) + outBlockPtr[pos] = windowFactor * ((inBlockPtr[pos] * inBlockPtr[pos]) + squaredPrefixSum_smem[compute_pos_in_smem(windowLength + pos)] - squaredPrefixSum_smem[compute_pos_in_smem(pos + 1)]); +} + +// -------------------- Set 2 - kernels for finding cutoffmag value -------------------- + +__global__ void max_reduction_hip_tensor(float *srcPtr, + uint nStride, + float *maxArr, + int *srcLengthTensor) +{ + int id_x = (hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x) * 8; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + uint srcLength = srcLengthTensor[id_z]; + + uint srcIdx = id_z * nStride; + __shared__ float max_smem[256]; // 256 values of src in a 256 x 1 thread block + max_smem[hipThreadIdx_x] = srcPtr[srcIdx]; // initialization of LDS to start value using all 256 threads + + if (id_x >= srcLength) + return; + + if (id_x + 8 > srcLength) + id_x -= (id_x + 8 - srcLength); + + srcIdx += id_x; + d_float8 src_f8; + rpp_hip_load8_and_unpack_to_float8(srcPtr + srcIdx, &src_f8); // load 8 pixels to local memory + rpp_hip_math_max8(&src_f8, &max_smem[hipThreadIdx_x]); + __syncthreads(); // syncthreads after max compute + + // Reduction of 256 floats on 256 threads per block in x dimension + for (int threadMax = 128; threadMax >= 1; threadMax /= 2) + { + if (hipThreadIdx_x < threadMax) + max_smem[hipThreadIdx_x] = fmaxf(max_smem[hipThreadIdx_x], max_smem[hipThreadIdx_x + threadMax]); + __syncthreads(); + } + + // Final store to dst + if (hipThreadIdx_x == 0) + { + int dstIdx = id_z * hipGridDim_x + hipBlockIdx_x; + maxArr[dstIdx] = max_smem[0]; + } +} + +__global__ void cutoffmag_hip_tensor(float *srcPtr, + int maxLength, + float *cutOffMagPtr, + float cutOff, + float referencePower, + bool referenceMax) +{ + int id_x = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + + // if referenceMax is set to true, perform final max reduction on srcPtr and compute cutOffMag + if(referenceMax) + { + uint srcIdx = id_z * maxLength; + __shared__ float max_smem[256]; // 256 values of src in a 256 x 1 thread block + max_smem[hipThreadIdx_x] = srcPtr[srcIdx]; // initialization of LDS to start value using all 256 threads + + if (id_x >= maxLength) + return; + + srcIdx += id_x; + float maxVal = srcPtr[srcIdx]; + while (id_x < maxLength) + { + maxVal = fmaxf(maxVal, srcPtr[srcIdx]); + id_x += hipBlockDim_x; + srcIdx += hipBlockDim_x; + } + max_smem[hipThreadIdx_x] = maxVal; + __syncthreads(); // syncthreads after max compute + + // Reduction of 256 floats on 256 threads per block in x dimension + for (int threadMax = 128; threadMax >= 1; threadMax /= 2) + { + if (hipThreadIdx_x < threadMax) + max_smem[hipThreadIdx_x] = max(max_smem[hipThreadIdx_x], max_smem[hipThreadIdx_x + threadMax]); + __syncthreads(); + } + + // Final store to dst + if (hipThreadIdx_x == 0) + cutOffMagPtr[id_z] = max_smem[0] * cutOff; + } + else + { + if (hipThreadIdx_x == 0) + cutOffMagPtr[id_z] = referencePower * cutOff; + } +} + +// -------------------- Set 3 - kernels for finding begin and length of NSR in inputs -------------------- + +__global__ void find_region_hip_tensor(float *srcPtr, + uint nStride, + int *beginTensor, + int *lengthTensor, + float *cutOffMagPtr, + int *srcLengthTensor, + float windowLength) +{ + int id_x = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x; + int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z; + uint srcLength = srcLengthTensor[id_z]; + float cutOffMag = cutOffMagPtr[id_z]; + + __shared__ int beginResult; + __shared__ int endResult; + beginResult = srcLength; + endResult = 0; + __syncthreads(); + + int beginIdx = srcLength; + int endIdx = 0; + uint stridePerSample = id_z * nStride; + + // Find the begin index in src whose value is >= cutOffMag + for (int i = id_x; i < srcLength; i += hipBlockDim_x) + { + uint srcIdx = stridePerSample + i; + if (srcPtr[srcIdx] >= cutOffMag) + { + beginIdx = i; + atomicMin(&beginResult, beginIdx); + if(beginResult != srcLength) + break; + } + } + + // Find the end index in src whose value is >= cutOffMag + for (int i = id_x; i < srcLength; i += hipBlockDim_x) + { + uint srcIdx = stridePerSample + srcLength - 1 - i; + if (srcPtr[srcIdx] >= cutOffMag) + { + endIdx = srcLength - 1 - i; + atomicMax(&endResult, endIdx); + if(endResult != 0) + break; + } + } + + // Final store to dst + if(hipThreadIdx_x == 0) + { + if(beginResult == srcLength || endResult == 0) + { + beginTensor[id_z] = 0; + lengthTensor[id_z] = 0; + } + else + { + int detectBegin = beginResult; + int detectEnd = endResult - beginResult + 1; + + // if both starting index and length of nonsilent region is not 0 + // adjust the values as per the windowLength + if(detectBegin != 0 && detectEnd != 0) + { + int newBegin = max(detectBegin - (windowLength - 1), 0); + detectEnd += detectBegin - newBegin; + detectBegin = newBegin; + } + beginTensor[id_z] = detectBegin; + lengthTensor[id_z] = detectEnd; + } + } +} + +// -------------------- Set 4 - host helpers for kernel executor -------------------- + +// return the nearest previous power of 2 for the given number +inline Rpp32s prev_pow2(Rpp32s n) +{ + Rpp32s pow2 = 1; + while (n - pow2 > pow2) + pow2 += pow2; + + return pow2; +} + +// return the nearest next power of 2 for the given number +inline Rpp32s next_pow2(Rpp32s n) +{ + Rpp32s pow2 = 1; + while (n > pow2) + pow2 += pow2; + + return pow2; +} + +// -------------------- Set 5 - non silent region kernels executor -------------------- + +RppStatus hip_exec_non_silent_region_detection_tensor(Rpp32f *srcPtr, + RpptDescPtr srcDescPtr, + Rpp32s *srcLengthTensor, + Rpp32s *detectedIndexTensor, + Rpp32s *detectionLengthTensor, + Rpp32f cutOffDB, + Rpp32s windowLength, + Rpp32f referencePower, + Rpp32s resetInterval, + rpp::Handle& handle) +{ + // check if scratch memory size required for moving mean square is within the limits + if ((srcDescPtr->n * srcDescPtr->strides.nStride) > MMS_MAX_SCRATCH_MEMORY) + return RPP_ERROR_OUT_OF_BOUND_SCRATCH_MEMORY_SIZE; + + Rpp32f *mmsArr = handle.GetInitHandle()->mem.mgpu.scratchBufferHip.floatmem; + Rpp32s maxSharedMemoryInBytes = handle.GetLocalMemorySize(); + Rpp32s maxSharedMemoryElements = maxSharedMemoryInBytes / sizeof(Rpp32f); + Rpp32s kSharedMemBanks = 32; + Rpp32s inputTileLength = prev_pow2(maxSharedMemoryElements * kSharedMemBanks / (kSharedMemBanks + 1)); + + if (resetInterval > 0 && resetInterval < inputTileLength) + { + Rpp32s p = prev_pow2(resetInterval); + Rpp32s n = next_pow2(resetInterval); + if (p > windowLength) + inputTileLength = p; + else if (n < inputTileLength) + inputTileLength = n; + } + + Rpp32s sharedMemorySizeInBytes = compute_pos_in_smem(inputTileLength) * sizeof(Rpp32f); + Rpp32s outputTileLength = inputTileLength - windowLength; + Rpp32f windowFactor = 1.0f / windowLength; + + if (outputTileLength <= 0) + return RPP_ERROR_INVALID_OUTPUT_TILE_LENGTH; + + if (sharedMemorySizeInBytes > maxSharedMemoryInBytes) + return RPP_ERROR_OUT_OF_BOUND_SHARED_MEMORY_SIZE; + + // launch kernel to compute the values needed for MMS Array + Rpp32s globalThreads_x = ceil(static_cast(srcDescPtr->strides.nStride) / outputTileLength); + Rpp32s globalThreads_y = 1; + Rpp32s globalThreads_z = srcDescPtr->n; + + hipLaunchKernelGGL(moving_mean_square_hip_tensor, + dim3(globalThreads_x, globalThreads_y, globalThreads_z), + dim3(LOCAL_THREADS_X_1DIM, LOCAL_THREADS_Y_1DIM, LOCAL_THREADS_Z_1DIM), + sharedMemorySizeInBytes, + handle.GetStream(), + srcPtr, + srcDescPtr->strides.nStride, + mmsArr, + srcLengthTensor, + outputTileLength, + windowLength, + windowFactor, + inputTileLength); + + const Rpp32f cutOff = std::pow(10.0f, cutOffDB * 0.1f); + bool referenceMax = (!referencePower); + Rpp32f *partialMaxArr = mmsArr + srcDescPtr->n * srcDescPtr->strides.nStride; + + Rpp32s numBlocksPerSample = ceil(static_cast(srcDescPtr->strides.nStride) / (LOCAL_THREADS_X_1DIM * 8)); + Rpp32s cutOffMagKernelBlockSize = 1; + if (referenceMax) + { + // compute max value in MMS buffer + hipLaunchKernelGGL(max_reduction_hip_tensor, + dim3(numBlocksPerSample, 1, globalThreads_z), + dim3(LOCAL_THREADS_X_1DIM, LOCAL_THREADS_Y_1DIM, LOCAL_THREADS_Z_1DIM), + 0, + handle.GetStream(), + mmsArr, + srcDescPtr->strides.nStride, + partialMaxArr, + srcLengthTensor); + cutOffMagKernelBlockSize = 256; + } + // find the cutoff value in magnitude + Rpp32f *cutOffMagPtr = partialMaxArr + globalThreads_z * numBlocksPerSample; + hipLaunchKernelGGL(cutoffmag_hip_tensor, + dim3(1, 1, globalThreads_z), + dim3(cutOffMagKernelBlockSize, LOCAL_THREADS_Y_1DIM, LOCAL_THREADS_Z_1DIM), + 0, + handle.GetStream(), + partialMaxArr, + numBlocksPerSample, + cutOffMagPtr, + cutOff, + referencePower, + referenceMax); + + // find the begin and length values of NSR in inputs + hipLaunchKernelGGL(find_region_hip_tensor, + dim3(1, 1, globalThreads_z), + dim3(1024, LOCAL_THREADS_Y_1DIM, LOCAL_THREADS_Z_1DIM), + 0, + handle.GetStream(), + mmsArr, + srcDescPtr->strides.nStride, + detectedIndexTensor, + detectionLengthTensor, + cutOffMagPtr, + srcLengthTensor, + windowLength); + return RPP_SUCCESS; +} diff --git a/src/modules/rppt_tensor_audio_augmentations.cpp b/src/modules/rppt_tensor_audio_augmentations.cpp index abb3308bb..91893a2f2 100644 --- a/src/modules/rppt_tensor_audio_augmentations.cpp +++ b/src/modules/rppt_tensor_audio_augmentations.cpp @@ -29,6 +29,10 @@ SOFTWARE. #include "rppt_tensor_audio_augmentations.h" #include "cpu/host_tensor_audio_augmentations.hpp" +#ifdef HIP_COMPILE + #include "hip/hip_tensor_audio_augmentations.hpp" +#endif // HIP_COMPILE + /******************** non_silent_region_detection ********************/ RppStatus rppt_non_silent_region_detection_host(RppPtr_t srcPtr, @@ -271,4 +275,49 @@ RppStatus rppt_resample_host(RppPtr_t srcPtr, } } +/********************************************************************************************************************/ +/*********************************************** RPP_GPU_SUPPORT = ON ***********************************************/ +/********************************************************************************************************************/ + +#ifdef GPU_SUPPORT + +/******************** non_silent_region_detection ********************/ + +RppStatus rppt_non_silent_region_detection_gpu(RppPtr_t srcPtr, + RpptDescPtr srcDescPtr, + Rpp32s *srcLengthTensor, + Rpp32s *detectedIndexTensor, + Rpp32s *detectionLengthTensor, + Rpp32f cutOffDB, + Rpp32s windowLength, + Rpp32f referencePower, + Rpp32s resetInterval, + rppHandle_t rppHandle) +{ +#ifdef HIP_COMPILE + if (srcDescPtr->dataType == RpptDataType::F32) + { + + return hip_exec_non_silent_region_detection_tensor(static_cast(srcPtr), + srcDescPtr, + srcLengthTensor, + detectedIndexTensor, + detectionLengthTensor, + cutOffDB, + windowLength, + referencePower, + resetInterval, + rpp::deref(rppHandle)); + } + else + { + return RPP_ERROR_NOT_IMPLEMENTED; + } + +#elif defined(OCL_COMPILE) + return RPP_ERROR_NOT_IMPLEMENTED; +#endif // backend +} + +#endif // GPU_SUPPORT #endif // AUDIO_SUPPORT diff --git a/utilities/test_suite/CMakeLists.txt b/utilities/test_suite/CMakeLists.txt index bb5987779..23515798b 100644 --- a/utilities/test_suite/CMakeLists.txt +++ b/utilities/test_suite/CMakeLists.txt @@ -120,6 +120,15 @@ if(Python3_FOUND) WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} ) endif(NIFTI_FOUND) + if(RPP_AUDIO_AUGMENTATIONS_SUPPORT_FOUND) + if(libsnd_LIBS) + add_test( + NAME rpp_qa_tests_tensor_audio_hip_all + COMMAND ${Python3_EXECUTABLE} ${ROCM_PATH}/share/rpp/test/HIP/runAudioTests.py --qa_mode 1 --batch_size 3 + WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} + ) + endif(libsnd_LIBS) + endif(RPP_AUDIO_AUGMENTATIONS_SUPPORT_FOUND) elseif( "${BACKEND}" STREQUAL "OCL") # TBD: Add OCL Tests diff --git a/utilities/test_suite/HIP/CMakeLists.txt b/utilities/test_suite/HIP/CMakeLists.txt index 864475f76..814b006fb 100644 --- a/utilities/test_suite/HIP/CMakeLists.txt +++ b/utilities/test_suite/HIP/CMakeLists.txt @@ -58,6 +58,10 @@ find_package(hip QUIET) find_package(OpenCV QUIET) find_package(TurboJpeg QUIET) find_package(NIFTI QUIET) +find_library(libsnd_LIBS + NAMES sndfile libsndfile + PATHS ${CMAKE_SYSTEM_PREFIX_PATH} ${LIBSND_ROOT_DIR} "/usr/local" + PATH_SUFFIXES lib lib64) # OpenMP find_package(OpenMP REQUIRED) @@ -111,4 +115,25 @@ if(NIFTI_FOUND AND OpenCV_FOUND) target_link_libraries(Tensor_voxel_hip ${OpenCV_LIBS} -lturbojpeg -lrpp ${hip_LIBRARIES} pthread ${LINK_LIBRARY_LIST} hip::device ${NIFTI_PACKAGE_PREFIX}NIFTI::${NIFTI_PACKAGE_PREFIX}niftiio) else() message("-- ${Yellow}Warning: libniftiio must be installed to install ${PROJECT_NAME}/Tensor_voxel_hip successfully!${ColourReset}") -endif() \ No newline at end of file +endif() + +if(RPP_AUDIO_SUPPORT) + if(NOT libsnd_LIBS) + message("-- ${Yellow}Warning: libsndfile must be installed to install ${PROJECT_NAME}/Tensor_audio_hip successfully!${ColourReset}") + else() + message("-- ${Green}${PROJECT_NAME} set to build with rpp and libsndfile ${ColourReset}") + set(COMPILER_FOR_HIP ${ROCM_PATH}/bin/hipcc) + set(CMAKE_CXX_COMPILER ${COMPILER_FOR_HIP}) + include_directories(${ROCM_PATH}/include ${ROCM_PATH}/include/rpp /usr/local/include) + link_directories(${ROCM_PATH}/lib /usr/local/lib) + include_directories(${SndFile_INCLUDE_DIRS}) + link_directories(${SndFile_LIBRARIES_DIR} /usr/local/lib/) + + add_executable(Tensor_audio_hip Tensor_audio_hip.cpp) + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=gnu++17") + if(NOT APPLE) + set(LINK_LIBRARY_LIST ${LINK_LIBRARY_LIST} stdc++fs) + endif() + target_link_libraries(Tensor_audio_hip ${libsnd_LIBS} -lsndfile -lrpp pthread ${LINK_LIBRARY_LIST}) + endif() +endif() diff --git a/utilities/test_suite/HIP/Tensor_audio_hip.cpp b/utilities/test_suite/HIP/Tensor_audio_hip.cpp new file mode 100644 index 000000000..28efcc67d --- /dev/null +++ b/utilities/test_suite/HIP/Tensor_audio_hip.cpp @@ -0,0 +1,246 @@ +/* +MIT License + +Copyright (c) 2019 - 2024 Advanced Micro Devices, Inc. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +*/ + +#include "../rpp_test_suite_audio.h" + +int main(int argc, char **argv) +{ + // handle inputs + const int MIN_ARG_COUNT = 7; + if (argc < MIN_ARG_COUNT) + { + printf("\nImproper Usage! Needs all arguments!\n"); + printf("\nUsage: ./Tensor_audio_hip \n"); + return -1; + } + + char *src = argv[1]; + int testCase = atoi(argv[2]); + int testType = atoi(argv[3]); + int numRuns = atoi(argv[4]); + int batchSize = atoi(argv[5]); + char *dst = argv[6]; + string scriptPath = argv[7]; + + // validation checks + if (testType == 0 && batchSize != 3) + { + cout << "Error! QA Mode only runs with batchsize 3" << endl; + return -1; + } + + // set case names + string funcName = audioAugmentationMap[testCase]; + if (funcName.empty()) + { + if (testType == 0) + printf("\ncase %d is not supported\n", testCase); + + return -1; + } + + // initialize tensor descriptors + RpptDesc srcDesc, dstDesc; + RpptDescPtr srcDescPtr, dstDescPtr; + srcDescPtr = &srcDesc; + dstDescPtr = &dstDesc; + + // set src/dst data types in tensor descriptors + srcDescPtr->dataType = RpptDataType::F32; + dstDescPtr->dataType = RpptDataType::F32; + + // other initializations + int missingFuncFlag = 0; + int maxSrcChannels = 0; + int maxSrcWidth = 0, maxSrcHeight = 0; + int maxDstWidth = 0, maxDstHeight = 0; + Rpp64u iBufferSize = 0; + Rpp64u oBufferSize = 0; + static int noOfAudioFiles = 0; + + // string ops on function name + char src1[1000]; + strcpy(src1, src); + strcat(src1, "/"); + string func = funcName; + + // get number of audio files + vector audioNames, audioFilesPath; + search_files_recursive(src, audioNames, audioFilesPath, ".wav"); + noOfAudioFiles = audioNames.size(); + if (noOfAudioFiles < batchSize || ((noOfAudioFiles % batchSize) != 0)) + { + replicate_last_file_to_fill_batch(audioFilesPath[noOfAudioFiles - 1], audioFilesPath, audioNames, audioNames[noOfAudioFiles - 1], noOfAudioFiles, batchSize); + noOfAudioFiles = audioNames.size(); + } + + // find max audio dimensions in the input dataset + maxSrcHeight = 1; + maxDstHeight = 1; + set_audio_max_dimensions(audioFilesPath, maxSrcWidth, maxSrcChannels); + maxDstWidth = maxSrcWidth; + + // set numDims, offset, n/c/h/w values for src/dst + Rpp32u offsetInBytes = 0; + set_audio_descriptor_dims_and_strides(srcDescPtr, batchSize, maxSrcHeight, maxSrcWidth, maxSrcChannels, offsetInBytes); + int maxDstChannels = maxSrcChannels; + if(testCase == 3) + maxDstChannels = 1; + set_audio_descriptor_dims_and_strides(dstDescPtr, batchSize, maxDstHeight, maxDstWidth, maxDstChannels, offsetInBytes); + + // set buffer sizes for src/dst + iBufferSize = (Rpp64u)srcDescPtr->h * (Rpp64u)srcDescPtr->w * (Rpp64u)srcDescPtr->c * (Rpp64u)srcDescPtr->n; + oBufferSize = (Rpp64u)dstDescPtr->h * (Rpp64u)dstDescPtr->w * (Rpp64u)dstDescPtr->c * (Rpp64u)dstDescPtr->n; + + // allocate hip buffers for input & output + Rpp32f *inputf32 = (Rpp32f *)calloc(iBufferSize, sizeof(Rpp32f)); + Rpp32f *outputf32 = (Rpp32f *)calloc(oBufferSize, sizeof(Rpp32f)); + + void *d_inputf32, *d_outputf32; + CHECK_RETURN_STATUS(hipMalloc(&d_inputf32, iBufferSize * sizeof(Rpp32f))); + CHECK_RETURN_STATUS(hipMalloc(&d_outputf32, oBufferSize * sizeof(Rpp32f))); + + // allocate the buffers for audio length and channels + Rpp32s *srcLengthTensor, *channelsTensor; + CHECK_RETURN_STATUS(hipHostMalloc(&srcLengthTensor, batchSize * sizeof(Rpp32s))); + CHECK_RETURN_STATUS(hipHostMalloc(&channelsTensor, batchSize * sizeof(Rpp32s))); + + // allocate the buffers for src/dst dimensions for each element in batch + RpptImagePatch *srcDims = (RpptImagePatch *) calloc(batchSize, sizeof(RpptImagePatch)); + RpptImagePatch *dstDims = (RpptImagePatch *) calloc(batchSize, sizeof(RpptImagePatch)); + + Rpp32s *detectedIndex = nullptr, *detectionLength = nullptr; + if(testCase == 0) + { + CHECK_RETURN_STATUS(hipHostMalloc(&detectedIndex, batchSize * sizeof(Rpp32f))); + CHECK_RETURN_STATUS(hipHostMalloc(&detectionLength, batchSize * sizeof(Rpp32f))); + } + + // run case-wise RPP API and measure time + rppHandle_t handle; + hipStream_t stream; + CHECK_RETURN_STATUS(hipStreamCreate(&stream)); + rppCreateWithStreamAndBatchSize(&handle, stream, batchSize); + + int noOfIterations = (int)audioNames.size() / batchSize; + double maxWallTime = 0, minWallTime = 500, avgWallTime = 0; + string testCaseName; + printf("\nRunning %s %d times (each time with a batch size of %d images) and computing mean statistics...", func.c_str(), numRuns, batchSize); + for (int iterCount = 0; iterCount < noOfIterations; iterCount++) + { + // read and decode audio and fill the audio dim values + read_audio_batch_and_fill_dims(srcDescPtr, inputf32, audioFilesPath, iterCount, srcLengthTensor, channelsTensor); + CHECK_RETURN_STATUS(hipMemcpy(d_inputf32, inputf32, iBufferSize * sizeof(Rpp32f), hipMemcpyHostToDevice)); + for (int perfRunCount = 0; perfRunCount < numRuns; perfRunCount++) + { + double startWallTime, endWallTime; + double wallTime; + switch (testCase) + { + case 0: + { + testCaseName = "non_silent_region_detection"; + Rpp32f cutOffDB = -60.0; + Rpp32s windowLength = 2048; + Rpp32f referencePower = 0.0f; + Rpp32s resetInterval = 8192; + + startWallTime = omp_get_wtime(); + rppt_non_silent_region_detection_gpu(d_inputf32, srcDescPtr, srcLengthTensor, detectedIndex, detectionLength, cutOffDB, windowLength, referencePower, resetInterval, handle); + + break; + } + default: + { + missingFuncFlag = 1; + break; + } + } + CHECK_RETURN_STATUS(hipDeviceSynchronize()); + + endWallTime = omp_get_wtime(); + if (missingFuncFlag == 1) + { + printf("\nThe functionality %s doesn't yet exist in RPP\n", func.c_str()); + return -1; + } + + wallTime = endWallTime - startWallTime; + maxWallTime = std::max(maxWallTime, wallTime); + minWallTime = std::min(minWallTime, wallTime); + avgWallTime += wallTime; + } + + // QA mode - verify outputs with golden outputs. Below code doesn’t run for performance tests + if (testType == 0) + { + // For testCase 0 verify_non_silent_region_detection function is used for QA testing */ + if (testCase == 0) + verify_non_silent_region_detection(detectedIndex, detectionLength, testCaseName, batchSize, audioNames, dst); + + /* Dump the outputs to csv files for debugging + Runs only if + 1. DEBUG_MODE is enabled + 2. Current iteration is 1st iteration + 3. Test case is not 0 */ + if (DEBUG_MODE && iterCount == 0 && testCase != 0) + { + std::ofstream refFile; + refFile.open(func + ".csv"); + for (int i = 0; i < oBufferSize; i++) + refFile << *(outputf32 + i) << "\n"; + refFile.close(); + } + } + } + rppDestroyGPU(handle); + + // performance test mode + if (testType == 1) + { + // display measured times + maxWallTime *= 1000; + minWallTime *= 1000; + avgWallTime *= 1000; + avgWallTime /= (numRuns * noOfIterations); + cout << fixed << "\nmax,min,avg wall times in ms/batch = " << maxWallTime << "," << minWallTime << "," << avgWallTime; + } + + cout << endl; + + // free memory + free(srcDims); + free(dstDims); + free(inputf32); + free(outputf32); + CHECK_RETURN_STATUS(hipFree(d_inputf32)); + CHECK_RETURN_STATUS(hipFree(d_outputf32)); + CHECK_RETURN_STATUS(hipHostFree(srcLengthTensor)); + CHECK_RETURN_STATUS(hipHostFree(channelsTensor)); + if (detectedIndex != nullptr) + CHECK_RETURN_STATUS(hipHostFree(detectedIndex)); + if (detectionLength != nullptr) + CHECK_RETURN_STATUS(hipHostFree(detectionLength)); + return 0; +} diff --git a/utilities/test_suite/HIP/runAudioTests.py b/utilities/test_suite/HIP/runAudioTests.py new file mode 100644 index 000000000..02e746c96 --- /dev/null +++ b/utilities/test_suite/HIP/runAudioTests.py @@ -0,0 +1,295 @@ +""" +MIT License + +Copyright (c) 2019 - 2024 Advanced Micro Devices, Inc. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +""" + +import os +import sys +sys.dont_write_bytecode = True +sys.path.append(os.path.join(os.path.dirname( __file__ ), '..' )) +from common import * + +# Set the timestamp +timestamp = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S") + +scriptPath = os.path.dirname(os.path.realpath(__file__)) +inFilePath = scriptPath + "/../TEST_AUDIO_FILES/three_samples_single_channel_src1" +outFolderPath = os.getcwd() +buildFolderPath = os.getcwd() +caseMin = 0 +caseMax = 0 + + +# Get a list of log files based on a flag for preserving output +def get_log_file_list(): + return [ + outFolderPath + "/OUTPUT_PERFORMANCE_AUDIO_LOGS_HIP_" + timestamp + "/Tensor_audio_hip_raw_performance_log.txt", + ] + +def case_file_check(CASE_FILE_PATH, new_file): + try: + case_file = open(CASE_FILE_PATH,'r') + for line in case_file: + print(line) + if not(line.startswith('"Name"')): + new_file.write(line) + case_file.close() + return True + except IOError: + print("Unable to open case results") + return False + +# Generate performance reports based on counters and a list of types +def generate_performance_reports(RESULTS_DIR): + import pandas as pd + pd.options.display.max_rows = None + # Generate performance report + df = pd.read_csv(RESULTS_DIR + "/consolidated_results.stats.csv") + df["AverageMs"] = df["AverageNs"] / 1000000 + dfPrint = df.drop(['Percentage'], axis = 1) + dfPrint["HIP Kernel Name"] = dfPrint.iloc[:,0].str.lstrip("Hip_") + dfPrint_noIndices = dfPrint.astype(str) + dfPrint_noIndices.replace(['0', '0.0'], '', inplace = True) + dfPrint_noIndices = dfPrint_noIndices.to_string(index = False) + print(dfPrint_noIndices) + +def run_unit_test_cmd(srcPath, case, numRuns, testType, batchSize, outFilePath): + print("./Tensor_audio_hip " + srcPath + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(numRuns) + " " + str(batchSize)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_audio_hip", srcPath, str(case), str(testType), str(numRuns), str(batchSize), outFilePath, scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) + +def run_performance_test_cmd(loggingFolder, srcPath, case, numRuns, testType, batchSize, outFilePath): + with open(loggingFolder + "/Tensor_audio_hip_raw_performance_log.txt", "a") as logFile: + print("./Tensor_audio_hip " + srcPath + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(numRuns) + " " + str(batchSize)) + process = subprocess.Popen([buildFolderPath + "/build/Tensor_audio_hip", srcPath, str(case), str(testType), str(numRuns), str(batchSize), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec + read_from_subprocess_and_write_to_log(process, logFile) + print("------------------------------------------------------------------------------------------") + +def run_performance_test_with_profiler_cmd(loggingFolder, srcPath, case, numRuns, testType, batchSize, outFilePath): + if not os.path.isdir(outFilePath + "/case_" + case): + os.mkdir(outFilePath + "/case_" + case) + with open(loggingFolder + "/Tensor_audio_hip_raw_performance_log.txt", "a") as logFile: + print("\nrocprof --basenames on --timestamp on --stats -o " + outFilePath + "/case_" + str(case) + "/output_case" + str(case) + ".csv ./Tensor_audio_hip " + srcPath + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(numRuns) + " " + str(batchSize)) + process = subprocess.Popen([ 'rocprof', '--basenames', 'on', '--timestamp', 'on', '--stats', '-o', outFilePath + "/case_" + case + "/output_case" + case + ".csv", "./Tensor_audio_hip", srcPath, str(case), str(testType), str(numRuns), str(batchSize), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec + while True: + output = process.stdout.readline() + if not output and process.poll() is not None: + break + print(output.strip()) + output_str = output.decode('utf-8') + logFile.write(output_str) + print("------------------------------------------------------------------------------------------") + +def run_test(loggingFolder, srcPath, case, numRuns, testType, batchSize, outFilePath, profilingOption = "NO"): + print("\n\n\n\n") + print("--------------------------------") + print("Running a New Functionality...") + print("--------------------------------") + if testType == 0: + run_unit_test_cmd(srcPath, case, numRuns, testType, batchSize, outFilePath) + elif testType == 1 and profilingOption == "NO": + run_performance_test_cmd(loggingFolder, srcPath, case, numRuns, testType, batchSize, outFilePath) + else: + run_performance_test_with_profiler_cmd(loggingFolder, srcPath, case, numRuns, testType, batchSize, outFilePath) + +# Parse and validate command-line arguments for the RPP test suite +def rpp_test_suite_parser_and_validator(): + parser = argparse.ArgumentParser() + parser.add_argument("--input_path", type = str, default = inFilePath, help = "Path to the input folder") + parser.add_argument("--case_start", type = int, default = caseMin, help = "Testing start case # - Range must be in [" + str(caseMin) + ":" + str(caseMax) + "]") + parser.add_argument("--case_end", type = int, default = caseMax, help = "Testing end case # - Range must be in [" + str(caseMin) + ":" + str(caseMax) + "]") + parser.add_argument('--test_type', type = int, default = 0, help = "Type of Test - (0 = QA tests / 1 = Performance tests)") + parser.add_argument('--qa_mode', type = int, default = 0, help = "Run with qa_mode? Output audio data from tests will be compared with golden outputs - (0 / 1)", required = False) + parser.add_argument('--case_list', nargs = "+", help = "List of case numbers to test", required = False) + parser.add_argument('--profiling', type = str , default = 'NO', help = 'Run with profiler? - (YES/NO)', required = False) + parser.add_argument('--num_runs', type = int, default = 1, help = "Specifies the number of runs for running the performance tests") + parser.add_argument('--preserve_output', type = int, default = 1, help = "preserves the output of the program - (0 = override output / 1 = preserve output )") + parser.add_argument('--batch_size', type = int, default = 1, help = "Specifies the batch size to use for running tests. Default is 1.") + args = parser.parse_args() + + # check if the folder exists + validate_path(args.input_path) + + # validate the parameters passed by user + if ((args.case_start < caseMin or args.case_start > caseMax) or (args.case_end < caseMin or args.case_end > caseMax)): + print("Starting case# and Ending case# must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") + exit(0) + elif args.case_end < args.case_start: + print("Ending case# must be greater than starting case#. Aborting!") + exit(0) + elif args.test_type < 0 or args.test_type > 1: + print("Test Type# must be in the 0 / 1. Aborting!") + exit(0) + elif args.case_list is not None and args.case_start != caseMin and args.case_end != caseMax: + print("Invalid input! Please provide only 1 option between case_list, case_start and case_end") + exit(0) + elif args.qa_mode < 0 or args.qa_mode > 1: + print("QA mode must be in the 0 / 1. Aborting!") + exit(0) + elif args.num_runs <= 0: + print("Number of Runs must be greater than 0. Aborting!") + exit(0) + elif args.batch_size <= 0: + print("Batch size must be greater than 0. Aborting!") + exit(0) + elif args.profiling != 'YES' and args.profiling != 'NO': + print("Profiling option value must be either 'YES' or 'NO'.") + exit(0) + elif args.preserve_output < 0 or args.preserve_output > 1: + print("Preserve Output must be in the 0/1 (0 = override / 1 = preserve). Aborting") + exit(0) + elif args.test_type == 0 and args.input_path != inFilePath: + print("Invalid input path! QA mode can run only with path:", inFilePath) + exit(0) + + if args.case_list is None: + args.case_list = range(args.case_start, args.case_end + 1) + args.case_list = [str(x) for x in args.case_list] + else: + for case in args.case_list: + if int(case) < caseMin or int(case) > caseMax: + print("Invalid case number " + str(case) + "! Case number must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") + exit(0) + return args + +args = rpp_test_suite_parser_and_validator() +srcPath = args.input_path +caseStart = args.case_start +caseEnd = args.case_end +testType = args.test_type +caseList = args.case_list +qaMode = args.qa_mode +profilingOption = args.profiling +numRuns = args.num_runs +preserveOutput = args.preserve_output +batchSize = args.batch_size +outFilePath = " " + +# Override testType to 0 if testType is 1 and qaMode is 1 +if testType == 1 and qaMode == 1: + print("WARNING: QA Mode cannot be run with testType = 1 (performance tests). Resetting testType to 0") + testType = 0 + +# set the output folders and number of runs based on type of test (unit test / performance test) +if(testType == 0): + outFilePath = outFolderPath + "/QA_RESULTS_AUDIO_HIP_" + timestamp + numRuns = 1 +elif(testType == 1): + if "--num_runs" not in sys.argv: + numRuns = 100 #default numRuns for running performance tests + outFilePath = outFolderPath + "/OUTPUT_PERFORMANCE_AUDIO_LOGS_HIP_" + timestamp +else: + print("Invalid TEST_TYPE specified. TEST_TYPE should be 0/1 (0 = QA tests / 1 = Performance tests)") + exit(0) + +if preserveOutput == 0: + validate_and_remove_folders(outFolderPath, "QA_RESULTS_AUDIO_HIP") + validate_and_remove_folders(outFolderPath, "OUTPUT_PERFORMANCE_AUDIO_LOGS_HIP") + +os.mkdir(outFilePath) +loggingFolder = outFilePath +dstPath = outFilePath + +# Validate DST_FOLDER +validate_and_remove_files(dstPath) + +# Enable extglob +if os.path.exists(buildFolderPath + "/build"): + shutil.rmtree(buildFolderPath + "/build") +os.makedirs(buildFolderPath + "/build") +os.chdir(buildFolderPath + "/build") + +# Run cmake and make commands +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec + +# List of cases supported +supportedCaseList = ['0'] +if qaMode and batchSize != 3: + print("QA tests can only run with a batch size of 3.") + exit(0) + +for case in caseList: + if "--input_path" not in sys.argv: + if case == "3": + srcPath = scriptPath + "/../TEST_AUDIO_FILES/three_sample_multi_channel_src1" + else: + srcPath = inFilePath + + if case not in supportedCaseList: + continue + run_test(loggingFolder, srcPath, case, numRuns, testType, batchSize, outFilePath, profilingOption) + +# print the results of qa tests +nonQACaseList = [] # Add cases present in supportedCaseList, but without QA support +if testType == 0: + qaFilePath = os.path.join(outFilePath, "QA_results.txt") + checkFile = os.path.isfile(qaFilePath) + if checkFile: + print("---------------------------------- Results of QA Test - Tensor_audio_hip -----------------------------------\n") + print_qa_tests_summary(qaFilePath, supportedCaseList, nonQACaseList) + +# Performance tests +if testType == 1 and profilingOption == "NO": + logFileList = get_log_file_list() + for logFile in logFileList: + print_performance_tests_summary(logFile, "", numRuns) +elif testType == 1 and profilingOption == "YES": + RESULTS_DIR = outFolderPath + "/OUTPUT_PERFORMANCE_AUDIO_LOGS_HIP_" + timestamp + print("RESULTS_DIR = " + RESULTS_DIR) + CONSOLIDATED_FILE = RESULTS_DIR + "/consolidated_results.stats.csv" + + CASE_NUM_LIST = caseList + BIT_DEPTH_LIST = [2] + OFT_LIST = [0] + + # Open csv file + new_file = open(CONSOLIDATED_FILE, 'w') + new_file.write('"HIP Kernel Name","Calls","TotalDurationNs","AverageNs","Percentage"\n') + + # Loop through cases + for CASE_NUM in CASE_NUM_LIST: + # Set results directory + CASE_RESULTS_DIR = RESULTS_DIR + "/case_" + str(CASE_NUM) + print("CASE_RESULTS_DIR = " + CASE_RESULTS_DIR) + + # Loop through bit depths + for BIT_DEPTH in BIT_DEPTH_LIST: + # Loop through output format toggle cases + for OFT in OFT_LIST: + # Write into csv file + CASE_FILE_PATH = CASE_RESULTS_DIR + "/output_case" + str(CASE_NUM) + ".stats.csv" + print("CASE_FILE_PATH = " + CASE_FILE_PATH) + fileCheck = case_file_check(CASE_FILE_PATH, new_file) + if fileCheck == False: + continue + + new_file.close() + subprocess.call(['chown', str(os.getuid()) + ':' + str(os.getgid()), CONSOLIDATED_FILE]) # nosec + try: + generate_performance_reports(RESULTS_DIR) + except ImportError: + print("\nPandas not available! Results of GPU profiling experiment are available in the following files:\n" + \ + CONSOLIDATED_FILE + "\n") + except IOError: + print("Unable to open results in " + CONSOLIDATED_FILE) diff --git a/utilities/test_suite/HOST/Tensor_audio_host.cpp b/utilities/test_suite/HOST/Tensor_audio_host.cpp index 3ec2e0060..32c0c7087 100644 --- a/utilities/test_suite/HOST/Tensor_audio_host.cpp +++ b/utilities/test_suite/HOST/Tensor_audio_host.cpp @@ -266,6 +266,7 @@ int main(int argc, char **argv) } set_audio_descriptor_dims_and_strides_nostriding(dstDescPtr, batchSize, maxDstHeight, maxDstWidth, maxDstChannels, offsetInBytes); + dstDescPtr->numDims = 3; // Set buffer sizes for src/dst unsigned long long spectrogramBufferSize = (unsigned long long)dstDescPtr->h * (unsigned long long)dstDescPtr->w * (unsigned long long)dstDescPtr->c * (unsigned long long)dstDescPtr->n; @@ -381,6 +382,8 @@ int main(int argc, char **argv) set_audio_descriptor_dims_and_strides_nostriding(srcDescPtr, batchSize, maxSrcHeight, maxSrcWidth, maxSrcChannels, offsetInBytes); set_audio_descriptor_dims_and_strides_nostriding(dstDescPtr, batchSize, maxDstHeight, maxDstWidth, maxDstChannels, offsetInBytes); + srcDescPtr->numDims = 3; + dstDescPtr->numDims = 3; // Set buffer sizes for src/dst unsigned long long spectrogramBufferSize = (unsigned long long)srcDescPtr->h * (unsigned long long)srcDescPtr->w * (unsigned long long)srcDescPtr->c * (unsigned long long)srcDescPtr->n; diff --git a/utilities/test_suite/common.py b/utilities/test_suite/common.py index 527b40ead..699495b39 100644 --- a/utilities/test_suite/common.py +++ b/utilities/test_suite/common.py @@ -28,6 +28,12 @@ import datetime import shutil +try: + from errno import FileExistsError +except ImportError: + # Python 2 compatibility + FileExistsError = OSError + imageAugmentationMap = { 0: ["brightness", "HOST", "HIP"], 1: ["gamma_correction", "HOST", "HIP"], @@ -318,8 +324,9 @@ def read_from_subprocess_and_write_to_log(process, logFile): output = process.stdout.readline() if not output and process.poll() is not None: break - print(output.strip()) - logFile.write(output) + output = output.decode().strip() # Decode bytes to string and strip extra whitespace + print(output) + logFile.write(output + '\n') # Returns the layout name based on layout value def get_layout_name(layout): diff --git a/utilities/test_suite/rpp_test_suite_audio.h b/utilities/test_suite/rpp_test_suite_audio.h index c70c6659f..cce2c35aa 100644 --- a/utilities/test_suite/rpp_test_suite_audio.h +++ b/utilities/test_suite/rpp_test_suite_audio.h @@ -25,11 +25,6 @@ SOFTWARE. #include "rpp_test_suite_common.h" #include #include -#include - -using half_float::half; -using namespace std; -typedef half Rpp16f; // Include this header file to use functions from libsndfile #include @@ -57,7 +52,7 @@ std::map> NonSilentRegionReferenceOutputs = // sets descriptor dimensions and strides of src/dst inline void set_audio_descriptor_dims_and_strides(RpptDescPtr descPtr, int batchSize, int maxHeight, int maxWidth, int maxChannels, int offsetInBytes) { - descPtr->numDims = 4; + descPtr->numDims = 2; descPtr->offsetInBytes = offsetInBytes; descPtr->n = batchSize; descPtr->h = maxHeight; @@ -75,7 +70,7 @@ inline void set_audio_descriptor_dims_and_strides(RpptDescPtr descPtr, int batch // sets descriptor dimensions and strides of src/dst inline void set_audio_descriptor_dims_and_strides_nostriding(RpptDescPtr descPtr, int batchSize, int maxHeight, int maxWidth, int maxChannels, int offsetInBytes) { - descPtr->numDims = 4; + descPtr->numDims = 2; descPtr->offsetInBytes = offsetInBytes; descPtr->n = batchSize; descPtr->h = maxHeight; From 2a13f48d97f83a6311e6ec9fc70288ba3b1d9973 Mon Sep 17 00:00:00 2001 From: Abishek <52214183+r-abishek@users.noreply.github.com> Date: Tue, 23 Jul 2024 23:51:21 -0700 Subject: [PATCH 3/7] RPP Test Suite - Replace fstrings and subprocess calls to support old/new versions of Python (#399) * fix errors in test suite * changes to support tests for all the versions of python * Replace .format() for strings and use concatination * Fix CI failures * Modify test suite to support pandas functions for python2 --------- Co-authored-by: HazarathKumarM Co-authored-by: Kiriti Gowda --- utilities/test_suite/HIP/runMiscTests.py | 29 ++++++----- utilities/test_suite/HIP/runTests.py | 60 ++++++++++++---------- utilities/test_suite/HIP/runVoxelTests.py | 40 ++++++++------- utilities/test_suite/HOST/runAudioTests.py | 21 ++++---- utilities/test_suite/HOST/runMiscTests.py | 17 +++--- utilities/test_suite/HOST/runTests.py | 53 ++++++++++--------- utilities/test_suite/HOST/runVoxelTests.py | 21 ++++---- utilities/test_suite/common.py | 28 +++++++++- utilities/test_suite/rpp_test_suite_misc.h | 1 + 9 files changed, 157 insertions(+), 113 deletions(-) diff --git a/utilities/test_suite/HIP/runMiscTests.py b/utilities/test_suite/HIP/runMiscTests.py index f4adbde28..ee97f4547 100644 --- a/utilities/test_suite/HIP/runMiscTests.py +++ b/utilities/test_suite/HIP/runMiscTests.py @@ -74,24 +74,25 @@ def generate_performance_reports(RESULTS_DIR): print(dfPrint_noIndices) def run_unit_test_cmd(numDims, case, numRuns, testType, toggle, batchSize, outFilePath, additionalArg): - print(f"./Tensor_misc_hip {case} {testType} {toggle} {numDims} {batchSize} {numRuns} {additionalArg}") - result = subprocess.run([buildFolderPath + "/build/Tensor_misc_hip", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_misc_hip " + str(case) + " " + str(testType) + " " + str(toggle) + " " + str(numDims) + " " + str(batchSize) + " " + str(numRuns) + " " + str(additionalArg)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_misc_hip", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) print("------------------------------------------------------------------------------------------") def run_performance_test_cmd(loggingFolder, numDims, case, numRuns, testType, toggle, batchSize, outFilePath, additionalArg): - with open("{}/Tensor_misc_hip_raw_performance_log.txt".format(loggingFolder), "a") as logFile: - print(f"./Tensor_misc_hip {case} {testType} {toggle} {numDims} {batchSize} {numRuns} {additionalArg}") - process = subprocess.Popen([buildFolderPath + "/build/Tensor_misc_hip", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_misc_hip_raw_performance_log.txt", "a") as logFile: + logFile.write("./Tensor_misc_hip " + str(case) + " " + str(testType) + " " + str(toggle) + " " + str(numDims) + " " + str(batchSize) + " " + str(numRuns) + " " + str(additionalArg) + "\n") + process = subprocess.Popen([buildFolderPath + "/build/Tensor_misc_hip", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec read_from_subprocess_and_write_to_log(process, logFile) def run_performance_test_with_profiler_cmd(loggingFolder, numDims, case, numRuns, testType, toggle, batchSize, outFilePath, additionalArg): - if not os.path.exists(f"{outFilePath}/case_{case}"): - os.mkdir(f"{outFilePath}/case_{case}") + if not os.path.exists(outFilePath + "/case_" + str(case)): + os.mkdir(outFilePath + "/case_" + str(case)) - with open("{}/Tensor_misc_hip_raw_performance_log.txt".format(loggingFolder), "a") as logFile: - print(f"\nrocprof --basenames on --timestamp on --stats -o {outFilePath}/case_{case}/output_case{case}.csv ./Tensor_misc_hip {case} {testType} {toggle} {numDims} {batchSize} {numRuns} {additionalArg}") - process = subprocess.Popen([ 'rocprof', '--basenames', 'on', '--timestamp', 'on', '--stats', '-o', f"{outFilePath}/case_{case}/output_case{case}.csv", "./Tensor_misc_hip", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_misc_hip_raw_performance_log.txt", "a") as logFile: + logFile.write("\nrocprof --basenames on --timestamp on --stats -o " + outFilePath + "/case_" + str(case) + "/output_case" + str(case) + ".csv ./Tensor_misc_hip " + str(case) + " " + str(testType) + " " + str(toggle) + " " + str(numDims) + " " + str(batchSize) + " " + str(numRuns) + " " + str(additionalArg) + "\n") + process = subprocess.Popen(['rocprof', '--basenames', 'on', '--timestamp', 'on', '--stats', '-o', outFilePath + "/case_" + str(case) + "/output_case" + str(case) + ".csv", "./Tensor_misc_hip", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec read_from_subprocess_and_write_to_log(process, logFile) print("------------------------------------------------------------------------------------------") @@ -206,8 +207,8 @@ def rpp_test_suite_parser_and_validator(): os.chdir(buildFolderPath + "/build") # Run cmake and make commands -subprocess.run(["cmake", scriptPath], cwd=".") # nosec -subprocess.run(["make", "-j16"], cwd=".") # nosec +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec supportedCaseList = ['0', '1', '2'] for case in caseList: @@ -253,7 +254,7 @@ def rpp_test_suite_parser_and_validator(): continue new_file.close() - subprocess.call(['chown', '{}:{}'.format(os.getuid(), os.getgid()), CONSOLIDATED_FILE]) # nosec + subprocess.call(['chown', str(os.getuid()) + ':' + str(os.getgid()), CONSOLIDATED_FILE]) # nosec try: generate_performance_reports(RESULTS_DIR) except ImportError: diff --git a/utilities/test_suite/HIP/runTests.py b/utilities/test_suite/HIP/runTests.py index 88606d0e1..cb4bc8bda 100644 --- a/utilities/test_suite/HIP/runTests.py +++ b/utilities/test_suite/HIP/runTests.py @@ -70,35 +70,39 @@ def run_unit_test(srcPath1, srcPath2, dstPathTemp, case, numRuns, testType, layo if case == "40" or case == "41" or case == "49" or case == "54": for kernelSize in range(3, 10, 2): - print(f"./Tensor_hip {srcPath1} {srcPath2} {dstPath} {bitDepth} {outputFormatToggle} {case} {kernelSize}") - result = subprocess.run([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(kernelSize), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_hip " + srcPath1 + " " + srcPath2 + " " + dstPath + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(kernelSize)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(kernelSize), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) elif case == "8": # Run all variants of noise type functions with additional argument of noiseType = gausssianNoise / shotNoise / saltandpepperNoise for noiseType in range(3): - print(f"./Tensor_hip {srcPath1} {srcPath2} {dstPathTemp} {bitDepth} {outputFormatToggle} {case} {noiseType} ") - result = subprocess.run([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(noiseType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_hip " + srcPath1 + " " + srcPath2 + " " + dstPathTemp + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(noiseType)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(noiseType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) elif case == "21" or case == "23" or case == "24" or case == "79": # Run all variants of interpolation functions with additional argument of interpolationType = bicubic / bilinear / gaussian / nearestneigbor / lanczos / triangular interpolationRange = 6 if case =='79': interpolationRange = 2 for interpolationType in range(interpolationRange): - print(f"./Tensor_hip {srcPath1} {srcPath2} {dstPathTemp} {bitDepth} {outputFormatToggle} {case} {interpolationType}") - result = subprocess.run([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(interpolationType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_hip " + srcPath1 + " " + srcPath2 + " " + dstPathTemp + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(interpolationType)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(interpolationType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) else: - print(f"./Tensor_hip {srcPath1} {srcPath2} {dstPathTemp} {bitDepth} {outputFormatToggle} {case} 0 {numRuns} {testType} {layout}") - result = subprocess.run([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), "0", str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_hip " + srcPath1 + " " + srcPath2 + " " + dstPathTemp + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " 0 " + str(numRuns) + " " + str(testType) + " " + str(layout)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), "0", str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) print("------------------------------------------------------------------------------------------") def run_performance_test_cmd(loggingFolder, logFileLayout, srcPath1, srcPath2, dstPath, bitDepth, outputFormatToggle, case, additionalParam, numRuns, testType, layout, qaMode, decoderType, batchSize, roiList): - with open("{}/Tensor_hip_{}_raw_performance_log.txt".format(loggingFolder, logFileLayout), "a") as logFile: - print(f"./Tensor_hip {srcPath1} {srcPath2} {dstPath} {bitDepth} {outputFormatToggle} {case} {additionalParam} 0 ") - process = subprocess.Popen([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPath, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_hip_" + logFileLayout + "_raw_performance_log.txt", "a") as logFile: + print("./Tensor_hip " + srcPath1 + " " + srcPath2 + " " + dstPath + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(additionalParam)) + process = subprocess.Popen([buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPath, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec read_from_subprocess_and_write_to_log(process, logFile) def run_performance_test(loggingFolder, logFileLayout, srcPath1, srcPath2, dstPath, case, numRuns, testType, layout, qaMode, decoderType, batchSize, roiList): @@ -133,11 +137,11 @@ def run_performance_test(loggingFolder, logFileLayout, srcPath1, srcPath2, dstPa def run_performance_test_with_profiler(loggingFolder, logFileLayout, srcPath1, srcPath2, dstPath, bitDepth, outputFormatToggle, case, additionalParam, additionalParamType, numRuns, testType, layout, qaMode, decoderType, batchSize, roiList): addtionalParamString = additionalParamType + str(additionalParam) layoutName = get_layout_name(layout) - if not os.path.isdir(f"{dstPath}/Tensor_{layoutName}/case_{case}"): - os.mkdir(f"{dstPath}/Tensor_{layoutName}/case_{case}") - with open(f"{loggingFolder}/Tensor_hip_{logFileLayout}_raw_performance_log.txt", "a") as logFile: - print(f'rocprof --basenames on --timestamp on --stats -o {dstPath}/Tensor_{layoutName}/case_{case}/output_case{case}_bitDepth{bitDepth}_oft{outputFormatToggle}{addtionalParamString}.csv ./Tensor_hip {srcPath1} {srcPath2} {bitDepth} {outputFormatToggle} {case} {additionalParam} 0') - process = subprocess.Popen(['rocprof', '--basenames', 'on', '--timestamp', 'on', '--stats', '-o', f'{dstPath}/Tensor_{layoutName}/case_{case}/output_case{case}_bitDepth{bitDepth}_oft{outputFormatToggle}{addtionalParamString}.csv', buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPath, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), str(numRuns), str(testType), str(layout), '0', str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec + if not os.path.isdir(dstPath + "/Tensor_" + layoutName + "/case_" + str(case)): + os.makedirs(dstPath + "/Tensor_" + layoutName + "/case_" + str(case)) + with open(loggingFolder + "/Tensor_hip_" + logFileLayout + "_raw_performance_log.txt", "a") as logFile: + logFile.write("rocprof --basenames on --timestamp on --stats -o " + dstPath + "/Tensor_" + layoutName + "/case_" + str(case) + "/output_case" + str(case) + "_bitDepth" + str(bitDepth) + "_oft" + addtionalParamString + ".csv ./Tensor_hip " + srcPath1 + " " + srcPath2 + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(additionalParam) + " 0\n") + process = subprocess.Popen(['rocprof', '--basenames', 'on', '--timestamp', 'on', '--stats', '-o', dstPath + "/Tensor_" + layoutName + "/case_" + str(case) + "/output_case" + str(case) + "_bitDepth" + str(bitDepth) + "_oft" + addtionalParamString + ".csv", buildFolderPath + "/build/Tensor_hip", srcPath1, srcPath2, dstPath, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), str(numRuns), str(testType), str(layout), '0', str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec while True: output = process.stdout.readline() if not output and process.poll() is not None: @@ -172,7 +176,7 @@ def rpp_test_suite_parser_and_validator(): # validate the parameters passed by user if ((args.case_start < caseMin or args.case_start > caseMax) or (args.case_end < caseMin or args.case_end > caseMax)): - print(f"Starting case# and Ending case# must be in the {caseMin}:{caseMax} range. Aborting!") + print("Starting case# and Ending case# must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") exit(0) elif args.case_end < args.case_start: print("Ending case# must be greater than starting case#. Aborting!") @@ -214,7 +218,7 @@ def rpp_test_suite_parser_and_validator(): else: for case in args.case_list: if int(case) < caseMin or int(case) > caseMax: - print(f"Invalid case number {case}! Case number must be in the {caseMin}:{caseMax} range. Aborting!") + print("Invalid case number " + str(case) + "! Case number must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") exit(0) return args @@ -272,17 +276,17 @@ def rpp_test_suite_parser_and_validator(): os.chdir(buildFolderPath + "/build") # Run cmake and make commands -subprocess.run(["cmake", scriptPath], cwd=".") # nosec -subprocess.run(["make", "-j16"], cwd=".") # nosec +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec # List of cases supported supportedCaseList = ['0', '1', '2', '4', '6', '8', '13', '20', '21', '23', '26', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '45', '46', '54', '61', '63', '65', '68', '70', '79', '80', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92'] # Create folders based on testType and profilingOption if testType == 1 and profilingOption == "YES": - os.makedirs(f"{dstPath}/Tensor_PKD3") - os.makedirs(f"{dstPath}/Tensor_PLN1") - os.makedirs(f"{dstPath}/Tensor_PLN3") + os.makedirs(dstPath + "/Tensor_PKD3") + os.makedirs(dstPath + "/Tensor_PLN1") + os.makedirs(dstPath + "/Tensor_PLN3") print("\n\n\n\n\n") print("##########################################################################################") @@ -453,7 +457,7 @@ def rpp_test_suite_parser_and_validator(): continue new_file.close() - subprocess.call(['chown', '{}:{}'.format(os.getuid(), os.getgid()), RESULTS_DIR + "/consolidated_results_" + TYPE + ".stats.csv"]) # nosec + subprocess.call(['chown', str(os.getuid()) + ':' + str(os.getgid()), RESULTS_DIR + "/consolidated_results_" + TYPE + ".stats.csv"]) # nosec try: generate_performance_reports(d_counter, TYPE_LIST, RESULTS_DIR) diff --git a/utilities/test_suite/HIP/runVoxelTests.py b/utilities/test_suite/HIP/runVoxelTests.py index f3ad38025..31c9dd22f 100644 --- a/utilities/test_suite/HIP/runVoxelTests.py +++ b/utilities/test_suite/HIP/runVoxelTests.py @@ -57,20 +57,23 @@ def func_group_finder(case_number): return "miscellaneous" def run_unit_test_cmd(headerPath, dataPath, dstPathTemp, layout, case, numRuns, testType, qaMode, batchSize): - print(f"./Tensor_voxel_hip {headerPath} {dataPath} {dstPathTemp} {layout} {case} {numRuns} {testType} {qaMode} {batchSize} {bitDepth}") - result = subprocess.run([buildFolderPath + "/build/Tensor_voxel_hip", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_voxel_hip " + headerPath + " " + dataPath + " " + dstPathTemp + " " + str(layout) + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(qaMode) + " " + str(batchSize) + " " + str(bitDepth)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_voxel_hip", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) print("------------------------------------------------------------------------------------------") def run_performance_test_cmd(loggingFolder, logFileLayout, headerPath, dataPath, dstPathTemp, layout, case, numRuns, testType, qaMode, batchSize): - with open(f"{loggingFolder}/Tensor_voxel_hip_{logFileLayout}_raw_performance_log.txt", "a") as logFile: - print(f"./Tensor_voxel_hip {headerPath} {dataPath} {dstPathTemp} {layout} {case} {numRuns} {testType} {qaMode} {batchSize} {bitDepth}") - process = subprocess.Popen([buildFolderPath + "/build/Tensor_voxel_hip", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_voxel_hip_" + logFileLayout + "_raw_performance_log.txt", "a") as logFile: + logFile.write("./Tensor_voxel_hip " + headerPath + " " + dataPath + " " + dstPathTemp + " " + str(layout) + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(qaMode) + " " + str(batchSize) + " " + str(bitDepth) + "\n") + process = subprocess.Popen([buildFolderPath + "/build/Tensor_voxel_hip", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec while True: output = process.stdout.readline() if not output and process.poll() is not None: break - print(output.strip()) + output = output.decode().strip() # Decode bytes to string and strip extra whitespace + print(output) + logFile.write(output) if "Running" in output or "max,min,avg wall times" in output: cleanedOutput = ''.join(char for char in output if 32 <= ord(char) <= 126) # Remove control characters cleanedOutput = cleanedOutput.strip() # Remove leading/trailing whitespace @@ -81,14 +84,15 @@ def run_performance_test_cmd(loggingFolder, logFileLayout, headerPath, dataPath, def run_performance_test_with_profiler_cmd(loggingFolder, logFileLayout, headerPath, dataPath, dstPathTemp, layout, case, numRuns, testType, qaMode, batchSize): layoutName = get_layout_name(layout) - if not os.path.exists(f"{loggingFolder}/Tensor_{layoutName}/case_{case}"): - os.mkdir(f"{loggingFolder}/Tensor_{layoutName}/case_{case}") + directory_path = os.path.join(loggingFolder, "Tensor_" + layoutName, "case_" + str(case)) + if not os.path.exists(directory_path): + os.mkdir(directory_path) bitDepths = [0, 2] for bitDepth in bitDepths: - with open(f"{loggingFolder}/Tensor_voxel_hip_{logFileLayout}_raw_performance_log.txt", "a") as logFile: - print(f"\nrocprof --basenames on --timestamp on --stats -o {dstPathTemp}/Tensor_{layoutName}/case_{case}/output_case{case}.csv ./Tensor_voxel_hip {headerPath} {dataPath} {dstPathTemp} {layout} {case}{numRuns} {testType} {qaMode} {batchSize} {bitDepth}") - process = subprocess.Popen([ 'rocprof', '--basenames', 'on', '--timestamp', 'on', '--stats', '-o', f"{dstPath}/Tensor_{layoutName}/case_{case}/output_case{case}.csv", buildFolderPath + "/build/Tensor_voxel_hip", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec + with open(loggingFolder + "/Tensor_voxel_hip_" + logFileLayout + "_raw_performance_log.txt", "a") as logFile: + logFile.write("\nrocprof --basenames on --timestamp on --stats -o " + dstPathTemp + "/Tensor_" + layoutName + "/case_" + str(case) + "/output_case" + str(case) + ".csv ./Tensor_voxel_hip " + headerPath + " " + dataPath + " " + dstPathTemp + " " + str(layout) + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(qaMode) + " " + str(batchSize) + " " + str(bitDepth) + "\n") + process = subprocess.Popen(['rocprof', '--basenames', 'on', '--timestamp', 'on', '--stats', '-o', dstPath + "/Tensor_" + layoutName + "/case_" + str(case) + "/output_case" + str(case) + ".csv", buildFolderPath + "/build/Tensor_voxel_hip", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec while True: output = process.stdout.readline() if not output and process.poll() is not None: @@ -227,17 +231,17 @@ def rpp_test_suite_parser_and_validator(): os.chdir(buildFolderPath + "/build") # Run cmake and make commands -subprocess.run(["cmake", scriptPath], cwd=".") # nosec -subprocess.run(["make", "-j16"], cwd=".") # nosec +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec # List of cases supported supportedCaseList = ['0', '1', '2', '3', '4', '5', '6'] # Create folders based on testType and profilingOption if testType == 1 and profilingOption == "YES": - os.makedirs(f"{dstPath}/Tensor_PKD3") - os.makedirs(f"{dstPath}/Tensor_PLN1") - os.makedirs(f"{dstPath}/Tensor_PLN3") + os.makedirs(dstPath + "/Tensor_PKD3") + os.makedirs(dstPath + "/Tensor_PLN1") + os.makedirs(dstPath + "/Tensor_PLN3") print("\n\n\n\n\n") print("##########################################################################################") @@ -322,7 +326,7 @@ def rpp_test_suite_parser_and_validator(): continue new_file.close() - subprocess.call(['chown', '{}:{}'.format(os.getuid(), os.getgid()), RESULTS_DIR + "/consolidated_results_" + TYPE + ".stats.csv"]) # nosec + subprocess.call(['chown', str(os.getuid()) + ':' + str(os.getgid()), RESULTS_DIR + "/consolidated_results_" + TYPE + ".stats.csv"]) # nosec try: generate_performance_reports(d_counter, TYPE_LIST, RESULTS_DIR) diff --git a/utilities/test_suite/HOST/runAudioTests.py b/utilities/test_suite/HOST/runAudioTests.py index c0600c057..a1771716b 100644 --- a/utilities/test_suite/HOST/runAudioTests.py +++ b/utilities/test_suite/HOST/runAudioTests.py @@ -45,15 +45,16 @@ def get_log_file_list(): ] def run_unit_test_cmd(srcPath, case, numRuns, testType, batchSize, outFilePath): - print(f"./Tensor_audio_host {srcPath} {case} {numRuns} {testType} {numRuns} {batchSize}") - result = subprocess.run([buildFolderPath + "/build/Tensor_audio_host", srcPath, str(case), str(testType), str(numRuns), str(batchSize), outFilePath, scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_audio_host " + srcPath + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(numRuns) + " " + str(batchSize)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_audio_host", srcPath, str(case), str(testType), str(numRuns), str(batchSize), outFilePath, scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) print("------------------------------------------------------------------------------------------") def run_performance_test_cmd(loggingFolder, srcPath, case, numRuns, testType, batchSize, outFilePath): - with open("{}/Tensor_audio_host_raw_performance_log.txt".format(loggingFolder), "a") as logFile: - print(f"./Tensor_audio_host {srcPath} {case} {numRuns} {testType} {numRuns} {batchSize} ") - process = subprocess.Popen([buildFolderPath + "/build/Tensor_audio_host", srcPath, str(case), str(testType), str(numRuns), str(batchSize), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_audio_host_raw_performance_log.txt", "a") as logFile: + logFile.write("./Tensor_audio_host " + srcPath + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(numRuns) + " " + str(batchSize) + "\n") + process = subprocess.Popen([buildFolderPath + "/build/Tensor_audio_host", srcPath, str(case), str(testType), str(numRuns), str(batchSize), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec read_from_subprocess_and_write_to_log(process, logFile) print("------------------------------------------------------------------------------------------") @@ -87,7 +88,7 @@ def rpp_test_suite_parser_and_validator(): # validate the parameters passed by user if ((args.case_start < caseMin or args.case_start > caseMax) or (args.case_end < caseMin or args.case_end > caseMax)): - print(f"Starting case# and Ending case# must be in the {caseMin}:{caseMax} range. Aborting!") + print("Starting case# and Ending case# must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") exit(0) elif args.case_end < args.case_start: print("Ending case# must be greater than starting case#. Aborting!") @@ -120,7 +121,7 @@ def rpp_test_suite_parser_and_validator(): else: for case in args.case_list: if int(case) < caseMin or int(case) > caseMax: - print(f"Invalid case number {case}! Case number must be in the {caseMin}:{caseMax} range. Aborting!") + print("Invalid case number " + str(case) + "! Case number must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") exit(0) return args @@ -171,8 +172,8 @@ def rpp_test_suite_parser_and_validator(): os.chdir(buildFolderPath + "/build") # Run cmake and make commands -subprocess.run(["cmake", scriptPath], cwd=".") # nosec -subprocess.run(["make", "-j16"], cwd=".") # nosec +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec # List of cases supported supportedCaseList = ['0', '1', '2', '3', '4', '5', '6', '7'] diff --git a/utilities/test_suite/HOST/runMiscTests.py b/utilities/test_suite/HOST/runMiscTests.py index 0f428fe40..931838f71 100644 --- a/utilities/test_suite/HOST/runMiscTests.py +++ b/utilities/test_suite/HOST/runMiscTests.py @@ -47,15 +47,16 @@ def get_log_file_list(): ] def run_unit_test_cmd(numDims, case, numRuns, testType, toggle, batchSize, outFilePath, additionalArg): - print(f"./Tensor_misc_host {case} {testType} {toggle} {numDims} {batchSize} {numRuns} {additionalArg}") - result = subprocess.run([buildFolderPath + "/build/Tensor_misc_host", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_misc_host " + str(case) + " " + str(testType) + " " + str(toggle) + " " + str(numDims) + " " + str(batchSize) + " " + str(numRuns) + " " + str(additionalArg)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_misc_host", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) print("------------------------------------------------------------------------------------------") def run_performance_test_cmd(loggingFolder, numDims, case, numRuns, testType, toggle, batchSize, outFilePath, additionalArg): - with open("{}/Tensor_misc_host_raw_performance_log.txt".format(loggingFolder), "a") as logFile: - print(f"./Tensor_misc_host {case} {testType} {toggle} {numDims} {batchSize} {numRuns} {additionalArg}") - process = subprocess.Popen([buildFolderPath + "/build/Tensor_misc_host", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_misc_host_raw_performance_log.txt", "a") as logFile: + logFile.write("./Tensor_misc_host " + str(case) + " " + str(testType) + " " + str(toggle) + " " + str(numDims) + " " + str(batchSize) + " " + str(numRuns) + " " + str(additionalArg) + "\n") + process = subprocess.Popen([buildFolderPath + "/build/Tensor_misc_host", str(case), str(testType), str(toggle), str(numDims), str(batchSize), str(numRuns), str(additionalArg), outFilePath, scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec read_from_subprocess_and_write_to_log(process, logFile) def run_test(loggingFolder, numDims, case, numRuns, testType, toggle, batchSize, outFilePath, additionalArg = ""): @@ -162,8 +163,8 @@ def rpp_test_suite_parser_and_validator(): os.chdir(buildFolderPath + "/build") # Run cmake and make commands -subprocess.run(["cmake", scriptPath], cwd=".") # nosec -subprocess.run(["make", "-j16"], cwd=".") # nosec +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec supportedCaseList = ['0', '1', '2'] for case in caseList: diff --git a/utilities/test_suite/HOST/runTests.py b/utilities/test_suite/HOST/runTests.py index bec4de5be..7386b364b 100644 --- a/utilities/test_suite/HOST/runTests.py +++ b/utilities/test_suite/HOST/runTests.py @@ -71,34 +71,37 @@ def run_unit_test(srcPath1, srcPath2, dstPathTemp, case, numRuns, testType, layo if case == "8": # Run all variants of noise type functions with additional argument of noiseType = gausssianNoise / shotNoise / saltandpepperNoise for noiseType in range(3): - print(f"./Tensor_host {srcPath1} {srcPath2} {dstPathTemp} {bitDepth} {outputFormatToggle} {case} {noiseType} 0 ") - result = subprocess.run([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(noiseType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_host " + srcPath1 + " " + srcPath2 + " " + dstPathTemp + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(noiseType) + " 0") + result = subprocess.Popen([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(noiseType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) elif case == "21" or case == "23" or case == "24" or case == "79": # Run all variants of interpolation functions with additional argument of interpolationType = bicubic / bilinear / gaussian / nearestneigbor / lanczos / triangular interpolationRange = 6 if case =='79': interpolationRange = 2 for interpolationType in range(interpolationRange): - print(f"./Tensor_host {srcPath1} {srcPath2} {dstPathTemp} {bitDepth} {outputFormatToggle} {case} {interpolationType} 0") - result = subprocess.run([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(interpolationType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_host " + srcPath1 + " " + srcPath2 + " " + dstPathTemp + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(interpolationType) + " 0") + result = subprocess.Popen([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), str(interpolationType), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) else: - print(f"./Tensor_host {srcPath1} {srcPath2} {dstPathTemp} {bitDepth} {outputFormatToggle} {case} 0 {numRuns} {testType} {layout} 0") - result = subprocess.run([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), "0", str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_host " + srcPath1 + " " + srcPath2 + " " + dstPathTemp + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " 0 " + str(numRuns) + " " + str(testType) + " " + str(layout) + " 0") + result = subprocess.Popen([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPathTemp, str(bitDepth), str(outputFormatToggle), str(case), "0", str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) print("------------------------------------------------------------------------------------------") def run_performance_test_cmd(loggingFolder, logFileLayout, srcPath1, srcPath2, dstPath, bitDepth, outputFormatToggle, case, additionalParam, numRuns, testType, layout, qaMode, decoderType, batchSize, roiList): if qaMode == 1: - with open("{}/BatchPD_host_{}_raw_performance_log.txt".format(loggingFolder, logFileLayout), "a") as logFile: - process = subprocess.Popen([buildFolderPath + "/build/BatchPD_host_" + logFileLayout, srcPath1, srcPath2, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), "0"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/BatchPD_host_" + logFileLayout + "_raw_performance_log.txt", "a") as logFile: + process = subprocess.Popen([buildFolderPath + "/build/BatchPD_host_" + logFileLayout, srcPath1, srcPath2, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), "0"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec read_from_subprocess_and_write_to_log(process, logFile) - with open("{}/Tensor_host_{}_raw_performance_log.txt".format(loggingFolder, logFileLayout), "a") as logFile: - print(f"./Tensor_host {srcPath1} {srcPath2} {dstPath} {bitDepth} {outputFormatToggle} {case} {additionalParam} 0 ") - process = subprocess.Popen([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPath, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_host_" + logFileLayout + "_raw_performance_log.txt", "a") as logFile: + logFile.write("./Tensor_host " + srcPath1 + " " + srcPath2 + " " + dstPath + " " + str(bitDepth) + " " + str(outputFormatToggle) + " " + str(case) + " " + str(additionalParam) + " 0\n") + process = subprocess.Popen([buildFolderPath + "/build/Tensor_host", srcPath1, srcPath2, dstPath, str(bitDepth), str(outputFormatToggle), str(case), str(additionalParam), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec read_from_subprocess_and_write_to_log(process, logFile) def run_performance_test(loggingFolder, logFileLayout, srcPath1, srcPath2, dstPath, case, numRuns, testType, layout, qaMode, decoderType, batchSize, roiList): @@ -154,7 +157,7 @@ def rpp_test_suite_parser_and_validator(): # validate the parameters passed by user if ((args.case_start < caseMin or args.case_start > caseMax) or (args.case_end < caseMin or args.case_end > caseMax)): - print(f"Starting case# and Ending case# must be in the {caseMin}:{caseMax} range. Aborting!") + print("Starting case# and Ending case# must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") exit(0) elif args.case_end < args.case_start: print("Ending case# must be greater than starting case#. Aborting!") @@ -193,7 +196,7 @@ def rpp_test_suite_parser_and_validator(): else: for case in args.case_list: if int(case) < caseMin or int(case) > caseMax: - print(f"Invalid case number {case}! Case number must be in the {caseMin}:{caseMax} range. Aborting!") + print("Invalid case number " + str(case) + "! Case number must be in the " + str(caseMin) + ":" + str(caseMax) + " range. Aborting!") exit(0) return args @@ -254,8 +257,8 @@ def rpp_test_suite_parser_and_validator(): os.chdir(buildFolderPath + "/build") # Run cmake and make commands -subprocess.run(["cmake", scriptPath], cwd=".") # nosec -subprocess.run(["make", "-j16"], cwd=".") # nosec +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec # List of cases supported supportedCaseList = ['0', '1', '2', '4', '6', '8', '13', '20', '21', '23', '26', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '45', '46', '54', '61', '63', '65', '68', '70', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92'] @@ -443,23 +446,23 @@ def rpp_test_suite_parser_and_validator(): passedCases = df['Test_Result'].eq('PASSED').sum() failedCases = df['Test_Result'].eq('FAILED').sum() - summaryRow = {'BatchPD_Augmentation_Type': pd.NA, - 'Tensor_Augmentation_Type': pd.NA, - 'Performance Speedup (%)': pd.NA, - 'Test_Result': f'Final Results of Tests: Passed: {passedCases}, Failed: {failedCases}'} + summaryRow = {'BatchPD_Augmentation_Type': None, + 'Tensor_Augmentation_Type': None, + 'Performance Speedup (%)': None, + 'Test_Result': 'Final Results of Tests: Passed: ' + str(passedCases) + ', Failed: ' + str(failedCases)} - print("\n", df.to_markdown()) + print("\n" + dataframe_to_markdown(df)) # Append the summary row to the DataFrame # Convert the dictionary to a DataFrame summaryRow = pd.DataFrame([summaryRow]) - df = pd.concat([df, summaryRow], ignore_index=True) + df = pd.concat([df, summaryRow], ignore_index=True, sort = True) df.to_excel(excelFilePath, index=False) print("\n-------------------------------------------------------------------" + resultsInfo + "\n\n-------------------------------------------------------------------") print("\nIMPORTANT NOTE:") print("- The following performance comparison shows Performance Speedup percentages between times measured on previous generation RPP-BatchPD APIs against current generation RPP-Tensor APIs.") - print(f"- All APIs have been improved for performance ranging from {0}% (almost same) to {100}% faster.") + print("- All APIs have been improved for performance ranging from " + str(0) + "% (almost same) to " + str(100) + "% faster.") print("- Random observations of negative speedups might always occur due to current test machine temperature/load variances or other CPU/GPU state-dependent conditions.") print("\n-------------------------------------------------------------------\n") elif (testType == 1 and qaMode == 0): diff --git a/utilities/test_suite/HOST/runVoxelTests.py b/utilities/test_suite/HOST/runVoxelTests.py index 3dbe0baa5..f44c05f78 100644 --- a/utilities/test_suite/HOST/runVoxelTests.py +++ b/utilities/test_suite/HOST/runVoxelTests.py @@ -58,20 +58,23 @@ def func_group_finder(case_number): return "miscellaneous" def run_unit_test_cmd(headerPath, dataPath, dstPathTemp, layout, case, numRuns, testType, qaMode, batchSize): - print(f"./Tensor_voxel_host {headerPath} {dataPath} {dstPathTemp} {layout} {case} {numRuns} {testType} {qaMode} {batchSize} {bitDepth}") - result = subprocess.run([buildFolderPath + "/build/Tensor_voxel_host", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE) # nosec - print(result.stdout.decode()) + print("./Tensor_voxel_host " + headerPath + " " + dataPath + " " + dstPathTemp + " " + str(layout) + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(qaMode) + " " + str(batchSize) + " " + str(bitDepth)) + result = subprocess.Popen([buildFolderPath + "/build/Tensor_voxel_host", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE) # nosec + stdout_data, stderr_data = result.communicate() + print(stdout_data.decode()) print("------------------------------------------------------------------------------------------") def run_performance_test_cmd(loggingFolder, logFileLayout, headerPath, dataPath, dstPathTemp, layout, case, numRuns, testType, qaMode, batchSize): - with open(f"{loggingFolder}/Tensor_voxel_host_{logFileLayout}_raw_performance_log.txt", "a") as logFile: - print(f"./Tensor_voxel_host {headerPath} {dataPath} {dstPathTemp} {layout} {case} {numRuns} {testType} {qaMode} {batchSize} {bitDepth}") - process = subprocess.Popen([buildFolderPath + "/build/Tensor_voxel_host", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True) # nosec + with open(loggingFolder + "/Tensor_voxel_host_" + logFileLayout + "_raw_performance_log.txt", "a") as logFile: + logFile.write("./Tensor_voxel_host " + headerPath + " " + dataPath + " " + dstPathTemp + " " + str(layout) + " " + str(case) + " " + str(numRuns) + " " + str(testType) + " " + str(qaMode) + " " + str(batchSize) + " " + str(bitDepth) + "\n") + process = subprocess.Popen([buildFolderPath + "/build/Tensor_voxel_host", headerPath, dataPath, dstPathTemp, str(layout), str(case), str(numRuns), str(testType), str(qaMode), str(batchSize), str(bitDepth), scriptPath], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # nosec while True: output = process.stdout.readline() if not output and process.poll() is not None: break - print(output.strip()) + output = output.decode().strip() # Decode bytes to string and strip extra whitespace + print(output) + logFile.write(output) if "Running" in output or "max,min,avg wall times" in output: cleanedOutput = ''.join(char for char in output if 32 <= ord(char) <= 126) # Remove control characters cleanedOutput = cleanedOutput.strip() # Remove leading/trailing whitespace @@ -203,8 +206,8 @@ def rpp_test_suite_parser_and_validator(): os.chdir(buildFolderPath + "/build") # Run cmake and make commands -subprocess.run(["cmake", scriptPath], cwd=".") # nosec -subprocess.run(["make", "-j16"], cwd=".") # nosec +subprocess.call(["cmake", scriptPath], cwd=".") # nosec +subprocess.call(["make", "-j16"], cwd=".") # nosec # List of cases supported supportedCaseList = ['0', '1', '2', '3', '4', '5', '6'] diff --git a/utilities/test_suite/common.py b/utilities/test_suite/common.py index 699495b39..19c0529df 100644 --- a/utilities/test_suite/common.py +++ b/utilities/test_suite/common.py @@ -27,6 +27,13 @@ import sys import datetime import shutil +import pandas as pd + +try: + from errno import FileExistsError +except ImportError: + # Python 2 compatibility + FileExistsError = OSError try: from errno import FileExistsError @@ -179,7 +186,7 @@ def case_file_check(CASE_FILE_PATH, TYPE, TENSOR_TYPE_LIST, new_file, d_counter) def directory_name_generator(qaMode, affinity, layoutType, case, path, func_group_finder): if qaMode == 0: functionality_group = func_group_finder(int(case)) - dst_folder_temp = f"{path}/rpp_{affinity}_{layoutType}_{functionality_group}" + dst_folder_temp = path + "/rpp_" + affinity + "_" + layoutType + "_" + functionality_group else: dst_folder_temp = path @@ -360,3 +367,22 @@ def func_group_finder(case_number): if case_number in value: return key return "miscellaneous" + +def dataframe_to_markdown(df): + # Calculate the maximum width of each column + column_widths = {} + for col in df.columns: + max_length = len(col) + for value in df[col]: + max_length = max(max_length, len(str(value))) + column_widths[col] = max_length + + # Create the header row + md = '| ' + ' | '.join([col.ljust(column_widths[col]) for col in df.columns]) + ' |\n' + md += '| ' + ' | '.join(['-' * column_widths[col] for col in df.columns]) + ' |\n' + + # Create the data rows + for i, row in df.iterrows(): + md += '| ' + ' | '.join([str(value).ljust(column_widths[df.columns[j]]) for j, value in enumerate(row.values)]) + ' |\n' + + return md diff --git a/utilities/test_suite/rpp_test_suite_misc.h b/utilities/test_suite/rpp_test_suite_misc.h index 0a4197caa..9ef118c48 100644 --- a/utilities/test_suite/rpp_test_suite_misc.h +++ b/utilities/test_suite/rpp_test_suite_misc.h @@ -98,6 +98,7 @@ void fill_roi_values(Rpp32u nDim, Rpp32u batchSize, Rpp32u *roiTensor, bool qaMo case 3: { std::array roi = {0, 0, 0, 50, 50, 8}; + for(int i = 0, j = 0; i < batchSize ; i++, j += 6) std::copy(roi.begin(), roi.end(), &roiTensor[j]); break; exit(0); From 6e30d487bfda1f2d6625c24e2078cba03cc15808 Mon Sep 17 00:00:00 2001 From: Abishek <52214183+r-abishek@users.noreply.github.com> Date: Tue, 23 Jul 2024 23:51:48 -0700 Subject: [PATCH 4/7] Doxyfile update - Add arithmetic and audio rppt groups (#404) * Update Doxyfile to add arithmetic and audio rppt groups * Update .Doxyfile * Update Doxyfile * Update .Doxyfile to alphabetical order inputs * Update docs/doxygen/Doxyfile to alphabetical order inputs * Update arguments addressing doxygen warnings * Update rppt_tensor_color_augmentations.h to address doxygen warnings * Update rppt_tensor_effects_augmentations.h to address doxygen warnings * Update rppt_tensor_filter_augmentations.h to address doxygen warnings * Update rppt_tensor_geometric_augmentations.h to address doxygen warnings * Update rppt_tensor_logical_operations.h to address doxygen warnings * Update rppt_tensor_morphological_operations.h to address doxygen warnings * Update rppt_tensor_statistical_operations.h --------- Co-authored-by: Kiriti Gowda --- .Doxyfile | 10 ++-- docs/doxygen/Doxyfile | 8 +-- include/rppt_tensor_arithmetic_operations.h | 6 +-- include/rppt_tensor_color_augmentations.h | 38 ++++++------- include/rppt_tensor_effects_augmentations.h | 54 +++++++++---------- include/rppt_tensor_filter_augmentations.h | 4 +- include/rppt_tensor_geometric_augmentations.h | 50 ++++++++--------- include/rppt_tensor_logical_operations.h | 10 ++-- .../rppt_tensor_morphological_operations.h | 6 +-- include/rppt_tensor_statistical_operations.h | 22 ++++---- 10 files changed, 105 insertions(+), 103 deletions(-) diff --git a/.Doxyfile b/.Doxyfile index 066a53c02..dac8a3acc 100644 --- a/.Doxyfile +++ b/.Doxyfile @@ -960,16 +960,16 @@ INPUT = README.md \ include/rppi_logical_operations.h \ include/rppi_morphological_transforms.h \ include/rppi_statistical_operations.h \ + include/rppt_tensor_arithmetic_operations.h \ + include/rppt_tensor_audio_augmentations.h \ include/rppt_tensor_color_augmentations.h \ include/rppt_tensor_data_exchange_operations.h \ include/rppt_tensor_effects_augmentations.h \ include/rppt_tensor_filter_augmentations.h \ include/rppt_tensor_geometric_augmentations.h \ + include/rppt_tensor_logical_operations.h \ include/rppt_tensor_morphological_operations.h \ - include/rppt_tensor_statistical_operations.h \ - include/rppt_tensor_arithmetic_operations.h \ - include/rppt_tensor_audio_augmentations.h \ - include/rppt_tensor_logical_operations.h + include/rppt_tensor_statistical_operations.h # This tag can be used to specify the character encoding of the source files @@ -2381,7 +2381,7 @@ INCLUDE_FILE_PATTERNS = # recursively expanded use the := operator instead of the = operator. # This tag requires that the tag ENABLE_PREPROCESSING is set to YES. -PREDEFINED = GPU_SUPPORT RPP_BACKEND_HIP HIP_COMPILE +PREDEFINED = GPU_SUPPORT RPP_BACKEND_HIP HIP_COMPILE AUDIO_SUPPORT # If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then this # tag can be used to specify a list of macro names that should be expanded. The diff --git a/docs/doxygen/Doxyfile b/docs/doxygen/Doxyfile index 18d9a73bc..9773637df 100644 --- a/docs/doxygen/Doxyfile +++ b/docs/doxygen/Doxyfile @@ -962,14 +962,16 @@ INPUT = ../../README.md \ ../../include/rppi_logical_operations.h \ ../../include/rppi_morphological_transforms.h \ ../../include/rppi_statistical_operations.h \ + ../../include/rppt_tensor_arithmetic_operations.h \ + ../../include/rppt_tensor_audio_augmentations.h \ ../../include/rppt_tensor_color_augmentations.h \ ../../include/rppt_tensor_data_exchange_operations.h \ ../../include/rppt_tensor_effects_augmentations.h \ ../../include/rppt_tensor_filter_augmentations.h \ ../../include/rppt_tensor_geometric_augmentations.h \ + ../../include/rppt_tensor_logical_operations.h \ ../../include/rppt_tensor_morphological_operations.h \ - ../../include/rppt_tensor_statistical_operations.h \ - ../../include/rppt_tensor_logical_operations.h + ../../include/rppt_tensor_statistical_operations.h # This tag can be used to specify the character encoding of the source files @@ -2381,7 +2383,7 @@ INCLUDE_FILE_PATTERNS = # recursively expanded use the := operator instead of the = operator. # This tag requires that the tag ENABLE_PREPROCESSING is set to YES. -PREDEFINED = GPU_SUPPORT RPP_BACKEND_HIP HIP_COMPILE +PREDEFINED = GPU_SUPPORT RPP_BACKEND_HIP HIP_COMPILE AUDIO_SUPPORT # If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then this # tag can be used to specify a list of macro names that should be expanded. The diff --git a/include/rppt_tensor_arithmetic_operations.h b/include/rppt_tensor_arithmetic_operations.h index 4ffd24156..88a8d76a2 100644 --- a/include/rppt_tensor_arithmetic_operations.h +++ b/include/rppt_tensor_arithmetic_operations.h @@ -190,7 +190,7 @@ RppStatus rppt_subtract_scalar_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcGeneri * \retval RPP_SUCCESS Successful completion. * \retval RPP_ERROR* Unsuccessful completion. */ -RppStatus rppt_multiply_scalar_host(RppPtr_t srcPtr, RpptGenericDescPtr srcGenericDescPtr, RppPtr_t dstPtr, RpptGenericDescPtr dstGenericDescPtr, Rpp32f *subtractTensor, RpptROI3DPtr roiGenericPtrSrc, RpptRoi3DType roiType, rppHandle_t rppHandle); +RppStatus rppt_multiply_scalar_host(RppPtr_t srcPtr, RpptGenericDescPtr srcGenericDescPtr, RppPtr_t dstPtr, RpptGenericDescPtr dstGenericDescPtr, Rpp32f *mulTensor, RpptROI3DPtr roiGenericPtrSrc, RpptRoi3DType roiType, rppHandle_t rppHandle); #ifdef GPU_SUPPORT /*! \brief Multiply scalar augmentation on HIP backend @@ -226,7 +226,7 @@ RppStatus rppt_multiply_scalar_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcGeneri * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -248,7 +248,7 @@ RppStatus rppt_magnitude_host(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr sr * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. diff --git a/include/rppt_tensor_color_augmentations.h b/include/rppt_tensor_color_augmentations.h index 3b39448eb..62ef13715 100644 --- a/include/rppt_tensor_color_augmentations.h +++ b/include/rppt_tensor_color_augmentations.h @@ -54,7 +54,7 @@ extern "C" { * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] alphaTensor alpha values for brightness calculation (1D tensor in HOST memory, of size batchSize, with 0 <= alpha <= 20 for each image in batch) * \param [in] betaTensor beta values for brightness calculation (1D tensor in HOST memory, of size batchSize, with 0 <= beta <= 255 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -76,7 +76,7 @@ RppStatus rppt_brightness_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] alphaTensor alpha values for brightness calculation (1D tensor in pinned/HOST memory, of size batchSize, with 0 <= alpha <= 20 for each image in batch) * \param [in] betaTensor beta values for brightness calculation (1D tensor in pinned/HOST memory, of size batchSize, with 0 <= beta <= 255 for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -97,7 +97,7 @@ RppStatus rppt_brightness_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] gammaTensor gamma values for gamma correction calculation (1D tensor in HOST memory, of size batchSize with gamma >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -118,7 +118,7 @@ RppStatus rppt_gamma_correction_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, Rp * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] gammaTensor gamma values for gamma correction calculation (1D tensor in pinned/HOST memory, of size batchSize with gamma >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -141,7 +141,7 @@ RppStatus rppt_gamma_correction_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, Rpp * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] alphaTensor alpha values for alpha-blending (1D tensor in HOST memory, of size batchSize with the transparency factor transparency factor 0 <= alpha <= 1 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -164,7 +164,7 @@ RppStatus rppt_blend_host(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr srcDes * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] alphaTensor alpha values for alpha-blending (1D tensor in pinned/HOST memory, of size batchSize with the transparency factor transparency factor 0 <= alpha <= 1 for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -188,7 +188,7 @@ RppStatus rppt_blend_gpu(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr srcDesc * \param [in] contrastTensor contrast modification parameter for color_twist calculation (1D tensor in HOST memory, of size batchSize with 0 < contrastTensor[i] <= 255 for each image in batch) * \param [in] hueTensor hue modification parameter for color_twist calculation (1D tensor in HOST memory, of size batchSize with 0 <= hueTensor[i] <= 359 for each image in batch) * \param [in] saturationTensor saturation modification parameter for color_twist calculation (1D tensor in HOST memory, of size batchSize with saturationTensor[i] >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -212,7 +212,7 @@ RppStatus rppt_color_twist_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_ * \param [in] contrastTensor contrast modification parameter for color_twist calculation (1D tensor in pinned/HOST memory, of size batchSize with 0 < contrastTensor[i] <= 255 for each image in batch) * \param [in] hueTensor hue modification parameter for color_twist calculation (1D tensor in pinned/HOST memory, of size batchSize with 0 <= hueTensor[i] <= 359 for each image in batch) * \param [in] saturationTensor saturation modification parameter for color_twist calculation (1D tensor in pinned/HOST memory, of size batchSize with saturationTensor[i] >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -236,7 +236,7 @@ RppStatus rppt_color_twist_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] contrastTensor contrast modification parameter for color_jitter calculation (1D tensor in HOST memory, of size batchSize with 0 < contrastTensor[i] <= 255 for each image in batch) * \param [in] hueTensor hue modification parameter for color_jitter calculation (1D tensor in HOST memory, of size batchSize with 0 <= hueTensor[i] <= 359 for each image in batch) * \param [in] saturationTensor saturation modification parameter for color_jitter calculation (1D tensor in HOST memory, of size batchSize with saturationTensor[i] >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -257,7 +257,7 @@ RppStatus rppt_color_jitter_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] rgbTensor R/G/B values for color casting calculation (2D tensor in HOST memory, of size sizeof(RpptRGB) * batchSize with 0 <= rgbTensor[n]. <= 255 for each image in batch) * \param [in] alphaTensor alpha values for color casting calculation (1D tensor in HOST memory, of size sizeof(Rpp32f) * batchSize with alphaTensor[i] >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -279,7 +279,7 @@ RppStatus rppt_color_cast_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] rgbTensor R/G/B values for color casting calculation (2D tensor in pinned/HOST memory, of size sizeof(RpptRGB) * batchSize with 0 <= rgbTensor[n]. <= 255 for each image in batch) * \param [in] alphaTensor alpha values for color casting calculation (1D tensor in pinned/HOST memory, of size sizeof(Rpp32f) * batchSize with alphaTensor[i] >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -300,7 +300,7 @@ RppStatus rppt_color_cast_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] exposureFactorTensor exposure factor values for exposure adjustment (1D tensor in HOST memory, of size batchSize, with exposureFactorTensor[n] >= 0 for each image in the batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -321,7 +321,7 @@ RppStatus rppt_exposure_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t d * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] exposureFactorTensor exposure factor values for exposure adjustment (1D tensor in pinned/HOST memory, of size batchSize, with exposureFactorTensor[n] >= 0 for each image in the batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -343,7 +343,7 @@ RppStatus rppt_exposure_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t ds * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] contrastFactorTensor contrast factor values for contrast calculation (1D tensor in HOST memory, of size batchSize with contrastFactorTensor[n] > 0 for each image in a batch)) * \param [in] contrastCenterTensor contrast center values for contrast calculation (1D tensor in HOST memory, of size batchSize) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -365,7 +365,7 @@ RppStatus rppt_contrast_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t d * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] contrastFactorTensor contrast factor values for contrast calculation (1D tensor in pinned/HOST memory, of size batchSize with contrastFactorTensor[n] > 0 for each image in a batch)) * \param [in] contrastCenterTensor contrast center values for contrast calculation (1D tensor in pinned/HOST memory, of size batchSize) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -386,7 +386,7 @@ RppStatus rppt_contrast_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t ds * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] lutPtr lut Array in HOST memory, containing a single integer look up table of length 65536, to be used for all images in the batch - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -407,7 +407,7 @@ RppStatus rppt_lut_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] lutPtr lut Array in pinned/HOST memory, containing a single integer look up table of length 65536, to be used for all images in the batch - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -428,7 +428,7 @@ RppStatus rppt_lut_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] adjustmentValueTensor adjustment values for color temperature calculation (1D tensor of size batchSize with -100 <= adjustmentValueTensor[i] >= 100 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -449,7 +449,7 @@ RppStatus rppt_color_temperature_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, R * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] adjustmentValueTensor adjustment values for color temperature calculation (1D tensor of size batchSize with -100 <= adjustmentValueTensor[i] >= 100 for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. diff --git a/include/rppt_tensor_effects_augmentations.h b/include/rppt_tensor_effects_augmentations.h index 085d62b84..b185e0081 100644 --- a/include/rppt_tensor_effects_augmentations.h +++ b/include/rppt_tensor_effects_augmentations.h @@ -56,7 +56,7 @@ extern "C" { * \param [in] gridRatio gridRatio value for gridmask calculation = black square width / tileWidth (a single Rpp32f number with 0 <= gridRatio <= 1 that applies to all images in the batch) * \param [in] gridAngle gridAngle value for gridmask calculation = grid rotation angle in radians (a single Rpp32f number that applies to all images in the batch) * \param [in] translateVector translateVector for gridmask calculation = grid X and Y translation lengths in pixels (a single RpptUintVector2D x,y value pair that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -80,7 +80,7 @@ RppStatus rppt_gridmask_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t d * \param [in] gridRatio gridRatio value for gridmask calculation = black square width / tileWidth (a single Rpp32f number with 0 <= gridRatio <= 1 that applies to all images in the batch) * \param [in] gridAngle gridAngle value for gridmask calculation = grid rotation angle in radians (a single Rpp32f number that applies to all images in the batch) * \param [in] translateVector translateVector for gridmask calculation = grid X and Y translation lengths in pixels (a single RpptUintVector2D x,y value pair that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -103,7 +103,7 @@ RppStatus rppt_gridmask_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t ds * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] spatterColor RGB values to use for the spatter augmentation (A single set of 3 Rpp8u values as RpptRGB that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 1920 and roiTensorSrc[i].xywhROI.roiHeight <= 1080) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 1920 and roiTensorSrc[i].xywhROI.roiHeight <= 1080) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -126,7 +126,7 @@ RppStatus rppt_spatter_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t ds * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] spatterColor RGB values to use for the spatter augmentation (A single set of 3 Rpp8u values as RpptRGB that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 1920 and roiTensorSrc[i].xywhROI.roiHeight <= 1080) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 1920 and roiTensorSrc[i].xywhROI.roiHeight <= 1080) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -151,7 +151,7 @@ RppStatus rppt_spatter_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dst * \param [in] saltValueTensor A user-defined salt noise value (1D tensor in HOST memory, of size batchSize with 0 <= saltValueTensor[i] <= 1 for each image in batch) * \param [in] pepperValueTensor A user-defined pepper noise value (1D tensor in HOST memory, of size batchSize with 0 <= pepperValueTensor[i] <= 1 for each image in batch) * \param [in] seed A user-defined seed value (single Rpp32u value) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -176,7 +176,7 @@ RppStatus rppt_salt_and_pepper_noise_host(RppPtr_t srcPtr, RpptDescPtr srcDescPt * \param [in] saltValueTensor A user-defined salt noise value (1D tensor in pinned/HOST memory, of size batchSize with 0 <= saltValueTensor[i] <= 1 for each image in batch) * \param [in] pepperValueTensor A user-defined pepper noise value (1D tensor in pinned/HOST memory, of size batchSize with 0 <= pepperValueTensor[i] <= 1 for each image in batch) * \param [in] seed A user-defined seed value (single Rpp32u value) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -198,7 +198,7 @@ RppStatus rppt_salt_and_pepper_noise_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] shotNoiseFactorTensor shotNoiseFactor values for each image, which are used to compute the lambda values in a poisson distribution (1D tensor in HOST memory, of size batchSize with shotNoiseFactorTensor[i] >= 0 for each image in batch) * \param [in] seed A user-defined seed value (single Rpp32u value) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -220,7 +220,7 @@ RppStatus rppt_shot_noise_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] shotNoiseFactorTensor shotNoiseFactor values for each image, which are used to compute the lambda values in a poisson distribution (1D tensor in pinned/HOST memory, of size batchSize with shotNoiseFactorTensor[i] >= 0 for each image in batch) * \param [in] seed A user-defined seed value (single Rpp32u value) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -243,7 +243,7 @@ RppStatus rppt_shot_noise_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] meanTensor mean values for each image, which are used to compute the generalized Box-Mueller transforms in a gaussian distribution (1D tensor in HOST memory, of size batchSize with meanTensor[i] >= 0 for each image in batch) * \param [in] stdDevTensor stdDev values for each image, which are used to compute the generalized Box-Mueller transforms in a gaussian distribution (1D tensor in HOST memory, of size batchSize with stdDevTensor[i] >= 0 for each image in batch) * \param [in] seed A user-defined seed value (single Rpp32u value) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -266,7 +266,7 @@ RppStatus rppt_gaussian_noise_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppP * \param [in] meanTensor mean values for each image, which are used to compute the generalized Box-Mueller transforms in a gaussian distribution (1D tensor in pinned/HOST memory, of size batchSize with meanTensor[i] >= 0 for each image in batch) * \param [in] stdDevTensor stdDev values for each image, which are used to compute the generalized Box-Mueller transforms in a gaussian distribution (1D tensor in pinned/HOST memory, of size batchSize with stdDevTensor[i] >= 0 for each image in batch) * \param [in] seed A user-defined seed value (single Rpp32u value) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -289,7 +289,7 @@ RppStatus rppt_gaussian_noise_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPt * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] stdDevTensor stdDev values for each image, which are used to compute the generalized Box-Mueller transforms in a gaussian distribution (1D tensor in HOST memory, of size batchSize with stdDevTensor[i] >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -312,7 +312,7 @@ RppStatus rppt_non_linear_blend_host(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDes * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] stdDevTensor stdDev values for each image, which are used to compute the generalized Box-Mueller transforms in a gaussian distribution (1D tensor in pinned/HOST memory, of size batchSize with stdDevTensor[i] >= 0 for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -338,7 +338,7 @@ RppStatus rppt_non_linear_blend_gpu(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDesc * \param[in] freqYTensor freqY values for water effect (1D tensor in HOST memory, of size batchSize) * \param[in] phaseXTensor amplitudeY values for water effect (1D tensor in HOST memory, of size batchSize) * \param[in] phaseYTensor amplitudeY values for water effect (1D tensor in HOST memory, of size batchSize) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -364,7 +364,7 @@ RppStatus rppt_water_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstP * \param[in] freqYTensor freqY values for water effect (1D tensor in pinned/HOST memory, of size batchSize) * \param[in] phaseXTensor amplitudeY values for water effect (1D tensor in pinned/HOST memory, of size batchSize) * \param[in] phaseYTensor amplitudeY values for water effect (1D tensor in pinned/HOST memory, of size batchSize) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -433,7 +433,7 @@ RppStatus rppt_ricap_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPt * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param[in] vignetteIntensityTensor intensity values to quantify vignette effect (1D tensor of size batchSize with 0 < vignetteIntensityTensor[n] for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -455,7 +455,7 @@ RppStatus rppt_vignette_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t d * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param[in] vignetteIntensityTensor intensity values to quantify vignette effect (1D tensor of size batchSize with 0 < vignetteIntensityTensor[n] for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -568,7 +568,7 @@ RppStatus rppt_gaussian_noise_voxel_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcD - Erase-region anchor boxes on each image given by the user must not overlap * \param [in] colorsTensor RGB values to use for each erase-region inside each image in the batch. (colors[i] will have range equivalent of srcPtr) * \param [in] numBoxesTensor number of erase-regions per image, for each image in the batch. (numBoxesTensor[n] >= 0) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -593,7 +593,7 @@ RppStatus rppt_erase_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstP - Erase-region anchor boxes on each image given by the user must not overlap * \param [in] colorsTensor RGB values to use for each erase-region inside each image in the batch. (colors[i] will have range equivalent of srcPtr) * \param [in] numBoxesTensor number of erase-regions per image, for each image in the batch. (numBoxesTensor[n] >= 0) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -603,13 +603,6 @@ RppStatus rppt_erase_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstP RppStatus rppt_erase_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, RpptRoiLtrb *anchorBoxInfoTensor, RppPtr_t colorsTensor, Rpp32u *numBoxesTensor, RpptROIPtr roiTensorPtrSrc, RpptRoiType roiType, rppHandle_t rppHandle); #endif // GPU_SUPPORT -/*! @} - */ - -#ifdef __cplusplus -} -#endif - /*! \brief Glitch augmentation on HOST backend for a NCHW/NHWC layout tensor * \details The glitch augmentation adds a glitch effect for a batch of RGB(3 channel) / greyscale(1 channel) images with an NHWC/NCHW tensor layout.
* - srcPtr depth ranges - Rpp8u (0 to 255), Rpp16f (0 to 1), Rpp32f (0 to 1), Rpp8s (-128 to 127). @@ -622,7 +615,7 @@ RppStatus rppt_erase_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPt * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] rgbOffsets RGB offset values to use for the glitch augmentation (A single set of 3 Rppi point values that applies to all images in the batch. * For each point and for each image in the batch: 0 < point.x < width, 0 < point.y < height) - * \param [in] roiTensorSrc ROI data for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -644,7 +637,7 @@ RppStatus rppt_glitch_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dst * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] rgbOffsets RGB offset values to use for the glitch augmentation (A 1D tensor in pinned/HOST memory contains single set of 3 Rppi point values that applies to all images in the batch. * For each point and for each image in the batch: 0 < point.x < width, 0 < point.y < height) - * \param [in] roiTensorSrc ROI data for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -653,4 +646,11 @@ RppStatus rppt_glitch_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dst */ RppStatus rppt_glitch_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, RpptChannelOffsets *rgbOffsets, RpptROIPtr roiTensorPtrSrc, RpptRoiType roiType, rppHandle_t rppHandle); #endif // GPU_SUPPORT + +/*! @} + */ + +#ifdef __cplusplus +} +#endif #endif // RPPT_TENSOR_EFFECTS_AUGMENTATIONS_H diff --git a/include/rppt_tensor_filter_augmentations.h b/include/rppt_tensor_filter_augmentations.h index 7ea8d00c6..992631c49 100644 --- a/include/rppt_tensor_filter_augmentations.h +++ b/include/rppt_tensor_filter_augmentations.h @@ -57,7 +57,7 @@ extern "C" { * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] kernelSize kernel size for box filter (a single Rpp32u odd number with kernelSize = 3/5/7/9 that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -83,7 +83,7 @@ RppStatus rppt_box_filter_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] stdDevTensor stdDev values for gaussian calculation (1D tensor in pinned/HOST memory, of size batchSize, for each image in batch) * \param [in] kernelSize kernel size for gaussian filter (a single Rpp32u odd number with kernelSize = 3/5/7/9 that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. diff --git a/include/rppt_tensor_geometric_augmentations.h b/include/rppt_tensor_geometric_augmentations.h index 884127a71..6da067844 100644 --- a/include/rppt_tensor_geometric_augmentations.h +++ b/include/rppt_tensor_geometric_augmentations.h @@ -52,7 +52,7 @@ extern "C" { * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -72,7 +72,7 @@ RppStatus rppt_crop_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPt * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -95,7 +95,7 @@ RppStatus rppt_crop_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr * \param [in] offsetTensor offset values for normalization (1D tensor in HOST memory, of size batchSize, with offsetTensor[n] <= 0) * \param [in] multiplierTensor multiplier values for normalization (1D tensor in HOST memory, of size batchSize, with multiplierTensor[n] > 0) * \param [in] mirrorTensor mirror flag values to set mirroring on/off (1D tensor in HOST memory, of size batchSize, with mirrorTensor[n] = 0/1) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -118,7 +118,7 @@ RppStatus rppt_crop_mirror_normalize_host(RppPtr_t srcPtr, RpptDescPtr srcDescPt * \param [in] offsetTensor offset values for normalization (1D tensor in pinned/HOST memory, of size batchSize, with offsetTensor[n] <= 0) * \param [in] multiplierTensor multiplier values for normalization (1D tensor in pinned/HOST memory, of size batchSize, with multiplierTensor[n] > 0) * \param [in] mirrorTensor mirror flag values to set mirroring on/off (1D tensor in pinned/HOST memory, of size batchSize, with mirrorTensor[n] = 0/1) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -140,7 +140,7 @@ RppStatus rppt_crop_mirror_normalize_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] affineTensor affine matrix values for transformation calculation (2D tensor in HOST memory, of size batchSize * 6 for each image in batch) * \param [in] interpolationType Interpolation type used (RpptInterpolationType::XYWH or RpptRoiType::LTRB) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -162,7 +162,7 @@ RppStatus rppt_warp_affine_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_ * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] affineTensor affine matrix values for transformation calculation (2D tensor in pinned/HOST memory, of size batchSize * 6 for each image in batch) * \param [in] interpolationType Interpolation type used (RpptInterpolationType::XYWH or RpptRoiType::LTRB) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -184,7 +184,7 @@ RppStatus rppt_warp_affine_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] horizontalTensor horizontal flag values to set horizontal flip on/off (1D tensor in HOST memory, of size batchSize, with horizontalTensor[i] = 0/1) * \param [in] verticalTensor vertical flag values to set vertical flip on/off (1D tensor in HOST memory, of size batchSize, with verticalTensor[i] = 0/1) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -206,7 +206,7 @@ RppStatus rppt_flip_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPt * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] horizontalTensor horizontal flag values to set horizontal flip on/off (1D tensor in pinned/HOST memory, of size batchSize, with horizontalTensor[i] = 0/1) * \param [in] verticalTensor vertical flag values to set vertical flip on/off (1D tensor in pinned/HOST memory, of size batchSize, with verticalTensor[i] = 0/1) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -228,7 +228,7 @@ RppStatus rppt_flip_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] dstImgSizes destination image sizes ( \ref RpptImagePatchPtr type pointer to array, in HOST memory, of size batchSize) * \param [in] interpolationType Interpolation type used in \ref RpptInterpolationType - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -250,7 +250,7 @@ RppStatus rppt_resize_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dst * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] dstImgSizes destination image sizes ( \ref RpptImagePatchPtr type pointer to array, in pinned/HOST memory, of size batchSize) * \param [in] interpolationType Interpolation type used in \ref RpptInterpolationType - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -275,7 +275,7 @@ RppStatus rppt_resize_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstP * \param [in] meanTensor mean value for each image in the batch (meanTensor[n] >= 0, 1D tensor in HOST memory, of size = batchSize for greyscale images, size = batchSize * 3 for RGB images)) * \param [in] stdDevTensor standard deviation value for each image in the batch (stdDevTensor[n] >= 0, 1D tensor in HOST memory, of size = batchSize for greyscale images, size = batchSize * 3 for RGB images) * \param [in] mirrorTensor mirror flag value to set mirroring on/off (1D tensor in HOST memory, of size batchSize, with mirrorTensor[n] = 0/1) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -300,7 +300,7 @@ RppStatus rppt_resize_mirror_normalize_host(RppPtr_t srcPtr, RpptDescPtr srcDesc * \param [in] meanTensor mean value for each image in the batch (meanTensor[n] >= 0, 1D tensor in pinned/HOST memory, of size = batchSize for greyscale images, size = batchSize * 3 for RGB images)) * \param [in] stdDevTensor standard deviation value for each image in the batch (stdDevTensor[n] >= 0, 1D tensor in pinned/HOST memory, of size = batchSize for greyscale images, size = batchSize * 3 for RGB images) * \param [in] mirrorTensor mirror flag value to set mirroring on/off (1D tensor in pinned/HOST memory, of size batchSize, with mirrorTensor[n] = 0/1) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -323,7 +323,7 @@ RppStatus rppt_resize_mirror_normalize_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescP * \param [in] dstImgSizes destination image sizes ( \ref RpptImagePatchPtr type pointer to array, in HOST memory, of size batchSize) * \param [in] interpolationType Interpolation type used in \ref RpptInterpolationType * \param [in] mirrorTensor mirror flag value to set mirroring on/off (1D tensor in HOST memory, of size batchSize, with mirrorTensor[n] = 0/1) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -346,7 +346,7 @@ RppStatus rppt_resize_crop_mirror_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, * \param [in] dstImgSizes destination image sizes ( \ref RpptImagePatchPtr type pointer to array, in pinned/HOST memory, of size batchSize) * \param [in] interpolationType Interpolation type used in \ref RpptInterpolationType * \param [in] mirrorTensor mirror flag value to set mirroring on/off (1D tensor in pinned/HOST memory, of size batchSize, with mirrorTensor[n] = 0/1) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -368,7 +368,7 @@ RppStatus rppt_resize_crop_mirror_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, R * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] angle image rotation angle in degrees - positive deg-anticlockwise/negative deg-clockwise (1D tensor in HOST memory, of size batchSize) * \param [in] interpolationType Interpolation type used (RpptInterpolationType::XYWH or RpptRoiType::LTRB) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -390,7 +390,7 @@ RppStatus rppt_rotate_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dst * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] angle image rotation angle in degrees - positive deg-anticlockwise/negative deg-clockwise (1D tensor in pinned/HOST memory, of size batchSize) * \param [in] interpolationType Interpolation type used (RpptInterpolationType::XYWH or RpptRoiType::LTRB) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -412,7 +412,7 @@ RppStatus rppt_rotate_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstP * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -434,7 +434,7 @@ RppStatus rppt_phase_host(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr srcDes * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -500,7 +500,7 @@ RppStatus rppt_slice_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcGenericDescPtr, * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] cropRoiTensor crop co-ordinates in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] patchRoiTensor patch co-ordinates in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) @@ -526,7 +526,7 @@ RppStatus rppt_crop_and_patch_host(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescP * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] cropRoiTensor crop co-ordinates in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] patchRoiTensor patch co-ordinates in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) @@ -598,7 +598,7 @@ RppStatus rppt_flip_voxel_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcGenericDesc * \param [in] colRemapTable Rpp32f column numbers in HOST memory for every pixel in the input batch of images (Restrictions - rois in the colRemapTable data for each image in batch must match roiTensorSrc) * \param [in] tableDescPtr rowRemapTable and colRemapTable common tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = F32, layout = NHWC, c = 1) * \param [in] interpolationType Interpolation type used in \ref RpptInterpolationType (Restrictions - Supports only NEAREST_NEIGHBOR and BILINEAR) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -623,7 +623,7 @@ RppStatus rppt_remap_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstP * \param [in] colRemapTable Rpp32f column numbers in HIP memory for every pixel in the input batch of images (Restrictions - rois in the colRemapTable data for each image in batch must match roiTensorSrc) * \param [in] tableDescPtr rowRemapTable and colRemapTable common tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = F32, layout = NHWC, c = 1) * \param [in] interpolationType Interpolation type used in \ref RpptInterpolationType (Restrictions - Supports only NEAREST_NEIGHBOR and BILINEAR) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -650,7 +650,7 @@ RppStatus rppt_remap_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPt * \param [in] tableDescPtr table tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = F32, layout = NHWC, c = 1) * \param [in] cameraMatrixTensor contains camera intrinsic parameters required to compute lens corrected image. (1D tensor of size 9 * batchSize) * \param [in] distortionCoeffsTensor contains distortion coefficients required to compute lens corrected image. (1D tensor of size 8 * batchSize) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -677,7 +677,7 @@ RppStatus rppt_lens_correction_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, Rpp * \param [in] tableDescPtr table tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = F32, layout = NHWC, c = 1) * \param [in] cameraMatrixTensor contains camera intrinsic parameters required to compute lens corrected image. (1D tensor of size 9 * batchSize) * \param [in] distortionCoeffsTensor contains distortion coefficients required to compute lens corrected image. (1D tensor of size 8 * batchSize) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -728,4 +728,4 @@ RppStatus rppt_transpose_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcGenericDescP #ifdef __cplusplus } #endif -#endif // RPPT_TENSOR_GEOMETRIC_AUGMENTATIONS_H \ No newline at end of file +#endif // RPPT_TENSOR_GEOMETRIC_AUGMENTATIONS_H diff --git a/include/rppt_tensor_logical_operations.h b/include/rppt_tensor_logical_operations.h index 3a4685167..28dff69ce 100644 --- a/include/rppt_tensor_logical_operations.h +++ b/include/rppt_tensor_logical_operations.h @@ -54,7 +54,7 @@ extern "C" { * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -76,7 +76,7 @@ RppStatus rppt_bitwise_and_host(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -98,7 +98,7 @@ RppStatus rppt_bitwise_and_gpu(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr s * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HOST memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -120,7 +120,7 @@ RppStatus rppt_bitwise_or_host(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr s * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -136,4 +136,4 @@ RppStatus rppt_bitwise_or_gpu(RppPtr_t srcPtr1, RppPtr_t srcPtr2, RpptDescPtr sr #ifdef __cplusplus } #endif -#endif // RPPT_TENSOR_LOGICAL_OPERATIONS_H \ No newline at end of file +#endif // RPPT_TENSOR_LOGICAL_OPERATIONS_H diff --git a/include/rppt_tensor_morphological_operations.h b/include/rppt_tensor_morphological_operations.h index eb879af5c..126c4757a 100644 --- a/include/rppt_tensor_morphological_operations.h +++ b/include/rppt_tensor_morphological_operations.h @@ -57,7 +57,7 @@ extern "C" { * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] kernelSize kernel size for box filter (a single Rpp32u odd number with kernelSize = 3/5/7/9 that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -82,7 +82,7 @@ RppStatus rppt_erode_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPt * \param [out] dstPtr destination tensor in HIP memory * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr) * \param [in] kernelSize kernel size for box filter (a single Rpp32u odd number with kernelSize = 3/5/7/9 that applies to all images in the batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -98,4 +98,4 @@ RppStatus rppt_dilate_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstP #ifdef __cplusplus } #endif -#endif // RPPT_TENSOR_MORPHOLOGICAL_OPERATIONS_H \ No newline at end of file +#endif // RPPT_TENSOR_MORPHOLOGICAL_OPERATIONS_H diff --git a/include/rppt_tensor_statistical_operations.h b/include/rppt_tensor_statistical_operations.h index 441816ea3..ca464340b 100644 --- a/include/rppt_tensor_statistical_operations.h +++ b/include/rppt_tensor_statistical_operations.h @@ -50,7 +50,7 @@ extern "C" { * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] tensorSumArr destination array in HOST memory * \param [in] tensorSumArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorSumArrLength >= srcDescPtr->n, and if srcDescPtr->c == 3 then tensorSumArrLength >= srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -68,7 +68,7 @@ RppStatus rppt_tensor_sum_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] tensorSumArr destination array in HIP memory * \param [in] tensorSumArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorSumArrLength >= srcDescPtr->n, and if srcDescPtr->c == 3 then tensorSumArrLength >= srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -86,7 +86,7 @@ RppStatus rppt_tensor_sum_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] minArr destination array in HOST memory * \param [in] minArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorSumArrLength >= srcDescPtr->n, and if srcDescPtr->c == 3 then tensorSumArrLength >= srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -104,7 +104,7 @@ RppStatus rppt_tensor_min_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] minArr destination array in HIP memory * \param [in] minArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorSumArrLength >= srcDescPtr->n, and if srcDescPtr->c == 3 then tensorSumArrLength >= srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -122,7 +122,7 @@ RppStatus rppt_tensor_min_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] maxArr destination array in HOST memory * \param [in] maxArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorSumArrLength >= srcDescPtr->n, and if srcDescPtr->c == 3 then tensorSumArrLength >= srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -140,7 +140,7 @@ RppStatus rppt_tensor_max_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] maxArr destination array in HIP memory * \param [in] maxArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorSumArrLength >= srcDescPtr->n, and if srcDescPtr->c == 3 then tensorSumArrLength >= srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -201,7 +201,7 @@ RppStatus rppt_normalize_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcGenericDescP * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] tensorMeanArr destination array in HOST memory * \param [in] tensorMeanArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorMeanArrLength = srcDescPtr->n, and if srcDescPtr->c == 3 then tensorMeanArrLength = srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -219,7 +219,7 @@ RppStatus rppt_tensor_mean_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_ * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 1/3) * \param [out] tensorMeanArr destination array in HIP memory * \param [in] tensorMeanArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorMeanArrLength = srcDescPtr->n, and if srcDescPtr->c == 3 then tensorMeanArrLength = srcDescPtr->n * 4) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -238,7 +238,7 @@ RppStatus rppt_tensor_mean_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t * \param [out] tensorStddevArr destination array in HOST memory * \param [in] tensorStddevArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorStddevArrLength = srcDescPtr->n, and if srcDescPtr->c == 3 then tensorStddevArrLength = srcDescPtr->n * 4) * \param [in] meanTensor mean values for stddev calculation (1D tensor of size batchSize * 4 in format (MeanR, MeanG, MeanB, MeanImage) for each image in batch) - * \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HOST handle created with \ref rppCreateWithBatchSize() * \return A \ref RppStatus enumeration. @@ -257,7 +257,7 @@ RppStatus rppt_tensor_stddev_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPt * \param [out] tensorStddevArr destination array in HIP memory * \param [in] tensorStddevArrLength length of provided destination array (Restrictions - if srcDescPtr->c == 1 then tensorStddevArrLength = srcDescPtr->n, and if srcDescPtr->c == 3 then tensorStddevArrLength = srcDescPtr->n * 4) * \param [in] meanTensor mean values for stddev calculation (1D tensor of size batchSize * 4 in format (MeanR, MeanG, MeanB, MeanImage) for each image in batch) - * \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) + * \param [in] roiTensorPtrSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) | (Restrictions - roiTensorSrc[i].xywhROI.roiWidth <= 3840 and roiTensorSrc[i].xywhROI.roiHeight <= 2160) * \param [in] roiType ROI type used (RpptRoiType::XYWH or RpptRoiType::LTRB) * \param [in] rppHandle RPP HIP handle created with \ref rppCreateWithStreamAndBatchSize() * \return A \ref RppStatus enumeration. @@ -273,4 +273,4 @@ RppStatus rppt_tensor_stddev_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr #ifdef __cplusplus } #endif -#endif // RPPT_TENSOR_STATISTICAL_OPERATIONS_H \ No newline at end of file +#endif // RPPT_TENSOR_STATISTICAL_OPERATIONS_H From 9ef84bb5a4c947da56976e5fdc4d261e3fe6a11d Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 23 Jul 2024 23:52:21 -0700 Subject: [PATCH 5/7] Docs - Bump rocm-docs-core[api_reference] from 1.5.0 to 1.5.1 in /docs/sphinx (#406) Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.5.0 to 1.5.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.5.0...v1.5.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kiriti Gowda --- docs/sphinx/requirements.in | 2 +- docs/sphinx/requirements.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sphinx/requirements.in b/docs/sphinx/requirements.in index a88668ba5..c316de276 100644 --- a/docs/sphinx/requirements.in +++ b/docs/sphinx/requirements.in @@ -1 +1 @@ -rocm-docs-core[api_reference]==1.5.0 +rocm-docs-core[api_reference]==1.5.1 diff --git a/docs/sphinx/requirements.txt b/docs/sphinx/requirements.txt index 54fbfde32..2c9286b18 100644 --- a/docs/sphinx/requirements.txt +++ b/docs/sphinx/requirements.txt @@ -110,7 +110,7 @@ requests==2.28.2 # via # pygithub # sphinx -rocm-docs-core[api-reference]==1.5.0 +rocm-docs-core[api-reference]==1.5.1 # via -r requirements.in smmap==5.0.0 # via gitdb From 3ab40bd266098681c795e12c7516adda23dcc234 Mon Sep 17 00:00:00 2001 From: sampath1117 Date: Thu, 25 Jul 2024 07:54:29 +0000 Subject: [PATCH 6/7] reverted unwanted changes happened with merge --- utilities/test_suite/HIP/Tensor_audio_hip.cpp | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/utilities/test_suite/HIP/Tensor_audio_hip.cpp b/utilities/test_suite/HIP/Tensor_audio_hip.cpp index 845f6705b..a2d4158c8 100644 --- a/utilities/test_suite/HIP/Tensor_audio_hip.cpp +++ b/utilities/test_suite/HIP/Tensor_audio_hip.cpp @@ -194,6 +194,7 @@ int main(int argc, char **argv) Rpp32f scaleRatio = outRateTensor[i] / inRateTensor[i]; srcDimsTensor[j] = srcLengthTensor[i]; srcDimsTensor[j + 1] = channelsTensor[i]; + dstDims[i].width = static_cast(std::ceil(scaleRatio * srcLengthTensor[i])); dstDims[i].height = 1; maxDstWidth = std::max(maxDstWidth, static_cast(dstDims[i].width)); } @@ -221,12 +222,19 @@ int main(int argc, char **argv) } default: { - printf("\nThe functionality %s doesn't yet exist in RPP\n", func.c_str()); - return -1; + missingFuncFlag = 1; + break; } } CHECK_RETURN_STATUS(hipDeviceSynchronize()); + endWallTime = omp_get_wtime(); + if (missingFuncFlag == 1) + { + printf("\nThe functionality %s doesn't yet exist in RPP\n", func.c_str()); + return -1; + } + wallTime = endWallTime - startWallTime; maxWallTime = std::max(maxWallTime, wallTime); minWallTime = std::min(minWallTime, wallTime); From 30bf80c14e6f144d983c8e61edf398acf9c6e396 Mon Sep 17 00:00:00 2001 From: sampath1117 Date: Thu, 25 Jul 2024 07:56:30 +0000 Subject: [PATCH 7/7] remove duplicate code added in merge --- utilities/test_suite/common.py | 16 ++-------------- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git a/utilities/test_suite/common.py b/utilities/test_suite/common.py index 15b2c701b..a0f37ffa2 100644 --- a/utilities/test_suite/common.py +++ b/utilities/test_suite/common.py @@ -29,18 +29,6 @@ import shutil import pandas as pd -try: - from errno import FileExistsError -except ImportError: - # Python 2 compatibility - FileExistsError = OSError - -try: - from errno import FileExistsError -except ImportError: - # Python 2 compatibility - FileExistsError = OSError - try: from errno import FileExistsError except ImportError: @@ -386,9 +374,9 @@ def dataframe_to_markdown(df): # Create the header row md = '| ' + ' | '.join([col.ljust(column_widths[col]) for col in df.columns]) + ' |\n' md += '| ' + ' | '.join(['-' * column_widths[col] for col in df.columns]) + ' |\n' - + # Create the data rows for i, row in df.iterrows(): md += '| ' + ' | '.join([str(value).ljust(column_widths[df.columns[j]]) for j, value in enumerate(row.values)]) + ' |\n' - + return md