Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPP Audio Support HIP - Downmixing #401

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
c33af22
Bump rocm-docs-core[api_reference] from 0.35.0 to 0.35.1 in /docs/sph…
dependabot[bot] Mar 6, 2024
14f6334
Bump rocm-docs-core[api_reference] from 0.35.1 to 0.36.0 in /docs/sph…
dependabot[bot] Mar 12, 2024
95c3272
Merge branch 'master' into develop
kiritigowda Mar 12, 2024
3973c34
added api support for ToDecibels HIP kernel
sampath1117 Mar 19, 2024
3f08f90
added test suite support for audio in HIP
sampath1117 Mar 8, 2024
b9c0788
added profiler support for hip test suite
sampath1117 Mar 18, 2024
641f653
Docs - Bump rocm-docs-core[api_reference] from 0.36.0 to 0.37.0 in /d…
dependabot[bot] Mar 20, 2024
5568573
Link cleanup (#326)
LisaDelaney Mar 20, 2024
a6749ba
Update notes
LisaDelaney Mar 20, 2024
a255906
Docs - Bump rocm-docs-core[api_reference] from 0.37.0 to 0.37.1 in /d…
dependabot[bot] Mar 22, 2024
d3df761
RPP Voxel Flip on HIP and HOST (#285)
r-abishek Mar 23, 2024
ebecb42
RPP Vignette Tensor on HOST and HIP (#311)
r-abishek Mar 23, 2024
fc1410b
Bump rocm-docs-core[api_reference] from 0.37.1 to 0.38.0 in /docs/sph…
dependabot[bot] Mar 27, 2024
cb3d539
added initial api and test suite support for downmixing hip kernel
sampath1117 Mar 29, 2024
7fe2a0e
Merge branch 'develop' into sr/downmixing_hip
sampath1117 Mar 29, 2024
e80533a
initial working commit for downmixing hip kernel
sampath1117 Mar 29, 2024
50a743f
added support fo copy input to output when channels is 1
sampath1117 Mar 29, 2024
fe06106
minor code cleanup
sampath1117 Apr 1, 2024
3ebd7c3
RPP Tensor Audio Support - Resample (#310)
r-abishek Apr 3, 2024
76f31df
Docs - Missing input and output images for Doxygen (#331)
r-abishek Apr 3, 2024
b83f910
Scratch buffers rename for HOST and HIP (#324)
r-abishek Apr 3, 2024
ebeb131
Update CMakeLists.txt
kiritigowda Apr 3, 2024
7c194b2
Merge branch 'develop' into sr/downmixing_hip
sampath1117 Apr 8, 2024
cb06f7f
added missing hipDeviceSynchronize() in test suite
sampath1117 Apr 8, 2024
0e51993
removed f16 includes since not needed for audio
sampath1117 Apr 4, 2024
004e1d6
restructured python test suite
sampath1117 Apr 4, 2024
b8f5c60
fixed spacing in Doxygen
sampath1117 Apr 11, 2024
1147bfe
Update CMakeLists.txt
kiritigowda Apr 12, 2024
9d48447
Merge remote-tracking branch 'develop' into sr/downmixing_hip
sampath1117 Apr 16, 2024
5e3fc7a
Bump rocm-docs-core[api_reference] from 0.38.1 to 1.0.0 in /docs/sphi…
dependabot[bot] Apr 18, 2024
b6b7cc5
Bump rocm-docs-core[api_reference] from 1.0.0 to 1.1.0 in /docs/sphin…
dependabot[bot] Apr 25, 2024
e16ad7a
RPP Gaussian Noise Voxel Tensor on HOST and HIP (#323)
r-abishek Apr 26, 2024
9394c78
Merge branch 'develop' into sr/downmixing_hip
sampath1117 Apr 30, 2024
06263a5
modify CHECK to CHECK_RETURN_STATUS
sampath1117 Apr 30, 2024
a7e71a3
Merge branch 'develop' into sr/downmixing_hip
sampath1117 May 2, 2024
77e14ef
Minor common-fixes for HIP (#345)
r-abishek May 7, 2024
34f3f6d
Readme Updates: --usecase=rocm (#349)
kiritigowda May 8, 2024
ab52683
RPP Tensor Audio Support - Spectrogram (#312)
r-abishek May 8, 2024
ee0d6fe
Update CHANGELOG.md (#352)
r-abishek May 8, 2024
2decd32
RPP Tensor Audio Support - Slice (#325)
r-abishek May 8, 2024
30ce1d6
RPP Tensor Audio Support - MelFilterBank (#332)
r-abishek May 8, 2024
64ae74f
RPP Tensor Normalize ND on HOST and HIP (#335)
r-abishek May 9, 2024
1a3015c
SWDEV-459739 - Remove the package obsolete setting (#353)
raramakr May 9, 2024
b926816
Merge branch 'develop' into sr/downmixing_hip
sampath1117 May 9, 2024
7cb3c03
changed globalThreads_z to use batchsize from description pointer
sampath1117 May 9, 2024
4cb8d4b
Audio support merge commit fixes (#354)
r-abishek May 9, 2024
8aab10c
Merge branch 'develop' into sr/downmixing_hip
sampath1117 May 17, 2024
9d68c49
removed if else block based on channels inside kernel
sampath1117 May 17, 2024
195e4a4
rename instances of tensor_hip_audio to tensor_audio_hip
sampath1117 May 17, 2024
807bbe8
modified verify_output to have different cutoff for HIP and HOST back…
sampath1117 May 15, 2024
b8264f7
vectorized channels loop in hip kernel
sampath1117 May 19, 2024
5cf7360
Merge branch 'develop' into sr/downmixing_hip
sampath1117 May 30, 2024
d62e9cb
Merge branch 'develop' into sr/downmixing_hip
sampath1117 Jun 18, 2024
239bf76
removed the multiplication with normalizeWeight for every channel value
sampath1117 Jun 18, 2024
fd21921
vectorized the writes for hip kernel
sampath1117 Jun 18, 2024
23072da
moved constant compute outside the loop
sampath1117 Jun 18, 2024
3b00c64
reverted back to unvectorized writes version
sampath1117 Jun 21, 2024
0c71365
Merge branch 'develop' into sr/downmixing_hip
sampath1117 Jul 12, 2024
a2e321e
Merge pull request #261 from sampath1117/sr/downmixing_hip
r-abishek Jul 16, 2024
315b64b
audio test suite changes for python 2 compatibility
sampath1117 Jul 17, 2024
7f9635a
added validation checks for numDims
sampath1117 Jul 17, 2024
611fb3f
Merge pull request #295 from sampath1117/sr/down_mixing_hip_pr_changes
r-abishek Jul 17, 2024
76ee4fe
Merge branch 'develop' into ar/audio_support_4_down_mixing_hip
kiritigowda Jul 23, 2024
0745bc0
Merge branch 'develop' into sr/down_mixing_develop_merge
sampath1117 Jul 25, 2024
4aabed4
removed duplicate code added in merge
sampath1117 Jul 25, 2024
940e7a0
Merge pull request #298 from sampath1117/sr/down_mixing_develop_merge
r-abishek Jul 25, 2024
af1a238
Merge branch 'develop' of github.com:ROCm/rpp into ar/audio_support_4…
r-abishek Aug 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions include/rppt_tensor_audio_augmentations.h
Original file line number Diff line number Diff line change
Expand Up @@ -141,9 +141,9 @@ RppStatus rppt_pre_emphasis_filter_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr,
/*! \brief Down Mixing augmentation on HOST backend
* \details Down Mixing augmentation for audio data
* \param [in] srcPtr source tensor in HOST memory
* \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32)
* \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 2 or 3 (for single-channel or multi-channel audio tensor), offsetInBytes >= 0, dataType = F32)
* \param [out] dstPtr destination tensor in HOST memory
* \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 3, offsetInBytes >= 0, dataType = F32)
* \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 2, offsetInBytes >= 0, dataType = F32)
* \param [in] srcDimsTensor source audio buffer length and number of channels (1D tensor in HOST memory, of size batchSize * 2)
* \param [in] normalizeWeights bool flag to specify if normalization of weights is needed
* \param [in] rppHandle RPP HOST handle created with <tt>\ref rppCreateWithBatchSize()</tt>
Expand All @@ -153,6 +153,23 @@ RppStatus rppt_pre_emphasis_filter_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr,
*/
RppStatus rppt_down_mixing_host(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcDimsTensor, bool normalizeWeights, rppHandle_t rppHandle);

#ifdef GPU_SUPPORT
/*! \brief Down Mixing augmentation on HIP backend
* \details Down Mixing augmentation for audio data
* \param [in] srcPtr source tensor in HIP memory
* \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 2 or 3 (for single-channel or multi-channel audio tensor), offsetInBytes >= 0, dataType = F32)
* \param [out] dstPtr destination tensor in HIP memory
* \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 2, offsetInBytes >= 0, dataType = F32)
* \param [in] srcDimsTensor source audio buffer length and number of channels (1D tensor in HIP/Pinned memory, of size batchSize * 2)
* \param [in] normalizeWeights bool flag to specify if normalization of weights is needed
* \param [in] rppHandle RPP HIP handle created with <tt>\ref rppCreateWithStreamAndBatchSize()</tt>
* \return A <tt> \ref RppStatus</tt> enumeration.
* \retval RPP_SUCCESS Successful completion.
* \retval RPP_ERROR* Unsuccessful completion.
*/
RppStatus rppt_down_mixing_gpu(RppPtr_t srcPtr, RpptDescPtr srcDescPtr, RppPtr_t dstPtr, RpptDescPtr dstDescPtr, Rpp32s *srcDimsTensor, bool normalizeWeights, rppHandle_t rppHandle);
#endif // GPU_SUPPORT

/*! \brief Produces a spectrogram from a 1D audio buffer on HOST backend
* \details Spectrogram for 1D audio buffer
* \param [in] srcPtr source tensor in HOST memory
Expand Down
1 change: 1 addition & 0 deletions src/modules/hip/hip_tensor_audio_augmentations.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ SOFTWARE.
#define HIP_TENSOR_AUDIO_AUGMENTATIONS_HPP

#include "kernel/non_silent_region_detection.hpp"
#include "kernel/down_mixing.hpp"
#include "kernel/to_decibels.hpp"

#endif // HIP_TENSOR_AUDIO_AUGMENTATIONS_HPP
72 changes: 72 additions & 0 deletions src/modules/hip/kernel/down_mixing.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#include <hip/hip_runtime.h>
#include "rpp_hip_common.hpp"

__global__ void down_mixing_hip_tensor(float *srcPtr,
uint srcStride,
float *dstPtr,
uint dstStride,
int2 *srcDimsTensor)

{
int id_x = hipBlockIdx_x * hipBlockDim_x + hipThreadIdx_x;
int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z;
int srcLength = srcDimsTensor[id_z].x;
int channels = srcDimsTensor[id_z].y;

if (id_x >= srcLength)
return;

float outVal = 0.0f;
uint srcIdx = id_z * srcStride + id_x * channels;
int i = 0;
int alignedChannels = (channels / 8) * 8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rounding down. Is that what you want?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes Rajy
Since we want to do 8 pixel vectorized processing for alignedChannels value


// do 8 pixel load till alignedChannels value
if (alignedChannels)
{
d_float8 outVal_f8;
outVal_f8.f4[0] = static_cast<float4>(0.0f);
outVal_f8.f4[1] = outVal_f8.f4[0];
for(; i < alignedChannels; i += 8, srcIdx += 8)
{
d_float8 src_f8;
rpp_hip_load8_and_unpack_to_float8(srcPtr + srcIdx, &src_f8);
rpp_hip_math_add8(&outVal_f8, &src_f8, &outVal_f8);
}
outVal_f8.f4[0] += outVal_f8.f4[1];
outVal += (outVal_f8.f1[0] + outVal_f8.f1[1] + outVal_f8.f1[2] + outVal_f8.f1[3]);
}
// process remaining channels
for(; i < channels; i++, srcIdx++)
outVal += srcPtr[srcIdx];
outVal *= (1.f / channels);

uint dstIdx = id_z * dstStride + id_x;
dstPtr[dstIdx] = outVal;
}

RppStatus hip_exec_down_mixing_tensor(Rpp32f *srcPtr,
RpptDescPtr srcDescPtr,
Rpp32f *dstPtr,
RpptDescPtr dstDescPtr,
Rpp32s *srcDimsTensor,
bool normalizeWeights,
rpp::Handle& handle)
{
Rpp32s globalThreads_x = dstDescPtr->strides.nStride;
Rpp32s globalThreads_y = 1;
Rpp32s globalThreads_z = dstDescPtr->n;

hipLaunchKernelGGL(down_mixing_hip_tensor,
dim3(ceil((Rpp32f)globalThreads_x/LOCAL_THREADS_X_1DIM), ceil((Rpp32f)globalThreads_y/LOCAL_THREADS_Y_1DIM), ceil((Rpp32f)globalThreads_z/LOCAL_THREADS_Z_1DIM)),
dim3(LOCAL_THREADS_X_1DIM, LOCAL_THREADS_Y_1DIM, LOCAL_THREADS_Z_1DIM),
0,
handle.GetStream(),
srcPtr,
srcDescPtr->strides.nStride,
dstPtr,
dstDescPtr->strides.nStride,
reinterpret_cast<int2 *>(srcDimsTensor));

return RPP_SUCCESS;
}
36 changes: 36 additions & 0 deletions src/modules/rppt_tensor_audio_augmentations.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -362,5 +362,41 @@ RppStatus rppt_to_decibels_gpu(RppPtr_t srcPtr,
#endif // backend
}

/******************** down_mixing ********************/

RppStatus rppt_down_mixing_gpu(RppPtr_t srcPtr,
RpptDescPtr srcDescPtr,
RppPtr_t dstPtr,
RpptDescPtr dstDescPtr,
Rpp32s *srcDimsTensor,
bool normalizeWeights,
rppHandle_t rppHandle)
{
#ifdef HIP_COMPILE
Rpp32u tensorDims = srcDescPtr->numDims - 1; // exclude batchsize from input dims
if (tensorDims != 1 && tensorDims != 2)
return RPP_ERROR_INVALID_SRC_DIMS;

if ((srcDescPtr->dataType == RpptDataType::F32) && (dstDescPtr->dataType == RpptDataType::F32))
{
hip_exec_down_mixing_tensor(static_cast<Rpp32f*>(srcPtr),
srcDescPtr,
static_cast<Rpp32f*>(dstPtr),
dstDescPtr,
srcDimsTensor,
normalizeWeights,
rpp::deref(rppHandle));
}
else
{
return RPP_ERROR_NOT_IMPLEMENTED;
}

return RPP_SUCCESS;
#elif defined(OCL_COMPILE)
return RPP_ERROR_NOT_IMPLEMENTED;
#endif // backend
}

#endif // GPU_SUPPORT
#endif // AUDIO_SUPPORT
28 changes: 28 additions & 0 deletions utilities/test_suite/HIP/Tensor_audio_hip.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,10 @@ int main(int argc, char **argv)
set_audio_descriptor_dims_and_strides(srcDescPtr, batchSize, maxSrcHeight, maxSrcWidth, maxSrcChannels, offsetInBytes);
int maxDstChannels = maxSrcChannels;
if(testCase == 3)
{
srcDescPtr->numDims = 3;
maxDstChannels = 1;
}
set_audio_descriptor_dims_and_strides(dstDescPtr, batchSize, maxDstHeight, maxDstWidth, maxDstChannels, offsetInBytes);

// set buffer sizes for src/dst
Expand All @@ -131,6 +134,11 @@ int main(int argc, char **argv)
CHECK_RETURN_STATUS(hipHostMalloc(&srcDims, batchSize * sizeof(RpptImagePatch)));
CHECK_RETURN_STATUS(hipHostMalloc(&dstDims, batchSize * sizeof(RpptImagePatch)));

// allocate the buffer for srcDimsTensor
Rpp32s *srcDimsTensor;
if(testCase == 3)
CHECK_RETURN_STATUS(hipHostMalloc(&srcDimsTensor, batchSize * 2 * sizeof(Rpp32s)));

Rpp32s *detectedIndex = nullptr, *detectionLength = nullptr;
if(testCase == 0)
{
Expand Down Expand Up @@ -190,6 +198,24 @@ int main(int argc, char **argv)

break;
}
case 3:
{
testCaseName = "down_mixing";
bool normalizeWeights = false;

for (int i = 0, j = 0; i < batchSize; i++, j += 2)
{
srcDimsTensor[j] = srcLengthTensor[i];
srcDimsTensor[j + 1] = channelsTensor[i];
dstDims[i].height = srcLengthTensor[i];
dstDims[i].width = 1;
}

startWallTime = omp_get_wtime();
rppt_down_mixing_gpu(d_inputf32, srcDescPtr, d_outputf32, dstDescPtr, srcDimsTensor, normalizeWeights, handle);

break;
}
default:
{
missingFuncFlag = 1;
Expand Down Expand Up @@ -263,6 +289,8 @@ int main(int argc, char **argv)
CHECK_RETURN_STATUS(hipHostFree(channelsTensor));
CHECK_RETURN_STATUS(hipHostFree(srcDims));
CHECK_RETURN_STATUS(hipHostFree(dstDims));
if(testCase == 3)
CHECK_RETURN_STATUS(hipHostFree(srcDimsTensor));
if (detectedIndex != nullptr)
CHECK_RETURN_STATUS(hipHostFree(detectedIndex));
if (detectionLength != nullptr)
Expand Down
4 changes: 2 additions & 2 deletions utilities/test_suite/HIP/runAudioTests.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
outFolderPath = os.getcwd()
buildFolderPath = os.getcwd()
caseMin = 0
caseMax = 1
caseMax = 3


# Get a list of log files based on a flag for preserving output
Expand Down Expand Up @@ -224,7 +224,7 @@ def rpp_test_suite_parser_and_validator():
subprocess.call(["make", "-j16"], cwd=".") # nosec

# List of cases supported
supportedCaseList = ['0', '1']
supportedCaseList = ['0', '1', '3']
if qaMode and batchSize != 3:
print("QA tests can only run with a batch size of 3.")
exit(0)
Expand Down
3 changes: 3 additions & 0 deletions utilities/test_suite/HOST/Tensor_audio_host.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,10 @@ int main(int argc, char **argv)
set_audio_descriptor_dims_and_strides(srcDescPtr, batchSize, maxSrcHeight, maxSrcWidth, maxSrcChannels, offsetInBytes);
int maxDstChannels = maxSrcChannels;
if(testCase == 3)
{
srcDescPtr->numDims = 3;
maxDstChannels = 1;
}
set_audio_descriptor_dims_and_strides(dstDescPtr, batchSize, maxDstHeight, maxDstWidth, maxDstChannels, offsetInBytes);

// create generic descriptor in case of slice
Expand Down