-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSE optimization for flip and rotate #10
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rrawther
changed the title
SSE optimization for flip
SSE optimization for flip and rotate
Nov 15, 2019
@Reza-Najafi waiting for @rrawther to fix conflicts |
Forget Rajy's branch I merged everything manually in the develop branch
Get Outlook for Android<https://aka.ms/ghei36>
…________________________________
From: LakshmiKumar23 <notifications@github.com>
Sent: Friday, November 15, 2019 2:56:48 PM
To: GPUOpen-ProfessionalCompute-Libraries/rpp <rpp@noreply.github.com>
Cc: Najafi, Seyedreza <Seyedreza.Najafi@amd.com>; Mention <mention@noreply.github.com>
Subject: Re: [GPUOpen-ProfessionalCompute-Libraries/rpp] SSE optimization for flip and rotate (#10)
[CAUTION: External Email]
@Reza-Najafi<https://github.com/Reza-Najafi> waiting for @rrawther<https://github.com/rrawther> to fix conflicts
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#10>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AB7SOXINNJCCFRCDWVFKPQLQT4SLBANCNFSM4JNPDCMA>.
|
LokeshBonta
pushed a commit
to LokeshBonta/rpp
that referenced
this pull request
Aug 6, 2020
kiritigowda
pushed a commit
that referenced
this pull request
Aug 25, 2020
* Changed Channel extract and channel combine function call * updated erode dilate kernals [OCL] * Non Working [FULLY BUILD] code for min_max_loc and mean_stddev * Updated Rain GPU kernel for multiple destination image calls [OCL] * Updated Median, Non Max and Histogram and added support for mean * Updated tensor [OCL] * updated table lookup [OCL] * small updates in mean and stddev [OCL] * Full functioning code for mean and standard deviation [OCL] * Added Support to Min Max Location [OCL] * Added support for gaussian_image_pyramid [OCL] * Added support for laplacian_image_pyramid [OCL] * small modification in LIP [OCL] * small modification in Min Max Location and Mean stddev [OCL] * box filter hisEq [OCL] * Added support for gaussian filter * Added support for bin in Histogram [OCL] * updated sobel [OCL] * Update in Temperature [CPU] * FIX SNP CPU half noise issue [OCL] * fin small change in Absolute difference [OCL] * Small changes in Custom convolution and table lookup [OCL} * Fix regressions due to scripting [cl & CPU]. * fix histogram [OCL] * Updated snow [OCL] * updated snow [CPU] * small update in Snow [OCL] * Modify filter_operations to add gaussian_filter with same backend as blur * Fix issue with rain Grey Scale [OCL] * Fix Rain GPU Transparancy [OCL] * Add Kernel Caching using Map/Kernelmanger * Resolved histogram grayscale issue in GPU * Resolved histogram grayscale issue in GPU * Fix the bug in warp affine planar call * Fix issue with resize crop validation [cl & CPU]. * Cl_enque_buffer, the argument is set to CL_FALSE * Fix Gamma correction [OCL] * minor changes to gamma_correction, vignette commons, flip functionalities * Fix Jitter with new Implementation [CPU & OCL] * Modify brightness bug that gave patches in output * Fix the buy with Lens correction [OCL]. * Fix Median filter issue * Modify rotate to match GPU functionality * Fix median filter * merge abi-dev-host-ms4 to main-hipcl-dev * Fix a round about fix for Hue and Saturation Shift * Modify scale to match GPU functionality * Fix a round about fix for Hue and Saturation Shift * Fix syntax error in hsvkernel * changed CL_False to CL_True in minmax location * Resolve merg * Modify warp affine to match GPU - inversion exists * Add validation for Warp Affine Matrix * changes in Warp Affine * Add Blocking calls [CL_TRUE flag is on] * Removed validation printf statements in the library * Removed a syntax error * Add extra validation for contrast * Fix issue with rain [CPU] * Added support to new Pixelate [OCL & CPU] * Fix issue with Fish eye [OCL] * Modify Histogram Implementation * Histogram Balance Fix * Update Readme.md Amended the list * Update Readme.md * Fix Histogram Planar Version * Add new support to Histogram [OCL] * Remove all files to include batch version * Move Mem-Mgmt_HIP branch files to master * Update Readme.md * Put all the recent changes or RPP here * Fix Border issues in crop mirror normalize and crop * Fix Crop mirror normalize border issue * Add RPP UnitTests * Add f32 support for crop_mirror_normalize * Add f32 support for crop * Add f32 support for resize_crop_mirror * Add f32 support for resize and resize_crop * Add f32 support for color_twist * Correct blur * Add f32 support for rotate * Add f16 host support for rotate, resize, resize_crop, crop, resize_crop_mirror, crop_mirror_normalize, color_twist * Major changes to host test suite * Separate host test suites for pkd3 and pln1 * modify rpp_unittests host * correct additional folder creation and readme * Minor correction in pln1/pkd3 host test scripts * Add basic float tensor support * Add FP32 and FP16 support for Crop function * Fix bug in crop * crop mirror normalize report * Float Support for Rotate GPU * Add Kernel Support in OCL for colorTwist and resize funtionalities * Add float support for ColorTwist and Resize Crop Mirror - FP16 and FP32 * Code Refactoring and Rotate Support for FP16 and FP32 * Fix Rotate Float issue * Fix FP32 Rotate Issue * Add Resize Function * Add Resize Crop Mirror in GPU OCL * Fix Typo * Add Resize Crop GPU FP16 and FP32 support * Update rppdefs.h * Crop Mirror Normalize Support is added * Support for ColorTwist in Float space * Update Colortwwist.cl - temp * Remove MIOPEN dependency in RPP build set-up * Update colortwist.cl * Fix Bug in ColorTwist * Fix Bug in ColorTwist (#6) * API refactoring for fused_functions * Fix make_data-type bug and code formatting * Testsuite for Float Support Functions * Removed the brace in switchcase * Add free statements for unreleased memomry and f16 fix for colortwist * rename folders * Fix Resize for U8 case * minor change in BatchPD host * Fix type error in resize.cl * Fix float errors for resize fucntions * foramt file * Fix Bug in ColorTwist (#6) (#8) * Fix Bug in ColorTwist (#6) (#8) (#9) * Update * update (#10) * Fix Bug in ColorTwist (#6) * Fix Bug in ColorTwist (#6) (#8) (#9) * Update * Format files * New Changes (#11) * Fix Bug in ColorTwist (#6) * Fix Bug in ColorTwist (#6) (#8) (#9) * Update * Format files * Correct f16 color twist host bug * Change test suite to input 0-1 normalized values for all f16/f32 functionalities * Refactor API code for geometry_transforms * Added Testsuite for Float Functions in OCL * AMD Docs * Create install.rst * Update index.rst * Add host support for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize * Add host support in test suite for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize * Add host support for i8 in resize, crop, cmn, rotate, resize_crop, resize_crop_mirror and color_twist * Add host test suite support for i8 * Add host support for u8->i8 in crop, resize, crop_mirror_normalize * modify test suite * Add host plan1 test suite to SOW3_HOST * crop mirror normalize full support in w.r.t type change and layout change * Add API calls for CMN function for new set of variations * Fix bug with respect to I8 * change type info in kernels * Fix cmn bub * Support I8 for Rotate * Int 8 support for colortwist and code refactoring * Add int8 support for resize crop mirror function * resize crop mirror int8 support is added * Crop various variations are added * Add crop support for all the conversions * Add host support for resize outputFormatToggle * Add host support for crop outputFormatToggle * Add host support for rotate outputFormatToggle * Add host support for resize_crop outputFormatToggle * Add host support for resize_crop_mirror outputFormatToggle * Add host support for crop_mirror_normalize outputFormatToggle * Add host support for color_twist outputFormatToggle and all other pln->pkd support * Add missing pln3 API for crop host * Major modifications in test suite and ReadMe for pkd3, pln3 and pln1 inputs for host * Modify resize kernel * Add outputtoggle in the API and functions * Add new changes to all the fused function w.r.t to outputFormatToggle * Add pln3 api for Crop on GPU * add missing API for resize cro * Fix compilation bugs * Remove unnecessary functions and fix build bug * Add ocl testing framework * Fix bug in rotate helper * Minor temp changes in test code to accomodate PKD3 input U8 cases with toggle format * Correct resize_u8_i8_pkd * Fix resize kenel issues for output toogle change * colortwist bug fix * Fix colortwist bug * resize tensor fix * Minor mods to both pln3 and pkd3 test suite to accomodate CMN's ability to do U8 format toggles * Corrections in PLN3 input funcitons for host * Fix bugs in Fused function new code * Add changes relatedd to planar format in padded * Fix issues with pln3 colortwist * Fix issue with test suite * Add pln3 testing and fix issues * Modify a few things in test script * Fix pln3 issue for FP16 for Rotate * Fix index issues with Test suit * Add output layout toggle for host API * ix pln3 issues in test suite Fix pln1 issues in testsuite Fix other minor bugs * Change paramerter order in resize pd pln host * remove print statements * Update README.MD * Codacy issues corrections in utilities/rpp-unittests * Codacy issues corrections for resize kernel * Codacy issues corrections in utilities/rpp-unittests OCL/HIP * Codacy issues corrections in utilities/rpp-unittests * Codacy issues corrections in utilities/rpp-unittests * Fix some codecy issues * Remove some Codecy issues in rpp unnittests * Remove a few codecy issues * Remove Print statements Co-authored-by: Muthukumaravel <muthukumaravel@multicorewareinc.com> Co-authored-by: shobana-mcw <shobana@multicorewareinc.com> Co-authored-by: r-abishekmcw <abishek@multicorewareinc.com> Co-authored-by: LokeshBonta <you@example.com> Co-authored-by: Reza <Seyedreza.Najafi@amd.com> Co-authored-by: Swetha B S <swetha@multicorewareinc.com>
kiritigowda
pushed a commit
that referenced
this pull request
Oct 29, 2020
* Modify phase for visualization * Pre-MS4 optimizations on arithmetic_operations * Pre-MS4 optimizations on arithmetic_operations * Pre-MS4 optimizations on morphological_transforms * Added support for table lookup [OCL] * Fix issues with pixelate greyscale. * Pre-MS4 optimizations on color_model_conversions * Modify sobel_filter functionality to match GPU impl. * mean and stddev base function [OCL] * Changed Channel extract and channel combine function call * updated erode dilate kernals [OCL] * Non Working [FULLY BUILD] code for min_max_loc and mean_stddev * Updated Rain GPU kernel for multiple destination image calls [OCL] * Updated Median, Non Max and Histogram and added support for mean * Updated tensor [OCL] * updated table lookup [OCL] * small updates in mean and stddev [OCL] * Full functioning code for mean and standard deviation [OCL] * Added Support to Min Max Location [OCL] * Added support for gaussian_image_pyramid [OCL] * Added support for laplacian_image_pyramid [OCL] * small modification in LIP [OCL] * small modification in Min Max Location and Mean stddev [OCL] * box filter hisEq [OCL] * Added support for gaussian filter * Added support for bin in Histogram [OCL] * updated sobel [OCL] * Update in Temperature [CPU] * FIX SNP CPU half noise issue [OCL] * fin small change in Absolute difference [OCL] * Small changes in Custom convolution and table lookup [OCL} * Fix regressions due to scripting [cl & CPU]. * fix histogram [OCL] * Updated snow [OCL] * updated snow [CPU] * small update in Snow [OCL] * Modify filter_operations to add gaussian_filter with same backend as blur * Fix issue with rain Grey Scale [OCL] * Fix Rain GPU Transparancy [OCL] * Add Kernel Caching using Map/Kernelmanger * Resolved histogram grayscale issue in GPU * Resolved histogram grayscale issue in GPU * Fix the bug in warp affine planar call * Fix issue with resize crop validation [cl & CPU]. * Cl_enque_buffer, the argument is set to CL_FALSE * Fix Gamma correction [OCL] * minor changes to gamma_correction, vignette commons, flip functionalities * Fix Jitter with new Implementation [CPU & OCL] * Modify brightness bug that gave patches in output * Fix the buy with Lens correction [OCL]. * Fix Median filter issue * Modify rotate to match GPU functionality * Fix median filter * merge abi-dev-host-ms4 to main-hipcl-dev * Fix a round about fix for Hue and Saturation Shift * Modify scale to match GPU functionality * Fix a round about fix for Hue and Saturation Shift * Fix syntax error in hsvkernel * changed CL_False to CL_True in minmax location * Resolve merg * Modify warp affine to match GPU - inversion exists * Add validation for Warp Affine Matrix * changes in Warp Affine * Add Blocking calls [CL_TRUE flag is on] * Removed validation printf statements in the library * Removed a syntax error * Add extra validation for contrast * Fix issue with rain [CPU] * Added support to new Pixelate [OCL & CPU] * Fix issue with Fish eye [OCL] * Modify Histogram Implementation * Histogram Balance Fix * Update Readme.md Amended the list * Update Readme.md * Fix Histogram Planar Version * Add new support to Histogram [OCL] * Remove all files to include batch version * Move Mem-Mgmt_HIP branch files to master * Update Readme.md * Put all the recent changes or RPP here * Fix Border issues in crop mirror normalize and crop * Fix Crop mirror normalize border issue * Add RPP UnitTests * Add f32 support for crop_mirror_normalize * Add f32 support for crop * Add f32 support for resize_crop_mirror * Add f32 support for resize and resize_crop * Add f32 support for color_twist * Correct blur * Add f32 support for rotate * Add f16 host support for rotate, resize, resize_crop, crop, resize_crop_mirror, crop_mirror_normalize, color_twist * Major changes to host test suite * Separate host test suites for pkd3 and pln1 * modify rpp_unittests host * correct additional folder creation and readme * Minor correction in pln1/pkd3 host test scripts * Add basic float tensor support * Add FP32 and FP16 support for Crop function * Fix bug in crop * crop mirror normalize report * Float Support for Rotate GPU * Add Kernel Support in OCL for colorTwist and resize funtionalities * Add float support for ColorTwist and Resize Crop Mirror - FP16 and FP32 * Code Refactoring and Rotate Support for FP16 and FP32 * Fix Rotate Float issue * Fix FP32 Rotate Issue * Add Resize Function * Add Resize Crop Mirror in GPU OCL * Fix Typo * Add Resize Crop GPU FP16 and FP32 support * Update rppdefs.h * Crop Mirror Normalize Support is added * Support for ColorTwist in Float space * Update Colortwwist.cl - temp * Update colortwist.cl * Fix Bug in ColorTwist * Fix Bug in ColorTwist (#6) * API refactoring for fused_functions * Fix make_data-type bug and code formatting * Testsuite for Float Support Functions * Removed the brace in switchcase * Add free statements for unreleased memomry and f16 fix for colortwist * rename folders * Fix Resize for U8 case * minor change in BatchPD host * Fix type error in resize.cl * Fix float errors for resize fucntions * foramt file * Fix Bug in ColorTwist (#6) (#8) * Fix Bug in ColorTwist (#6) (#8) (#9) * Update * update (#10) * Fix Bug in ColorTwist (#6) * Fix Bug in ColorTwist (#6) (#8) (#9) * Update * Format files * New Changes (#11) * Fix Bug in ColorTwist (#6) * Fix Bug in ColorTwist (#6) (#8) (#9) * Update * Format files * Correct f16 color twist host bug * Change test suite to input 0-1 normalized values for all f16/f32 functionalities * Refactor API code for geometry_transforms * Added Testsuite for Float Functions in OCL * Add host support for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize * Add host support in test suite for u8->f16 and u8->f32 for resize, crop, crop_mirror_normalize * Add host support for i8 in resize, crop, cmn, rotate, resize_crop, resize_crop_mirror and color_twist * Add host test suite support for i8 * Add host support for u8->i8 in crop, resize, crop_mirror_normalize * modify test suite * Add host plan1 test suite to SOW3_HOST * crop mirror normalize full support in w.r.t type change and layout change * Add API calls for CMN function for new set of variations * Fix bug with respect to I8 * change type info in kernels * Fix cmn bub * Support I8 for Rotate * Int 8 support for colortwist and code refactoring * Add int8 support for resize crop mirror function * resize crop mirror int8 support is added * Crop various variations are added * Add crop support for all the conversions * Add host support for resize outputFormatToggle * Add host support for crop outputFormatToggle * Add host support for rotate outputFormatToggle * Add host support for resize_crop outputFormatToggle * Add host support for resize_crop_mirror outputFormatToggle * Add host support for crop_mirror_normalize outputFormatToggle * Add host support for color_twist outputFormatToggle and all other pln->pkd support * Add missing pln3 API for crop host * Major modifications in test suite and ReadMe for pkd3, pln3 and pln1 inputs for host * Modify resize kernel * Add outputtoggle in the API and functions * Add new changes to all the fused function w.r.t to outputFormatToggle * Add pln3 api for Crop on GPU * add missing API for resize cro * Fix compilation bugs * Remove unnecessary functions and fix build bug * Add ocl testing framework * Fix bug in rotate helper * Minor temp changes in test code to accomodate PKD3 input U8 cases with toggle format * Correct resize_u8_i8_pkd * Fix resize kenel issues for output toogle change * colortwist bug fix * Fix colortwist bug * resize tensor fix * Minor mods to both pln3 and pkd3 test suite to accomodate CMN's ability to do U8 format toggles * Corrections in PLN3 input funcitons for host * Fix bugs in Fused function new code * Add changes relatedd to planar format in padded * Fix issues with pln3 colortwist * Fix issue with test suite * Add pln3 testing and fix issues * Modify a few things in test script * Fix pln3 issue for FP16 for Rotate * Fix index issues with Test suit * Add output layout toggle for host API * ix pln3 issues in test suite Fix pln1 issues in testsuite Fix other minor bugs * Change paramerter order in resize pd pln host * remove print statements * Add unittest * Fix HIP backend issues * able to build hip * Changed cmakelists for linking issues * Change include hip/hip_hcc.h to hip/hip_ext.h to avoid warning Co-authored-by: Muthukumaravel <muthukumaravel@multicorewareinc.com> Co-authored-by: shobana-mcw <shobana@multicorewareinc.com> Co-authored-by: LokeshBonta <you@example.com> Co-authored-by: Reza <Seyedreza.Najafi@amd.com> Co-authored-by: LokeshBonta <lokeshpsn93@gmail.com> Co-authored-by: Lokesh Bonta <lokeswara@multicorewareinc.com> Co-authored-by: Swetha B S <swetha@multicorewareinc.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
AVX_simd added for flip