Improved #4 / 3494: AVX2 bugfixes + no code duplication for the integer workhorses in there #5

GerHobbelt · 2021-07-13T09:01:04Z

Improved #4 / tesseract-ocr#3494: AVX2 bugfixes + no code duplication for the integer workhorses in there tesseract-ocr#3495

same as patch-4 (tesseract-ocr#3494) but now with reduced code duplication: for TFloat to work, we don't need to duplicate the integer work functions as it's only the ExtractResults[8,16] functions that need different implementations for float vs. double. These are therefor common to both implementations:

static void PartialMatrixDotVector64(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                     int num_in, TFloat *v) {

static void PartialMatrixDotVector32(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                     int num_in, TFloat *v) {

static void PartialMatrixDotVector16(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                     int num_in, TFloat *v) {

static inline void PartialMatrixDotVector8(const int8_t *wi, const TFloat *scales, const int8_t *u,
                                           int num_in, TFloat *v) {

static void matrixDotVector(int dim1, int dim2, const int8_t *wi, const TFloat *scales,
                            const int8_t *u, TFloat *v) {

(extract from tesseract-ocr#3490)

…ation: for TFloat to work, we don't need to duplicate the integer work functions as it's only the ExtractResults16[8,16] functions that need different implementations for float vs. double. These are therefor common to both implementations: ``` static void PartialMatrixDotVector64(const int8_t *wi, const TFloat *scales, const int8_t *u, int num_in, TFloat *v) { static void PartialMatrixDotVector32(const int8_t *wi, const TFloat *scales, const int8_t *u, int num_in, TFloat *v) { static void PartialMatrixDotVector16(const int8_t *wi, const TFloat *scales, const int8_t *u, int num_in, TFloat *v) { static inline void PartialMatrixDotVector8(const int8_t *wi, const TFloat *scales, const int8_t *u, int num_in, TFloat *v) { static void matrixDotVector(int dim1, int dim2, const int8_t *wi, const TFloat *scales, const int8_t *u, TFloat *v) { ```

GerHobbelt · 2021-07-13T12:00:54Z

annulled due to #2 (comment). Given the goals specified there, these functions must exist in both float and double instantiation anyhow.

This was referenced Jul 13, 2021

Improved #3494: AVX2 bugfixes + no code duplication for the integer workhorses in there tesseract-ocr/tesseract#3495

Closed

consistent TFloat prototypes #2

Merged

GerHobbelt closed this Jul 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved #4 / 3494: AVX2 bugfixes + no code duplication for the integer workhorses in there #5

Improved #4 / 3494: AVX2 bugfixes + no code duplication for the integer workhorses in there #5

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 13, 2021

Improved #4 / 3494: AVX2 bugfixes + no code duplication for the integer workhorses in there #5

Improved #4 / 3494: AVX2 bugfixes + no code duplication for the integer workhorses in there #5

Conversation

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 13, 2021