Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Despite having compiled this CUDA code on many different machines, only ozstar threw a compile error surrounding taking the absolute value of an int8_t m_accum value and adding/subtracting it with a float. This is fixed by pre-computing the absolute values of these m_accums and passing them in. There appears to be no impact on performance. While we're here, I made more pointers const. This unfortunately also did not affect performance; maybe the compiler was already good about doing associated optimisations. Also remove a directive that ignored null pointer warnings; the newest version of bindgen has fixed things. Also tidy the body of fee_kernel; I like not having to indent code if I don't have to.
- Loading branch information