Math library #25
Replies: 41 comments 30 replies
-
I am working on a easy to automate/replicate version of this - I may have hit a Verilator bug we'll see |
Beta Was this translation helpful? Give feedback.
-
Did my example work for you? I had no issues |
Beta Was this translation helpful? Give feedback.
-
I just committed some example test structure for 'math_pkg' If you run You should get a verilator compile failure like so
And lolz |
Beta Was this translation helpful? Give feedback.
-
in trying to narrow down to And get just the original circular logic error - darn - nothing silly about
|
Beta Was this translation helpful? Give feedback.
-
its supposed to run like u24mult demo
|
Beta Was this translation helpful? Give feedback.
-
Oh final data point like I mentioned - a ghdl error now when using last night oss cad suite build tar ball
|
Beta Was this translation helpful? Give feedback.
-
Per above, my thinking is its a ghdl plugin for yosys problem - before jumping to it being a Verilator issue. |
Beta Was this translation helpful? Give feedback.
-
Also I setup an example trying to even more closely mimic your original working setup at the start of this discussion.
On my (slightly older) local build gets verilator error like above
And on my latest oss cad suite binaries gets ghdl assertion also like we've seen
|
Beta Was this translation helpful? Give feedback.
-
Actually reading about the verilator warning it might be ok
Explaining why synthesizes fine. We may want to disable this warning and just watch out for DIDNOTCONVERGE? I am trying to see if this really is showing up as circular verilog or not... |
Beta Was this translation helpful? Give feedback.
-
Also this VHDL that was related to our last GHDL plugin for yosys issue Line 3047 in 85c7bca Changes the behavior from GHDL assertion to Verilator circular logic error and back if un/commented |
Beta Was this translation helpful? Give feedback.
-
I receive verilator warnings that I just ignored since the code was
generated, anyways.
Just for checking: did my sqrt example work for you? I've got it running.
Maybe that's the case, please just confirm
El dom., 3 oct. 2021 16:20, Julian Kemmerer ***@***.***>
escribió:
… Also this VHDL that was related to our last GHDL plugin for yosys issue
https://github.com/JulianKemmerer/PipelineC/blob/85c7bcace91b48b52714e0c118a05578c26d9888/src/VHDL.py#L3047
Changes the behavior from GHDL assertion to Verilator circular logic error
and back if un/commented
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWL3VK2LJQ4G6SHFLHLUFCUJPANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
I just checked in a fix to the generated vhdl that should help with circular logic problems and set verilator to ignore that UNOPTFLAT warning. Right now all the ~
I swore I saw something weird in the fp32sub - will try to run that for increasingly more random numbers. But feel free to give it a go yourself. What do ya think @suarezvictor ? |
Beta Was this translation helpful? Give feedback.
-
I'm truly inspired with the progress, now that we can write and algorithm
and easyly test it behaves correctly.
Next step will be (I hope) to give some estimation of time performance and
resource usage
El dom., 3 oct. 2021 20:26, Julian Kemmerer ***@***.***>
escribió:
… I just checked in a fix to the generated vhdl that should help with
circular logic problems and set verilator to ignore that UNOPTFLAT warning.
Right now all the ~math library tests (10 random nums each) all pass. Im
just happy they compile :-p
TESTS: fp32sub rsqrtf u24add u24mult
./src/pipelinec ./examples/verilator/math_pkg/$TEST/$TEST.c --sim_comb
--verilator --main_cpp ./examples/verilator/math_pkg/$TEST/test.cpp
I swore I saw something weird in the fp32sub - will try to run that for increasingly more random numbers.
But feel free to give it a go yourself.
What do ya think @suarezvictor ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWN7GGMYLPE7O6GGEWDUFDRDHANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
I just updated the above tests to do more test cases I think we need to prescribe a range for this approximate
But there is defintely an issue with the
|
Beta Was this translation helpful? Give feedback.
-
I really think and agree the comparison should't be with the C math library
for fast implementations like the provided one. But comparison with
verilated version should match within a rounding error. Said in another
way, even without conversion to logic, rsqrt is not the same as 1/sqrt as
calculated by the standard implementation. In this special case, a model of
error should be applied to calculated tolerance (that already exist) but
for the moment we can move forward without tests that deep.
El dom., 3 oct. 2021 21:26, Julian Kemmerer ***@***.***>
escribió:
… Actually for rsqrtf it makes more sense to do like you did and compare
against the simulated llvm func implementation instead of 1/sqrt as I am
now above
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWKTV6ARUDYNOJYTKYLUFDYCTANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
May main idea for floating point opwrations mayor diverge a bit of the
current course. What I would to is to copy the implementation tricks of a
proven library and port ir to pipelineC.
See for example this one that seems very clean:
https://github.com/LiraNuna/soft-ieee754/blob/master/includes/ieee754.hpp
Subatraction and adding operations are based on two interesting functions
called "renormalize" and "from unsigned".
If a once the " BITS(x, 5, 3)"-like macros are implemented, then the same
you can also copy the template-constant operations to get also double
precision with a single implementation, by calculating compile-time
constants appropiately.
El mié., 6 oct. 2021 07:10, Victor Suarez Rovere ***@***.***>
escribió:
… Maybe a silly proposal but what about using add function and flip the sign
bit of the second operand? Only one function to debug and may use less
resources also.
El mié., 6 oct. 2021 06:51, Victor Suarez Rovere ***@***.***>
escribió:
> Another way of testing it is to use the bit-exact clase on "bitregs.h"
> and run a debugger on that - another advantage of C compatibility. I can
> try that route (bit please be patient until I finish my consulting tasks)
>
> El mié., 6 oct. 2021 01:43, Julian Kemmerer ***@***.***>
> escribió:
>
>> OK up to 1 million test cases for fp32sub - but getting a handful of
>> interesting failures
>>
>> x: float -1.734040e-15, uint32 0xA6F9E6C9 y: float -1.777618e-15, uint32 0xA7001746 c_result: float 4.357847e-17, uint32 0x2448F860 result: float 4.357857e-17, uint32 0x2448F880 err: 1.05879e-22 allowed_err: 4.35785e-23 FAILED
>> x: float 1.822899e+13, uint32 0x5584A225 y: float 1.729264e+13, uint32 0x557BA419 c_result: float 9.363543e+11, uint32 0x535A0310 result: float 9.363553e+11, uint32 0x535A0320 err: 1.04858e+06 allowed_err: 936354 FAILED
>> x: float 4.330267e+12, uint32 0x547C0E03 y: float 4.572113e+12, uint32 0x548510E6 c_result: float -2.418459e+11, uint32 0xD2613C90 result: float -2.418462e+11, uint32 0xD2613CA0 err: 262144 allowed_err: 241846 FAILED
>> x: float -5.907621e-11, uint32 0xAE81E8F3 y: float -5.666673e-11, uint32 0xAE793911 c_result: float -2.409479e-12, uint32 0xAC298D50 result: float -2.409482e-12, uint32 0xAC298D60 err: 3.46945e-18 allowed_err: 2.40948e-18 FAILED
>> x: float -4.415078e+06, uint32 0xCA86BCCB y: float -4.192051e+06, uint32 0xCA7FDCCB c_result: float -2.230268e+05, uint32 0xC859CCB0 result: float -2.230270e+05, uint32 0xC859CCC0 err: 0.25 allowed_err: 0.223027 FAILED
>> x: float -1.395268e+11, uint32 0xD201F1C5 y: float -1.337852e+11, uint32 0xD1F931C1 c_result: float -5.741552e+09, uint32 0xCFAB1C90 result: float -5.741560e+09, uint32 0xCFAB1CA0 err: 8192 allowed_err: 5741.55 FAILED
>> x: float -5.685577e+17, uint32 0xDCFC7D87 y: float -5.888213e+17, uint32 0xDD02BE9E c_result: float 2.026362e+16, uint32 0x5A8FFB50 result: float 2.026366e+16, uint32 0x5A8FFB60 err: 3.43597e+10 allowed_err: 2.02636e+10 FAILED
>> 1000000 outputs checked.
>> Test failed!
>>
>> The first one
>>
>> x: float -1.734040e-15, uint32 0xA6F9E6C9
>> y: float -1.777618e-15, uint32 0xA7001746
>> c_result: float 4.357847e-17, uint32 0x2448F860
>> result: float 4.357857e-17, uint32 0x2448F880
>> err: 1.05879e-22 allowed_err: 4.35785e-23 FAILED
>>
>> Similar to other cases the lest significant 4 bits of the mantissa
>> (rightmost hex char) are zeros but then the first non zero bits are
>> slightly off ... maybe in 'rounding' sort of way idk
>>
>> Any thoughts on this kinda of mismatch?
>>
>> I am going to confirm I see this in modelsim too for sanity - easy
>> enough to do after this many tries
>>
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub
>> <#25 (comment)>,
>> or unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/ACBHVWOD4JRLH72AMW7MXKDUFPHX7ANCNFSM5FFFAISQ>
>> .
>> Triage notifications on the go with GitHub Mobile for iOS
>> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
>> or Android
>> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>>
>>
>
|
Beta Was this translation helpful? Give feedback.
-
OK the FP32 add shows similar issues so I think the same from from fp32sub exists fp32 add
float BIN_OP_PLUS_float_float_float(float left, float right)
{
// Get exponent for left and right
uint8_t left_exponent;
left_exponent = float_30_23(left);
uint8_t right_exponent;
right_exponent = float_30_23(right);
float x;
float y;
// Step 1: Copy inputs so that left's exponent >= than right's.
// ?????????MAYBE TODO:
// Is this only needed for shift operation that takes unsigned only?
// ALLOW SHIFT BY NEGATIVE?????
// OR NO since that looses upper MSBs of mantissa which not acceptable? IDK too many drinks
if ( right_exponent > left_exponent ) // Lazy switch to GT
{
x = right;
y = left;
}
else
{
x = left;
y = right;
}
// Step 2: Break apart into S E M
// X
uint23_t x_mantissa;
x_mantissa = float_22_0(x);
uint8_t x_exponent;
x_exponent = float_30_23(x);
uint1_t x_sign;
x_sign = float_31_31(x);
// Y
uint23_t y_mantissa;
y_mantissa = float_22_0(y);
uint8_t y_exponent;
y_exponent = float_30_23(y);
uint1_t y_sign;
y_sign = float_31_31(y);
// Mantissa needs +3b wider
// [sign][overflow][hidden][23 bit mantissa]
// Put 0's in overflow bit and sign bit
// Put a 1 hidden bit if exponent is non-zero.
// X
// Determine hidden bit
uint1_t x_hidden_bit;
if(x_exponent == 0) // lazy swith to ==
{
x_hidden_bit = 0;
}
else
{
x_hidden_bit = 1;
}
// Apply hidden bit
uint24_t x_mantissa_w_hidden_bit;
x_mantissa_w_hidden_bit = uint1_uint23(x_hidden_bit, x_mantissa);
// Y
// Determine hidden bit
uint1_t y_hidden_bit;
if(y_exponent == 0) // lazy swith to ==
{
y_hidden_bit = 0;
}
else
{
y_hidden_bit = 1;
}
// Apply hidden bit
uint24_t y_mantissa_w_hidden_bit;
y_mantissa_w_hidden_bit = uint1_uint23(y_hidden_bit, y_mantissa);
// Step 3: Un-normalize Y (including hidden bit) so that xexp == yexp.
// Already swapped left/right based on exponent
// diff will be >= 0
uint8_t diff;
diff = x_exponent - y_exponent;
// Shift y by diff (bit manip pipelined function)
uint24_t y_mantissa_w_hidden_bit_unnormalized;
y_mantissa_w_hidden_bit_unnormalized = y_mantissa_w_hidden_bit >> diff;
// Step 4: If necessary, negate mantissas (twos comp) such that add makes sense
// STEP 2.B moved here
// Make wider for twos comp/sign
int25_t x_mantissa_w_hidden_bit_sign_adj;
int25_t y_mantissa_w_hidden_bit_sign_adj;
if(x_sign) //if(x_sign == 1)
{
x_mantissa_w_hidden_bit_sign_adj = uint24_negate(x_mantissa_w_hidden_bit); //Returns +1 wider signed, int25t
}
else
{
x_mantissa_w_hidden_bit_sign_adj = x_mantissa_w_hidden_bit;
}
if(y_sign) // if(y_sign == 1)
{
y_mantissa_w_hidden_bit_sign_adj = uint24_negate(y_mantissa_w_hidden_bit_unnormalized);
}
else
{
y_mantissa_w_hidden_bit_sign_adj = y_mantissa_w_hidden_bit_unnormalized;
}
// Step 5: Compute sum
int26_t sum_mantissa;
sum_mantissa = x_mantissa_w_hidden_bit_sign_adj + y_mantissa_w_hidden_bit_sign_adj;
// Step 6: Save sign flag and take absolute value of sum.
uint1_t sum_sign;
sum_sign = int26_25_25(sum_mantissa);
uint26_t sum_mantissa_unsigned;
sum_mantissa_unsigned = int26_abs(sum_mantissa);
// Step 7: Normalize sum and exponent. (Three cases.)
uint1_t sum_overflow;
sum_overflow = uint26_24_24(sum_mantissa_unsigned);
uint8_t sum_exponent_normalized;
uint23_t sum_mantissa_unsigned_normalized;
if (sum_overflow) //if ( sum_overflow == 1 )
{
// Case 1: Sum overflow.
// Right shift significand by 1 and increment exponent.
sum_exponent_normalized = x_exponent + 1;
sum_mantissa_unsigned_normalized = uint26_23_1(sum_mantissa_unsigned);
}
else if(sum_mantissa_unsigned == 0) // laxy switch to ==
{
//
// Case 3: Sum is zero.
sum_exponent_normalized = 0;
sum_mantissa_unsigned_normalized = 0;
}
else
{
// Case 2: Sum is nonzero and did not overflow.
// Dont waste zeros at start of mantissa
// Find position of first non-zero digit from left
// Know bit25(sign) and bit24(overflow) are not set
// Hidden bit is [23], can narrow down to 24b wide including hidden bit
uint24_t sum_mantissa_unsigned_narrow;
sum_mantissa_unsigned_narrow = sum_mantissa_unsigned;
uint5_t leading_zeros; // width = ceil(log2(len(sumsig)))
leading_zeros = count0s_uint24(sum_mantissa_unsigned_narrow); // Count from left/msbs downto, uintX_count0s counts from right
// NOT CHECKING xexp < adj
// Case 2b: Adjust significand and exponent.
sum_exponent_normalized = x_exponent - leading_zeros;
sum_mantissa_unsigned_normalized = sum_mantissa_unsigned_narrow << leading_zeros;
}
// Declare the output portions
uint23_t z_mantissa;
uint8_t z_exponent;
uint1_t z_sign;
z_sign = sum_sign;
z_exponent = sum_exponent_normalized;
z_mantissa = sum_mantissa_unsigned_normalized;
// Assemble output
return float_uint1_uint8_uint23(z_sign, z_exponent, z_mantissa);
} |
Beta Was this translation helpful? Give feedback.
-
Did you see the header-only library I posted? Isn't there good tricks to
copy? In such library all is based around an interesting normalization
function that seems to simplify everything
El mié., 6 oct. 2021 19:39, Julian Kemmerer ***@***.***>
escribió:
… OK the FP32 add shows similar issues so I think the same from from fp32sub
exists
fp32 add
x: float 4.015406e+03, uint32 0x457AF67D y: float -4.258397e+03, uint32 0xC585132D c_result: float -2.429915e+02, uint32 0xC372FDD0 result: float -2.429917e+02, uint32 0xC372FDE0 err: 0.000244141 allowed_err: 0.000242991 FAILED
x: float 3.079372e-02, uint32 0x3CFC431B y: float -3.176257e-02, uint32 0xBD021978 c_result: float -9.688530e-04, uint32 0xBA7DFAA0 result: float -9.688549e-04, uint32 0xBA7DFAC0 err: 1.86265e-09 allowed_err: 9.68853e-10 FAILED
x: float 6.835042e+35, uint32 0x7B03A35C y: float -6.513701e+35, uint32 0xFAFAE60D c_result: float 3.213411e+34, uint32 0x78C60AB0 result: float 3.213415e+34, uint32 0x78C60AC0 err: 3.96141e+28 allowed_err: 3.21341e+28 FAILED
x: float -1.684015e+07, uint32 0xCB807AEB y: float 1.606630e+07, uint32 0x4B7526FD c_result: float -7.738490e+05, uint32 0xC93CED90 result: float -7.738500e+05, uint32 0xC93CEDA0 err: 1 allowed_err: 0.773849 FAILED
x: float -9.023538e+15, uint32 0xDA003B71 y: float 8.817388e+15, uint32 0x59FA9AF1 c_result: float -2.061504e+14, uint32 0xD73B7E20 result: float -2.061509e+14, uint32 0xD73B7E40 err: 5.36871e+08 allowed_err: 2.0615e+08 FAILED
x: float -6.069733e-05, uint32 0xB87E9543 y: float 6.428654e-05, uint32 0x3886D193 c_result: float 3.589212e-06, uint32 0x3670DE30 result: float 3.589215e-06, uint32 0x3670DE40 err: 3.63798e-12 allowed_err: 3.58921e-12 FAILED
x: float 5.333232e+08, uint32 0x4DFE4EEF y: float -5.494575e+08, uint32 0xCE03003A c_result: float -1.613430e+07, uint32 0xCB7630A0 result: float -1.613434e+07, uint32 0xCB7630C0 err: 32 allowed_err: 16.1343 FAILED
x: float 6.323657e+26, uint32 0x6C02C529 y: float -6.097366e+26, uint32 0xEBFC2E5F c_result: float 2.262910e+25, uint32 0x6995BF30 result: float 2.262914e+25, uint32 0x6995BF40 err: 3.68935e+19 allowed_err: 2.26291e+19 FAILED
x: float 3.810096e-06, uint32 0x367FB0F5 y: float -3.834480e-06, uint32 0xB680A9EF c_result: float -2.438378e-08, uint32 0xB2D17480 result: float -2.438401e-08, uint32 0xB2D17500 err: 2.27374e-13 allowed_err: 2.43838e-14 FAILED
x: float -5.382599e+36, uint32 0xFC8194D4 y: float 5.219303e+36, uint32 0x7C7B4CDF c_result: float -1.632965e+35, uint32 0xF9FB9920 result: float -1.632968e+35, uint32 0xF9FB9940 err: 3.16913e+29 allowed_err: 1.63297e+29 FAILED
x: float -3.024259e-02, uint32 0xBCF7BF4F y: float 3.187624e-02, uint32 0x3D0290A8 c_result: float 1.633646e-03, uint32 0x3AD62010 result: float 1.633648e-03, uint32 0x3AD62020 err: 1.86265e-09 allowed_err: 1.63365e-09 FAILED
x: float 5.217902e+05, uint32 0x48FEC7C7 y: float -5.445391e+05, uint32 0xC904F1B2 c_result: float -2.274891e+04, uint32 0xC6B1B9D0 result: float -2.274894e+04, uint32 0xC6B1B9E0 err: 0.03125 allowed_err: 0.0227489 FAILED
x: float -1.727113e-12, uint32 0xABF311CF y: float 1.822396e-12, uint32 0x2C003D60 c_result: float 9.528326e-14, uint32 0x29D68F10 result: float 9.528337e-14, uint32 0x29D68F20 err: 1.0842e-19 allowed_err: 9.52833e-20 FAILED
x: float 1.492741e-08, uint32 0x328039B6 y: float -1.454009e-08, uint32 0xB279CBFB c_result: float 3.873177e-10, uint32 0x2FD4EE20 result: float 3.873186e-10, uint32 0x2FD4EE40 err: 8.88178e-16 allowed_err: 3.87318e-16 FAILED
x: float -4.595889e+15, uint32 0xD9829F7E y: float 4.434701e+15, uint32 0x597C1563 c_result: float -1.611882e+14, uint32 0xD7129990 result: float -1.611885e+14, uint32 0xD71299A0 err: 2.68435e+08 allowed_err: 1.61188e+08 FAILED
1000000 outputs checked.
Test failed!
float BIN_OP_PLUS_float_float_float(float left, float right)
{
// Get exponent for left and right
uint8_t left_exponent;
left_exponent = float_30_23(left);
uint8_t right_exponent;
right_exponent = float_30_23(right);
float x;
float y;
// Step 1: Copy inputs so that left's exponent >= than right's.
// ?????????MAYBE TODO:
// Is this only needed for shift operation that takes unsigned only?
// ALLOW SHIFT BY NEGATIVE?????
// OR NO since that looses upper MSBs of mantissa which not acceptable? IDK too many drinks
if ( right_exponent > left_exponent ) // Lazy switch to GT
{
x = right;
y = left;
}
else
{
x = left;
y = right;
}
// Step 2: Break apart into S E M
// X
uint23_t x_mantissa;
x_mantissa = float_22_0(x);
uint8_t x_exponent;
x_exponent = float_30_23(x);
uint1_t x_sign;
x_sign = float_31_31(x);
// Y
uint23_t y_mantissa;
y_mantissa = float_22_0(y);
uint8_t y_exponent;
y_exponent = float_30_23(y);
uint1_t y_sign;
y_sign = float_31_31(y);
// Mantissa needs +3b wider
// [sign][overflow][hidden][23 bit mantissa]
// Put 0's in overflow bit and sign bit
// Put a 1 hidden bit if exponent is non-zero.
// X
// Determine hidden bit
uint1_t x_hidden_bit;
if(x_exponent == 0) // lazy swith to ==
{
x_hidden_bit = 0;
}
else
{
x_hidden_bit = 1;
}
// Apply hidden bit
uint24_t x_mantissa_w_hidden_bit;
x_mantissa_w_hidden_bit = uint1_uint23(x_hidden_bit, x_mantissa);
// Y
// Determine hidden bit
uint1_t y_hidden_bit;
if(y_exponent == 0) // lazy swith to ==
{
y_hidden_bit = 0;
}
else
{
y_hidden_bit = 1;
}
// Apply hidden bit
uint24_t y_mantissa_w_hidden_bit;
y_mantissa_w_hidden_bit = uint1_uint23(y_hidden_bit, y_mantissa);
// Step 3: Un-normalize Y (including hidden bit) so that xexp == yexp.
// Already swapped left/right based on exponent
// diff will be >= 0
uint8_t diff;
diff = x_exponent - y_exponent;
// Shift y by diff (bit manip pipelined function)
uint24_t y_mantissa_w_hidden_bit_unnormalized;
y_mantissa_w_hidden_bit_unnormalized = y_mantissa_w_hidden_bit >> diff;
// Step 4: If necessary, negate mantissas (twos comp) such that add makes sense
// STEP 2.B moved here
// Make wider for twos comp/sign
int25_t x_mantissa_w_hidden_bit_sign_adj;
int25_t y_mantissa_w_hidden_bit_sign_adj;
if(x_sign) //if(x_sign == 1)
{
x_mantissa_w_hidden_bit_sign_adj = uint24_negate(x_mantissa_w_hidden_bit); //Returns +1 wider signed, int25t
}
else
{
x_mantissa_w_hidden_bit_sign_adj = x_mantissa_w_hidden_bit;
}
if(y_sign) // if(y_sign == 1)
{
y_mantissa_w_hidden_bit_sign_adj = uint24_negate(y_mantissa_w_hidden_bit_unnormalized);
}
else
{
y_mantissa_w_hidden_bit_sign_adj = y_mantissa_w_hidden_bit_unnormalized;
}
// Step 5: Compute sum
int26_t sum_mantissa;
sum_mantissa = x_mantissa_w_hidden_bit_sign_adj + y_mantissa_w_hidden_bit_sign_adj;
// Step 6: Save sign flag and take absolute value of sum.
uint1_t sum_sign;
sum_sign = int26_25_25(sum_mantissa);
uint26_t sum_mantissa_unsigned;
sum_mantissa_unsigned = int26_abs(sum_mantissa);
// Step 7: Normalize sum and exponent. (Three cases.)
uint1_t sum_overflow;
sum_overflow = uint26_24_24(sum_mantissa_unsigned);
uint8_t sum_exponent_normalized;
uint23_t sum_mantissa_unsigned_normalized;
if (sum_overflow) //if ( sum_overflow == 1 )
{
// Case 1: Sum overflow.
// Right shift significand by 1 and increment exponent.
sum_exponent_normalized = x_exponent + 1;
sum_mantissa_unsigned_normalized = uint26_23_1(sum_mantissa_unsigned);
}
else if(sum_mantissa_unsigned == 0) // laxy switch to ==
{
//
// Case 3: Sum is zero.
sum_exponent_normalized = 0;
sum_mantissa_unsigned_normalized = 0;
}
else
{
// Case 2: Sum is nonzero and did not overflow.
// Dont waste zeros at start of mantissa
// Find position of first non-zero digit from left
// Know bit25(sign) and bit24(overflow) are not set
// Hidden bit is [23], can narrow down to 24b wide including hidden bit
uint24_t sum_mantissa_unsigned_narrow;
sum_mantissa_unsigned_narrow = sum_mantissa_unsigned;
uint5_t leading_zeros; // width = ceil(log2(len(sumsig)))
leading_zeros = count0s_uint24(sum_mantissa_unsigned_narrow); // Count from left/msbs downto, uintX_count0s counts from right
// NOT CHECKING xexp < adj
// Case 2b: Adjust significand and exponent.
sum_exponent_normalized = x_exponent - leading_zeros;
sum_mantissa_unsigned_normalized = sum_mantissa_unsigned_narrow << leading_zeros;
}
// Declare the output portions
uint23_t z_mantissa;
uint8_t z_exponent;
uint1_t z_sign;
z_sign = sum_sign;
z_exponent = sum_exponent_normalized;
z_mantissa = sum_mantissa_unsigned_normalized;
// Assemble output
return float_uint1_uint8_uint23(z_sign, z_exponent, z_mantissa);
}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWPNPI3WP3HVPPRIE53UFTFYLANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
Hopefully on a couple of days I can provide an implementation
El mié., 6 oct. 2021 21:46, Julian Kemmerer ***@***.***>
escribió:
… I did see ieee754.hpp but its a little difficult to translate to C -
looking at it more.
It seems like I am dropping lsbs from the mantissa but idk where/how.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWNOUMYJWUDWHR7WPU3UFTUVHANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
I wish you solve the issue, but computing floating points with justo
integers is an old problem with many proven implementations. If it keeps
being hard, maybe best solution is to copy another implementation that
already faced such problems and maybe others not yet discovered
El jue., 7 oct. 2021 00:53, Julian Kemmerer ***@***.***>
escribió:
… I think that I just am not using enough fractional bits for the mantissa.
x: float 4.015406e+03, uint32 0x457AF67D
y: float -4.258397e+03, uint32 0xC585132D
c_result: float -2.429915e+02, uint32 0xC372FDD0 11000011011100101111110111010000
result: float -2.429917e+02, uint32 0xC372FDE0 11000011011100101111110111100000
err: 0.000244141 allowed_err: 0.000242991 FAILED
I think losing some lsbs from y mantissa when shifting to righ to to match
x exponent - idk where else the lsbs would be coming from
uint8_t diff;
diff = x_exponent - y_exponent;
uint24_t y_mantissa_w_hidden_bit_unnormalized;
y_mantissa_w_hidden_bit_unnormalized = y_mantissa_w_hidden_bit >> diff;
And no worries take your time
This stuff drains me so I'll probably be needing some time myself too
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWM57CFTKTJZMQLP7F3UFUKU5ANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
But then, inspired by looking at softfloat code and seeing they pad their mantissa on the right with 6 bits - I did the same - giving myself more lsbs. And now the fp32add is passing tests - and another 10x more random tests after that too. Seems good. So I am going to get that checked in tonight. Finally getting over this bug it seems. |
Beta Was this translation helpful? Give feedback.
-
This is quite good Julian, could you show the softfloat code you have used?
…On Thu, Oct 7, 2021 at 8:11 PM Julian Kemmerer ***@***.***> wrote:
But then, inspired by looking at softfloat code and seeing they pad their
mantissa on the right with 6 bits - I did the same - giving myself more
lsbs.
And now the fp32add is passing tests - and another 10x more random tests
after that too. Seems good. So I am going to get that checked in tonight.
Finally getting over this bug it seems.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWO2PCC5UDCGF3PFR5LUFYSI3ANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
All of the Thinking about what to do next... |
Beta Was this translation helpful? Give feedback.
-
I started an issue: |
Beta Was this translation helpful? Give feedback.
-
I want to put some thought into next steps for math package I am picturing something as simple as a github wiki page with a bunch of tables. Table header looks like ex.
Where last column has link to download some .zip of the source files. Which then makes me wonder - how to go about packaging these vhdl outputs - folks are probably going to use more than one of these 'cores' at a time too - can't interfere with each other. |
Beta Was this translation helpful? Give feedback.
-
I like you tables on a github wiki.
What I'd suggest is that we make a new repo for libraries and projects
"made with pipelinec" like most other projects do. That would stimulate
additions by others
El sáb., 9 oct. 2021 00:07, Julian Kemmerer ***@***.***>
escribió:
… Also any opinions on where how to publish this? I was picturing on github?
Via the PipelineC wiki?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWNBP25A7XHVMVLTGITUF6WYJANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
Abother download column can have verilog sources translated with yosys
El sáb., 9 oct. 2021 10:28, Victor Suarez Rovere ***@***.***>
escribió:
… I like you tables on a github wiki.
What I'd suggest is that we make a new repo for libraries and projects
"made with pipelinec" like most other projects do. That would stimulate
additions by others
El sáb., 9 oct. 2021 00:07, Julian Kemmerer ***@***.***>
escribió:
> Also any opinions on where how to publish this? I was picturing on
> github? Via the PipelineC wiki?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#25 (reply in thread)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACBHVWNBP25A7XHVMVLTGITUF6WYJANCNFSM5FFFAISQ>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
>
|
Beta Was this translation helpful? Give feedback.
-
Page 203 of intel HLS document list math functions supported (math.h like functions and others like fixed point support): https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/hls/mnl-hls-reference.pdf |
Beta Was this translation helpful? Give feedback.
-
I bet its illegal or against some terms to, for example, run Intel HLS and publicly share the output HDL i.e. doing what we are doing. |
Beta Was this translation helpful? Give feedback.
-
by no means I'm proposing to use intel HLS, I just posted the list of
supported functions to offer an alternative
in regards to xilinx code, I've seen many sources published with Apache
license.
It seems, for example, that for expressing an integer of different widths a
template type called ap_int<> is normally used (both intel and xilinx use
such convention) that can be also used to follow such convention
…On Mon, Oct 11, 2021 at 2:04 PM Julian Kemmerer ***@***.***> wrote:
I bet its illegal or against some terms to, for example, run Intel HLS and
publicly share the output HDL i.e. doing what we are doing.
Seems like a unique opportunity we have here using PipelineC to do the
pipelining instead of some big vendor HLS tool
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBHVWPFV3BE4SPL457654DUGMKIPANCNFSM5FFFAISQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
I achieved calculation of a float function "1/sqrt(x)" with C and PipelineC and results match. So we are now ready to implement more complex functions.
As seen the result of calculations are within rounding errors (only the least significant bits of respective mantissas are different)
The commands to achieve this are as follows:
Note the need of std08 VHDL.
Source of function implementation is as follows (note that it can be compiled with a regular C/C++ compiler):
The optimized version of the * operator is not used, it relies on the PipelineC default * operator. The function as shown, is a translated version of a normal C function using LLVM and a C code generator.
Program to test the simulated calculation with respect to the results of compilation is as follows:
For compilation and execution, this commands are needed:
Lots of improvements can be done, first one I think would be to connect the argument to the function to the simulator, instead of the constant value so better tests can be run.
Beta Was this translation helpful? Give feedback.
All reactions