Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: floats precision and scale corrections #40357

Closed
bpintea opened this issue Mar 22, 2019 · 2 comments · Fixed by #40467
Closed

SQL: floats precision and scale corrections #40357

bpintea opened this issue Mar 22, 2019 · 2 comments · Fixed by #40467
Labels
:Analytics/SQL SQL querying

Comments

@bpintea
Copy link
Contributor

bpintea commented Mar 22, 2019

This issue is proposing to change the reported MAXIMUM_SCALE, MINIMUM_SCALE and potentially PRECISION values in SYS TYPES answer.

  • MINIMUM_SCALE
    The floats are not variable precision in ES. Furthermore, according to ODBC's SQLGetTypeInfo, If a data type has a fixed scale, the MINIMUM_SCALE and MAXIMUM_SCALE columns both contain this value. So I believe MINIMUM_SCALE should contain the same values as MAXIMUM_SCALE.

  • MAXIMUM_SCALE

    • HALF_FLOAT
      Currently its value is 16. However, it's precision is half of that of a FLOAT. So it should be adjusted down accordingly. (Also, see example[1] below.)
    • SCALED_FLOAT
      Currently 19, probably suggested by it being backed up by a signed long, whose precision is 19. However, this is handled as a floating point (probably double?), so it's scale should be adjusted accordingly. (Also, see example[2] below.)
  • PRECISION
    This column has currently the same values as the MAXIMUM_SCALE. Its value however is that of the maximum number of digits (whole and fractional plus dot and sign or exponential), so it should be in any case greater than that. Current implementation of operations - like addition, for instance - is consistent with the currently reported value, in that it truncates the result. But this truncation seems incorrect, as it's effectively reducing the type's scale. (Also, see example[3] below.). IMO, both the reported precision and operations should be corrected. (There might be Java readily-available calculations for the maximum precision of floats, like for C/C++.)

Example:

DELETE floats
PUT floats
{
  "mappings" : {
    "properties": {
      "scaled_float": {
        "type": "scaled_float"
        , "scaling_factor": 1000.0
      },
      "half_float": {
        "type": "half_float"
      },
      "float": {
        "type": "float"
      },
      "double": {
        "type": "double"
      }
    }
  }
}

PUT floats/_doc/1
{
  "scaled_float": 9223372036854775807,
  "half_float": 1.98765432100123456789,
  "float": 1.98765432100123456789,
  "double": 1.98765432100123456789
}
sql> select * from floats;

      double      |      float       |  half_float   |    scaled_float
------------------+------------------+---------------+--------------------
1.9876543210012345|1.9876543283462524|1.9873046875   |9.223372036854776E15

Above one can see that:

  • the double will correctly retain 16 decimals, in accordance to Java's double implementation;
  • the float is also consistent with float implementation
  • [1] the half_float is down to 3 (vs 16 reported now);
  • [2] the scaled_float won't report with higher scale than a double's (vs 19 reported).

For the PRECISION point[3]:

sql> select 1000000+double from floats;
 1000000+double
-----------------
1000001.987654321

In this case, the PRECISION seems kept, but the scale (and thus float data type's precision, in non-SQL talk) is diminished, the type's reported scale then no longer being correct.

An alternative implementation:

MariaDB [(none)]> select 1000000+1.98765432100123456789;
+--------------------------------+
| 1000000+1.98765432100123456789 |
+--------------------------------+
|   1000001.98765432100123456789 |
+--------------------------------+
1 row in set (0.00 sec)

@bpintea bpintea added the :Analytics/SQL SQL querying label Mar 22, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@bpintea
Copy link
Contributor Author

bpintea commented Mar 25, 2019

Rectification: for PRECISION, the ODBC documents the values to use for the floating types as column size (as 3rd column in the type information API) as follows:

  • SQL_DOUBLE (ES DOUBLE): 15
  • SQL_REAL (ES FLOAT): 7
    and thus correspondingly, for or half precision, 16bit representation, lg(211):
  • HALF_FLOAT: 3

costin added a commit to costin/elasticsearch that referenced this issue Mar 26, 2019
costin added a commit that referenced this issue Mar 28, 2019
fixmebot bot referenced this issue in VectorXz/elasticsearch Apr 22, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch May 28, 2021
fixmebot bot referenced this issue in VectorXz/elasticsearch Aug 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/SQL SQL querying
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants