You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After indexing Github dataset by repo_main_language and year, we started doing queries for testing tolerance in each group.
In some of those queries, elements were missing. For example:
df_qbeast.where("""repo_main_language == "SAS" and year == 2022""").count()
was not equal to
df_parquet.where("""repo_main_language == "SAS" and year == 2022""").count()
In a more deep search, we found out that year == 2022 query was translated on a space (from = 1.0, to = 1.0). The condition to select or not a cube to retrieve, is:
valdf_qbeast= spark.read.format("qbeast").load("path")
valdf_parquet= spark.read.format("parquet").load("path")
df_qbeast.where("""repo_main_language == "SAS" and year == 2022""").count()
df_parquet.where("""repo_main_language == "SAS" and year == 2022""").count()
What went wrong?
After indexing Github dataset by
repo_main_language
andyear
, we started doing queries for testing tolerance in each group.In some of those queries, elements were missing. For example:
df_qbeast.where("""repo_main_language == "SAS" and year == 2022""").count()
was not equal to
df_parquet.where("""repo_main_language == "SAS" and year == 2022""").count()
In a more deep search, we found out that
year == 2022
query was translated on a space(from = 1.0, to = 1.0)
. The condition to select or not a cube to retrieve, is:So, in the case of f == t == 1.0, the answer is always false.
How to reproduce?
2. Branch and commit id:
main bb08083
3. Spark version:
On the spark shell run
spark.version
.3.1.2
4. Hadoop version:
On the spark shell run
org.apache.hadoop.util.VersionInfo.getVersion()
.3.2.0
5. How are you running Spark?
Are you running Spark inside a container? Are you launching the app on a remote K8s cluster? Or are you just running the tests in a local computer?
6. Stack trace:
Trace of the log/error messages.
The text was updated successfully, but these errors were encountered: