Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pramen-Py get_latest_available_date throws an error for delta tables #166

Closed
jirifilip opened this issue Mar 13, 2023 · 0 comments · Fixed by #217
Closed

Pramen-Py get_latest_available_date throws an error for delta tables #166

jirifilip opened this issue Mar 13, 2023 · 0 comments · Fixed by #217
Labels
bug Something isn't working Pramen-Py

Comments

@jirifilip
Copy link
Collaborator

Describe the bug

For some reason, in get_latest_available_date, reading the partitioning column of a delta table produces a list of strings instead of datetime.date. But we rely on it returning list of dates, so the method fails.

Failing test case

This test case passes for parquet but fails for delta

@pytest.mark.parametrize("table_format", (TableFormat.parquet, TableFormat.delta))
def test_get_latest_available_date(
        spark,
        tmp_path,
        table_format
):
    test_table_df = spark.createDataFrame([
        (1, "John", d(2023, 3, 23)),
        (2, "Jack", d(2023, 3, 23))
    ]).toDF("id", "name", "info_date")
    metastore_table = MetastoreTable(
        name="test_table",
        format=table_format,
        path=tmp_path.as_posix(),
        info_date_settings=InfoDateSettings(column="info_date"),
    )
    writer = MetastoreWriter(
        spark=spark,
        tables=[metastore_table],
        info_date=d(2022, 3, 23),
    )
    writer.write("test_table", test_table_df)

    metastore_reader = MetastoreReader(spark=spark, tables=[metastore_table])
    latest_available_date = metastore_reader.get_latest_available_date("test_table") # delta fails here

    assert type(latest_available_date) == datetime.date

Expected behavior

Test should pass for both parquet and delta

@jirifilip jirifilip added bug Something isn't working Pramen-Py labels Mar 13, 2023
jirifilip added a commit that referenced this issue Jun 22, 2023
jirifilip added a commit that referenced this issue Jun 22, 2023
jirifilip added a commit that referenced this issue Jun 27, 2023
jirifilip added a commit that referenced this issue Jun 27, 2023
jirifilip added a commit that referenced this issue Jun 28, 2023
jirifilip added a commit that referenced this issue Jun 28, 2023
jirifilip added a commit that referenced this issue Jun 28, 2023
jirifilip added a commit that referenced this issue Jun 28, 2023
jirifilip added a commit that referenced this issue Jun 28, 2023
jirifilip added a commit that referenced this issue Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Pramen-Py
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant