Skip to content

Commit 5539c51

Browse files
kevjumbafelixwang9817
authored andcommitted
fix: Addresses ZeroDivisionError when materializing file source with same timestamps (#2551)
* Add docs for Go feature server Signed-off-by: Felix Wang <wangfelix98@gmail.com> * Update go feature server docs Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Address review components Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Fix Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Revert Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Fix Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Update comment Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Fix Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Fix Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * Revert indent Signed-off-by: Kevin Zhang <kzhang@tecton.ai> * fix comment Signed-off-by: Kevin Zhang <kzhang@tecton.ai> Co-authored-by: Felix Wang <wangfelix98@gmail.com>
1 parent 39d3f9d commit 5539c51

File tree

2 files changed

+19
-5
lines changed

2 files changed

+19
-5
lines changed

docs/reference/feature-servers/go-feature-retrieval.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The Go Feature Retrieval component currently only supports Redis and Sqlite as o
1010

1111
## Installation
1212

13-
As long as you are running macOS or linux x86 with python version 3.7-3.10, the go component comes pre-compiled when you run install feast.
13+
As long as you are running macOS or linux, on x86, with python version 3.7-3.10, the go component comes pre-compiled when you install feast.
1414

1515
For developers, if you want to build from source, run `make compile-go-lib` to build and compile the go server.
1616

sdk/python/feast/infra/offline_stores/file.py

+18-4
Original file line numberDiff line numberDiff line change
@@ -299,11 +299,25 @@ def evaluate_offline_job():
299299
if created_timestamp_column
300300
else [event_timestamp_column]
301301
)
302+
# try-catch block is added to deal with this issue https://github.com/dask/dask/issues/8939.
303+
# TODO(kevjumba): remove try catch when fix is merged upstream in Dask.
304+
try:
305+
if created_timestamp_column:
306+
source_df = source_df.sort_values(by=created_timestamp_column,)
307+
308+
source_df = source_df.sort_values(by=event_timestamp_column)
309+
310+
except ZeroDivisionError:
311+
# Use 1 partition to get around case where everything in timestamp column is the same so the partition algorithm doesn't
312+
# try to divide by zero.
313+
if created_timestamp_column:
314+
source_df = source_df.sort_values(
315+
by=created_timestamp_column, npartitions=1
316+
)
302317

303-
if created_timestamp_column:
304-
source_df = source_df.sort_values(by=created_timestamp_column)
305-
306-
source_df = source_df.sort_values(by=event_timestamp_column)
318+
source_df = source_df.sort_values(
319+
by=event_timestamp_column, npartitions=1
320+
)
307321

308322
source_df = source_df[
309323
(source_df[event_timestamp_column] >= start_date)

0 commit comments

Comments
 (0)