Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread '<unnamed>' panicked when dropping column #11410

Closed
2 tasks done
MarcoGorelli opened this issue Sep 29, 2023 · 3 comments · Fixed by #16981
Closed
2 tasks done

thread '<unnamed>' panicked when dropping column #11410

MarcoGorelli opened this issue Sep 29, 2023 · 3 comments · Fixed by #16981
Assignees
Labels
A-panic Area: code that results in panic exceptions accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars

Comments

@MarcoGorelli
Copy link
Collaborator

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

This came up in a training session today:

df = pl.LazyFrame({'date': [1,2,3], 'symbol': [4,5,6]})
dates = df.select('date').unique()
symbols = df.select('symbol').unique()
symbols.join(dates, left_on=pl.lit(1), right_on=pl.lit(1)).drop('literal').collect()

Log output

No response

Issue description

I get a mysterious:

---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
Cell In[1], line 4
      2 dates = df.select('date').unique()
      3 symbols = df.select('symbol').unique()
----> 4 symbols.join(dates, left_on=pl.lit(1), right_on=pl.lit(1)).drop('literal').collect()

File ~/tmp/.venv/lib/python3.10/site-packages/polars/utils/deprecation.py:95, in deprecate_renamed_parameter.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
     90 @wraps(function)
     91 def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
     92     _rename_keyword_argument(
     93         old_name, new_name, kwargs, function.__name__, version
     94     )
---> 95     return function(*args, **kwargs)

File ~/tmp/.venv/lib/python3.10/site-packages/polars/lazyframe/frame.py:1711, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, no_optimization, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, streaming, **kwargs)
   1698     comm_subplan_elim = False
   1700 ldf = self._ldf.optimization_toggle(
   1701     type_coercion,
   1702     predicate_pushdown,
   (...)
   1709     eager,
   1710 )
-> 1711 return wrap_df(ldf.collect())

PanicException: called `Option::unwrap()` on a `None` value

Expected behavior

no error, just drop the 'literal' column

Installed versions

--------Version info---------
Polars:              0.19.5
Index type:          UInt32
Platform:            Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python:              3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]

----Optional dependencies----
adbc_driver_sqlite:  <not installed>
cloudpickle:         2.2.1
connectorx:          <not installed>
deltalake:           0.9.0
fsspec:              2023.6.0
gevent:              <not installed>
matplotlib:          3.7.1
numpy:               1.25.1
openpyxl:            <not installed>
pandas:              2.1.1
pyarrow:             12.0.1
pydantic:            2.0.2
pyiceberg:           <not installed>
pyxlsb:              <not installed>
sqlalchemy:          2.0.20
xlsx2csv:            <not installed>
xlsxwriter:          <not installed>
@MarcoGorelli MarcoGorelli added bug Something isn't working python Related to Python Polars labels Sep 29, 2023
@cmdlineluser
Copy link
Contributor

Seems to be a projection_pushdown issue.

symbols.join(dates, left_on=pl.lit(1), right_on=pl.lit(1)).drop('literal').collect(projection_pushdown=False)
# shape: (9, 2)
# ┌────────┬──────┐
# │ symbol ┆ date │
# │ ---    ┆ ---  │
# │ i64    ┆ i64  │
# ╞════════╪══════╡
# │ 5      ┆ 1    │
# │ 5      ┆ 3    │
# │ 5      ┆ 2    │
# │ 4      ┆ 1    │
# │ 4      ┆ 3    │
# │ 4      ┆ 2    │
# │ 6      ┆ 1    │
# │ 6      ┆ 3    │
# │ 6      ┆ 2    │
# └────────┴──────┘

@henryharbeck
Copy link
Contributor

Can also not select any columns afterwards, even the non-literal

df = pl.LazyFrame({'date': [1,2,3], 'symbol': [4,5,6]})
dates = df.select('date').unique()
symbols = df.select('symbol').unique()

symbols.join(dates, left_on=pl.lit(1), right_on=pl.lit(1)).collect() # works
symbols.join(dates, left_on=pl.lit(1), right_on=pl.lit(1)).with_columns().collect() # works
symbols.join(dates, left_on=pl.lit(1), right_on=pl.lit(1)).select(pl.all()).collect() # panics
symbols.join(dates, left_on=pl.lit(1), right_on=pl.lit(1)).select("symbol").collect() # panics

It makes sense that drop panics too, as it is now implemented as select per #10885

@mcrumiller
Copy link
Contributor

Not sure if this is related to #9621. I don't think that using expressions in the on parameter should add columns to the resulting frame, which is why we're ending up with a literal column.

@stinodego stinodego added needs triage Awaiting prioritization by a maintainer P-high Priority: high and removed needs triage Awaiting prioritization by a maintainer labels Jan 13, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog Jan 20, 2024
@coastalwhite coastalwhite added the A-panic Area: code that results in panic exceptions label Jun 14, 2024
@github-project-automation github-project-automation bot moved this from Ready to Done in Backlog Jun 16, 2024
@c-peters c-peters added the accepted Ready for implementation label Jun 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-panic Area: code that results in panic exceptions accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars
Projects
Archived in project
8 participants