-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot use colMany to access anything wrapped in Option #168
Comments
colMany
to access anything wrapped in Option
I have an alternative proposal here. What if we have a way to flatten schema's with Optional fields. Example: case class O(x: Option[Int], y: Option[Int])
val df: TypedDataset[O] = ...
val dfFlat = df.flatten : TypedDataset[(Int,Int)] What happens here is that @palmerlao Do you think this would solve your use case? |
Let me know if this is what you mean. In your example, say that However, I think there is some interest at my company for somehow building over frameless with Monocle. Do you think that would be something that other people find useful? |
I see what you mean |
This is also an issue the we got bitten by lately. Unfortunately we cannot adapt our model and thus, for now, need to use some hacky non-typechecked workarounds, which makes me sad. I am way to new to shapeless and frameless to make a valuable contribution here, but I really hope that this is in general solvable. |
I was hopping to get this working in #204, but I hit a wall with UDFs. The idea is to be able to do a map on an optional column In the meanwhile you can probably work around this using a UDF, but you will have to serialize the entire column. If that is not an issue for you, then a UDF is a fairly ok typesafe work around. t.makeUDF( (x: Option[Foo]) => x.bar + 1) |
Got the same issue. |
Ran into this pretty quick as soon as we tried to do |
@palmerlao #479 helps with this issues. |
My understanding is that
Option
should be used to represent columns that one might mark nullable in vanilla Spark. I tried something along the lines of the following:The last line resulted in
What I think is reasonable is to return something of type
TypedColumn[A, Option[Int]]
. For comparison, in regular Spark:which I would expect
as.select(as.colMany('ob, 'i).show().run
to be roughly equivalent to up to some decisions on whether to displaynull
orNone
.Perhaps a reasonable way to approach this problem is to integrate the column selection mechanism with some kind of optics.
The text was updated successfully, but these errors were encountered: