-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector PRF #25
Vector PRF #25
Conversation
Looks great, thanks! I prefer class-based Transformer definitions for stuff that isn't ad hoc. I'm happy to convert it to that format if you're okay with it? |
I guess the advantage of a transformer is that its parameters can be tuned in a gridsearch. I think we should commit a notebook to the examples folder. I have one that can be associted. |
Sounds good. Yeah, there are other benefits too, like you can get a clearer repr. I think more extensive tests would also be good. I just put in one basic one to make sure I didn’t mess something up in the new version. |
Thoughts on replacing this boilerplate with a decorator? # BEFORE:
def transform(self, inp: pd.DataFrame) -> pd.DataFrame:
"""Transforms the input DataFrame query-by-query."""
return pt.apply.by_query(self.transform_by_query, add_ranks=False)(inp)
def transform_by_query(self, inp: pd.DataFrame) -> pd.DataFrame:
pta.validate.result_frame(inp, extra_columns=['query', 'query_vec', 'doc_vec'])
...
# AFTER:
@pta.transform.by_query(add_ranks=False)
def transform(self, inp: pd.DataFrame) -> pd.DataFrame:
pta.validate.result_frame(inp, extra_columns=['query', 'query_vec', 'doc_vec'])
... It could live in alpha as we refine on the idea/name/etc. |
sure. |
It's a fairly niche style, thus far. Perhaps setup a gist so we can continue discussion there. I guess this in effect making transform() methods that can be slightly specialised (eg guaranteed to be called once per query) without needing to have multiple more transform_*() methods in the API specification |
Yeah, I'd rather not have any more I'm also no stranger to just doing a |
Alright, I'm happy with this version. I went ahead and added |
ok, I'll make a separate PR with a notebook. |
Going to merge it? |
Ok! Sounds like it's urgent. |
WIP.
Our results:
Results from Pyserini:
(I think they use FAISS as a retrieval backend)