Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/add_search_dataset_endpoint #16

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

teezzan
Copy link
Contributor

@teezzan teezzan commented Jul 13, 2024

  • Added local search functionality using the search query parameter to filter datasets by name.

  • Implemented local pagination by calculating start and end indices based on page and page_size parameters.

Copy link
Contributor

@MHHukiewitz MHHukiewitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is right now, I think we need to adjust the AARS to accomodate for the pagination issue.

Ideally, we'd want to get the total amount of pages or objects in the PaginationResponse object, this way we can avoid calling .all(). Think about all datatypes, that have a potentially unlimited amount of records, like Datasets being constantly added, etc.

Implementing search is always difficult in these scenarios, but should be possible using the filter method (check out *__in or other modifiers to the search params in the docs/code). If necessary, let's implement more modifiers to have a better search.

page: int = 1,
page_size: int = 20,
) -> List[DatasetResponse]:
) -> dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a Pydantic type for the API docs

if by:
dataset_resp = Dataset.filter(owner=by)
else:
dataset_resp = Dataset.fetch_objects()
datasets = await dataset_resp.page(page=page, page_size=page_size)

all_datasets = await dataset_resp.all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a very costly operation, only use it if you cannot avoid it


start = (page - 1) * page_size
end = start + page_size
datasets = all_datasets[start:end]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the .page operation instead of fetching all datasets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants