Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(website): improve search #925

Merged
merged 8 commits into from
Mar 17, 2024
Merged

docs(website): improve search #925

merged 8 commits into from
Mar 17, 2024

Conversation

etiennebacher
Copy link
Collaborator

@etiennebacher etiennebacher commented Mar 15, 2024

Close #905

It seems that the index.json used for search cannot take into account pieces of words, e.g sum_hor in pl_sum_horizontal, which explains why the search results were so bad.

This PR adds custom separators when building the index.json for search, meaning that pl_ and other prefixes are no longer taken into account when looking for function names. Note that [\\s\\-]+ is the default, so only what comes after is customized.

Here are some search suggestions before and after this change.

sum_horizontal

Old:

image

New:

image

strptime

Old:

image

New:

image

sink

Old:

image

New:

image

altdoc/mkdocs_static.yml Outdated Show resolved Hide resolved
@eitsupi
Copy link
Collaborator

eitsupi commented Mar 17, 2024

I understand. But I don't want to close #686 because this is not a fundamental solution.
Do we need to also add $ as a separator?

@etiennebacher
Copy link
Collaborator Author

I don't want to close #686 because this is not a fundamental solution.

We can close #686 later, I've updated the initial post of this PR.

Do we need to also add $ as a separator?

No because the search index is built on the Markdown files, which still have the _ as separator, e.g DataFrame_clone(). We only replace those by $ in the post-processing script, which is the last thing to run.

@eitsupi
Copy link
Collaborator

eitsupi commented Mar 17, 2024

No because the search index is built on the Markdown files, which still have the _ as separator, e.g DataFrame_clone(). We only replace those by $ in the post-processing script, which is the last thing to run.

I understand that. My point was that since $ is used in places such as examples, shouldn't that also be registered as a separator?

@etiennebacher
Copy link
Collaborator Author

My point was that since $ is used in places such as examples, shouldn't that also be registered as a separator?

I added it, it allows a bit more matches indeed

@etiennebacher etiennebacher merged commit 93b45d6 into main Mar 17, 2024
@etiennebacher etiennebacher deleted the fix-search branch March 17, 2024 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants