Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Improvements #6318

Merged
merged 49 commits into from
Jan 24, 2023
Merged

Search Improvements #6318

merged 49 commits into from
Jan 24, 2023

Conversation

jasonvarga
Copy link
Member

@jasonvarga jasonvarga commented Jul 8, 2022

This PR will target master since there are some minor breaking changes.

Searchables

The plan here is to make search deal with dedicated classes and interfaces to make extending it simpler.

e.g. When you perform a search, instead of getting back the models (like an Entry) directly, you'd get SearchResult objects. You'll also be able to make any class implement Searchable which will know how to convert itself to a result.

Then each "type" would have a Provider class that provides Searchables that can be indexed. It would know how to do the opposite too: take an indexed item (which is just an array of values) and get the actual item (e.g. an Entry).

Replaces #5179
Fixes: statamic/ideas#519

Localization

At the moment localization and search doesn't play so well together.

You can now opt to have localized indexes by specifying the sites.

'myindex' => [
  'driver' => 'local',
  'searchables' => 'all',
+ 'sites' => ['en', 'fr'], // or 'sites' => 'all'
]

Doing this will result in multiple indexes (myindex_en, myindex_fr), and localized content will go into the appropriate one. This is how Algolia suggests to do it. If an index contains non-localizable items (e.g. assets or users), they'll be available in all of them.

Fixes #2699

Terms

At the moment, Terms are the things that get indexed. Not LocalizedTerms. When you save a localized version of a term, it doesn't update anything. Goes along with the localization improvement mentioned above.

In order to fix this, querying terms will need to change a bit. Right now if you were to do Term::all(), you'd get LocalizedTerms, but only from one site. It'll need to change to include terms from all sites.

Fixes #3274

Filtering searchables (and drafts)

It's probably not a great idea to have drafts be indexed. Yes the search:results tag filters them out, but if you're pushing them to Algolia then using JavaScript to get the results directly from them, you'll need to filter them out yourself.

Draft entries will now be filtered out by default.

You can also configure what items are indexed.

// config/statamic/search.php

'indexes' => [
  'blog' => [
    'driver' => 'local',
    'searchables' => ['collection:blog'],
    'filter' => BlogSearchFilter::class
  ],
],
class BlogSearchFilter
{
  public function handle($entry)
  {
    return $entry->published() && $entry->is_searchable !== false;
  }
}

Setting a custom filter will override the default filter (for entries, that's where the drafts are filtered out). In the above example, we make sure to continue to filter out drafts, plus an is_searchable field - maybe that's a toggle.

Fixes #5239

Breaking Changes

  • Draft entries will not be indexed.
  • Result classes will be returned from search queries instead of the actual objects. This won't have any differences within templates, but could if you're doing searches in PHP and relying on Entry objects, etc.
  • Entries etc will implement Searchable, which will require anyone with custom Entry classes that don't extend our Entry class to implement it too. If they extend our classes (which seems to be the majority) no changes will be needed.
  • Term queries will return multiple localized terms.
  • LocalizedTerm@reference now includes the site handle. e.g. term::tags::foo becomes term::tags::foo::english. The id doesn't change.

Todo

  • Make term queries return LocalizedTerms, and avoid de-duplicating them.
  • Fix the localized index issue
  • Avoid indexing drafts
    • Not just drafts. Allow users to configure what should be indexed.
    • Remove from index when draft/field changes, not just when being deleted.
  • Saving terms should update the appropriate indexes

- refactor to result classes
- providers are responsible for finding their corresponding items instead of doing Data::find
- added a generic result class for the 90% use case
- registering a provider needs to be newed up instead of passing the string
- clean up search tag class.
- search:results paginate no longer needs "as" parameter. breaking change
- algolia adds search_score which is are just decrementing numbers
@jasonvarga jasonvarga changed the title Searchables Search Improvements Jul 14, 2022
@edalzell
Copy link
Contributor

I'd also love to be able to exclude entries from the index, like here: #6552.

Our use case would be a toggle on the entry to indicate not to search.

jasonvarga and others added 2 commits January 12, 2023 14:53
# Conflicts:
#	src/Assets/Asset.php
#	src/Auth/User.php
#	src/Entries/Entry.php
#	tests/Search/SearchablesTest.php
@jasonvarga jasonvarga added this to the 3.4 milestone Jan 12, 2023
@jasonvarga jasonvarga changed the base branch from 3.3 to master January 12, 2023 20:05
@jasonvarga jasonvarga marked this pull request as ready for review January 24, 2023 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants