Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

GeoSearch — Add the _geoBoundingBox built-in filter #223

Merged
merged 4 commits into from
Apr 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions open-api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ components:
- Mixed: `["something > 1 AND genres=comedy", "genres=horror OR title=batman"]`

> info
> _geoRadius({lat}, {lng}, {distance_in_meters}) built-in filter rule can be used to filter documents within a geo circle.
> _geoRadius({lat}, {lng}, {distance_in_meters}) and _geoBoundingBox([{lat, lng}], [{lat}, {lng}]) built-in filter rules can be used to filter documents within geo shapes.

> warn
> Attribute(s) used in `filter` should be declared as filterable attributes. See [Filtering and Faceted Search](https://docs.meilisearch.com/reference/features/filtering_and_faceted_search.html).
Expand Down Expand Up @@ -1027,7 +1027,7 @@ components:
- Mixed: `["something > 1 AND genres=comedy", "genres=horror OR title=batman"]`

> info
> _geoRadius({lat}, {lng}, {distance_in_meters}) built-in filter rule can be used to filter documents within a geo circle.
> _geoRadius({lat}, {lng}, {distance_in_meters}) and _geoBoundingBox([{lat, lng}], [{lat}, {lng}]) built-in filter rules can be used to filter documents within geo shapes.

> warn
> Attribute(s) used in `filter` should be declared as filterable attributes. See [Filtering and Faceted Search](https://docs.meilisearch.com/reference/features/filtering_and_faceted_search.html).
Expand Down
3 changes: 3 additions & 0 deletions text/0034-telemetry-policies.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ The collected data is sent to [Segment](https://segment.com/). Segment is a plat
| `sort.with_geoPoint` | `true` if the sort rule `_geoPoint` was used in this batch, otherwise `false` | true | `Documents Searched POST`, `Documents Searched GET` |
| `sort.avg_criteria_number` | Average number of sort criteria among all requests containing the `sort` parameter in this batch | 2 | `Documents Searched POST`, `Documents Searched GET` |
| `filter.with_geoRadius` | `true` if the filter rule `_geoRadius` was used in this batch, otherwise `false` | false | `Documents Searched POST`, `Documents Searched GET` |
| `filter.with_geoBoundingBox` | `true` if the filter rule `_geoBoundingBox` was used in this batch, otherwise `false`| false | `Documents Searched POST`, `Documents Searched GET` |
| `filter.most_used_syntax` | Most used filter syntax among all requests containing the `filter` parameter in this batch | string | `Documents Searched POST`, `Documents Searched GET` |
| `q.max_terms_number` | Highest number of terms given for the `q` parameter in this batch | 5 | `Documents Searched POST`, `Documents Searched GET` |
| `pagination.max_limit` | Highest value given for the `limit` parameter in this batch | 60 | `Documents Searched POST`, `Documents Searched GET` |
Expand Down Expand Up @@ -251,6 +252,7 @@ This property allows us to gather essential information to better understand on
| sort.with_geoPoint | Does the built-in sort rule _geoPoint rule has been used in the aggregated event? | `true` |
| sort.avg_criteria_number | The average number of sort criteria among all the requests containing the `sort` parameter in the aggregated event. `"sort": []` equals to `0` while not sending `sort` does not influence the average. | `2` |
| filter.with_geoRadius | Does the built-in filter rule _geoRadius has been used in the aggregated event? | `false` |
| filter.with_geoBoundingBox | Does the built-in filter rule _geoBoundingBox has been used in the aggregated event?| `false` |
| filter.avg_criteria_number | The average number of filter criteria among all the requests containing the `filter` parameter in the aggregated event. `"filter": []` equals to `0` while not sending `filter` does not influence the average in the aggregated event. | `4` |
| filter.most_used_syntax | The most used filter syntax among all the requests containing the requests containing the `filter` parameter in the aggregated event. `string` / `array` / `mixed` | `mixed` |
| q.max_terms_number | The maximum number of terms for the `q` parameter among all requests in the aggregated event. | `5` |
Expand Down Expand Up @@ -284,6 +286,7 @@ This property allows us to gather essential information to better understand on
| sort.with_geoPoint | Does the built-in sort rule _geoPoint rule has been used in the aggregated event? | `true` |
| sort.avg_criteria_number | The average number of sort criteria among all the requests containing the `sort` parameter in the aggregated event. `"sort": []` equals to `0` while not sending `sort` does not influence the average. | `2` |
| filter.with_geoRadius | Does the built-in filter rule _geoRadius has been used in the aggregated event? | `false` |
| filter.with_geoBoundingBox | Does the built-in filter rule _geoBoundingBox has been used in the aggregated event?| `false` |
| filter.avg_criteria_number | The average number of filter criteria among all the requests containing the `filter` parameter in the aggregated event. `"filter": []` equals to `0` while not sending `filter` does not influence the average in the aggregated event. | `4` |
| filter.most_used_syntax | The most used filter syntax among all the requests containing the requests containing the `filter` parameter in the aggregated event. `string` / `array` / `mixed` | `mixed` |
| q.max_terms_number | The maximum number of terms for the `q` parameter among all requests in the aggregated event. | `5` |
Expand Down
35 changes: 18 additions & 17 deletions text/0059-geo-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,15 @@ The purpose of this specification is to add a first iteration of the **geosearch
#### Summary Key points

- Documents MUST have a `_geo` reserved object to be geosearchable.
- Filter documents by a given geo radius using the built-in filter `_geoRadius({lat}, {lng}, {distance_in_meters})`. It is possible to cumulate several geosearch filters within the `filter` field.
- Filter documents by a given geo radius using the built-in filter `_geoRadius({lat}, {lng}, {distance_in_meters})`.
- Filter documents by a given geo bounding box using the built-in filter `_geoBoundingBox([{lat}, {lng}], [{lat, lng}])`. The first pair of coordinates represents the top right corner of the bounding box, while the second pair represents the bottom left corner.
- It is possible to cumulate several geosearch filters within the `filter` field.
- Sort documents in ascending/descending order around a geo point. e.g. `_geoPoint({lat}, {lng}):asc`.
- It is possible to filter and/or sort by geographical criteria of the user's choice.
- `_geo` must be set as a filterable attribute to use geo filtering capabilities.
- `_geo` must be set as a sortable attribute to use geo sort capabilities.
- There is no `geo` ranking rule that can be manipulated by the user. This one is automatically integrated in the ranking rule `sort` by default and activated by sorting using the `_geoPoint({lat}, {lng})` built-in sort rule.
- Using `_geoPoint({lat}, {lng})` in the `sort` parameter at search leads the engine to return a `_geoDistance` within the search results. This field represents the distance in meters of the document from the specified `_geoPoint`.
- Add an `invalid_geo_field` error.
- Add an alternative message for `invalid_sort` and `invalid_filter` error to handle reserved keywords.
- `invalid_criterion` is renamed to `invalid_ranking_rule` and add an alternative message to handle reserved keywords.

### II. Motivation

Expand Down Expand Up @@ -134,6 +133,16 @@ csv format example

> The `_geo` field has to be set in `filterableAttributes` setting by the developer to activate geo filtering capabilities at search.

**`_geoBoundingBox` built-in filter rule definition**

- Name: `_geoBoundingBox`
- Signature: ([{lat:float}:required, {lng:float}:required)], [{lat:float}:required, {lng:float}:required])
- Not required

The first pair of coordinates represents the top right corner of the bounding box, while the second pair represents the bottom left corner.

> The `_geo` field has to be set in `filterableAttributes` setting by the developer to activate geo filtering capabilities at search.

#### GET Search `/indexes/{indexUid}/search`

```
Expand All @@ -148,7 +157,7 @@ csv format example
}
```

- 🔴 Specifying parameters that do not conform to the `_geoRadius` signature causes the API to return an [invalid_search_parameter_filter](0061-error-format-and-definitions.md#invalid_search_parameter_filter) error.
- 🔴 Specifying parameters that do not conform to the `_geoRadius` or `_geoBoundingBox` signature causes the API to return an [invalid_search_parameter_filter](0061-error-format-and-definitions.md#invalid_search_parameter_filter) error.
- 🔴 Using `_geoDistance`, `_geo` or `_geoPoint` in a filter expression causes the API to return an [invalid_search_parameter_filter](0061-error-format-and-definitions.md#invalid_search_parameter_filter) error.

---
Expand Down Expand Up @@ -184,7 +193,7 @@ Following the [`sort` specification feature](https://github.com/meilisearch/spec
}
```
- 🔴 Specifying parameters that do not conform to the `_geoPoint` signature causes the API to return an [invalid_search_parameter_sort](0061-error-format-and-definitions.md#invalid_search_parameter_sort) error.
- 🔴 Using `_geoDistance`, `_geo` or `_geoRadius` in a sort expression causes the API to return an[invalid_search_parameter_sort](0061-error-format-and-definitions.md#invalid_search_parameter_sort) error.
- 🔴 Using `_geoDistance`, `_geo`, `_geoRadius` or `_geoBoundingBox` in a sort expression causes the API to return an[invalid_search_parameter_sort](0061-error-format-and-definitions.md#invalid_search_parameter_sort) error.

---

Expand All @@ -199,23 +208,16 @@ Following the [`sort` specification feature](https://github.com/meilisearch/spec
- Type: int
- Not required

> 💡 `_geoDistance` response field is only computed and shown when the end-user have sorted documents around a `_geoPoint`. So if the end-user filters documents using a `_geoRadius` built-in filter without sorting them around a `_geoPoint`, this field `_geoDistance` will not appear in the search response.
> 💡 `_geoDistance` response field is only computed and shown when the end-user have sorted documents around a `_geoPoint`. So if the end-user filters documents using a `_geoRadius/_geoBoundingBox` built-in filter without sorting them around a `_geoPoint`, this field `_geoDistance` will not appear in the search response.

---

#### Related Ranking Rules Settings API Errors

- 🔴 Specifying a custom ranking rule with `_geo`, `_geoDistance`, `_geoPoint`, or `_geoRadius` returns an [invalid_settings_ranking_rules](0061-error-format-and-definitions.md#invalid_settings_ranking_rules) error.
- 🔴 Specifying a custom ranking rule with `_geo`, `_geoDistance`, `_geoPoint`, `_geoRadius` or `_geoBoundinBox` returns an [invalid_settings_ranking_rules](0061-error-format-and-definitions.md#invalid_settings_ranking_rules) error.

---

### IV. Finalized Key Changes

- Add a `_geo` reserved field on JSON and CSV format to index a geo point coordinates for a document.
- Add a `_geoPoint(lat, lng)` built-in sort rule.
- Add a `_geoRadius(lat, lng, distance_in_meters)` built-in filter rule.
- Return a `_geoDistance` in `hits` objects representing the distance in meters computed from the `_geoPoint` built-in sort rule.

## 2. Technical Aspects

### I. Measuring
Expand All @@ -225,8 +227,7 @@ Following the [`sort` specification feature](https://github.com/meilisearch/spec

## 3. Future Possibilities

- Add built-in filter to filter documents within `polygon` and `bounding-box`.
- Add built-in filter to filter documents within `polygon`.
- Handling array of geo points in the document object.
- Handling multiple geo formats for the `_geo` field. e.g. "{lat},{lng}", a geohash etc.
- Handling distance in other formats (like the imperial format). **It's easy to implement on the user side though.**
- Handling position in other formats. It seems that [degrees and minutes](https://www.pacioos.hawaii.edu/voyager-news/lat-long-formats/) are also used a lot. **It's easy to implement on the user side though.**
2 changes: 1 addition & 1 deletion text/0061-error-format-and-definitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1095,7 +1095,7 @@ HTTP Code: `400 Bad Request` when `Synchronous`
}
```

#### Variant: Specifying a custom ranking rule on reserved expression `_geoRadius`
#### Variant: Specifying a custom ranking rule on reserved expressions `_geoRadius` / `_geoBoundingBox`

```json
{
Expand Down
6 changes: 4 additions & 2 deletions text/0118-search-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ expression = or
or = and ("OR" WS+ and)*
and = not ("AND" WS+ not)*
not = ("NOT" WS+ not) | primary
primary = "(" WS* expression WS* ")" | geoRadius | in | condition | exists | not_exists | to
primary = "(" WS* expression WS* ")" | geoRadius | geoBoundingBox | in | condition | exists | not_exists | to
in = attribute "IN" WS* "[" value_list "]"
condition = attribute ("=" | "!=" | ">" | ">=" | "<" | "<=") value
exists = attribute "EXISTS"
Expand Down Expand Up @@ -163,6 +163,7 @@ The grammar for the value of a filterable attribute is the same as the grammar f
- OR: `filter OR filter`
- NOT: `NOT filter`
- GeoSearch: `_geoRadius(lat, lng, distance)`
- GeoSearch: `_geoBoundingBox([lat, lng], [lat, lng])`

###### 3.1.2.1.5 Equality

Expand Down Expand Up @@ -350,7 +351,8 @@ attribute != value1 AND attribute != value2 AND ...

###### 3.1.2.1.12 Geo Search

The `_geoRadius` operator selects the documents whose geographical coordinates fall within a certain range of a given coordinate. See [GeoSearch](0059-geo-search.md) for more information.
- The `_geoRadius` operator selects the documents whose geographical coordinates fall within a certain range of a given coordinate. See [GeoSearch](0059-geo-search.md) for more information.
- The `_geoBoundingBox` operator selects the documents whose geographical coordinates fall within a square described by the given coordinates. See [GeoSearch](0059-geo-search.md) for more information.

##### 3.1.2.2. Array Syntax

Expand Down