Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

GeoSearch — Add the _geoBoundingBox built-in filter #223

Merged
merged 4 commits into from
Apr 3, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions open-api.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ components:
- Mixed: `["something > 1 AND genres=comedy", "genres=horror OR title=batman"]`

> info
> _geoRadius({lat}, {lng}, {distance_in_meters}) built-in filter rule can be used to filter documents within a geo circle.
> _geoRadius({lat}, {lng}, {distance_in_meters}) and _geoBoundingBox([{lat, lng}], [{lat}, {lng}]) built-in filter rules can be used to filter documents within geo shapes.

> warn
> Attribute(s) used in `filter` should be declared as filterable attributes. See [Filtering and Faceted Search](https://docs.meilisearch.com/reference/features/filtering_and_faceted_search.html).
Expand Down Expand Up @@ -1027,7 +1027,7 @@ components:
- Mixed: `["something > 1 AND genres=comedy", "genres=horror OR title=batman"]`

> info
> _geoRadius({lat}, {lng}, {distance_in_meters}) built-in filter rule can be used to filter documents within a geo circle.
> _geoRadius({lat}, {lng}, {distance_in_meters}) and _geoBoundingBox([{lat, lng}], [{lat}, {lng}]) built-in filter rules can be used to filter documents within geo shapes.

> warn
> Attribute(s) used in `filter` should be declared as filterable attributes. See [Filtering and Faceted Search](https://docs.meilisearch.com/reference/features/filtering_and_faceted_search.html).
Expand Down
3 changes: 3 additions & 0 deletions text/0034-telemetry-policies.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ The collected data is sent to [Segment](https://segment.com/). Segment is a plat
| `sort.with_geoPoint` | `true` if the sort rule `_geoPoint` was used in this batch, otherwise `false` | true | `Documents Searched POST`, `Documents Searched GET` |
| `sort.avg_criteria_number` | Average number of sort criteria among all requests containing the `sort` parameter in this batch | 2 | `Documents Searched POST`, `Documents Searched GET` |
| `filter.with_geoRadius` | `true` if the filter rule `_geoRadius` was used in this batch, otherwise `false` | false | `Documents Searched POST`, `Documents Searched GET` |
| `filter.with_geoBoundingBox` | `true` if the filter rule `_geoBoundingBox` was used in this batch, otherwise `false`| false | `Documents Searched POST`, `Documents Searched GET` |
| `filter.most_used_syntax` | Most used filter syntax among all requests containing the `filter` parameter in this batch | string | `Documents Searched POST`, `Documents Searched GET` |
| `q.max_terms_number` | Highest number of terms given for the `q` parameter in this batch | 5 | `Documents Searched POST`, `Documents Searched GET` |
| `pagination.max_limit` | Highest value given for the `limit` parameter in this batch | 60 | `Documents Searched POST`, `Documents Searched GET` |
Expand Down Expand Up @@ -251,6 +252,7 @@ This property allows us to gather essential information to better understand on
| sort.with_geoPoint | Does the built-in sort rule _geoPoint rule has been used in the aggregated event? | `true` |
| sort.avg_criteria_number | The average number of sort criteria among all the requests containing the `sort` parameter in the aggregated event. `"sort": []` equals to `0` while not sending `sort` does not influence the average. | `2` |
| filter.with_geoRadius | Does the built-in filter rule _geoRadius has been used in the aggregated event? | `false` |
| filter.with_geoBoundingBox | Does the built-in filter rule _geoBoundingBox has been used in the aggregated event?| `false` |
| filter.avg_criteria_number | The average number of filter criteria among all the requests containing the `filter` parameter in the aggregated event. `"filter": []` equals to `0` while not sending `filter` does not influence the average in the aggregated event. | `4` |
| filter.most_used_syntax | The most used filter syntax among all the requests containing the requests containing the `filter` parameter in the aggregated event. `string` / `array` / `mixed` | `mixed` |
| q.max_terms_number | The maximum number of terms for the `q` parameter among all requests in the aggregated event. | `5` |
Expand Down Expand Up @@ -284,6 +286,7 @@ This property allows us to gather essential information to better understand on
| sort.with_geoPoint | Does the built-in sort rule _geoPoint rule has been used in the aggregated event? | `true` |
| sort.avg_criteria_number | The average number of sort criteria among all the requests containing the `sort` parameter in the aggregated event. `"sort": []` equals to `0` while not sending `sort` does not influence the average. | `2` |
| filter.with_geoRadius | Does the built-in filter rule _geoRadius has been used in the aggregated event? | `false` |
| filter.with_geoBoundingBox | Does the built-in filter rule _geoBoundingBox has been used in the aggregated event?| `false` |
| filter.avg_criteria_number | The average number of filter criteria among all the requests containing the `filter` parameter in the aggregated event. `"filter": []` equals to `0` while not sending `filter` does not influence the average in the aggregated event. | `4` |
| filter.most_used_syntax | The most used filter syntax among all the requests containing the requests containing the `filter` parameter in the aggregated event. `string` / `array` / `mixed` | `mixed` |
| q.max_terms_number | The maximum number of terms for the `q` parameter among all requests in the aggregated event. | `5` |
Expand Down
33 changes: 16 additions & 17 deletions text/0059-geo-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,15 @@ The purpose of this specification is to add a first iteration of the **geosearch
#### Summary Key points

- Documents MUST have a `_geo` reserved object to be geosearchable.
- Filter documents by a given geo radius using the built-in filter `_geoRadius({lat}, {lng}, {distance_in_meters})`. It is possible to cumulate several geosearch filters within the `filter` field.
- Filter documents by a given geo radius using the built-in filter `_geoRadius({lat}, {lng}, {distance_in_meters})`.
- Filter documents by a given geo bounding box using the built-in filter `_geoBoundingBox([{lat}, {lng}], [{lat, lng}])`. The first pair of coordinates represents the top left corner of the bounding box, while the second pair represents the bottom right corner.
- It is possible to cumulate several geosearch filters within the `filter` field.
- Sort documents in ascending/descending order around a geo point. e.g. `_geoPoint({lat}, {lng}):asc`.
- It is possible to filter and/or sort by geographical criteria of the user's choice.
- `_geo` must be set as a filterable attribute to use geo filtering capabilities.
- `_geo` must be set as a sortable attribute to use geo sort capabilities.
- There is no `geo` ranking rule that can be manipulated by the user. This one is automatically integrated in the ranking rule `sort` by default and activated by sorting using the `_geoPoint({lat}, {lng})` built-in sort rule.
- Using `_geoPoint({lat}, {lng})` in the `sort` parameter at search leads the engine to return a `_geoDistance` within the search results. This field represents the distance in meters of the document from the specified `_geoPoint`.
- Add an `invalid_geo_field` error.
- Add an alternative message for `invalid_sort` and `invalid_filter` error to handle reserved keywords.
- `invalid_criterion` is renamed to `invalid_ranking_rule` and add an alternative message to handle reserved keywords.

### II. Motivation

Expand Down Expand Up @@ -134,6 +133,14 @@ csv format example

> The `_geo` field has to be set in `filterableAttributes` setting by the developer to activate geo filtering capabilities at search.

**`_geoBoundingBox` built-in filter rule definition**

- Name: `_geoBoundingBox`
- Signature: ([{lat:float}:required, {lng:float}:required)], [{lat:float}:required, {lng:float}:required])
- Not required

> The `_geo` field has to be set in `filterableAttributes` setting by the developer to activate geo filtering capabilities at search.

#### GET Search `/indexes/{indexUid}/search`

```
Expand All @@ -148,7 +155,7 @@ csv format example
}
```

- 🔴 Specifying parameters that do not conform to the `_geoRadius` signature causes the API to return an [invalid_search_parameter_filter](0061-error-format-and-definitions.md#invalid_search_parameter_filter) error.
- 🔴 Specifying parameters that do not conform to the `_geoRadius` or `_geoBoundingBox` signature causes the API to return an [invalid_search_parameter_filter](0061-error-format-and-definitions.md#invalid_search_parameter_filter) error.
- 🔴 Using `_geoDistance`, `_geo` or `_geoPoint` in a filter expression causes the API to return an [invalid_search_parameter_filter](0061-error-format-and-definitions.md#invalid_search_parameter_filter) error.

---
Expand Down Expand Up @@ -184,7 +191,7 @@ Following the [`sort` specification feature](https://github.com/meilisearch/spec
}
```
- 🔴 Specifying parameters that do not conform to the `_geoPoint` signature causes the API to return an [invalid_search_parameter_sort](0061-error-format-and-definitions.md#invalid_search_parameter_sort) error.
- 🔴 Using `_geoDistance`, `_geo` or `_geoRadius` in a sort expression causes the API to return an[invalid_search_parameter_sort](0061-error-format-and-definitions.md#invalid_search_parameter_sort) error.
- 🔴 Using `_geoDistance`, `_geo`, `_geoRadius` or `_geoBoundingBox` in a sort expression causes the API to return an[invalid_search_parameter_sort](0061-error-format-and-definitions.md#invalid_search_parameter_sort) error.

---

Expand All @@ -199,23 +206,16 @@ Following the [`sort` specification feature](https://github.com/meilisearch/spec
- Type: int
- Not required

> 💡 `_geoDistance` response field is only computed and shown when the end-user have sorted documents around a `_geoPoint`. So if the end-user filters documents using a `_geoRadius` built-in filter without sorting them around a `_geoPoint`, this field `_geoDistance` will not appear in the search response.
> 💡 `_geoDistance` response field is only computed and shown when the end-user have sorted documents around a `_geoPoint`. So if the end-user filters documents using a `_geoRadius/_geoBoundingBox` built-in filter without sorting them around a `_geoPoint`, this field `_geoDistance` will not appear in the search response.

---

#### Related Ranking Rules Settings API Errors

- 🔴 Specifying a custom ranking rule with `_geo`, `_geoDistance`, `_geoPoint`, or `_geoRadius` returns an [invalid_settings_ranking_rules](0061-error-format-and-definitions.md#invalid_settings_ranking_rules) error.
- 🔴 Specifying a custom ranking rule with `_geo`, `_geoDistance`, `_geoPoint`, `_geoRadius` or `_geoBoundinBox` returns an [invalid_settings_ranking_rules](0061-error-format-and-definitions.md#invalid_settings_ranking_rules) error.

---

### IV. Finalized Key Changes

- Add a `_geo` reserved field on JSON and CSV format to index a geo point coordinates for a document.
- Add a `_geoPoint(lat, lng)` built-in sort rule.
- Add a `_geoRadius(lat, lng, distance_in_meters)` built-in filter rule.
- Return a `_geoDistance` in `hits` objects representing the distance in meters computed from the `_geoPoint` built-in sort rule.

## 2. Technical Aspects

### I. Measuring
Expand All @@ -225,8 +225,7 @@ Following the [`sort` specification feature](https://github.com/meilisearch/spec

## 3. Future Possibilities

- Add built-in filter to filter documents within `polygon` and `bounding-box`.
- Add built-in filter to filter documents within `polygon`.
- Handling array of geo points in the document object.
- Handling multiple geo formats for the `_geo` field. e.g. "{lat},{lng}", a geohash etc.
- Handling distance in other formats (like the imperial format). **It's easy to implement on the user side though.**
- Handling position in other formats. It seems that [degrees and minutes](https://www.pacioos.hawaii.edu/voyager-news/lat-long-formats/) are also used a lot. **It's easy to implement on the user side though.**
2 changes: 1 addition & 1 deletion text/0061-error-format-and-definitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1095,7 +1095,7 @@ HTTP Code: `400 Bad Request` when `Synchronous`
}
```

#### Variant: Specifying a custom ranking rule on reserved expression `_geoRadius`
#### Variant: Specifying a custom ranking rule on reserved expressions `_geoRadius` / `_geoBoundingBox`

```json
{
Expand Down
6 changes: 4 additions & 2 deletions text/0118-search-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ expression = or
or = and ("OR" WS+ and)*
and = not ("AND" WS+ not)*
not = ("NOT" WS+ not) | primary
primary = "(" WS* expression WS* ")" | geoRadius | in | condition | exists | not_exists | to
primary = "(" WS* expression WS* ")" | geoRadius | geoBoundingBox | in | condition | exists | not_exists | to
in = attribute "IN" WS* "[" value_list "]"
condition = attribute ("=" | "!=" | ">" | ">=" | "<" | "<=") value
exists = attribute "EXISTS"
Expand Down Expand Up @@ -163,6 +163,7 @@ The grammar for the value of a filterable attribute is the same as the grammar f
- OR: `filter OR filter`
- NOT: `NOT filter`
- GeoSearch: `_geoRadius(lat, lng, distance)`
- GeoSearch: `_geoBoundingBox([lat, lng], [lat, lng])`

###### 3.1.2.1.5 Equality

Expand Down Expand Up @@ -350,7 +351,8 @@ attribute != value1 AND attribute != value2 AND ...

###### 3.1.2.1.12 Geo Search

The `_geoRadius` operator selects the documents whose geographical coordinates fall within a certain range of a given coordinate. See [GeoSearch](0059-geo-search.md) for more information.
- The `_geoRadius` operator selects the documents whose geographical coordinates fall within a certain range of a given coordinate. See [GeoSearch](0059-geo-search.md) for more information.
- The `_geoBoundingBox` operator selects the documents whose geographical coordinates fall within a square described by the given coordinates. See [GeoSearch](0059-geo-search.md) for more information.

##### 3.1.2.2. Array Syntax

Expand Down