Releases · pelias/api

Address parsing is huge for an address geocoder and this release takes a first crack at it using AddressIt module. AddressIt is a freeform street address parser, that is designed to take a piece of text and convert that into a structured address that can be processed in different systems.

> var addressit = require('addressit')

> addressit('123 main st new york ny 10010 usa')
{ text: '123 main st new york ny 10010 usa',
  parts: [],
  unit: undefined,
  number: 123,
  street: 'main st',
  state: 'NY',
  country: 'USA',
  postalcode: 10010,
  regions: [ 'new york' ] }

Before the pelias API calls addressit for address parsing, it does some basic checks by parsing query to ensure that we dont slow things down drastically when unnecessary for example the following are the cases where we dont need address parsing -

input=a or input=au or input=aus - if the input has 3 or less characters, we could assume its not a fully formed address, in fact - we can go one step further by only targeting admin layers because if we return results such as austin, australia etc it should be relevant but more importantly fast.
input=boston or input=frankfurt or input=somereallybigname or input=new york - if the input is just one or even two tokens and does not contain a number - we can get away with just targeting admin and poi layers

In all other cases, we do address parsing and handle the address parts to query the ES index. Here's a sample mapping

number + street -> name.default
number -> address.number
street -> address.street
postalcode -> address.zip
state -> admin1_abbr
country -> alpha3
regions -> admin2

Sometimes, the address parser comes back empty handed

> addressit('123 chelsea, london')
{ text: '123 chelsea, london',
  parts: [],
  unit: undefined,
  state: undefined,
  country: undefined,
  postalcode: undefined,
  regions: [ '123 chelsea', 'london' ] }

In this case, we take fall back to the naive approach we implemented months ago - where we split the address based on a comma and assume everything that follows the comma is an admin part and add a match block in the should array. So, we query name.default with 123 chelsea and the should array in the query would try to match london with all the 5 admin fields

admin0
admin1
admin1_abbr
admin2
alpha3

All of this logic lives in helper/query_parser.js and is well documented with in-line comments. The query changes can be seen in query/search.js.

An additional 104 test cases were written to test out all the above mentioned logic and to test the query building - bringing the grand total of unit tests for the API to 708!

Assets 2

21 Jul 14:31

hkrishna

2.1.1

5401ed9

Deleting code is so much fun

Code cleanup - deleted all suggester related code (843 deletions) FTW!
Tech Debt - Better 408/500 error handling

Assets 2

21 Jul 14:26

hkrishna

2.1.0

28e656f

Minor cleanup -> minor speedup

Minor cleanup, minor speedup and minor performance improvement - brought to you by:

removed exact_match script
increased search radius to 500kms

Assets 2

21 Jul 14:23

hkrishna

2.0.0

812aa9f

NGRAMS

This release is a big one, we are using ngrams to analyze/tokenize & are officially moving away from using the context suggester that is memory intensive and wasn't letting us build an autocomplete suggester on a global scale

Some major Features:

partial matching using the ngrams approach ftw! https://www.elastic.co/guide/en/elasticsearch/guide/current/_ngrams_for_partial_matching.html
better support for geohashes https://github.com/pelias/schema/blob/ngram/mappings/partial/centroid.js
explicit definitions of how field data is to be stored
improved punctuation https://github.com/pelias/schema/blob/ngram/punctuation.js
improved synonyms: https://github.com/pelias/schema/blob/ngram/street_suffix.js

Assets 2

06 May 20:25

sevko

1.2.1

dff3f98

category scoring, bbox format change, fix JSONification bug

Minor improvements/bugfixes:

change the API bbox parameter format to a more conventional format (#117)
fix a bounding-box runtime error (#124)
boost records belonging to certain feature categories (#106)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.1.2 (2016-08-19)

Bug Fixes

3.1.1 (2016-08-09)

Bug Fixes

3.1.0 (2016-08-05)

Features

3.0.2 (2016-08-03)

Bug Fixes

3.0.1 (2016-07-28)

Bug Fixes

Releases: pelias/api

v3.1.2

3.1.2 (2016-08-19)

Bug Fixes

v3.1.1

3.1.1 (2016-08-09)

Bug Fixes

v3.1.0

3.1.0 (2016-08-05)

Features

v3.0.2

3.0.2 (2016-08-03)

Bug Fixes

v3.0.1

3.0.1 (2016-07-28)

Bug Fixes

Address Parsing

Deleting code is so much fun

Minor cleanup -> minor speedup

NGRAMS

category scoring, bbox format change, fix JSONification bug