Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@searchable generated DynamoDB-to-ES trigger converts JSON integer to decimal. #371

Open
andrew-aernos opened this issue Oct 7, 2019 · 7 comments

Comments

@andrew-aernos
Copy link

andrew-aernos commented Oct 7, 2019

Note: If your issue/bug is regarding the AWS Amplify Console service, please log it in the
Amplify Console GitHub Issue Tracker

Describe the bug
A clear and concise description of what the bug is.
Having AWS custom scalar type timestamp fields (or integer types) in a type marked @model and @searchable

To Reproduce
Steps to reproduce the behavior:

  1. Create some type in schema.graphql model
type Foo @model @searchable{
  id: ID!
  content: String!
  createdAtEpochSeconds: AWSTimestamp
}
  1. Deploy with amplify push
  2. Using the generated api, create an example Foo in the api console
mutation {
  CreateFoo(input: {content: 'blah'}){
    id
    createdAtEpochSeconds
  }
}
  1. Check the DynamoDB to ES triggers' cloudwatch log:
    Dynamo New Image event json, is okay
{
  'id': '079c4bdf-61ea-416a-857e-35492a6b0145', 
  'createdAt': {'S': '2019-10-07T21:54:17.996Z'}, 
  'createdAtEpochSeconds': {'N': '1570485257'}, 
  'version': {'N': '1'}
}

But, Posting to ES turns integer number into float/double

{
  'id': '079c4bdf-61ea-416a-857e-35492a6b0145', 
  'createdAt': '2019-10-07T21:54:17.996Z', 
  'createdAtEpochSeconds': 1570485257.0,   // <= causing headache in searchFoos()
  'version': 1.0 // <= problematic
}

Expected behavior
A clear and concise description of what you expected to happen.
I expect both the 'createdAtEpochSeconds' and 'version' to be indexed an integer. The amplify codegen resulting API.service.ts generated SearchableIntFilterInput for these fields.

The long epoch seconds 1570485257 turns into 1570485257.0, and then somehow to 1.570485257E9 scientific notation string when trying to search the records. This causes downstream searchFoos serialize/de-serialize issues, for example, when trying to resolve the createdAtEpochSeconds AWSTimestamp field:

"Can't serialize value (/searchFoos/items[0]/createdAtEpochSeconds) : Unable to serialize `1.570485257E9` as a valid timestamp. Ensure that the value provided is an Integer that lies within the limits specified in this scalar's description."

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS] Windows 10
  • Browser [e.g. chrome, safari] Chrome
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@andrew-aernos andrew-aernos changed the title @searchable generated DynamoDB trigger converts JSON integer to decimal. @searchable generated DynamoDB-to-ES trigger converts JSON integer to decimal. Oct 7, 2019
@yuth
Copy link
Contributor

yuth commented Oct 8, 2019

Elastic search infers types of indexes based on the content i finds and not based on the types that are defined by the schema. I am marking this as an enhancement and adding this to our backlog.

@jcbdev
Copy link

jcbdev commented Jun 29, 2020

@yuth

I am experiencing this problem with the field _lastChangeAt used in conflict resolution (I think?)

the error I get is:

Can't serialize value (/searchEvents/items[7]/_lastChangedAt) : Unable to serialize `1.59339025024E12` as a valid timestamp.
 Ensure that the value provided is an Integer that lies within the limits specified in this scalar's description.

this is a straightforward @searchable directive on an api with conflict resolution (nothing special or custom going on)

I could probably put the mapping for the field in kibana before the index is created but this seems like a pretty big bug??

@DaZhang1994
Copy link

Well, I used an UGLY method to suppress this exception.

If you don't need the '_lastChangedAt' field (which is added by amplify framework by default), you can try to override this value to 0 (or any other integer less than MAX_LONG), because ES(Elasticsearch) will convert field type to float automatically if it is larger than the max number of 'long' type.

I tried to created a type mapping manually first on AWS ES and set '_lastChangedAt' to 'long' type, but it didn't work, ES will ignore your type mapping if the value of field exceeds MAX_LONG and still convert it to float automatically :(

So, I wrote a pipeline on AWS ES using Kibana (pre-installed by AWS) like this (pipeline is an interceptor/AOP implementation for ES)

PUT _ingest/pipeline/overrideLastChangedAt
{
  "description": "set _lastChangedAt field to 0 (we need to change this value to an integer, but cant convert such a large number, so we set it to 0)",
  "processors" : [
    {
      "set" : {
        "field" : "_lastChangedAt",
        "value"  : 0,
        "override": true
      }
    }
  ]
}

And then, invoke your pipeline by overwriting your default AWS Lambda trigger (which is created by amplify automatically)
Find this line:

req = AWSRequest(method=method, url=proto + host +
                    quote(path), data=payload, headers={'Host': host, 'Content-Type': 'application/json'})

Change it to:

req = AWSRequest(method=method, url=proto + host +
                    quote(path) + '?pipeline=overrideLastChangedAt', data=payload, headers={'Host': host, 'Content-Type': 'application/json'})

All done.

@jcbdev
Copy link

jcbdev commented Jul 5, 2020

@DaZhang1994 Interesting.

I haven't had a chance to try solving this yet as I have other more pressing issues in my current project but I may need to do this so thanks!

As a matter of interest when you did the custom type mapping did you try "date" formate instead of "long"? I have no idea if it works but I was going to try this.

@DaZhang1994
Copy link

DaZhang1994 commented Jul 6, 2020

@DaZhang1994 Interesting.

I haven't had a chance to try solving this yet as I have other more pressing issues in my current project but I may need to do this so thanks!

As a matter of interest when you did the custom type mapping did you try "date" formate instead of "long"? I have no idea if it works but I was going to try this.

I tried, but none of them worked. (long, int, date, none worked).

But I found another way to solve this problem by checking AWS lambda log.

Deserialized doc_fields: {'createdAt': '2020-07-06T00:42:24.619Z', '_lastChangedAt': 1593996144640.0, 'title': 'myModelTitle', '__typename': 'MyModelType', , 'id': 'xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', '_version': 1.0, 'updatedAt': '2020-07-06T00:42:24.619Z'}

The problem is '_lastChangedAt': 1593996144640.0, which means AWS default lambda function (or their DynamoDB deserialization libraries) convert _lastChangedAt from Integer to a Float type. We just need to convert it back by modifying the default lambda function.

After this line

        doc_fields = ddb_deserializer.deserialize({'M': ddb[image_name]})

Add this (just convert it from float to int using Python)

        doc_fields['_lastChangedAt'] = int(doc_fields['_lastChangedAt'])

Tested and it worked without exception now.
If you need to add more fields like '_lastChangedAt' but meet the same exception, you can try this method as well.

@saltonmassally
Copy link

@yuth is there an ETA on this?

@jcbdev
Copy link

jcbdev commented Jun 17, 2021

I've submitted a PR which I think should fix this for all timestamp fields. Hopefully it will get some love from the amplify team.

aws-amplify/amplify-cli#7534

EDIT: Closed this PR but then reopened it as I fixed the bug I found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants