Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How put an array of objects at Google Cloud Datastore? #3361

Closed
gustavorps opened this issue May 3, 2017 · 7 comments
Closed

How put an array of objects at Google Cloud Datastore? #3361

gustavorps opened this issue May 3, 2017 · 7 comments
Assignees
Labels
api: datastore Issues related to the Datastore API.

Comments

@gustavorps
Copy link

gustavorps commented May 3, 2017

OS : Ubuntu 14.04.5
Python: 3.4.3
google-cloud-datastore: 0.24.0

Code example:

from google.cloud import datastore

class GoogleDatastorePipeline(object):
    
    def __init__(self, settings, stats):
        self.client = datastore.Client()
        self.batch = self.client.batch()

    def open(self):
            self.batch.begin()

    def process(self):
        key = self.client.key('Article')
        entity = datastore.Entity(key)
        entity['name'] = 'A post'
        entity['content'] = '<html></html>'
        # How I insert this as array of objects? Can be indexed, if no, no problem.
        entity['authors'] = [{
            'name': 'Author 1', 
            'type': 'person', 
        },{
            'name': 'Author 2', 
            'type': 'organization', 
        }]

        self.batch.put(entity)

    def close(self):
            self.batch.commit()

Stacktrace:

  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/ubuntu/workspace/ze/pipelines/__init__.py", line 215, in process_item
    self.batch.put(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 199, in put
    _assign_entity_to_pb(entity_pb, entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 319, in _assign_entity_to_pb
    bare_entity_pb = helpers.entity_to_protobuf(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 219, in entity_to_protobuf
    _set_protobuf_value(value_pb, value)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 408, in _set_protobuf_value
    attr, val = _pb_attr_value(val)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 325, in _pb_attr_value
    raise ValueError("Unknown protobuf attr type %s" % type(val))
ValueError: Unknown protobuf attr type <class 'dict'>
2017-05-03 01:30:16 [scrapy.core.engine] INFO: Closing spider (finished)
2017-05-03 01:30:16 [google_auth_httplib2] DEBUG: Making request: POST https://accounts.google.com/o/oauth2/token
2017-05-03 01:30:16 [scrapy.core.engine] ERROR: Scraper close failure
Traceback (most recent call last):
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/retry.py", line 120, in inner
    return to_call(*args)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/retry.py", line 68, in inner
    return a_func(*updated_args, **kwargs)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/grpc/_channel.py", line 507, in __call__
    return _end_unary_response_blocking(state, call, False, deadline)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/grpc/_channel.py", line 455, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Mutation is missing operation.)>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/_gax.py", line 74, in _catch_remap_gax_error
    yield
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/_gax.py", line 173, in commit
    return super(GAPICDatastoreAPI, self).commit(*args, **kwargs)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/gapic/datastore/v1/datastore_client.py", line 345, in commit
    return self._commit(request, options)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/api_callable.py", line 419, in inner
    return api_caller(api_call, this_settings, request)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/api_callable.py", line 407, in base_caller
    return api_call(*args)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/api_callable.py", line 368, in inner
    return a_func(*args, **kwargs)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/retry.py", line 126, in inner
    ' classified as transient', exception)
google.gax.errors.RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Mutation is missing operation.)>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/ubuntu/workspace/ze/pipelines/__init__.py", line 193, in close_spider
    self.batch.commit()
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 273, in commit
    self._commit()
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 249, in _commit
    self.project, mode, self._mutations, transaction=self._id)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/_gax.py", line 173, in commit
    return super(GAPICDatastoreAPI, self).commit(*args, **kwargs)
  File "/usr/lib/python3.4/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/_gax.py", line 82, in _catch_remap_gax_error
    six.reraise(error_class, new_exc, sys.exc_info()[2])
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/_gax.py", line 74, in _catch_remap_gax_error
    yield
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/_gax.py", line 173, in commit
    return super(GAPICDatastoreAPI, self).commit(*args, **kwargs)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/gapic/datastore/v1/datastore_client.py", line 345, in commit
    return self._commit(request, options)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/api_callable.py", line 419, in inner
    return api_caller(api_call, this_settings, request)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/api_callable.py", line 407, in base_caller
    return api_call(*args)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/api_callable.py", line 368, in inner
    return a_func(*args, **kwargs)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/gax/retry.py", line 126, in inner
    ' classified as transient', exception)
google.cloud.exceptions.BadRequest: 400 Mutation is missing operation.
@gustavorps
Copy link
Author

gustavorps commented May 3, 2017

I tested the JSON below on Datastore Entites (https://console.cloud.google.com/datastore/entities/new) and worked!

{
    "values": [
        {
            "entityValue": {
                "properties": {
                    "name": {
                        "stringValue": "NAME1"
                    },
                    "type": {
                        "stringValue": "TYPE1"
                    }
                }
            }
        },
        {
            "entityValue": {
                "properties": {
                    "name": {
                        "stringValue": "NAME2"
                    },
                    "type": {
                        "stringValue": "TYPE2"
                    }
                }
            }
        }
    ]
}

screenshot from Google Cloud Datastore Entities

But in python don't

entity['author'] = {
    'values': [
        {
            'entityValue': {
                'properties': {
                    'name': {
                        'stringValue': 'NAME1'
                    },
                   'type': {
                        'stringValue': 'TYPE1'
                    }
                }
            }
        },{
            'entityValue': {
                'properties': {
                    'name': {
                        'stringValue': 'NAME2'
                    },
                   'type': {
                        'stringValue': 'TYPE2'
                    }
                }
            }
        }
    ]
}

Traceback

Traceback (most recent call last):
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/ubuntu/workspace/ze/pipelines/__init__.py", line 247, in process_item
    self.client.put(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/client.py", line 384, in put
    self.put_multi(entities=[entity])
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/client.py", line 408, in put_multi
    current.put(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 199, in put
    _assign_entity_to_pb(entity_pb, entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 319, in _assign_entity_to_pb
    bare_entity_pb = helpers.entity_to_protobuf(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 219, in entity_to_protobuf
    _set_protobuf_value(value_pb, value)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 408, in _set_protobuf_value
    attr, val = _pb_attr_value(val)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 325, in _pb_attr_value
    raise ValueError("Unknown protobuf attr type %s" % type(val))

@dhermes
Copy link
Contributor

dhermes commented May 3, 2017

Thanks for filing @gustavorps I'll look into it.

@dhermes dhermes added api: datastore Issues related to the Datastore API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 3, 2017
@dhermes
Copy link
Contributor

dhermes commented May 3, 2017

@gustavorps Two things:

  • As your second stacktrace shows, dict is not supported as an entity type. Here is the list of supported types, though we should do a better job surfacing this list / making __setitem__ fail for a bad type. What you want to use there is another Entity, not a dict.
  • Do you want a Batch or a Transaction? A batch will just accumulate mutations, but won't commit them transactionally.

I am preemptively closing this issue since the dict -> Entity fix covers your two errors (the second error occurs because batch.put() will fail with a bad entity with a dict field and if the put() failed then there will be no mutations to commit).

We can re-open and continue discussion if you think there are more things to discuss.

@dhermes dhermes closed this as completed May 3, 2017
@dhermes dhermes added docs and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 3, 2017
dhermes added a commit to dhermes/google-cloud-python that referenced this issue May 3, 2017
H/T to @gustavorps for bringing this up in googleapis#3361.

Also snuck in a change in `google.cloud.datastore.helpers` to use
`six.binary_type` in place of `(str, bytes)`. (It wasn't a Py3 error
before because that check came **after** a `six.text_type` check.)
@gustavorps
Copy link
Author

Thx for fas thet replay @dhermes.
Now i'm using a Entity instead of a dict and try 2 forms to add to array of entityValues and the results is this:

First Form

key = client.key('Article')
entity = datastore.Entity(key)
entity['name'] = 'A post'
entity['content'] = '<html></html>'

author1 = datastore.Entity('author1')
author1['name'] = 'NAME1'
author1['type'] = 'TYPE1'

author2 = datastore.Entity('author2')
author2['name'] = 'NAME2'
author2['type'] = 'TYPE2'

entity['author'] = [author1, author2]
client.put(entity)

Traceback

Traceback (most recent call last):
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/ubuntu/workspace/ze/pipelines/__init__.py", line 234, in process_item
    self.client.put(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/client.py", line 384, in put
    self.put_multi(entities=[entity])
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/client.py", line 408, in put_multi
    current.put(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 199, in put
    _assign_entity_to_pb(entity_pb, entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 319, in _assign_entity_to_pb
    bare_entity_pb = helpers.entity_to_protobuf(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 219, in entity_to_protobuf
    _set_protobuf_value(value_pb, value)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 420, in _set_protobuf_value
    _set_protobuf_value(i_pb, item)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 414, in _set_protobuf_value
    entity_pb = entity_to_protobuf(val)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 209, in entity_to_protobuf
    key_pb = entity.key.to_protobuf()
AttributeError: 'str' object has no attribute 'to_protobuf'

Second Form

key = client.key('Article')
entity = datastore.Entity(key)
entity['name'] = 'A post'
entity['content'] = '<html></html>'

author1 = datastore.Entity('author1')
author1['name'] = 'NAME1'
author1['type'] = 'TYPE1'

author2 = datastore.Entity('author2')
author2['name'] = 'NAME2'
author2['type'] = 'TYPE2'

entity['author'] = { 'values': [author1, author2] }
client.put(entity)

Traceback

Traceback (most recent call last):
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/ubuntu/workspace/ze/pipelines/__init__.py", line 234, in process_item
    self.client.put(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/client.py", line 384, in put
    self.put_multi(entities=[entity])
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/client.py", line 408, in put_multi
    current.put(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 199, in put
    _assign_entity_to_pb(entity_pb, entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/batch.py", line 319, in _assign_entity_to_pb
    bare_entity_pb = helpers.entity_to_protobuf(entity)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 219, in entity_to_protobuf
    _set_protobuf_value(value_pb, value)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 408, in _set_protobuf_value
    attr, val = _pb_attr_value(val)
  File "/home/ubuntu/workspace/env/lib/python3.4/site-packages/google/cloud/datastore/helpers.py", line 325, in _pb_attr_value
    raise ValueError("Unknown protobuf attr type %s" % type(val))
ValueError: Unknown protobuf attr type <class 'dict'>

I decided to use the client instead of a transaction because my code runs for more than 1 minute and only make inserts and batch how you say will just accumulate mutations, but won't commit them transactionally.

@dhermes
Copy link
Contributor

dhermes commented May 3, 2017

datastore.Entity takes a key as the first argument, not a string ('author1' or 'author2'). This is why you saw

AttributeError: 'str' object has no attribute 'to_protobuf'

@gustavorps
Copy link
Author

My bad @dhermes!
Everything is work now. Thx very much for the support.

You can close the issue.

@dhermes
Copy link
Contributor

dhermes commented May 3, 2017

Cheers!

dhermes added a commit that referenced this issue May 9, 2017
H/T to @gustavorps for bringing this up in #3361.

Also snuck in a change in `google.cloud.datastore.helpers` to use
`six.binary_type` in place of `(str, bytes)`. (It wasn't a Py3 error
before because that check came **after** a `six.text_type` check.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API.
Projects
None yet
Development

No branches or pull requests

2 participants