Add support for multiple data elements in notifications #191

ohylli · 2019-05-13T10:45:33Z

The notify API can now process notifications whose data array contains more than one element.
Previously it returned an error message with the HTTP status 400.
reporter.reporter.notify method now loops through each data element
performing the operations previously done to the single element such as adding the time index and validating.
translators.crate.CrateTranslator._preprocess_values method now adds None to the values list if an entity update doesnot have a value for a given attribute.
Before it threw a KeyError exception. This allows _insert_entities_of_type to successfully insert
entities that have different attributes.

Tests were modified to test for this new feature:

Test data was added to conftest
test_notify.test_no_multiple_data_elements was renamed and modified to test the notify API with the test data.
test_insert_same_type_entities_with_different_attrs test was added to test_crate.
It uses the test data to check that the above described modification to CrateTranslator works.
Utils.common.check_notifications_record method was modified to use None if there is no value for an attribute in an entity update.
This was required for the method to work in the test_insert_same_type_entities_with_different_attrs test.

This pull request solves issue #185

The notify API can now process notifications whose data array contains more than one element. Previously it returned an error message with the HTTP status 400. reporter.reporter.notify method now loops through each data element performing the operations previously done to the single element such as adding the time index and validating. translators.crate.CrateTranslator._preprocess_values method now adds None to the values list if an entity update doesnot have a value for a given attribute. Before it threw a KeyError exception. This allows _insert_entities_of_type to successfully insert entities that have different attributes. Tests were modified to test for this new feature: - Test data was added to conftest - test_notify.test_no_multiple_data_elements was renamed and modified to test the notify API with the test data. - test_insert_same_type_entities_with_different_attrs test was added to test_crate. It uses the test data to check that the above described modification to CrateTranslator works. - Utils.common.check_notifications_record method was modified to use None if there is no value for an attribute in an entity update. This was required for the method to work in the test_insert_same_type_entities_with_different_attrs test.

chicco785

@ohylli thx for your pr! I did few comments, the experts for a proper review are @taliaga and @c0c0n3 ;)

chicco785 · 2019-05-13T12:19:30Z

src/conftest.py

@@ -147,6 +147,53 @@ def entity():
    }
    return entity

+@pytest.fixture
+def sameTypeEntitiesWithDifferentAttrs():
+    """


i wonder why only testing same entity id?

No special reason for testing with same entity id. It just happens to match my use case for this feature. Different ids would work as well although test_insert_same_type_entities_with_different_attrs would require some changes since query would return 2 records not one and utils.common.check_notifications_record requires that all entities have the same id and there is only one record.

chicco785 · 2019-05-13T12:22:00Z

src/translators/crate.py

-                    raise NotImplementedError(msg)
+                except KeyError:
+                    # this entity update does not have a value for the column so use None which will be inserted as NULL to the db.
+                    values.append( None )


not sure about the implications of this change? @taliaga @c0c0n3 ?

I think this is actually a good change. In fact, you get a KeyError when accessing a key that's not in a dictionary. In our case that can happen only if the attribute dictionary has no value key, so setting it to None which will ultimately result in a DB null for the corresponding field is actually consistent with what we've done in PR #122. Also, PR #122 should make this code block unreachable actually, in other words I think the except clause is dead code!

I'm not 100% sure it's dead code as it could still be subject to input in key-values format. True, a key-values input might break earlier than this step, but still. In any case, I agree with chicco that it'd be a safer play to at least add a debug log record here stating which attribute of which entity is being "nullified".

With my modifications this except clause is not dead code. Its required for example when inserting the new sameTypeEntitiesWithDifferentAttrs test entities I added to src/conftest. There the first entity has attributes temperature and pressure and the second only temperature. This is different from the situation in pull request #122 where the attribute is present but has no value where as here the attribute pressure is completely missing from the entity update. So when _preprocess_values processes the latter entity update the KeyError exception is raised and None is added as the value for pressure. Of course None could be added earlier in src/reporter/reporter.py as its done with missing values but I did it here since it was a simple change.

oh right! thanks for clarifying I missed that.

chicco785 · 2019-05-13T12:24:11Z

src/utils/common.py

-        n_values = [e[a]['value'] for e in notifications]
+        # collect values for the attribute a from the entities in the notification
+        # if an entity does not have a value for the attribute use None
+        n_values = [e.get( a, { 'value': None } )['value'] for e in notifications]


so basically, this is the place where "none" is entered if an attribute value is missing.

i think a specific test to validate this change is needed. also this change may be better to be in a separate pr
@taliaga @c0c0n3 ?

As far as I can tell, the setting of missing attribute values to None happens in the reporter

https://github.com/smartsdk/ngsi-timeseries-api/pull/122/files/a78eb256fbe307599ce1c6fef2e0f9623f9f3c50#diff-f796da60daee45177c252e52fc78b4aaR96:

Again, this is a change we brought in with PR #122---see my comment above.
On the other hand this block of code @ohylli changed is part of the check_notifications_record function which is only called from test code, so if all tests pass, this must be a good change, right? However, in principle this line shouldn't be needed, again because of PR #122. So I suggest rolling back the changes to this file, unless doing so causes some tests to fail, in which case we'd have to figure out why.

This modification is required by the new Test_insert_same_type_entities_with_different_attrs test I added where I use this method. This test uses the new test data I added to src/conftest. As I said in my previous comment about my modification to _preprocess_values the situation with this modification is also different from pr #122 since here the attribute is completely missing instead of only the value missing.

I understand this is a needed change because now with this PR the symmetry of the entities (in terms of set of attributes) is no longer hold, because now an a (attribute) of entity e1 (entity) might not exist in another e2.

The catch I mentioned earlier is, under this assumption of symmetry, the check_notifications_record method has been taking the set of attributes from the first record... https://github.com/smartsdk/ngsi-timeseries-api/blob/f639c6c3a5c4d774b59db8e16716b0616ca747ab/src/utils/common.py#L221. To make sure you test all attributes (their values and Nones), you should instead take the complete set of attributes (considering all records).

Thanks in advance for your patience with our comments. Since this PR is changing a critical assumption, we want to make sure we don't oversee places where this symmetry break could cause more problems (specially those that do not break tests)

On a side note,

Maybe slightly cleaner: n_values = [e[a]['value'] if a in e else None for e in notifications]

No need for a separate PR IMO

Is this required for this pull request since all tests including my new one gives one record for this. check_notifications_record function also has an assert for checking that there is only one record.

Of course if we end up creating more tests for this pr then check_notifications_record probably will require this modification.

no problem, we can take care of that later.

taliaga

Thanks for your PR @ohylli . All in all, as you said in the issue, the introduced change is simple in its own; it'll just require a bit of effort on the testing side.

taliaga · 2019-05-13T12:24:02Z

src/conftest.py

@@ -147,6 +147,53 @@ def entity():
    }
    return entity

+@pytest.fixture
+def sameTypeEntitiesWithDifferentAttrs():


maybe sameEntityWithDifferentAttrs ?

anyway not a big deal :)

I changed the name and the related test name since it is more clear.

taliaga · 2019-05-13T12:25:11Z

src/reporter/reporter.py

+        # Validate entity update
+        error = _validate_payload(entity)
+        if error:
+            return error, 400


If there were many, maybe we could report all errors at once. Wdyt?

This seems not to be possible or at least not simple. Only things _validate_payload reports errors about are missing type or id but Connexion alrady checks them before an gives its own error which only reports the issue it noticed first. I know nothing about Connexion so I don't know if this could be some how done with it.

also, we'll have to think about what to do if validation for a record fails. What if we have a batch of 10 and two entities fail validation? Are we going to ditch the whole lot?! Or should we rather accept the 8 that are valid and return an error message indicating partial failure, possibly with an indication of which entity failed validation and why?

Will be covered by #193

taliaga · 2019-05-13T12:27:22Z

src/reporter/tests/test_notify.py

+    Test that the notify API can process notifications containing multiple elements in the data array.
+    """
+    notification['data'] = sameTypeEntitiesWithDifferentAttrs
+    print(json.dumps(notification))


leftover to be removed

I removed it.

taliaga · 2019-05-14T07:20:53Z

src/reporter/tests/test_notify.py

-                        }
-                    }
-    notification['data'].append(second_element)
+def test_multiple_data_elements(notification, sameTypeEntitiesWithDifferentAttrs, clean_mongo, clean_crate):


we should test what happens with multiple elements of different ngsi types, right? Regardless if this PR aims to support that as well or not.

In both cases we should also have a new test that checks the actual values (and the Nones) being retrieved from the translator (this one is just to make sure the request was processed)

yes, definitely.

Do you mean tests for the notify API i.e. reporter.notify or data base insert i.e. CrateTranslator.insert. For the latter there I already added one test as you noticed. For testing elements with different entity types does the existing test_query_all in test_crate work or is a new test with entities of different types including same type entities with different attributes required.

Both. In your use-case, are you inserting multiple entities always of the same type or sometimes mixed?

BTW, we'll not block this PR for this missing test, but it's something that needs to be added.

In my use-case the entities are always of the same type.

taliaga · 2019-05-14T07:28:26Z

src/translators/crate.py

-                    raise NotImplementedError(msg)
+                except KeyError:
+                    # this entity update does not have a value for the column so use None which will be inserted as NULL to the db.
+                    values.append( None )


I'm not 100% sure it's dead code as it could still be subject to input in key-values format. True, a key-values input might break earlier than this step, but still. In any case, I agree with chicco that it'd be a safer play to at least add a debug log record here stating which attribute of which entity is being "nullified".

taliaga · 2019-05-14T08:02:42Z

src/translators/tests/test_crate.py

@@ -29,6 +29,21 @@ def test_insert_entity(translator, entity):

    check_notifications_record([entity], loaded_entities)

+def test_insert_same_type_entities_with_different_attrs( translator, sameTypeEntitiesWithDifferentAttrs ):


Ok, good, this is one of the tests I mentioned above. There's a catch though at assertion time, see my comment on common.py

taliaga · 2019-05-14T08:12:41Z

src/utils/common.py

-        n_values = [e[a]['value'] for e in notifications]
+        # collect values for the attribute a from the entities in the notification
+        # if an entity does not have a value for the attribute use None
+        n_values = [e.get( a, { 'value': None } )['value'] for e in notifications]


I understand this is a needed change because now with this PR the symmetry of the entities (in terms of set of attributes) is no longer hold, because now an a (attribute) of entity e1 (entity) might not exist in another e2.

The catch I mentioned earlier is, under this assumption of symmetry, the check_notifications_record method has been taking the set of attributes from the first record... https://github.com/smartsdk/ngsi-timeseries-api/blob/f639c6c3a5c4d774b59db8e16716b0616ca747ab/src/utils/common.py#L221. To make sure you test all attributes (their values and Nones), you should instead take the complete set of attributes (considering all records).

Thanks in advance for your patience with our comments. Since this PR is changing a critical assumption, we want to make sure we don't oversee places where this symmetry break could cause more problems (specially those that do not break tests)

On a side note,

Maybe slightly cleaner: n_values = [e[a]['value'] if a in e else None for e in notifications]

No need for a separate PR IMO

…ntities_withDifferent_attrs

- remove unnecessary print from test_notify - Change one expression to a cleaner one in check_notifications_record

ohylli · 2019-05-16T11:14:43Z

@chicco785, @c0c0n3, @taliaga I think I have addressed all of your comments with commits or my own comments. Since you have not responded for a few days I wanted to check if you are expecting something from me or if you just have not yet had time to look into this which I naturally understand.

c0c0n3 · 2019-05-16T11:48:12Z

@ohylli sorry for the slow reply :-)
we actually talked about this PR yesterday and are thinking of merging it in. You can ignore my comments about error handling and batch processing, since we decided to implement those features in a separate PR. For the rest, we're all very happy with your work and would like to thank you very much for taking the time to do this.

ghost · 2019-05-20T08:54:58Z

Hi @ohylli and @c0c0n3 Sir,

I am also facing the same issue in our project and so I am interested in this feature to be merged as soon as possible.
Could please merge this issue or let me know in case any contribution is required. I am happy to help.

Thanks

c0c0n3 · 2019-05-21T08:32:28Z

Hi @SourabhChourasiya-NEC, thanks for offering to help :-)
We should be able to work on this next week and will definitely let you know if we need an extra pair of hands, thanks!

chicco785 reviewed May 13, 2019

View reviewed changes

taliaga suggested changes May 14, 2019

View reviewed changes

ohylli added 2 commits May 14, 2019 13:21

Rename sameTypeEntitiesWithDifferentAttrs and test_insert_same_type_e…

c424a18

…ntities_withDifferent_attrs

Minor changes suggested by reviewers

ea9f93c

- remove unnecessary print from test_notify - Change one expression to a cleaner one in check_notifications_record

c0c0n3 mentioned this pull request May 16, 2019

Batch processing of notifications #193

Open

taliaga approved these changes May 22, 2019

View reviewed changes

taliaga merged commit 56cb44b into orchestracities:master May 22, 2019

c0c0n3 mentioned this pull request Jun 7, 2019

How to deal with nulls? #204

Open

		@@ -29,6 +29,21 @@ def test_insert_entity(translator, entity):

		check_notifications_record([entity], loaded_entities)

		def test_insert_same_type_entities_with_different_attrs( translator, sameTypeEntitiesWithDifferentAttrs ):

Add support for multiple data elements in notifications #191

Add support for multiple data elements in notifications #191

Conversation

ohylli commented May 13, 2019

chicco785 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taliaga left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ohylli commented May 16, 2019

c0c0n3 commented May 16, 2019

ghost commented May 20, 2019

c0c0n3 commented May 21, 2019