Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multiple data elements in notifications #191

Merged
merged 3 commits into from
May 22, 2019

Conversation

ohylli
Copy link
Contributor

@ohylli ohylli commented May 13, 2019

The notify API can now process notifications whose data array contains more than one element.
Previously it returned an error message with the HTTP status 400.
reporter.reporter.notify method now loops through each data element
performing the operations previously done to the single element such as adding the time index and validating.
translators.crate.CrateTranslator._preprocess_values method now adds None to the values list if an entity update doesnot have a value for a given attribute.
Before it threw a KeyError exception. This allows _insert_entities_of_type to successfully insert
entities that have different attributes.

Tests were modified to test for this new feature:

  • Test data was added to conftest
  • test_notify.test_no_multiple_data_elements was renamed and modified to test the notify API with the test data.
  • test_insert_same_type_entities_with_different_attrs test was added to test_crate.
    It uses the test data to check that the above described modification to CrateTranslator works.
  • Utils.common.check_notifications_record method was modified to use None if there is no value for an attribute in an entity update.
    This was required for the method to work in the test_insert_same_type_entities_with_different_attrs test.

This pull request solves issue #185

The notify API can now process notifications whose data array contains more than one element.
Previously it returned an error message with the HTTP status 400.
reporter.reporter.notify method now loops through each data element
performing the operations previously done to the single element such as adding the time index and validating.
translators.crate.CrateTranslator._preprocess_values method now adds None to the values list if an entity update doesnot have a value for a given attribute.
Before it threw a KeyError exception. This allows _insert_entities_of_type to successfully insert
entities that have different attributes.

Tests were modified to test for this new feature:
- Test data was added to conftest
- test_notify.test_no_multiple_data_elements was renamed and modified to test the notify API with the test data.
- test_insert_same_type_entities_with_different_attrs test was added to test_crate.
   It uses the test data to check that the above described modification to CrateTranslator works.
- Utils.common.check_notifications_record method was modified to use None if there is no value for an attribute in an entity update.
  This was required for the method to work in the test_insert_same_type_entities_with_different_attrs test.
Copy link
Contributor

@chicco785 chicco785 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ohylli thx for your pr! I did few comments, the experts for a proper review are @taliaga and @c0c0n3 ;)

@@ -147,6 +147,53 @@ def entity():
}
return entity

@pytest.fixture
def sameTypeEntitiesWithDifferentAttrs():
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder why only testing same entity id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No special reason for testing with same entity id. It just happens to match my use case for this feature. Different ids would work as well although test_insert_same_type_entities_with_different_attrs would require some changes since query would return 2 records not one and utils.common.check_notifications_record requires that all entities have the same id and there is only one record.

raise NotImplementedError(msg)
except KeyError:
# this entity update does not have a value for the column so use None which will be inserted as NULL to the db.
values.append( None )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the implications of this change? @taliaga @c0c0n3 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is actually a good change. In fact, you get a KeyError when accessing a key that's not in a dictionary. In our case that can happen only if the attribute dictionary has no value key, so setting it to None which will ultimately result in a DB null for the corresponding field is actually consistent with what we've done in PR #122. Also, PR #122 should make this code block unreachable actually, in other words I think the except clause is dead code!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure it's dead code as it could still be subject to input in key-values format. True, a key-values input might break earlier than this step, but still. In any case, I agree with chicco that it'd be a safer play to at least add a debug log record here stating which attribute of which entity is being "nullified".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my modifications this except clause is not dead code. Its required for example when inserting the new sameTypeEntitiesWithDifferentAttrs test entities I added to src/conftest. There the first entity has attributes temperature and pressure and the second only temperature. This is different from the situation in pull request #122 where the attribute is present but has no value where as here the attribute pressure is completely missing from the entity update. So when _preprocess_values processes the latter entity update the KeyError exception is raised and None is added as the value for pressure. Of course None could be added earlier in src/reporter/reporter.py as its done with missing values but I did it here since it was a simple change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right! thanks for clarifying I missed that.

n_values = [e[a]['value'] for e in notifications]
# collect values for the attribute a from the entities in the notification
# if an entity does not have a value for the attribute use None
n_values = [e.get( a, { 'value': None } )['value'] for e in notifications]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so basically, this is the place where "none" is entered if an attribute value is missing.

i think a specific test to validate this change is needed. also this change may be better to be in a separate pr
@taliaga @c0c0n3 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, the setting of missing attribute values to None happens in the reporter

https://github.com/smartsdk/ngsi-timeseries-api/pull/122/files/a78eb256fbe307599ce1c6fef2e0f9623f9f3c50#diff-f796da60daee45177c252e52fc78b4aaR96:

Again, this is a change we brought in with PR #122---see my comment above.
On the other hand this block of code @ohylli changed is part of the check_notifications_record function which is only called from test code, so if all tests pass, this must be a good change, right? However, in principle this line shouldn't be needed, again because of PR #122. So I suggest rolling back the changes to this file, unless doing so causes some tests to fail, in which case we'd have to figure out why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This modification is required by the new Test_insert_same_type_entities_with_different_attrs test I added where I use this method. This test uses the new test data I added to src/conftest. As I said in my previous comment about my modification to _preprocess_values the situation with this modification is also different from pr #122 since here the attribute is completely missing instead of only the value missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is a needed change because now with this PR the symmetry of the entities (in terms of set of attributes) is no longer hold, because now an a (attribute) of entity e1 (entity) might not exist in another e2.

The catch I mentioned earlier is, under this assumption of symmetry, the check_notifications_record method has been taking the set of attributes from the first record... https://github.com/smartsdk/ngsi-timeseries-api/blob/f639c6c3a5c4d774b59db8e16716b0616ca747ab/src/utils/common.py#L221. To make sure you test all attributes (their values and Nones), you should instead take the complete set of attributes (considering all records).

Thanks in advance for your patience with our comments. Since this PR is changing a critical assumption, we want to make sure we don't oversee places where this symmetry break could cause more problems (specially those that do not break tests)

On a side note,

  • Maybe slightly cleaner: n_values = [e[a]['value'] if a in e else None for e in notifications]
  • No need for a separate PR IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required for this pull request since all tests including my new one gives one record for this. check_notifications_record function also has an assert for checking that there is only one record.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course if we end up creating more tests for this pr then check_notifications_record probably will require this modification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem, we can take care of that later.

Copy link
Contributor

@taliaga taliaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR @ohylli . All in all, as you said in the issue, the introduced change is simple in its own; it'll just require a bit of effort on the testing side.

src/conftest.py Outdated
@@ -147,6 +147,53 @@ def entity():
}
return entity

@pytest.fixture
def sameTypeEntitiesWithDifferentAttrs():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe sameEntityWithDifferentAttrs ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway not a big deal :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the name and the related test name since it is more clear.

# Validate entity update
error = _validate_payload(entity)
if error:
return error, 400
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there were many, maybe we could report all errors at once. Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems not to be possible or at least not simple. Only things _validate_payload reports errors about are missing type or id but Connexion alrady checks them before an gives its own error which only reports the issue it noticed first. I know nothing about Connexion so I don't know if this could be some how done with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, we'll have to think about what to do if validation for a record fails. What if we have a batch of 10 and two entities fail validation? Are we going to ditch the whole lot?! Or should we rather accept the 8 that are valid and return an error message indicating partial failure, possibly with an indication of which entity failed validation and why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be covered by #193

Test that the notify API can process notifications containing multiple elements in the data array.
"""
notification['data'] = sameTypeEntitiesWithDifferentAttrs
print(json.dumps(notification))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover to be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it.

}
}
notification['data'].append(second_element)
def test_multiple_data_elements(notification, sameTypeEntitiesWithDifferentAttrs, clean_mongo, clean_crate):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should test what happens with multiple elements of different ngsi types, right? Regardless if this PR aims to support that as well or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both cases we should also have a new test that checks the actual values (and the Nones) being retrieved from the translator (this one is just to make sure the request was processed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, definitely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean tests for the notify API i.e. reporter.notify or data base insert i.e. CrateTranslator.insert. For the latter there I already added one test as you noticed. For testing elements with different entity types does the existing test_query_all in test_crate work or is a new test with entities of different types including same type entities with different attributes required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both. In your use-case, are you inserting multiple entities always of the same type or sometimes mixed?

BTW, we'll not block this PR for this missing test, but it's something that needs to be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my use-case the entities are always of the same type.

raise NotImplementedError(msg)
except KeyError:
# this entity update does not have a value for the column so use None which will be inserted as NULL to the db.
values.append( None )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure it's dead code as it could still be subject to input in key-values format. True, a key-values input might break earlier than this step, but still. In any case, I agree with chicco that it'd be a safer play to at least add a debug log record here stating which attribute of which entity is being "nullified".

@@ -29,6 +29,21 @@ def test_insert_entity(translator, entity):

check_notifications_record([entity], loaded_entities)

def test_insert_same_type_entities_with_different_attrs( translator, sameTypeEntitiesWithDifferentAttrs ):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good, this is one of the tests I mentioned above. There's a catch though at assertion time, see my comment on common.py

n_values = [e[a]['value'] for e in notifications]
# collect values for the attribute a from the entities in the notification
# if an entity does not have a value for the attribute use None
n_values = [e.get( a, { 'value': None } )['value'] for e in notifications]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is a needed change because now with this PR the symmetry of the entities (in terms of set of attributes) is no longer hold, because now an a (attribute) of entity e1 (entity) might not exist in another e2.

The catch I mentioned earlier is, under this assumption of symmetry, the check_notifications_record method has been taking the set of attributes from the first record... https://github.com/smartsdk/ngsi-timeseries-api/blob/f639c6c3a5c4d774b59db8e16716b0616ca747ab/src/utils/common.py#L221. To make sure you test all attributes (their values and Nones), you should instead take the complete set of attributes (considering all records).

Thanks in advance for your patience with our comments. Since this PR is changing a critical assumption, we want to make sure we don't oversee places where this symmetry break could cause more problems (specially those that do not break tests)

On a side note,

  • Maybe slightly cleaner: n_values = [e[a]['value'] if a in e else None for e in notifications]
  • No need for a separate PR IMO

ohylli added 2 commits May 14, 2019 13:21
- remove unnecessary print from test_notify
- Change one expression to a cleaner one in check_notifications_record
@ohylli
Copy link
Contributor Author

ohylli commented May 16, 2019

@chicco785, @c0c0n3, @taliaga I think I have addressed all of your comments with commits or my own comments. Since you have not responded for a few days I wanted to check if you are expecting something from me or if you just have not yet had time to look into this which I naturally understand.

@c0c0n3
Copy link
Member

c0c0n3 commented May 16, 2019

@ohylli sorry for the slow reply :-)
we actually talked about this PR yesterday and are thinking of merging it in. You can ignore my comments about error handling and batch processing, since we decided to implement those features in a separate PR. For the rest, we're all very happy with your work and would like to thank you very much for taking the time to do this.

@ghost
Copy link

ghost commented May 20, 2019

Hi @ohylli and @c0c0n3 Sir,

I am also facing the same issue in our project and so I am interested in this feature to be merged as soon as possible.
Could please merge this issue or let me know in case any contribution is required. I am happy to help.

Thanks

@c0c0n3
Copy link
Member

c0c0n3 commented May 21, 2019

Hi @SourabhChourasiya-NEC, thanks for offering to help :-)
We should be able to work on this next week and will definitely let you know if we need an extra pair of hands, thanks!

@taliaga taliaga merged commit 56cb44b into orchestracities:master May 22, 2019
@c0c0n3 c0c0n3 mentioned this pull request Jun 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants