-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extending BatchActuateStreamResponse #103
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #103 +/- ##
==========================================
- Coverage 59.35% 59.32% -0.03%
==========================================
Files 33 33
Lines 16048 16063 +15
==========================================
+ Hits 9525 9530 +5
- Misses 6523 6533 +10 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
if let Some(signal_id) = batch_actuate_stream_response.signal_id { | ||
match signal_id.signal { | ||
Some(proto::signal_id::Signal::Path(path)) => { | ||
msg = msg + ", path: " + &path; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to use format!()
here to create the debug msg
messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean having something like
msg =format! ("{}, path: {}" msg, &path);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. exactly that
proto/kuksa/val/v2/val.proto
Outdated
@@ -280,7 +282,12 @@ message BatchActuateStreamRequest { | |||
repeated ActuateRequest actuate_requests = 1; | |||
} | |||
|
|||
// Message that can be used by provider to indicate that an actuation request was not accepted. | |||
// It is optional for the provider to use this message, and for performance reasons it is recommended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really disagree with what you said here, just to qualify it a bit, I think it won't add a big overhead if the provider returns an OK response in case of success, since kuksa_client actuate is an unary call (it is not supposed to be used with a high frequency as publish datapoint stream). If we add one day actuate_stream to kuksa_client, it should not definitely return an OK response up success to reduce stream overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, returning feedback to the originator of the actuate
call seems way better than introducing any form of optimization here. But it should probably be made a lot more clear what the expected provider behaviour is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no problems changing it so that it is expected or even required that a result is returned. But if so, do we want say something more on the meaning of OK and when it should be sent. Like does it mean:
I acknowledge reception and did not find any problem with the request
or
I confirm that the request has been executed!
Or something in between? Or leave it to the provider? It could have an impact if we have the ambition to return a response to the requester - like how long do we want to wait for a response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me apart from just one comment added.
I am also wondering, is it possible that a provider has a different accepted value range for a signal than databroker? Provider should build its value ranges for signal according metadata received from databroker right?
message BatchActuateStreamResponse { | ||
SignalID signal_id = 1; | ||
Error error = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a architectural question, but do we really want a single Error
enum that will contain every error possible for all functions in the whole service?
Wouldn't it be better to have specific errors for different functions, i.e. an ActuationError
that only contains the errors possible when trying to actuate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the enum I used matched quite well. At least I can find use-cases for all of them
- OK - Good to have a default value that indicates no error
- INVALID_ARGUMENT - value is received is out of expected range
- NOT_FOUND - signal id is not known by provider
- PERMISSION_DENIED - signal is known by provider, but it does not accept that it is changes. A possible example could be that the provider has not registered itself for the signal id, so it does not expect to get it.
Off course one can argue that everything here is INVALID_ARGUMENT, so we do not need the rest. But to me the set seems quite reasonable.
https://github.com/eclipse-kuksa/kuksa-databroker/blob/main/proto/kuksa/val/v2/types.proto#L64
message Error {
ErrorCode code = 1;
string message = 2;
}
enum ErrorCode {
OK = 0;
INVALID_ARGUMENT = 1;
NOT_FOUND = 2;
PERMISSION_DENIED = 3;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this could be further refined in the future I think to use ErrorCode if it meets current needs is sufficient for now.
} | ||
} | ||
msg = msg + ", error code: " + &error.code.to_string() + ", error message: " + &error.message; | ||
debug!(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is just a first step in implementing this. But shouldn't the error coming from the provider be forwarded to the originator of the actuate(...)
call, i.e. the "consumer"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I looked mostly at the Provider rpc.
But I assume we need quite a lot refactoring if we are to return results to the "consumer", some aspects I think of:
- If
Actuate
/BatchActuate
are to return actual results, the Databroker will need to wait until the providers has replied (forBatchActuate
there could be multiple, right?) - That means the providers MUST send a response also on OK. Possibly combined with a timeout in databroker so that Actuation is considered unsuccessful if Databroker does not get a response within X seconds.
- Then we maybe also need some form of the request-id/statefulness, so we know which request the error belong to. At least theoretically there might be multiple consumers trying to change the same signal, right? And it can happen partially in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you are discussing here was postponed if I remember correctly as we were not sure if we wanted to put databroker to wait until it receives ACK of receipt and forwards it to the consumer.
So in my opinion, this PR is enough for now if databroker can output some debug actuate information.
// - Databroker sends BatchActuateStreamRequest -> Provider may return one or more BatchActuateStreamResponse upon error, | ||
// but should for performance reasons send nothing upon success. | ||
// It is up to the provider to decide if the stream shall be closed, | ||
// as of today Databroker will not react on the received error message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should anyone close the stream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it as a reasonable behavior if you as a provider get a lot of junk from the Databroker, like if you get a lot of BatchActuateStreamRequest
where the signal id does not match the IDs provided in the ProvideActuationRequest
you have sent, or the values does not match the value limitations the Provider have internally. I think it is up to the Provider to decide how many errors it will tolerate before it gets pissed off and closes the stream.
Like if a provider has received 1000 requests and all of them have been faulty, why should it keep the stream open
proto/kuksa/val/v2/val.proto
Outdated
@@ -280,7 +282,12 @@ message BatchActuateStreamRequest { | |||
repeated ActuateRequest actuate_requests = 1; | |||
} | |||
|
|||
// Message that can be used by provider to indicate that an actuation request was not accepted. | |||
// It is optional for the provider to use this message, and for performance reasons it is recommended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, returning feedback to the originator of the actuate
call seems way better than introducing any form of optimization here. But it should probably be made a lot more clear what the expected provider behaviour is here.
Not necessarily as I see it. Taking the Kuksa CAN Provider as example, it's main source of truth is a Vspec file that is loaded as configuration. It expects the databroker to have a matching vspec, at least for the files it has in common, it does not read metadata from Databroker. As a provider typically needs business logic to convert from VSS value to some vehicle network, that conversion mechanism is likely hard-coded or based on some config that is defined before a connection to Databroker is established. In the best of worlds the vspec files used by Velocitas Apps, Databroker and Providers should match, but there is no guarantee for that. There is also the theoretical possibility of a malfunctioning or compromised Databroker that does not perform the checks it should or sends updates to the "wrong" provider. So as summary - in system that is correctly configured and without bugs there should never be any reason for a provider to reject a request - at least as long as we only reject if signal name or value is incorrect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and enough to me for now, I agree with @argerus that errors should be propagated to consumer, but we might need some more time to refine that concept better.
This PR is based on analysis of
OpenProviderStream
at https://github.com/eclipse-kuksa/kuksa-databroker/blob/main/proto/kuksa/val/v2/val.proto#L184. One identified gap is if the provider by some reason does not like the actuate request received from Databroker. As of today the only counter measure it can do is to close the stream.This PR propose to extend
BatchActuateStreamResponse
so that the provider can send error messages back to databroker. It can send as manyBatchActuateStreamResponse
as it likes.It reuses
A hypothetical use case is the the provider by some reason has a different accepted value range for a signal. Then it could return
INVALID_ARGUMENT
. Databroker will as of today just print this info on debug level, but it could help for troubleshooting. The provider may in addition decide to close the stream, if it thinks the error is so severe that it do not want to talk to Databroker any longer. A hypothetical example could be if the Provider assumes that the Databroker is compromised or has a misconfiguration.