-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Something went wrong" when commissioning through Android app #463
Comments
It seems we get the information from the CommissioningRequestMetadata object at https://github.com/home-assistant/android/blob/5d779a27aa98bd5f075c843537a46a71be8ba387/app/src/full/java/io/homeassistant/companion/android/matter/MatterCommissioningService.kt#L61. We could get more specific information there e.g. the device location (IP) directly. With that we'd not need to rediscover the device on the Matter Server side. |
If we get some more info of the device, we could so a call to "get discoverable nodes" first |
I first considered using The |
FWIW, the
Let me know if you want a build of the app with specific changes for the on network commissioning. |
The discriminator is a static per device unique value. I expect Google just passes the one they learned from the QR code or manual pairing code there, so I'd guess that value is there. We'll have to try it out. What I am not sure about is if and how we can distinguish between a short and a long discriminator. It does matter for the device as we need to tell the CommissionOnNetwork API which one to filter for. Reading the Discriminator docs I don't think there is a way to know for sure 😢 If we can't learn from the Android API, then we probably just pass the discriminator as integer to the backend for now, and see how to go about it there. I am guessing that we get 0x300 (768) as short discriminator, if the discriminator is 0x348 (the long discriminator example of 840 from the spec). If that is the case, then we can just mask the lower 8-bit and if they are zero, assume a short discriminator. So we'd introduce the following websocket API in Core: @websocket_api.websocket_command(
{
vol.Required(TYPE): "matter/commission_on_network",
vol.Required("pin"): int,
vol.Optional("discriminator"): int,
}
) |
Yeah if you can create a build which passes the discriminator as outlined above would be perfect 👍 |
The Matter server logs look something like this (this capture has some additional hacked in log entries):
|
I would also be happy to use such a build... since I am having the same issue: |
Looking at it again, it looks like we confused who creates each object... The After scanning completes, the app receives a |
Hm, right, so I guess this API is mainly meant for device manufacturer. They would limit the device to their PID/VID. It is weird though that |
When using the Android Matter commissioning flow from the Home Assistant App (via CommissioningRequest), we get the passcode only, no QR code or manual pairing code (see CommissioningRequestMetadata documentation). This means there is no discriminator. In other words, the server will try to find a commissionable device, and use the pincode against the first one found. It seems that sometimes another device is announced to be commissionable at the same time, which obvisouly has another passcode, hence the commission fails quite quickly. The App shows "Something went wrong". This seems to be particularly common when using Thread: The Thread border router forwards and caches DNS-SD/mDNS service information about commissionable Thread devices with its SRP server service. The cache is up to 2h long. So when a communication breakdown happens after a particular device went into commission mode, this commissionable service entry lingers in the network for quite some time. Now if the OTBR sends this entry before the new/valid entry, the Matter Server tries to commission a device which is no longer responding. The CommissioningRequestMetadata have a way to get the device's IP address. This change extends commission_on_network WS endpoint to also take an IP address. The SDK CommissionIP service is used to commission the device. Note: CommissionIP is marked deprecated currently. This is mainly to prevent implementation which would ask users for IP addresses, which is not the intended way to implement commissioning. However, for this particular usecase the API seems very sensible and works as intended. Fixes: home-assistant-libs#463
When using the Android Matter commissioning flow from the Home Assistant App (via CommissioningRequest), we get the passcode only, no QR code or manual pairing code (see CommissioningRequestMetadata documentation). This means there is no discriminator. In other words, the server will try to find a commissionable device, and use the pincode against the first one found. It seems that sometimes another device is announced to be commissionable at the same time, which obvisouly has another passcode, hence the commission fails quite quickly. The App shows "Something went wrong". This seems to be particularly common when using Thread: The Thread border router forwards and caches DNS-SD/mDNS service information about commissionable Thread devices with its SRP server service. The cache is up to 2h long. So when a communication breakdown happens after a particular device went into commission mode, this commissionable service entry lingers in the network for quite some time. Now if the OTBR sends this entry before the new/valid entry, the Matter Server tries to commission a device which is no longer responding. The CommissioningRequestMetadata have a way to get the device's IP address. This change extends commission_on_network WS endpoint to also take an IP address. The SDK CommissionIP service is used to commission the device. Note: CommissionIP is marked deprecated currently. This is mainly to prevent implementation which would ask users for IP addresses, which is not the intended way to implement commissioning. However, for this particular usecase the API seems very sensible and works as intended. Fixes: #463
Actually, for this error it seems more typical that the commissioning fails quickly:
From start to failure it just took 6s, which indicates the the Matter Server was communicating to a wrong device. This is with the logging enhancements. In the current 5.0.3. Matter server |
Further investigations show that certain devices announce themself twice not the network, e.g. the "Aqara Door and Window Sensor P2" seems to announce twice as a commission-able device when using the App commissioning mode:
Looking at the SRP entries on the OTBR shows that the two entries have different:
And
It seems that the Matter SDK does not like when two devices announce themselfs as commissionable. They do use different discriminator, but point to the same device, so in theory when not checking the discriminator the SDK should resolve to any of these two hosts (and it shouldn't matter which one as they resolve to the same host). But in testing the SDK trips every time with this device when using When using |
I just realized that the current implementation is problematic for WiFi devices if they resolve to link-local addresses:
|
With #27 merged, the scope id is accepted. But with an invalid scope id the API will just revert to host OS routing table which on systems with containers running typically doesn't prefer the primary interface, hence commissioning attempts still fail:
|
When receiving link-local IPv6 addresses, remove the scope by default as the scope id is typically from a remote machine (e.g. from the Android phone). If the Matter server got started with a primary interface set, scope the link-local address to that interface instead. This fixes commissioning issue for WiFi devices when using the Android in-app commissioning flow which sends the IP address alongside. Related-to: home-assistant-libs#463
When trying to commission a device through the Android App sometimes simply "Something went wrong" is returned.
Note: There might be multiple issue behind that particular error, but this outlines what I've been able to reproduce here.
There are two ways to on-board a Matter device to Home Assistant using Android:
When using method 2, the Android app uses the
matter/commission_on_network
WS command is used. The passcode is passed to this endpoint, no further information. The Matter Server (SDK) then searches for any commissionable device on the network, and (presumably) picks the first it hears about. This of course could be a different one.Furthermore, a BR might cache DNS-SD entry for a commissionable device (any BR runs a SRP server, which caches service announcements on behalf of Thread devices). Now if a Thread device got reset during commissioning at one point, such a stale entry might linger around for up to 2h (in my particular test case).
I've noticed this by checking the service announcements during a failed commissioning attempt. Two announcements were present:
And it turned out that in this case the second announcement was from a stale Thread device from the OTBR Add-on:
(I've run this command for debugging from within the container)
The text was updated successfully, but these errors were encountered: