-
Notifications
You must be signed in to change notification settings - Fork 608
AVS will become abnormal after Internet disconnect / resume #727
Comments
Hi! |
Hi @mavamazon , Here I update another test result yesterday, and I need to correct the condition in my origin report. |
Hi @zeusshuang, To help with the investigation, is this a new issue introduced by 1.7.1? Or have you seen this occur in previous versions of the SDK? |
Hi @kjkh , It also happens on previous SDK. At least we start to look into this case since 1.6. |
Hi @zeusshuang , from your logs 2018-05-18 01:00:57.116 [ 14] 0 HTTP2StreamPool:getStream:streamId=49,numAcquiredStreams=2 I can see that your utterance was tried to be sent to the server, but unfortunately timed out and was not able to send. Can you check while the issue is happening that your device can connect to the server by issuing the following command: curl https://avs-alexa-na.amazon.com It should return a response. After issuing that command, try to trigger your application again. |
Hi @visayamv , I tried it, it will respond "Unauthorized_request_exception". But it's no help, the symptom is the same as my report before. BTW, this log "HTTP2Transport:networkLoop_582::Request Timeout", is my added from case#582 suggestion. |
Hi @zeusshuang, I'm actively looking at this issue and have not been able to reproduce it yet. When you are running this test, are you using the SampleApp from 1.7.1, or are you testing a custom application. Are you testing it on an embedded platform, or on a desktop? I do not have a router handy, but I set up an equivalent test environment by running SampleApp from 1.7.1 on my mac, connecting over WiFi to my cell phone. I then disabled cellular data on the phone while leaving the WiFi access point active. This should be equivalent to your scenario of pulling the internet connection from a router. In this test, I was able to observe the tunein station ceasing to play after disabling the internet connection. After 10 minutes, I restored the internet connection and was able to ask alexa for a joke successfully. I did not see the tunin station resume playing however. If you are not testing with plain SampleApp on desktop, could I trouble you to try it and check whether you can reproduce the issue using your router to break the internet connection? In the mean time, I will continue to investigate here. Ken |
Hi @kencecka , I tested it on our embedded system. By your test steps, I think it should be the same as my tests. Not sure in your test, will the player still play stream after Internet disconnected (due to it has buffer) ? |
Hi @kencecka , Sorry to update late. Our team verified this case with Amazon development Kit (Intel MIC solution), and we can see the same case. Here I attached 5 logs, two are okay to return normal cases, three are failed cases. faile_play_and_ping_timeout_curl_error.txt |
Hi @zeusshuang, Looking at failed_read1351.txt: The last activity on AudioPlayer's MediaPlayer thread is here:
When the state switches to LISTENING, a context is needed to send the Recognize event to AVS, leading to:
...which leads to:
... which leads to:
...which should be handled by AudioPlayer's MediaPlayer thread (i.e. thread 3). However, we never see a subsequent log with 'handleGetOffsetCalled' for thread 3, and a couple of seconds later ContextManager's timeout expires:
...which cancels the pending recognize event before it is sent. So, it appears that this is all caused by thread 3 getting stuck. That thread is driven by gstreamer. We expect that besides called things like handleGetOffset() it should be calling AttachmentReaderSource::handleReadData() over and over. If this thread is behaving as we expect, it seems very likely that it is in AttachmentReaderSource::handleReadData()'s call to m_reader->read(). In 1.7.1, that is expected to be a non-blocking call because AudioPlayer::handlePlayDirective() creates a NONBLOCKING attachment reader. In addition, the call to m_reader->read() is called with a 1 millisecond timeout value. Either should prevent this call from blocking. I suggest that we add a new log line in AttachmentReaderSource::handleReadData(), right before the call to m_reader->read(). For example:
If we reproduce this problem with the new log line, and we see the new log line without a subsequent log line emitted after the call to m_reader->read(), then we can be sure that is where the thread is getting stuck, and then dig deeper in to how that could be happening. If we do not find that pattern in the log, we will know that we need to track down other places this thread could be hung up. If by chance you can reproduce this problem under a debugger, it might be even easier to just get a backtrace of where this thread is blocked when the problem occurs. Can you reproduce the problem with the additional log line and/or provide us a backtrace of the blocked thread? Best regards, |
Hi @scotthea-amazon , The last logs for you is from Amazon development kit, it's AVS 1.6 (We have no newer to upgrade it, so we tried 1.6). I attached the new log by your suggestion. By our investigating, if the media player is in playing state and finally found no buffer and Internet failed, it will go to stuck. If it's not in playing state, eg. in idle or pause state, it will be safe. |
Hi @zeusshuang, Thank you for the logs and information. This line in particular is very interesting to me: It looks like AudioPlayer is indeed trying to pause and get out of the playing state (as it should be), but is failing. This ties in with what you said about the MediaPlayer getting stuck. I will investigate this and see if it's helpful. Thanks, |
Hi @zeusshuang, Since it appears that the blockage is occurring in the MediaPlayer, and likely within gstreamer, I suggest reproducing the problem with gstreamer's logging enabled. See: debugging tools. Something else that might help would be to capture a backtrace of all threads once the app is in this problem state. That might tell us where things are getting stuck. Also, we have still been unable to reproduce the problem from your description. Might there be something special about your media playback, network, or other configuration details that would help us to reproduce the issue? Thank you, |
Hi @scotthea-amazon , Thanks for the help. |
Hi, Thanks for pointing this issue out. We'll include a fix in the next release. Thanks, |
This issue should be fixed in v1.9. Thanks! |
Hi @kclchan , Could you help to point which files need merger into v1.7 from v1.9 If we need to use AVS SDK v1.7 ? Thanks |
Hi @jackeyu, unfortunately the fix involves changes to multiple files and is not simple to merge into v1.7. I would recommend upgrading to v1.9. |
Hi AVS team,
AVS SDK 1.7.1
I attached a log for this case, and it can be 100% reproduced.
-- while you listen audio start, remove the Internet from WiFi AP (but WiFi is still on)
-- TuneIn will continue to play 1 or 2 minutes, maybe buffering.
-- wait about 6 minutes (this log, I wait 10 minutes)
-- put Internet resume on WiFi AP, and make sure Internet is actually working.
-- No audio streaming resume from TuneIn
-- You will see the BBC TuneIn is still running on APP or Web APP (in playing state)
-- AVS go into listening state, no response. From the APP history, no text is detected.
AVS will never return until restart it.
And it's very interesting, if it's not in playing state, and test it in Idle state, I mean, in idle state, just disconnect Internet and resume, it will be fine and no problem.
Why I do this test is that, during past months test, we sometimes get AVS stuck in overnight test,
the symptom is AVS is no response, but whole system seems fine.
To review the source code, we found a ping time out for 5 minutes in "HTTP2Transport.cpp".
So I design this test to check if it will be problem while it failed to ping timeout.
And if you change 5 minutes to 1 minutes for example, it will reduce your test time.
I found, while you see the log "[ 3] 9 AttachmentReaderSource:read:size=0,status=1" dump,
it will go to fail.
In the log, you will see my first trigger it at line#5, and let it running for 10 minutes.
Next trigger is at line#7178, and line#7692. The AVS will go into listening state but no response.
2.log
The text was updated successfully, but these errors were encountered: