Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP8266: Avoid duplicate data sends (5.15) #13164

Merged
merged 7 commits into from
Jul 28, 2020

Conversation

ccli8
Copy link
Contributor

@ccli8 ccli8 commented Jun 22, 2020

Summary of changes

Backport of #12157 to mbed-os-5.15.


Pull request type

[x] Patch update (Bug fix / Target update / Docs update / Test update / Refactor)
[] Feature update (New feature / Functionality change / New API)
[] Major update (Breaking change E.g. Return code change / API behaviour change)

Test results

[] No Tests required for this change (E.g docs only update)
[x] Covered by existing mbed-os tests (Greentea or Unittest)
[] Tests / results supplied as part of this PR

Reviewers

@michalpasztamobica


michalpasztamobica and others added 6 commits June 19, 2020 17:45
We are now checking if ESP8266 has confirmed receiving data over serial
port with an undocumented (but existing) "Recv x bytes" message. Next we
are explicitly waiting for an official "SEND OK".
1.  Fix 'spurious close' by adding close() in open(). 'spurious close' gets frequent and cannot ignore when send() changes to asynchronous. User can retry open() until 'spurious close' gets true.
2.  Allow only one actively sending socket because:
    (1) ESP8266 AT packets 'SEND OK'/'SEND FAIL' are not associated with socket ID. No way to tell them.
    (2) In original implementation, ESP8266::send() is synchronous, which implies only one actively sending socket.
3.  Register 'SEND OK'/'SEND FAIL' oobs, like others in ESP8266::ESP8266 constructor. Don't get involved in oob management with send status because ESP8266 modem possibly doesn't reply these packets on error case.
4.  Now that ESP8266::send() changes to asynchronous, drop the code with _parser.recv("SEND OK")/_parser.recv("SEND FAIL"). _parser.recv("SEND OK")/_parser.recv("SEND FAIL") and 'SEND OK'/'SEND FAIL' oobs both consume 'SEND OK'/'SEND FAIL' packets and complicate flow control.
This is because the ESP8266 is now waiting for SEND OK and takes much
more to complete the send_repeat, and echo_burst tests in RAAS.
(but has higher pass ratio).
@ciarmcom
Copy link
Member

@ccli8, thank you for your changes.
@michalpasztamobica @ARMmbed/mbed-os-maintainers please review.

@ciarmcom ciarmcom requested review from michalpasztamobica and a team June 22, 2020 03:00
Copy link
Contributor

@michalpasztamobica michalpasztamobica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the fix is required on the 5.15 branch, then I don't mind. Technically this looks sane.

@ccli8
Copy link
Contributor Author

ccli8 commented Jun 22, 2020

If the fix is required on the 5.15 branch, then I don't mind. Technically this looks sane.

@michalpasztamobica Some Nuvoton's targets use ESP8266 as default network interface and still need to maintain 5.15 support. So backport this PR.

0xc0170
0xc0170 previously approved these changes Jun 22, 2020
@mergify mergify bot added needs: CI and removed needs: review labels Jun 22, 2020
@0xc0170
Copy link
Contributor

0xc0170 commented Jun 26, 2020

CI started

@mbed-ci
Copy link

mbed-ci commented Jun 26, 2020

Test run: FAILED

Summary: 2 of 10 test jobs failed
Build number : 2
Build artifacts

Failed test jobs:

  • jenkins-ci/mbed-os-ci_dynamic-memory-usage-lts
  • jenkins-ci/mbed-os-ci_greentea-test-lts

@mergify mergify bot added needs: work and removed needs: CI labels Jun 26, 2020
@michalpasztamobica
Copy link
Contributor

The greentea failures look relevant, they happenned on K66F+ESP8266, not on other targets:
These GT tests failed:

[2020-06-26T14:39:24.795Z] 	test case: 'TCPSOCKET_SEND_REPEAT' ........................................................... FAIL in 10.86 sec
[2020-06-26T14:39:24.795Z] 	test case: 'TCPSOCKET_SEND_TIMEOUT' .......................................................... FAIL in 0.10 sec
[2020-06-26T14:39:24.795Z] 	test case: 'TCPSOCKET_SETSOCKOPT_KEEPALIVE_VALID' ............................................ OK in 0.12 sec
[2020-06-26T14:39:24.795Z] 	test case: 'TCPSOCKET_THREAD_PER_SOCKET_SAFETY' .............................................. FAIL in 2.81 sec

@ccli8 , are you able to try it on your desk and see if you can reproduce it?

The dynamic memory check look rather suspicious, as it claims to have failed with K66F and ethernet:

[2020-06-26T13:56:30.748Z] | GCC_ARM-K66F | K66F          | ethernet   | ethernet   | 0      | 1      | ERROR  | 78.27              |

@mergify mergify bot dismissed 0xc0170’s stale review June 30, 2020 08:02

Pull request has been modified.

@ccli8
Copy link
Contributor Author

ccli8 commented Jun 30, 2020

are you able to try it on your desk and see if you can reproduce it?

@michalpasztamobica Reproducible on my NUMAKER_IOT_M487. With send(...) changing to non-block, modem can be busy in sending previous data and fail AT+CIPCLOSE in close(...). netsocket-tcp tests require close(...) be OK. I add more retries in close(...) to make it succeed to some degree to pass GT tests.

@michalpasztamobica
Copy link
Contributor

@ccli8 , I wonder - is the _busy flag set in the scenario you described? If so - perhaps it would be safer to only loop if the flag is indeed set? With the latest change, not only is the maximum loop count increased but also a 1s delay added to every loop, which may block close() for up to 10 seconds.
Also - if possible, it would be worth checking if the socket is in non-blocking mode and only looping in that case :).
Let me know your thoughts on this.

With send(...) changing to non-block, modem can be busy in sending previous data and 'busy' the current 'AT+CIPCLOSE' command.
In blocking mode, add retries to avoid spurious close to some degree.
This is required to pass GT netsocket-tcp/netsocket-tls tests which expect close(...) to be OK.
@ccli8 ccli8 force-pushed the esp8266_send_retry_5.15 branch from 07ad4a8 to 935134e Compare July 1, 2020 08:04
@ccli8
Copy link
Contributor Author

ccli8 commented Jul 1, 2020

@michalpasztamobica Recover ESP8266::close(...) and move loop logic to ESP8266Interface::socket_close(...) on blocking mode.

@0xc0170
Copy link
Contributor

0xc0170 commented Jul 22, 2020

CI started

@mbed-ci
Copy link

mbed-ci commented Jul 22, 2020

Test run: SUCCESS

Summary: 10 of 10 test jobs passed
Build number : 3
Build artifacts

@adbridge
Copy link
Contributor

Approved by @andypowers

@adbridge adbridge merged commit a85f2bb into ARMmbed:mbed-os-5.15 Jul 28, 2020
ccli8 added a commit to OpenNuvoton/NuMaker-mbed-Aliyun-IoT-CSDK-example that referenced this pull request Aug 28, 2020
Update to mbed-os 6.0 is suspended due to incompatibility.

Update to mbed-os 5.15.5 which fixes ESP8266 driver issue:
ARMmbed/mbed-os#13164
ccli8 added a commit to OpenNuvoton/NuMaker-mbed-Aliyun-IoT-CSDK-OTA-example that referenced this pull request Aug 28, 2020
Update to mbed-os 6.0 is suspended due to incompatibility.

Update to mbed-os 5.15.5 which fixes ESP8266 driver issue:
ARMmbed/mbed-os#13164
ccli8 added a commit to OpenNuvoton/Mbed-to-Azure-IoT-Hub that referenced this pull request Sep 30, 2020
Update to mbed-os 5.15.5 which fixes ESP8266 driver issue:
ARMmbed/mbed-os#13164
@cyliangtw cyliangtw deleted the esp8266_send_retry_5.15 branch March 9, 2023 03:59
ccli8 added a commit to ccli8/NuMaker-mbed-ce-AWS-IoT-example that referenced this pull request Nov 26, 2024
Update to mbed-os 6.0 is suspended due to incompatibility.

Update to mbed-os 5.15.5 which fixes ESP8266 driver issue:
ARMmbed/mbed-os#13164
ccli8 added a commit to ccli8/NuMaker-mbed-ce-wifi-tcp-example that referenced this pull request Jan 7, 2025
Update to mbed-os 6.0 is suspended due to incompatibility.

Update to mbed-os 5.15.5 which fixes ESP8266 driver issue:
ARMmbed/mbed-os#13164
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants