Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update of mesh network library. #4718

Merged
merged 29 commits into from
Aug 1, 2018
Merged

Update of mesh network library. #4718

merged 29 commits into from
Aug 1, 2018

Conversation

aerlon
Copy link
Contributor

@aerlon aerlon commented May 10, 2018

The mesh network library was not working well, so I took a look at it and managed to improve it to a relatively well working state.

As noted in the new library readme: The library has been tested and works with Arduino core for ESP8266 version 2.3.0 (with default lwIP) and 2.4.1 (with lwIP 1.4). Random disconnects and crashes seem to happen with the most recent master branch, so better stick to the release builds for now. Also, nodes seem to be unable to connect to more than one other node when using lwIP 2.0.

…ring initial connection to speed up connection time. Add separate handlers for requests and responses. Add network password. Provide more detailed code example. Add optional verbose mode. Improve comments. Add readme file.
@d-a-v
Copy link
Collaborator

d-a-v commented May 11, 2018

About lwip2 not working properly, static IP bug was recently fixed in lwip2, after core-2.4.1 release.
Can you print out the full version string Esp.getFullVersion() (also visible in debug mode) ?

@aerlon
Copy link
Contributor Author

aerlon commented May 11, 2018

The output from ESP.getFullVersion() is: SDK:2.2.1(cfd48f3)/Core:win-2.5.0-dev/lwIP:1.4.0rc2

I noticed that a lwIP2 fix was released a few days ago (#4677) but I get the same result both with and without that fix.

In fact, I think the problem is unrelated to static IP. Even when I set the network to only use DHCP the problems persist.

The main issue when using the current master branch seems to be that just after transmitting something to an AP with client.print(), the following condition sometimes turns true: if (WiFi.status() == WL_DISCONNECTED || !curr_client.connected()) . More precisely, !curr_client.connected() turns true even though WiFi.status() == WL_DISCONNECTED is false. The AP still receives the station's transmission.

Indeed, removing the !curr_client.connected() check from the condition above makes the mesh code work well even with the most recent master branch (and lwIP 1.4), but it seems to me that if a station can both send and receive transmissions from an AP then !curr_client.connected() should be false. Especially since this is the case in previous releases. Is this a topic for a bug report?

@aerlon
Copy link
Contributor Author

aerlon commented May 11, 2018

I also did a trial run with lwIP2 and the current master branch with the !curr_client.connected() check removed.

Though transmissions work well initially as with lwIP1.4, whenever a connection to a second AP is established (meaning static IP is disabled and DHCP is activated (via WiFi.config(0u,0u,0u))) the node becomes unable to connect to any AP (connection attempts take more than 10 s).

Output from ESP.getFullVersion(): SDK:2.2.1(cfd48f3)/Core:win-2.5.0-dev/lwIP:2.0.3(STABLE-2_0_3_RELEASE/glue:arduino-2.4.1-7-g2b827f8)

@d-a-v
Copy link
Collaborator

d-a-v commented May 11, 2018

Well, that's a lot of informations. Can we accurately summarize ?

  • (WiFiClient:) curr_client.connected() is not true but nonetheless connected and working when a new client has come (via WiFiServer?), in some mode (STA, AP, AP+STA?), with both lwip1.4 and 2.

  • lwip2 does not behave correctly (at least not like lwip1.4) when connecting to a dhcp server (provided by another ESP in AP mode? locally in AP+STA mode? both ?) using WiFi.config(0u,0u,0u) (I'm not sure this is the recommended way to get to dhcp-client mode)

You'll understand that we need to reproduce bugs separately. As it is more easy to try and fix bugs with MCVEs, it'd be great if you could open issues with them.

…tructor. Make use of static IP optional by moving static IP initialization code to setStaticIP method. Increase scanning interval from one to two seconds in the HelloMesh.ino example to increase chances of successful connections. Update comments. Update README.rst.
@aerlon
Copy link
Contributor Author

aerlon commented May 11, 2018

Heh, yeah. And still I tried to be brief.

Your summary is correct. Answers:
Via WiFiServer? Yes.
Mode is STA for station, AP+STA for AP.

Server is provided by another ESP in AP+STA mode.
Checking the source at line 223-230 WiFi.config(0u,0u,0u) seemed to be the most supported method. I'd be happy to hear of a more official one, though, because I found no documentation about it.

I'll see if I can reduce the code to simpler test cases for you and open issues with them. ETA a day or two.

@d-a-v
Copy link
Collaborator

d-a-v commented May 11, 2018

Thanks.
About WiFi.config(0u,0u,0u) you are right, this seems to be the way. I'll check that with lwip2.

@d-a-v
Copy link
Collaborator

d-a-v commented May 14, 2018

I tried calling WiFi.config(0u,0u,0u) when IP is static with lwIP-v2 and it gets back to dhcp-client mode.

@aerlon
Copy link
Contributor Author

aerlon commented May 14, 2018

@d-a-v Issues #4728 and #4727 have been filed regarding the problems discussed above.

While working on #4727 I noticed that with the latest master branch and lwIP2 it is possible for a station to switch between WiFi networks even while using static IP, provided WiFi.mode(WIFI_OFF) is called before the switch. This is not possible with any other lwIP or ESP8266 Arduino core.

Previously I did not find a way to have a station switch networks while maintaining static IP (connections to all but the first network would always fail, thus the use of WiFi.config(0u,0u,0u) discussed above to activate DHCP), but it would seem possible after all.

Is there an officially supported way to have a station switch WiFi network while maintaining static IP that I've missed?

@d-a-v
Copy link
Collaborator

d-a-v commented May 14, 2018

Thanks for the two mcve!
I will look into and try them as soon as I can make a 3-esp setup to play with.

Is there an officially supported way to have a station switch WiFi network while maintaining static IP that I've missed?

I personally don't know. I think it is always safe and recommended to reset IP addresses (static or dhcp) after any network change (ethernet plugout-plugin, access-point change).

@aerlon
Copy link
Contributor Author

aerlon commented May 14, 2018

You're welcome!

OK.

@devyte
Copy link
Collaborator

devyte commented Jul 3, 2018

@aerlon is there anything pending for this?

@aerlon
Copy link
Contributor Author

aerlon commented Jul 3, 2018

@devyte Yes. I've been working on a large update during the last few weeks, trying to make the library more flexible. It is pretty much done now, so the plan is to push the update to this PR this week. The update will also modify the library documentation so that it is up to date with the recent bug fixes for lwIP2.

@devyte
Copy link
Collaborator

devyte commented Jul 4, 2018

Great! In that case I'll wait.

aerlon and others added 2 commits July 8, 2018 12:31
…reatly increase flexibility.

Changes:
* Make WiFi-connection related variables static to allow for the use of multiple ESP8266WiFiMesh instances on a single node (useful e.g. when communicating with several different mesh networks).
* Make it possible to choose AP port, which is helpful when using multiple ESP8266WiFiMesh AP:s on a single node.
* Add user-customizable network filter.
* Make activation of own AP optional for each mesh node.
* Add ways to change mesh network name and node id for existing ESP8266WiFiMesh instances.
* Add verboseModePrint method to clean up the code.
* Add reactivation of static IP after successful data transfers to speed up re-connection attempts.
* Add empty_IP constant which can be used to check if static IP is disabled for a ESP8266WiFiMesh instance.
* Remove the WiFiClient _client class variable in ESP8266WiFiMesh since there is no need to save _client in the class instance.
* Add transmission status as a return value from attemptTransmission.
* Pass calling ESP8266WiFiMesh instance pointer to callback functions to allow for greater range of actions in callbacks.
* Make transmission message a class variable to allow it to be stored in the class and accessed from callbacks.
* Add getters for mesh name and node id to ESP8266WiFiMesh.
* Add getter and setter for networkFilter to ESP8266WiFiMesh.
* Increase range of available node_id:s by changing the type to String and adding functions to convert between String and uint64_t using a customizable radix between 2 and 36.
* Make it possible to connect to several nodes during each attemptTransmission call.
* Add static connection_queue and latest_transmission_outcomes vectors to the ESP8266WiFiMesh class, a NetworkInfo class and a TransmissionResult class to aid in bookkeeping when connecting to several AP:s during one attemptTransmission call.
* Make wifi_channel and BSSID optional when connecting to an AP (though excluding them will slow down the connection process).
* Add optional scan and static ip optimizations available in Arduino core for ESP8266 version 2.4.2.
* Add functions to check lwIP version in order to enable WiFi optimizations only available with lwIP2.
* Add concluding_disconnect, initial_disconnect and no_scan options to the attemptTransmission method.
* Update documentation.
@aerlon
Copy link
Contributor Author

aerlon commented Jul 22, 2018

@devyte As per our discussions I have added a compatibility layer to the new library code. The old mesh library API is now deprecated but valid until core version 2.5.0.

unsigned int request_i = 0;
unsigned int response_i = 0;

String manageRequest(String request);
String manageRequest(String request, ESP8266WiFiMesh *mesh_instance);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The request argument is declared for passing by value, which needlessly duplicated mem. It should probably be declared as reference, or better a const reference.
While strictly speaking it's ok to declare the mesh_instance as a pointer, it's better to declare as reference to make it explicit that the arg can never be a nullptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@@ -1,50 +1,129 @@
#include <ESP8266WiFi.h>
#include <ESP8266WiFiMesh.h>

String mesh_name = "Mesh_Node";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be declared as String meshName(F("meshNode"));
Also, please use camelCase for naming convention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Camel case or snake case work either way for me. I noticed snake case was used in the original mesh library, but I take it style conventions has changed since then?

Regarding the actual mesh name, though, I think it may be prudent to not write it in the same style as a programming object. It is just a network name, after all, and its lack of relationship to other program objects may be easier for the novice user to deduce when it does not follow the typical naming conventions for objects in the library. Thus String meshName(F("Mesh_Node")); or String meshName(F("MeshNode_")); or some such mesh name format may be useful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style wasn't really pursued before, commenting on it is rather recent. In general, current naming convention is camelCase for C++ and snake_case for C.

String manageRequest(String request);
String manageRequest(String request, ESP8266WiFiMesh *mesh_instance);
transmission_status_t manageResponse(String response, ESP8266WiFiMesh *mesh_instance);
void networkFilter(int number_of_networks, ESP8266WiFiMesh *mesh_instance);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mesh_instanc declare as reference

@returns The string to send back to the other node
*/
String manageRequest(String request) {
String manageRequest(String request, ESP8266WiFiMesh *mesh_instance) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request => ref, mesh_instance => ref

Serial.println(request);

/* return a string to send back */
char response[60];
sprintf(response, "Hello world response #%d from Mesh_Node%d.", response_i++, ESP.getChipId());
sprintf(response, "Hello world response #%d from %s%s.", response_i++, mesh_instance->getMeshName().c_str(), mesh_instance->getNodeID().c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This string could easily be made larger than the size of response[60] by a user playing with the example. I suggest a more robust approach.

* @param base The radix of "string". Must be between 2 and 36.
* @returns A uint64_t of the string, using radix "base" during decoding.
*/
uint64_t ESP8266WiFiMesh::StringToUint64(String string, byte base)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string => ref

* Note that using a base higher than 16 increases likelihood of randomly generating ssid strings containing controversial words.
*
* @param string The string to convert to uint64_t. String must use radix "base".
* @param base The radix of "string". Must be between 2 and 36.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this 2-36 range enforced somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The problem is that I’m not sure how I would do that in a nice way since I can’t throw exceptions in this environment (as far as I know). The core conversion functions used by my code (String() and strtoul()) also seem to accept arguments that are out of range, so I just propagated that behaviour through my functions. Of course I could clamp the values to the 2-36 range, but then I would silently convert invalid input arguments to other values which I think could confuse the user more than just getting wrong results. Ideas are welcome.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WTB exceptions :) there have been several attempts, some I suspect are rather close.

I suggest a debug message and/or an assert for now.

}

return result;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this function fail? what happens if the string passed as arg is invalid or has a mistake?
Does it make sense to put these conversion files elsewhere? At least in standalone files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, any invalid input is treated as a 0.

}

return (result == "" ? "0" : result);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to put this function in a standalone file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I could do that. Do you mean as a separate utility class or just definitions in a separate .cpp?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just movemto some separate file would be enough. The functions are generic and not specific to the contents of these files.

* Note that using a base higher than 16 increases likelihood of randomly generating ssid strings containing controversial words.
*
* @param number The number to convert to a string with radix "base".
* @param base The radix to convert "number" into. Must be between 2 and 36.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this 2-36 range enforced somehow?

@devyte
Copy link
Collaborator

devyte commented Jul 24, 2018

@aerlon I had originally thought to wait on making this review, because most of what I saw was guidelines and cosmetic. However, I just took a last look before merging and noticed the passing by value instead of reference, and the duplicate declarations of the std::function types, so I decided to make the full review now.
Do you think you can still make the changes for 2.4.2?

@aerlon
Copy link
Contributor Author

aerlon commented Jul 24, 2018

@devyte Today is a little busy, but I should have time to go through this tomorrow.

@aerlon
Copy link
Contributor Author

aerlon commented Jul 26, 2018

@devyte Thanks for the feedback! I left some comments for further discussion on a few of the points. I’ll have to do some testing before answering a few of the others.

Actually it is quite good that you postponed the merge because looking through the new API for this I got an idea for a change that would be nice to get in before the code goes public. I’ll see if I can find you on Gitter tomorrow to discuss how creative I should get while implementing that change.

Ready by 1st of August you say? No problem! Software is always ready on time, as I’m sure you know. (Seriously though, it should be OK.)

@devyte
Copy link
Collaborator

devyte commented Jul 26, 2018

Code reviews always help in some way :)
Just look me up to discuss.

"We need it for yesterday, bug free, and performance tested"

d-a-v and others added 3 commits July 30, 2018 14:19
…tance.

- Separate methods for changing mesh name and node id from AP control methods.
- Add methods getAPController and isAPController to better handle situations when multiple ESP8266WiFiMesh instances take turns to be in control of the AP.
- Create separate UtilityMethods.cpp file for utility methods.
- Improve code efficiency and robustness, e.g. by passing arguments by reference instead of by value for non-POD types and employing typedefs.
- Update README.md.
@aerlon
Copy link
Contributor Author

aerlon commented Jul 31, 2018

@devyte Now all but the camelCase style issue should be fixed. I’ll take care of that one later today, then the PR should be ready for release. Unless new issues pop up.

*/
static const IPAddress emptyIP;

static String uint64ToString(uint64_t number, byte base = 16);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I wasn't clear about this. I meant move this completely out of the class. These two functions have nothing to do with the mesh nodes, or wifi, or even with the ESP really. They should be free standing functions.
But let's leave that for a subsequent cleanup.

* Calculate the current lwIP version number and store the numbers in the _lwipVersion array.
* lwIP version can be changed in the "Tools" menu of Arduino IDE.
*/
void ESP8266WiFiMesh::storeLwipVersion()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be possible to simplify this. As it is, it's rather fragile.
Note: investigate getting lwip version at compile time.

Copy link
Contributor Author

@aerlon aerlon Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. However there seemed to be no official way to get the lwIP version when I wrote this code, so I had to make my own. Still, a more officially supported method would be better.

} else if (transmissionResult.transmissionStatus == TS_TRANSMISSION_COMPLETE) {
// No need to do anything, transmission was successful.
} else {
Serial.println("Invalid transmission status for " + transmissionResult.SSID + "!");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be an assert, because if the condition ever happens it's due to a programming error or mem corruption.

@devyte
Copy link
Collaborator

devyte commented Aug 1, 2018

A few minor cleanups pending, but good enough to merge as-is.

@devyte devyte merged commit 7d5997d into esp8266:master Aug 1, 2018
@aerlon
Copy link
Contributor Author

aerlon commented Aug 2, 2018

Nice. I will open a new PR when I’m done with the fixes for the remaining issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants