Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot stream transforms via the frameTransformClient from Windows to Ubuntu because yarp::os::Time::now() is platform dependent #3038

Closed
S-Dafarra opened this issue Nov 7, 2023 · 15 comments · Fixed by #3041
Labels
Issue Type: Bug Involves some intervention from a system administrator

Comments

@S-Dafarra
Copy link
Contributor

Describe the bug
We have two PCs, one running Windows 11 to access Windows device, and one running Ubuntu, connected to the same network. The Windows PC runs the frameTransformServer device and an application publishes transforms. These transforms are then read on the Ubuntu machine using the following configuration

    <device type="frameTransformClient" name="TransformClient">
        <param name="period">0.01</param>
        <param name="filexml_option">ftc_yarp_only.xml</param>
        <param name="ft_client_prefix">/IFrameTransformToIWear/tf</param>
        <param name="local_rpc">/IFrameTransformToIWear/tf/local_rpc</param>
    </device>

When we attempt to read the transform, we get an error of the type

[ERROR] |yarp.device.FrameTransformClient| getTransform(): Frames  root_link_desired  and  vive_tracker_right_elbow_pose  are not connected

By enabling the verbosity of the FrameTransformStorage (basically commenting the lines

if (config.check("FrameTransform_verbose_debug") &&
config.find("FrameTransform_verbose_debug").asInt32() == 1)
on the Ubuntu machine)

we obtain the following messages

[DEBUG] |yarp.device.frameTransformContainer|ftc_storage.container| At time 1699376406.449361 Transform expired: openxr_origin_raw -> openxr_origin with timestamp 1218348.232036
[DEBUG] |yarp.device.frameTransformContainer|ftc_storage.container| At time 1699376406.449361 Transform expired: openxr_origin_raw -> openxr_head with timestamp 1218348.233145
[DEBUG] |yarp.device.frameTransformContainer|ftc_storage.container| At time 1699376406.449361 Transform expired: openxr_origin_raw -> vive_tracker_left_elbow_pose with timestamp 1218348.232411
[DEBUG] |yarp.device.frameTransformContainer|ftc_storage.container| At time 1699376406.449361 Transform expired: openxr_origin_raw -> vive_tracker_waist_pose with timestamp 1218348.233626
[DEBUG] |yarp.device.frameTransformContainer|ftc_storage.container| At time 1699376406.449361 Transform expired: openxr_origin -> stable_waist with timestamp 1218348.234057
[DEBUG] |yarp.device.frameTransformContainer|ftc_storage.container| At time 1699376406.449361 Transform expired: openxr_origin -> world with timestamp 1218348.234509
[DEBUG] |yarp.device.frameTransformContainer|ftc_storage.container| At time 1699376406.449361 Transform expired: openxr_origin -> root_link_desired with timestamp 1218348.235078

It is possible to notice that there are two largely different timestamps: 1699376406.449361, and 1218348.235078. 1699376406.449361 is the Unix epoch (see https://www.epochconverter.com/clock). We instead verified that 1218348.235078 is the output of yarp::os::Time::now() on the Windows machine.

Indeed, the Windows epoch is completely different: https://devblogs.microsoft.com/oldnewthing/20090306-00/?p=18913

This might be related to the fact that time_since_epoch, used in

return std::chrono::time_point_cast<std::chrono::duration<double>>(std::chrono::high_resolution_clock::now()).time_since_epoch().count();

is guaranteed to be relative to the UNIX epoch only since C++20: https://stackoverflow.com/a/6012671

With @randaz81, we also tried to compile YARP with C++20 on Windows, but given the corresponding high number of issues, we paused the activity.

To Reproduce
Steps to reproduce the behavior:
Run yarp::os::Time::now() on Windows and Ubuntu to check the difference.

Expected behavior
A clear and concise description of what you expected to happen.
If yarp::os::Time::now()returns the same on each machine, then also the frameTransformClient issue should be solved.

Screenshots
If applicable, add screenshots to help explain your problem.

Configuration (please complete the following information):

  • OS: Windows 11 - Ubuntu 20.04
  • yarp version: 3.7.2
  • compiler: Visual Studio 22 17.4.4 - gcc version 9.4.0

Additional context
Add any other context about the problem here.

@S-Dafarra
Copy link
Contributor Author

I wonder if we could modify yarp::os::Time::now() to return explicitly the time since January 1, 1970

@randaz81
Copy link
Member

randaz81 commented Nov 8, 2023

This issue has a big impact on all setups where windows machines have to communicate with linux machines. It may involve not only the frameTransform-related software but also other modules, such as camera/encoders synchronization. It must be noted that this issue cannot be caught by our CI which does not test communication between different os.

@randaz81 randaz81 added the Issue Type: Bug Involves some intervention from a system administrator label Nov 8, 2023
@S-Dafarra
Copy link
Contributor Author

What about defining a CustomClock with a defined epoch, and use it for all the applications that require synchronization between different machines?

cc @traversaro @GiulioRomualdi

@randaz81
Copy link
Member

randaz81 commented Nov 8, 2023

Even if this is tipically done in simulation when you use stream the gazebo clock throught a port, I do not think propagating a custom clock through the network is a good idea.
The issue must be solved at its root, as you mentioned:

I wonder if we could modify yarp::os::Time::now() to return explicitly the time since January 1, 1970

@traversaro
Copy link
Member

As this timestamp is meant to be interoperable with ROS, I think we can check which timestamp code is used in ROS and use that.

@traversaro
Copy link
Member

As this timestamp is meant to be interoperable with ROS, I think we can check which timestamp code is used in ROS and use that.

See:

at this point, I lost where the actual get_now of the clock used by default is defined, as I guess multiple clock can be defined in ROS. However, apparently tf2 uses a different definition of time, see https://github.com/ros2/geometry2/blob/e8a7223bb6a55c8116f439dbbc947d93885d29f9/tf2/src/time.cpp#L41 , that is converted to a double in https://github.com/ros2/geometry2/blob/e8a7223bb6a55c8116f439dbbc947d93885d29f9/tf2/src/time.cpp#L64 . So either std::chrono::system_clock::now() uses the UNIX epoch for some reason in Windows, or the same problem is there also in ROS.

@traversaro
Copy link
Member

However, apparently tf2 uses a different definition of time, see https://github.com/ros2/geometry2/blob/e8a7223bb6a55c8116f439dbbc947d93885d29f9/tf2/src/time.cpp#L41 , that is converted to a double in https://github.com/ros2/geometry2/blob/e8a7223bb6a55c8116f439dbbc947d93885d29f9/tf2/src/time.cpp#L64 . So either std::chrono::system_clock::now() uses the UNIX epoch for some reason in Windows, or the same problem is there also in ROS.

No, apparently that function exists but then it is not used in the actual nodes that publish transform, see for example: https://github.com/ros2/geometry2/blob/e8a7223bb6a55c8116f439dbbc947d93885d29f9/tf2_ros/src/static_transform_broadcaster_node.cpp#L73C31-L73C34 , see also ros2/geometry2#67 .

@traversaro
Copy link
Member

Ok, I found the code in ROS2 were they account for different epochs between Windows and Linux, exactly how we were discussing to do @S-Dafarra :

@S-Dafarra
Copy link
Contributor Author

Ok, I found the code in ROS2 were they account for different epochs between Windows and Linux, exactly how we were discussing to do @S-Dafarra :

The second link does not work for me

@traversaro
Copy link
Member

You are right, I blindly copied it from the code. By looking in webarchive, the updated link is https://support.microsoft.com/en-us/topic/bf03df72-96e4-59f3-1d02-b6781002dc7f .

@traversaro
Copy link
Member

You are right, I blindly copied it from the code. By looking in webarchive, the updated link is https://support.microsoft.com/en-us/topic/bf03df72-96e4-59f3-1d02-b6781002dc7f .

Fixed upstream in ros2/rcutils#438 .

@S-Dafarra
Copy link
Contributor Author

I have tried to get the number of seconds since 1st January 1970. In order to test it, I started editing these lines

while (true) {
double t = yarp::os::Time::now();
if (ros) {
int sec = (int) t;
int nsec = (int)((t-sec)*1e9);
yCInfo(COMPANION, "%d %d", sec, nsec);
} else {
yCInfo(COMPANION, "%f", t);
}
fflush(stdout);
clk.delay(0.1);
}

with the following:

    while (true) {
        double t = yarp::os::Time::now();

        auto currentTime = std::chrono::high_resolution_clock::now();

        std::tm epochTimeInfo = {};
        epochTimeInfo.tm_year = 1970 - 1900;
        epochTimeInfo.tm_mon = 0;
        epochTimeInfo.tm_mday = 1;
        epochTimeInfo.tm_hour = 0;
        epochTimeInfo.tm_min = 0;
        epochTimeInfo.tm_sec = 0;
        time_t epochTimeWithZone = std::mktime(&epochTimeInfo);
        auto epochTimeUTC = std::mktime(std::gmtime(&epochTimeWithZone));
        auto epochTimeManual = std::chrono::high_resolution_clock::from_time_t(epochTimeUTC);

        auto epochTimeChrono = std::chrono::time_point<std::chrono::high_resolution_clock>{};

        double seconds_since_epoch = std::chrono::duration<double>(currentTime - epochTimeManual).count();
        double epochVariation = std::chrono::duration<double>(epochTimeChrono - epochTimeManual).count();

        if (ros) {
            int sec = (int) t;
            int nsec = (int)((t-sec)*1e9);
            yCInfo(COMPANION, "%d %d", sec, nsec);
        } else {
            yCInfo(COMPANION, "yarp %f  --  since epoch %f -- epoch difference %f", t, seconds_since_epoch, epochVariation);
        }
        fflush(stdout);
        clk.delay(0.1);
    }

but I get the following result

sdafarra@iiticublap260:~$ yarp time
yarp 1699443274.573688  --  since epoch 1699450474.573688 -- epoch difference 7200.000000
yarp 1699443274.673824  --  since epoch 1699450474.673824 -- epoch difference 7200.000000
yarp 1699443274.773941  --  since epoch 1699450474.773941 -- epoch difference 7200.000000
yarp 1699443274.874056  --  since epoch 1699450474.874056 -- epoch difference 7200.000000
yarp 1699443274.974158  --  since epoch 1699450474.974159 -- epoch difference 7200.000000
yarp 1699443275.074309  --  since epoch 1699450475.074309 -- epoch difference 7200.000000

I have verified that the first number is coincident with https://www.epochconverter.com/clock. There is 2 hours difference I need to iron out. I suspect that there is a weird timezone difference applied when converting to a std::chrono::time_point.

@S-Dafarra
Copy link
Contributor Author

To fix the conversion issue, I checked how the elapsedTimeChrono was changing when applying gmtime to it, and then use that difference to fix the conversion of the "manual" epoch. The code is as follows:

    while (true) {
        double t = yarp::os::Time::now();

        auto currentTime = std::chrono::high_resolution_clock::now();

        std::tm epochTimeInfo = {};
        epochTimeInfo.tm_year = 1970 - 1900;
        epochTimeInfo.tm_mon = 0;
        epochTimeInfo.tm_mday = 1;
        epochTimeInfo.tm_hour = 0;
        epochTimeInfo.tm_min = 0;
        epochTimeInfo.tm_sec = 0;
        time_t epochTimeWithZone = std::mktime(&epochTimeInfo);

        auto epochTimeChrono = std::chrono::time_point<std::chrono::high_resolution_clock>{};
        time_t epochTimeChrono_time_t = std::chrono::high_resolution_clock::to_time_t(epochTimeChrono);
        time_t epochTimeChrono_time_t_UTC = std::mktime(std::gmtime(&epochTimeChrono_time_t));
        auto utc_diff = epochTimeChrono_time_t - epochTimeChrono_time_t_UTC;

        auto epochTimeManual = std::chrono::high_resolution_clock::from_time_t(epochTimeWithZone + utc_diff);

        double seconds_since_epoch = std::chrono::duration<double>(currentTime - epochTimeManual).count();
        double epochVariation = std::chrono::duration<double>(epochTimeChrono - epochTimeManual).count();

        if (ros) {
            int sec = (int) t;
            int nsec = (int)((t-sec)*1e9);
            yCInfo(COMPANION, "%d %d", sec, nsec);
        } else {
            yCInfo(COMPANION, "yarp %f  --  since epoch %f -- epoch difference %f", t, seconds_since_epoch, epochVariation);
        }
        fflush(stdout);
        clk.delay(0.1);
    }

and the result is

sdafarra@iiticublap260:~$ yarp time
yarp 1699445471.501894  --  since epoch 1699445471.501894 -- epoch difference 0.000000
yarp 1699445471.602027  --  since epoch 1699445471.602027 -- epoch difference 0.000000
yarp 1699445471.702150  --  since epoch 1699445471.702150 -- epoch difference 0.000000
yarp 1699445471.802274  --  since epoch 1699445471.802274 -- epoch difference 0.000000
yarp 1699445471.902438  --  since epoch 1699445471.902438 -- epoch difference 0.000000
yarp 1699445472.002641  --  since epoch 1699445472.002642 -- epoch difference 0.000000
yarp 1699445472.102862  --  since epoch 1699445472.102863 -- epoch difference 0.000000
yarp 1699445472.203131  --  since epoch 1699445472.203132 -- epoch difference 0.000000
yarp 1699445472.303377  --  since epoch 1699445472.303378 -- epoch difference 0.000000

@S-Dafarra
Copy link
Contributor Author

S-Dafarra commented Nov 8, 2023

I tried the same code on Windows. Initially it was not compiling because from_time_t and to_time_t are not defined for std::chrono::high_resolution_clock (indeed they are not mentioned in https://en.cppreference.com/w/cpp/chrono/high_resolution_clock)
By changing to std::chrono::system_clock

i.e.
while (true) {
        double t = yarp::os::Time::now();

        auto currentTime = std::chrono::system_clock::now();

        std::tm epochTimeInfo = {};
        epochTimeInfo.tm_year = 1970 - 1900;
        epochTimeInfo.tm_mon = 0;
        epochTimeInfo.tm_mday = 1;
        epochTimeInfo.tm_hour = 0;
        epochTimeInfo.tm_min = 0;
        epochTimeInfo.tm_sec = 0;
        time_t epochTimeWithZone = std::mktime(&epochTimeInfo);

        auto epochTimeChrono = std::chrono::time_point<std::chrono::system_clock> {};
        time_t epochTimeChrono_time_t = std::chrono::system_clock::to_time_t(epochTimeChrono);
        time_t epochTimeChrono_time_t_UTC = std::mktime(std::gmtime(&epochTimeChrono_time_t));
        auto utc_diff = epochTimeChrono_time_t - epochTimeChrono_time_t_UTC;

        auto epochTimeManual = std::chrono::system_clock::from_time_t(epochTimeWithZone + utc_diff);

        double seconds_since_epoch = std::chrono::duration<double>(currentTime - epochTimeManual).count();
        double epochVariation = std::chrono::duration<double>(epochTimeChrono - epochTimeManual).count();

        if (ros) {
            int sec = (int)t;
            int nsec = (int)((t - sec) * 1e9);
            yCInfo(COMPANION, "%d %d", sec, nsec);
        } else {
            yCInfo(COMPANION, "yarp %f  --  since epoch %f -- epoch difference %f", t, seconds_since_epoch, epochVariation);
        }
        fflush(stdout);
        clk.delay(0.1);
    }

I got the following output:

(robsub) C:\>yarp time
yarp 1291206.673404  --  since epoch 1699449263.785837 -- epoch difference 0.000000
yarp 1291206.774669  --  since epoch 1699449263.887101 -- epoch difference 0.000000
yarp 1291206.875579  --  since epoch 1699449263.988012 -- epoch difference 0.000000
yarp 1291206.976467  --  since epoch 1699449264.088900 -- epoch difference 0.000000
yarp 1291207.077158  --  since epoch 1699449264.189591 -- epoch difference 0.000000
yarp 1291207.177768  --  since epoch 1699449264.290201 -- epoch difference 0.000000
yarp 1291207.278573  --  since epoch 1699449264.391006 -- epoch difference 0.000000
yarp 1291207.379331  --  since epoch 1699449264.491764 -- epoch difference 0.000000
yarp 1291207.480272  --  since epoch 1699449264.592705 -- epoch difference 0.000000
yarp 1291207.581725  --  since epoch 1699449264.694158 -- epoch difference 0.000000

The interesting part is that now also on Windows the epoch difference is zero.

So I tried to change

return std::chrono::time_point_cast<std::chrono::duration<double>>(std::chrono::high_resolution_clock::now()).time_since_epoch().count();

to

    return std::chrono::time_point_cast<std::chrono::duration<double>>(std::chrono::system_clock::now()).time_since_epoch().count();

as @traversaro hinted, and I obtained

(robsub) C:\>yarp time
yarp 1699449322.318464  --  since epoch 1699449322.318465 -- epoch difference 0.000000
yarp 1699449322.419178  --  since epoch 1699449322.419178 -- epoch difference 0.000000
yarp 1699449322.519740  --  since epoch 1699449322.519741 -- epoch difference 0.000000
yarp 1699449322.620194  --  since epoch 1699449322.620195 -- epoch difference 0.000000
yarp 1699449322.720970  --  since epoch 1699449322.720970 -- epoch difference 0.000000
yarp 1699449322.821424  --  since epoch 1699449322.821425 -- epoch difference 0.000000
yarp 1699449322.921975  --  since epoch 1699449322.921976 -- epoch difference 0.000000
yarp 1699449323.022848  --  since epoch 1699449323.022848 -- epoch difference 0.000000

So I guess that the problem was with std::chrono::high_resolution_clock all along. Reading https://en.cppreference.com/w/cpp/chrono/high_resolution_clock I noticed the following

It is often just an alias for std::chrono::steady_clock or std::chrono::system_clock, but which one it is depends on the library or configuration. When it is a system_clock, it is not monotonic (e.g., the time can go backwards). For example, as of 2023, libstdc++ has it aliased to system_clock "until higher-than-nanosecond definitions become feasible"[1], MSVC has it as steady_clock[2], and libc++ uses steady_clock when the C++ standard library implementation supports a monotonic clock and system_clock otherwise[3].

(This also explains why from_time_t and to_time_t are defined for std::chrono::high_resolution_clock on Ubuntu)

In fact, I tried to change to std::chrono::steady_clock and I had again

(robsub) C:\>yarp time
yarp 1291864.924830  --  since epoch 1699449922.036908 -- epoch difference 0.000000
yarp 1291865.026306  --  since epoch 1699449922.138378 -- epoch difference 0.000000

and reading its documentation

This clock is not related to wall clock time (for example, it can be time since last reboot), and is most suitable for measuring intervals.

which explains why it is so small. Indeed, after a reboot of the machine (not after power on/power off) I obtained

(robsub) C:\>yarp time
yarp 41.976821  --  since epoch 1699450399.333492 -- epoch difference 0.000000
yarp 42.077304  --  since epoch 1699450399.433975 -- epoch difference 0.000000
yarp 42.178113  --  since epoch 1699450399.534784 -- epoch difference 0.000000
yarp 42.278720  --  since epoch 1699450399.635391 -- epoch difference 0.000000

This behavior is the same with std::chrono::high_resolution_clock.

@S-Dafarra S-Dafarra changed the title Cannot stream transforms via the frameTransformClient from Windows to Ubuntu because ``yarp::os::Time::now() is platform dependent Cannot stream transforms via the frameTransformClient from Windows to Ubuntu because yarp::os::Time::now() is platform dependent Nov 8, 2023
randaz81 added a commit to randaz81/yarp that referenced this issue Nov 8, 2023
…k instead of std::chrono::high_resolution_clock.

This makes yarp timestamps comparable between windows and linux machines. See issue robotology#3038
@randaz81 randaz81 linked a pull request Nov 8, 2023 that will close this issue
@S-Dafarra
Copy link
Contributor Author

I applied #3041 locally and the problem with the frameTransformClient seems fixed.

randaz81 added a commit to randaz81/yarp that referenced this issue Nov 10, 2023
…k instead of std::chrono::high_resolution_clock.

This makes yarp timestamps comparable between windows and linux machines. See issue robotology#3038
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue Type: Bug Involves some intervention from a system administrator
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants