Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant overhead/latency (about 50ms) #54

Closed
KronosTheLate opened this issue Jun 21, 2023 · 5 comments · Fixed by #63
Closed

Significant overhead/latency (about 50ms) #54

KronosTheLate opened this issue Jun 21, 2023 · 5 comments · Fixed by #63

Comments

@KronosTheLate
Copy link
Contributor

I mentioned in a comment on this issue that I had some latency issues when using RemoteREPL for my Raspberry Pi. But I just checked using a local host, so no SSH, and having everything running on the same, modern computer. I found that there is STILL almost 50 ms of latency from just evaluating 1 and returning the result:

julia> @btime @remote 1
  43.513 ms (66 allocations: 3.61 KiB)
1

By running using ProfileView and then @profview @remote 1, I get the following flamegraph:
image

From the top, the call-sites that make up the flamegraph are

./task.jl:795, MethodInstance for poptask(::Base.InvasiveLinkedListSynchronized{Task})
./task.jl:804, MethodInstance for wait()
./condition.jl:106, MethodInstance for wait(::Base.GenericCondition{Base.Threads.SpinLock})
./stream.jl:413, MethodInstance for wait_readnb(::Sockets.TCPSocket, ::Int64)
./stream.jl:106, eof [inlined]
./stream.jl:925, MethodInstance for read(::Sockets.TCPSocket, ::Type{UInt8})
/buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782, deserialize [inlined]
/buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:769, MethodInstance for deserialize(::Sockets.TCPSocket)
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:207, MethodInstance for var"#send_and_receive#40"(::Bool, ::typeof(RemoteREPL.send_and_receive), ::RemoteREPL.Connection, ::Tuple{Symbol, Int64})
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:199, send_and_receive [inlined]
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:382, MethodInstance for (::RemoteREPL.var"#47#48"{RemoteREPL.Connection, Int64})()
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:178, MethodInstance for var"#ensure_connected!#39"(::Int64, ::typeof(RemoteREPL.ensure_connected!), ::RemoteREPL.var"#47#48"{RemoteREPL.Connection, Int64}, ::RemoteREPL.Connection)
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:174, ensure_connected! [inlined]
/home/dennishb/.julia/packages/RemoteREPL/BFqrB/src/client.jl:380, MethodInstance for remote_eval_and_fetch(::RemoteREPL.Connection, ::Int64)
./boot.jl:360, eval [inlined]

I am not sure if this can be improved, or if this wait-time is necessary when dealing with networks. But investigations should be made into the possibility of avoiding this ~50 ms latency to every remote call.

@xgdgsc
Copy link
Contributor

xgdgsc commented Jun 22, 2023

JuliaLang/julia#31842 ?

@c42f
Copy link
Collaborator

c42f commented Jun 23, 2023

Naively 50 ms seems pretty crazy high on the loopback interface?

I expect this is more a Julia issue than a problem in this package but if we can invent a workaround that's great. Thanks @xgdgsc for the link :-)

@KronosTheLate
Copy link
Contributor Author

The linked issue has a comment where Jeff says that the culprit is the "Nagle algorithm". It can be disabled:

help?> Sockets.nagle
  nagle(socket::Union{TCPServer, TCPSocket}, enable::Bool)


  Enables or disables Nagle's algorithm on a given TCP server or socket.

  │ Julia 1.3
  │
  │  This function requires Julia 1.3 or later.

Should we use Sockets.nagle to disable this algorithm by default? I have to imagine that generally we do not want a 50 ms delay, for the gain of fewer packets on a communication channel that is not used by multiple people.

@jpsamaroo
Copy link

Correct, you should not be using Nagle's algorithm for interactive sockets - it's intended for high-bandwidth, high-latency TCP connections (such as data downloads).

@KronosTheLate
Copy link
Contributor Author

PR created. The effect was a 74x reduction in overhead, from adding a single line!

@c42f c42f closed this as completed in #63 Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants