Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce lock contention for frame writes (with extra tests) #354

Conversation

michaelklishin
Copy link
Member

This is #350 by @YulerB and @vendre21 with some additional tests from our team. See #350 for the original discussion.

The idea is the following:

  • Frames and frame sets (for frames that carry content and need, ahem, content framing) are written in a single operation to a memory-backed stream
  • BufferedStream is no longer used (it was not efficiently used for writes anyway)
  • The above seeks to guarantee that socket stream writes cannot be interleaved in a way that will confuse RabbitMQ protocol parser
  • As such, locking is no longer necessary

This PR was created to simplify the final rounds of QA:

  • Extra tests if our team comes up with any
  • A fresh discussion without a dozen of comments about line endings :)

@bording
Copy link
Collaborator

bording commented Sep 19, 2017

@michaelklishin I've been meaning to try out these changes, but haven't gotten to them yet.

The only way I can think to make trying it out easier would be for your to publish some sort of prerelease package, perhaps to a custom MyGet feed instead of nuget.org.

Did you see @danielmarbach's request on the other PR about adding tests with several different payload sizes? Seems like a good idea to validate this continues to be better as message size increases.

@michaelklishin
Copy link
Member Author

@bording OK, we had an AppVeyor feed for development releases but I'm afraid that thing hasn't been updated in a while.

I am going to add more tests with different payload sizes, very good idea @danielmarbach. Any particular values you have in mind?

@michaelklishin
Copy link
Member Author

I did some benchmarking with 20 runs of the concurrent connection access test suite on two machines (from 2017 and 2013).

In one case, median went down from 1.35s (master) to 1.04s (this PR). In the other, from 1.41s to 1.06s. While that's not a lot of data points, on a CPU (32 threads, together with RabbitMQ completely max out 8 hyperthreads on both machines) + loopback I/O heavy workload, removing the lock makes a nice difference.

@bording
Copy link
Collaborator

bording commented Sep 20, 2017

@michaelklishin That sounds really good! I spent some time today getting some of my testing setup ready to try out the changes here, so I should have something to report back tomorrow.

Copy link
Contributor

@kjnilsson kjnilsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this generally looks good. Just a few formatting, tidy up changes. Naturally it is almost impossible to say this is definitely safe but it looks ok to me.

@@ -1099,7 +1099,7 @@ public void HeartbeatWriteTimerCallback(object state)
if (!m_closed)
{
WriteFrame(m_heartbeatFrame);
m_frameHandler.Flush();
//m_frameHandler.Flush();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code

@@ -465,11 +464,11 @@ public void ModelSend(MethodBase method, ContentHeaderBase header, byte[] body)
{
if (method.HasContent)
{
lock (m_flowSendLock)
{
//lock (m_flowSendLock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code

@@ -908,11 +912,11 @@ public void HandleChannelFlow(bool active)
}
else
{
lock (m_flowSendLock)
{
//lock (m_flowSendLock)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code

@@ -767,7 +767,7 @@ public void MainLoopIteration()
}
}

public void NotifyHeartbeatListener()
public void NotifyHeartbeatListener()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove double space

@@ -88,7 +87,7 @@ public abstract class ModelBase : IFullModel, IRecoverable

private EventHandler<EventArgs> m_recovery;

public IConsumerDispatcher ConsumerDispatcher { get; private set; }
public IConsumerDispatcher ConsumerDispatcher { get; private set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove extra space

{
throw new UnexpectedFrameException(f);
}
byte[] fragment = f.Payload;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there real value in removing this variable? I would have thought (hoped) the compiler would optimise this away anyway.

@YulerB
Copy link
Contributor

YulerB commented Sep 21, 2017

Since the frame sizes are calculable, might be wothwhile setting the capacity on the memory streams to avoid resizing and memory fragmentation.

For example,

  1. HEADER var ms = new MemoryStream(8);
  2. FRAME-> var ms = new MemoryStream(frame.Size);
  3. FRAMESET-> var ms = new MemoryStream(frames.Sum(=>.Size));

@danielmarbach
Copy link
Collaborator

@YulerB if you worry about allocs then better no use Linq ;)

@YulerB
Copy link
Contributor

YulerB commented Sep 21, 2017

@danielmarbach Linq gets a bad rap, but I know what you mean.

Many of the Framesets with 4k, which would require 2 re-allocs of the byte buffer in the MemoryStream.

Many of the message like Ack are much smaller than the default 1k, so we have over-allocation of memory .

This might avoid some of the memory thrashing.

To avoid linq, here is an extension method.

    public int Sum<T>(this IEnumerable<T> list, Func<T, int> expression)
    {
        int sum = 0;
        foreach (var item in list)
        {
            sum += expression(item);
        }
        return sum;
    }

@kjnilsson
Copy link
Contributor

@bording @michaelklishin I've set our ci to push each PR build to myget - would this work for testing this PR? https://www.myget.org/feed/Packages/rabbitmq

@bording
Copy link
Collaborator

bording commented Sep 25, 2017

@kjnilsson Yes, having a MyGet feed that the results of each PR will be helpful for the future.

FYI, assuming your MyGet account name is rabbitmq, then the feed URL is https://www.myget.org/F/rabbitmq/api/v3/index.json. It doesn't look like any packages have been pushed to it yet, though!

@bording
Copy link
Collaborator

bording commented Sep 25, 2017

I have done some testing of these changes with a locally built copy of the client. I am seeing a small benefit on some tests that specifically ensure that we're publishing messages with multiple channels from the same connection.

I'm also not seeing any performance regressions, so overall it seems like this PR is 👍 from me, with the caveat that I haven't spent a lot of time actually looking at the content of the changes.

@michaelklishin
Copy link
Member Author

@bording my personal concerns were about concurrency hazard safety but I assume if there were any failures in your test suites you'd have mentioned it. I'll do some last round naming and test structure changes before merging.

As integration test case subclasses are meant to do. Squashes
a warning.
@bording
Copy link
Collaborator

bording commented Sep 26, 2017

@michaelklishin I'll try and run a few more rounds of tests to be sure, but I've not seen any problems so far.

There are no "read frames" and "write frames" in the
protocol. "Incoming" and "outgoing" frames are also
not well established terms but at least they hint
at what the difference is and when they are used.

There are existing methods with identical names (ReadFrame, WriteFrame) as well,
which isn't helping.
@michaelklishin michaelklishin dismissed kjnilsson’s stale review September 26, 2017 04:25

Addressed Karl's feedback.

@michaelklishin michaelklishin merged commit 7e69311 into master Sep 26, 2017
@michaelklishin
Copy link
Member Author

In the end, so far we didn't manage to run into any concurrency hazards with this implementation and indeed there's a nice efficiency gain with concurrent writes on a shared connection. Thank you, @YulerB and @vendre21! Your contribution will be availalbe in the 5.1.0 version of the client :)

@michaelklishin
Copy link
Member Author

We published a 5.1.0-pre1 to nuget.org for others to try this change out.

@YulerB
Copy link
Contributor

YulerB commented Sep 26, 2017

Thanks for you help on this @michaelklishin, @bording, @danielmarbach, @kjnilsson, @lukebakken, @vendre21.

@lukebakken lukebakken deleted the rabbitmq-dotnet-client-350-connection-write-lock-contention-reduction branch January 4, 2020 12:23
michaelklishin added a commit that referenced this pull request Jan 7, 2020
when transmitting multiple frames at once.

While changes in #350, #354 are supposed to be safe, there's some
evidence (#681) that sometimes they are not which leads to incorrect
byte stream interleaving on the wire.
michaelklishin added a commit that referenced this pull request Jan 7, 2020
when transmitting multiple frames at once.

While changes in #350, #354 are supposed to be safe, there's some
evidence (#681) that sometimes they are not which leads to incorrect
byte stream interleaving on the wire.
lukebakken pushed a commit that referenced this pull request Jan 30, 2020
when transmitting multiple frames at once.

While changes in #350, #354 are supposed to be safe, there's some
evidence (#681) that sometimes they are not which leads to incorrect
byte stream interleaving on the wire.

Update API approval test expectations

Re-introduce lock for all socket writes

The official `NetworkStream` docs are confusing:

https://docs.microsoft.com/en-us/dotnet/api/system.net.sockets.networkstream?view=netframework-4.5.1

> Read and write operations can be performed simultaneously on an
instance of the NetworkStream class without the need for
synchronization. As long as there is one unique thread for the write
operations and one unique thread for the read operations, there will be
no cross-interference between read and write threads and no
synchronization is required.

This is a poorly-written way to say "multiple threads must be synchronized".

Fixes #681

Update API approval test expectations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants