Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use AtomicU64 instead of u64 for the position field in Receiver #70

Closed
wants to merge 4 commits into from

Conversation

hozan23
Copy link
Contributor

@hozan23 hozan23 commented Jul 4, 2024

The recv(), recv_direct(), recv_blocking(), and try_recv() methods currently require &mut self to modify the value of the position field. By using AtomicU64 for the position field eliminates the need for mutability.

Regarding performance, it's worth noting that on most processor architectures, there is no much difference between an atomic store and a regular non-atomic store (source: 1, 2).

Additionally, Atomic operations are generally not as expensive as mutexes, as mutexes themselves are implemented using atomic operations but with additional overhead for locking.

This PR has breaking API changes.

Fixes issue #66

@hozan23
Copy link
Contributor Author

hozan23 commented Jul 4, 2024

Hello @zeenix, Please let me know what do you think about this PR.

@zeenix
Copy link
Member

zeenix commented Jul 4, 2024

Regarding performance, it's worth noting that on most processor architectures, there is no much difference between an atomic store and a regular non-atomic store (source).

Thanks for the citing a source on this. I'll read it later..

Additionally, Atomic operations are generally not as expensive as mutexes, as mutexes themselves are implemented using atomic operations but with additional overhead for locking.

It would be nice to have some numbers on this but I guess we know for sure that at least atomics are not more expensive. So if atomics are not more expensive than their non-atomic primitive types for most machines out there, there is no downside to using atomics and that's good enough for me.

Copy link
Member

@zeenix zeenix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you please also bump the semver in this PR (in a separate commit on top please)? That would help avoid breaking API by accident.

hozan23 added 2 commits July 17, 2024 00:11
The `recv()`, `recv_direct()`, `recv_blocking()`, and `try_recv()`
methods currently require `&mut self` to modify the value of the
`position` field. By using AtomicU64 for the `position` field eliminates
the need for mutability.

Fixes issue smol-rs#66
@hozan23
Copy link
Contributor Author

hozan23 commented Jul 18, 2024

After conducting benchmarks for the new changes, it turns out using atomic operations results in a performance regression of around 10-11%. This regression is likely to increase with more receivers.

Here are the benchmark results on an Intel x86_64 architecture CPU by running cargo bench on the main repo:

Without atomic operations:

1 -> 1                  time:   [118.89 ns 119.26 ns 119.71 ns]
Found 21 outliers among 100 measurements (21.00%)
  3 (3.00%) high mild
  18 (18.00%) high severe

1 -> 2                  time:   [148.34 ns 148.65 ns 149.01 ns]
Found 18 outliers among 100 measurements (18.00%)
  1 (1.00%) high mild
  17 (17.00%) high severe

1 -> 4                  time:   [213.03 ns 213.59 ns 214.25 ns]
Found 19 outliers among 100 measurements (19.00%)
  7 (7.00%) high mild
  12 (12.00%) high severe

1 -> 8                  time:   [337.38 ns 337.90 ns 338.52 ns]
Found 19 outliers among 100 measurements (19.00%)
  2 (2.00%) high mild
  17 (17.00%) high severe

1 -> 16                 time:   [585.31 ns 586.83 ns 588.65 ns]
Found 19 outliers among 100 measurements (19.00%)
  3 (3.00%) high mild
  16 (16.00%) high severe

With atomic operations:

1 -> 1                  time:   [121.59 ns 121.96 ns 122.42 ns]
                        change: [+1.3652% +1.9991% +2.5771%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 20 outliers among 100 measurements (20.00%)
  20 (20.00%) high severe

1 -> 2                  time:   [158.20 ns 158.62 ns 159.08 ns]
                        change: [+6.2135% +6.5239% +6.8636%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
  7 (7.00%) high mild
  11 (11.00%) high severe

1 -> 4                  time:   [232.68 ns 233.26 ns 233.93 ns]
                        change: [+9.0008% +9.3453% +9.7053%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

1 -> 8                  time:   [370.89 ns 371.66 ns 372.54 ns]
                        change: [+10.014% +10.440% +10.926%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

1 -> 16                 time:   [653.93 ns 655.80 ns 657.86 ns]
                        change: [+11.406% +11.762% +12.118%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 23 outliers among 100 measurements (23.00%)
  1 (1.00%) high mild
  22 (22.00%) high severe

I am closing this PR due to this regression.

@zeenix
Copy link
Member

zeenix commented Jul 18, 2024

@hozan23 Thanks so much for your hard work still. I certainly learnt things in the course of this PR and now we've a very good answer to "why not allow Receiver to be non-mut?".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants