Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf improvements #166

Merged
merged 4 commits into from
Jan 20, 2021
Merged

Perf improvements #166

merged 4 commits into from
Jan 20, 2021

Conversation

iffyio
Copy link
Collaborator

@iffyio iffyio commented Dec 22, 2020

Ran some tests sending as many packets to the proxy and made some changes in the process.

  • Limit the max number of packets we're processing at a time.

  • Make session to not expose any mutable APIs. This allows us to not have to avoid locking the
    session when working with it. A thread only acquires a lock in order to put a packet on the
    socket and it immediately releases it after that.

  • Use an atomic integer to track session expiration rather than an explicit lock.

  • Enable parking_lot mutex internally within Tokio. Tokio's lock is really slow and
    is currently one bottleneck in the code. It has the option to use parking_lot internally
    which helped speed things up a bit.
    Ideally, we won't have any tokio locks on the hot path when processing
    packets - currently the session map uses this because that lock is held across await
    boundaries - will look into whether we can avoid this later on.

This change also fixes a couple race conditions

  • In ensure_session, between grabbing a read and write lock, another thread could create
    a session, leading to multiple sessions being created.

  • A similar bug in prune_sessions - if a session receives a packet between listing
    remove_keys and actually removing sessions, we will incorrectly remove that session

Ran some tests sending as many packets to the proxy and made some changes in the process.

- Limit the max number of packets we're processing at a time.

- Make session to not expose any mutable APIs. This allows us to not have to avoid locking the
  session when working with it. A thread only acquires a lock in order to put a packet on the
  socket and it immediately releases it after that.

- Use an atomic integer to track session expiration rather than an explicit lock.

- Enable parking_lot mutex internally within Tokio. Tokio's lock is really slow and
  is currently one bottleneck in the code. It has the option to use parking_lot internally
  which helped speed things up a bit.
  Ideally, we won't have any tokio locks on the hot path when processing
  packets - currently the session map uses this because that lock is held across await
  boundaries - will look into whether we can avoid this later on.

This change also fixes a couple race conditions

- In ensure_session, between grabbing a read and write lock, another thread could create
  a session, leading to multiple sessions being created.

- A similar bug in prune_sessions - if a session receives a packet between listing
  `remove_keys` and actually removing sessions, we will incorrectly remove that session
Copy link
Contributor

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of fun stuff! Nice!

I mentioned in the comment - can you share how you performed the performance tests (also how you tracked the metrics)? Maybe this is the start of some performance suites?

If you can't, can you paste at least some of the results in here? Would be good to have even just a light history of performance metrics on the project that we can look back on over time, even if it's just for a baseline.

Cargo.toml Show resolved Hide resolved
src/proxy/server/mod.rs Show resolved Hide resolved
src/proxy/server/mod.rs Show resolved Hide resolved
src/proxy/server/mod.rs Show resolved Hide resolved
Ok(_) => {
session.increment_expiration().await;
if let Some(response) = result {
for endpoint in response.endpoints.iter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull this out into another function? Even if we pulled out the block at 274 (inside for endpoint in response.endpoints.iter() {) into it's own function - session_send_packet (as a thought on a name?) - I think it would be easier to follow, as the complicated locking logic would be wrapped into it's own area.

Feels to me like this code block is a bit too large inside this tokio::spawn to otherwise track what is happening.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted to session_send_packet functions!

src/proxy/server/mod.rs Show resolved Hide resolved
src/proxy/sessions/session.rs Show resolved Hide resolved
pub async fn increment_expiration(&mut self) {
let expiration = self.expiration.clone();
Session::inc_expiration(expiration).await
/// increment_expiration set the increments the expiration value by the session timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// increment_expiration set the increments the expiration value by the session timeout
/// update_expiration increments the expiration value by the session timeout

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Session::inc_expiration(expiration).await
/// increment_expiration set the increments the expiration value by the session timeout
pub fn update_expiration(&self, ttl: Duration) -> Result<()> {
Self::do_update_expiration(&self.expiration, ttl)
}

/// increment_expiration increments the expiration value by the session timeout (internal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// increment_expiration increments the expiration value by the session timeout (internal)
/// do_update_expiration increments the expiration value by the session timeout (internal)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

}

/// Sends a packet to the Session's dest.
pub async fn send_to(&mut self, buf: &[u8]) -> Result<Option<usize>> {
pub async fn send_to(&self, buf: &[u8]) -> Result<Option<usize>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're in here, should this be "send" rather than "send_to" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍨

@markmandel markmandel merged commit 7405bd1 into master Jan 20, 2021
@markmandel markmandel deleted the iu/perf-update branch January 20, 2021 02:00
@markmandel markmandel added the area/performance Anything to do with Quilkin being slow, or making it go faster. label Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Anything to do with Quilkin being slow, or making it go faster. cla: yes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants