Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial pass on adding proxy peer server and configuration #10223

Merged
merged 35 commits into from
Apr 29, 2022

Conversation

dboslee
Copy link
Contributor

@dboslee dboslee commented Feb 8, 2022

This implements the proxy service for peer proxies to connect and dial nodes through.

Closes https://github.com/gravitational/cloud/issues/1305

@russjones russjones added the cloud Cloud label Feb 9, 2022
@russjones
Copy link
Contributor

@dboslee Ping us when this is ready to review.

@dboslee dboslee marked this pull request as ready for review February 10, 2022 17:54
@dboslee dboslee requested review from knisbet, NajiObeid and russjones and removed request for jimbishopp February 10, 2022 17:54
lib/proxy/middleware.go Outdated Show resolved Hide resolved
lib/proxy/server.go Outdated Show resolved Hide resolved
expectErr: true,
},
{
desc: "certificates with correct role from different CAs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this particular test case does not really represent what happens in the code.
The expected error is the result of the way the test is setup rather than actually be a legitimate error. I've removed it from my client branch as I was reworking the tests a bit after I implemented the equivalent of getConfigForClient on the client side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the issue. This is specifically testing the server side cert validation. Could we leave this and have a separate test for the client side?

One change I might add is setting the tc.client.InsecureSkipVerify = true to ensure the failure happens server side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I am seeing this wrong, there's nothing in the code itself that prevents a connection between a client and a server with certs from two different CAs. It is just triggered by the way the test case is implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying call to ClientHandshake/ServerHandshake should return an error in this case.

@russjones russjones requested review from espadolini and rosstimothy and removed request for russjones March 3, 2022 19:25
@russjones
Copy link
Contributor

@dboslee Is this PR ready for review?

@dboslee
Copy link
Contributor Author

dboslee commented Mar 4, 2022

@dboslee Is this PR ready for review?

Yep, but I did open up this issue for discussion that could lead to the configuration changing slightly.

name string
in string
want ProxyPeering
wantErr func(*testing.T, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wantErr func(*testing.T, error)
wantErr require.ErrorAssertionFunc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will resolve this once we settle on a config format.

Comment on lines 927 to 935
ProxyPeering ProxyPeering = 9 [ (gogoproto.jsontag) = "proxy_peering,omitempty" ];
}

// ProxyPeeringMode represents the cluster proxy peering mode.
enum ProxyPeering {
// Disabled is set if proxy peering is disabled.
Disabled = 0;
// Enabled is set if proxy peering is enabled.
Enabled = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the faux boolean could we do something that is both backward compatible and open to changes in the future?

Suggested change
ProxyPeering ProxyPeering = 9 [ (gogoproto.jsontag) = "proxy_peering,omitempty" ];
}
// ProxyPeeringMode represents the cluster proxy peering mode.
enum ProxyPeering {
// Disabled is set if proxy peering is disabled.
Disabled = 0;
// Enabled is set if proxy peering is enabled.
Enabled = 1;
TunnelDialingStrategy TunnelDialingStrategy = 9 [ (gogoproto.jsontag) = "tunnel_dialing_strategy,omitempty" ];
}
// TunnelDialingStrategy
enum TunnelDialingStrategy {
// Legacy/WarDialer/Whatever we are calling it
Legacy = 0;
// ProxyPeering
ProxyPeering = 1;
.....
// SomeNewMechanism in the future could go here
SomeNewMechanism = 2;

The naming in the suggestion could use some work - it was meant more to get the suggestion across than be anything concrete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I would be curious to hear your thoughts on updating the configuration based on this discussion as well. Basically another configuration we want to add is reversetunnel_count which would be dependent on ProxyPeering being enabled.

It feels a bit wrong but maybe not a big deal for reversetunnel_count to be ignored when TunnelDialingStrategy is not equal to ProxyPeering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't fully looked into that discussion yet - but as far as allowing each type to have its own config parameters wouldn't that be similar to say how teleport storage configuration can have different parameters based on the type?

  storage:
    type: etcd
    peers: [https://etcd-0.etcd:2379, https://etcd-1.etcd:2379, https://etcd-2.etcd:2379]
    tls_cert_file: /etc/etcd/certs/client-cert.pem
    tls_key_file: /etc/etcd/certs/client-key.pem
    tls_ca_file: /etc/etcd/certs/ca-cert.pem
    prefix: teleport
  storage:
    type: dynamodb
    table_name: tablename
    region: us-east-2

lib/proxy/auth.go Outdated Show resolved Hide resolved
lib/proxy/conn.go Outdated Show resolved Hide resolved
lib/proxy/conn.go Outdated Show resolved Hide resolved
lib/proxy/service_test.go Outdated Show resolved Hide resolved
lib/proxy/conn.go Outdated Show resolved Hide resolved
lib/proxy/conn.go Outdated Show resolved Hide resolved
lib/proxy/service.go Outdated Show resolved Hide resolved

sent, received := pipeConn(stream.Context(), streamConn, nodeConn)
log.Debugf("Closing dial request from peer. sent: %d reveived %d", sent, received)
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return any errors from streamConn/pipeConn instead of always returning nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed comments on error handling here 39555e5

Copy link
Contributor

@rosstimothy rosstimothy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan on adding any integration tests or extending the existing reverse tunnel tests to cover both agent mesh and proxy peering?

Comment on lines 173 to 179
func() {
defer conn.Close()

client := proto.NewProxyServiceClient(conn)
_, err = client.DialNode(context.Background())
tc.assertErr(t, err)
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to be wrapped in a function?

Suggested change
func() {
defer conn.Close()
client := proto.NewProxyServiceClient(conn)
_, err = client.DialNode(context.Background())
tc.assertErr(t, err)
}()
defer conn.Close()
client := proto.NewProxyServiceClient(conn)
_, err = client.DialNode(context.Background())
tc.assertErr(t, err)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rosstimothy these server tests change a little bit on the client branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made sense when this wasn't wrapped in the t.Run(tc.desc, func(t *testing.T). But agreed not necessary now.

@dboslee
Copy link
Contributor Author

dboslee commented Mar 21, 2022

Do you plan on adding any integration tests or extending the existing reverse tunnel tests to cover both agent mesh and proxy peering?

@rosstimothy I think that will make sense to add in the pr that has the changes for the agent/reverse tunneling.

@pierrebeaucamp
Copy link
Contributor

@rosstimothy / @fspmarshall / @espadolini could we get another review please?

api/types/types.proto Outdated Show resolved Hide resolved
api/types/tunnel_strategy.go Outdated Show resolved Hide resolved
api/types/tunnel_strategy.go Show resolved Hide resolved
api/types/tunnel_strategy.go Outdated Show resolved Hide resolved
api/types/tunnel_strategy.go Show resolved Hide resolved
api/types/tunnel_strategy.go Outdated Show resolved Hide resolved
lib/proxy/conn.go Outdated Show resolved Hide resolved
Comment on lines +59 to +60
func (c *streamConn) Read(b []byte) (n int, err error) {
c.rLock.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (c *streamConn) Read(b []byte) (n int, err error) {
c.rLock.Lock()
func (c *streamConn) Read(b []byte) (n int, err error) {
if len(b) == 0 {
return 0, nil
}
c.rLock.Lock()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather let this through that way it will fail if the stream is closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A zero read would still be a noop if we had a non-empty buffer ready.

lib/proxy/conn.go Outdated Show resolved Hide resolved
lib/proxy/conn.go Outdated Show resolved Hide resolved
@dboslee dboslee requested review from espadolini and removed request for knisbet April 8, 2022 18:15
lib/proxy/conn.go Outdated Show resolved Hide resolved
Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
}

// streamConn wraps a grpc stream with a net.streamConn interface.
type streamConn struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some test coverage for this and pipeConn? They seem like pretty fundamental components yet are only indirectly tested through DialNode. Some dedicated unit tests would ensure all the edge cases are taken care of

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed these comments here a4053ec

Recv() (*proto.Frame, error)
}

// streamConn wraps a grpc stream with a net.streamConn interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// streamConn wraps a grpc stream with a net.streamConn interface.
// streamConn wraps a grpc stream with a net.Conn interface.

}
}

// Read reads data reveived over the grpc stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Read reads data reveived over the grpc stream.
// Read reads data received over the grpc stream.

func (c *streamConn) Close() error {
var err error
if cstream, ok := c.stream.(grpc.ClientStream); ok {
err = cstream.CloseSend()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// CloseSend closes the send direction of the stream. It closes the stream
// when non-nil error is met. It is also not safe to call CloseSend
// concurrently with SendMsg.
https://pkg.go.dev/google.golang.org/grpc#ClientStream

Should we prevent Close and Write from being called concurrently?

"github.com/sirupsen/logrus"
)

// proxyService impelements the grpc ProxyService.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// proxyService impelements the grpc ProxyService.
// proxyService implements the grpc ProxyService.

log logrus.FieldLogger
}

// DialNode opens a bidrectional stream to the requested node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DialNode opens a bidrectional stream to the requested node.
// DialNode opens a bidirectional stream to the requested node.

return nil
}

func (s *TunnelStrategyV1) CheckAndSetDefaults() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing godoc

Comment on lines 84 to 85
require.NoError(t, err)
require.Equal(t, len(data), n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertions in spawned goroutines will cause problems due to semantics of t.FailNow. Here and in all other occurrences in this file the data needs to be sent over a channel and asserted on the main goroutine that is running the test

Copy link
Contributor Author

@dboslee dboslee Apr 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I believe using assert fixes this since it calls testing.Errorf and testing.Log under the hood which allows the test to continue and fails at the end. Updated here 7c047ff.

If its preferred to use require I can move all the checks to the the main goroutine but using assert feels a little easier to read/reason about in this case.

@dboslee dboslee changed the base branch from master to david/proxy-peering April 29, 2022 20:53
@dboslee dboslee merged commit 16a4c96 into david/proxy-peering Apr 29, 2022
@dboslee dboslee deleted the david/proxy-peering-server branch April 24, 2023 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cloud Cloud
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants