Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caddyfile support #217

Merged
merged 33 commits into from
Jul 23, 2024
Merged

Caddyfile support #217

merged 33 commits into from
Jul 23, 2024

Conversation

vnxme
Copy link
Collaborator

@vnxme vnxme commented Jul 16, 2024

Caddyfile support

No, you aren't sleeping. Yes, you are reading correctly. This PR introduces a fully fledged caddyfile support for mholt/caddy-l4 module. And you can try it right now. Please find details below.

Key features

  • Single caddyfile for both http and layer4 applications.
  • Native caddyfile syntax, no quoted json is required.
  • All the layer4 matchers and handlers are covered.
  • Both global and listener wrapper configs are supported.
  • No breaking changes, existing json configs keep working.

Approach

General

  • The layer4 application can be configured either separately (via layer4 global directive), or as a listener wrapper around the http application (via layer4 directive inside listener_wrappers section of servers global directive):
    {
        # layer4 global configuration below
        layer4 {
            <addresses...> {
                ...
            }
            <addresses...> {
                ...
            }
        }
        servers {
            listener_wrappers {
                # layer4 listener wrapper configuration below
                layer4 {
                    ...
                }
                tls
            }
        }
    }
    
  • Global layer4 block contains a number of servers, and each server is essentially a separately configured application. Each server block is prepended by a list of network addresses. At least one network address must be provided.
  • Inside a server block there are named matcher sets, routes, and matching_timeout option. The same structure applies to layer4 listener wrapper and subroute handler blocks.
  • Layer4 servers, listener wrapper and subroutes are independent, i.e. they can't share matcher sets or routes.

Matcher sets

  • Named matcher sets are introduced with @ at the beginning and generally work the same as http named matchers. However, layer4 named matcher sets can't be mixed with http named matchers.
  • Named matcher sets are parsed first, i.e. a route directive can use named matcher sets that are defined after (below it within the same block of a caddyfile). For each composed route, named matcher sets are ordered in a way they are listed after the corresponding route directive.
  • A named matcher set can have one or many matchers (layer4.matchers.*), an empty named matcher set is invalid:
    @a <matcher> [<matcher_args...>]
    @b {
        <matcher> [<matcher_args...>]
        <matcher> {
            <matcher_option> [<matcher_option_args...>]
        }
    }
    @c <matcher> {
        <matcher_option> [<matcher_option_args...>]
    }
    
  • If a named matcher set block contains multiple matchers, matcher names are ensured to be unique, i.e. each kind of matcher can only be defined once in a matcher set. Multiple matchers are ANDed.
  • If a named matcher set is never listed after any route directive in the same block, it is completely ignored.
  • Basically any matcher supports one option to be set on the same line, or multiple options to be set in its block. Individual matcher syntax depends on the module used. See comments for caddyfile.Unmarshaler implementations for additional guidance.

Routes

  • The only directive consuming layer4 named matcher sets is route. If multiple named matcher sets are provided after route, they are ORed. A route directive without matcher sets is valid, it will match any connection it gets.
  • Routes are ordered in a way route directives are provided in a caddyfile, i.e. no internal reordering occurs.
  • One or many handlers can be provided in a block:
    route @a @b {
      <handler> [<handler_args...>]
    }
    route @c {
        <handler> [<handler_args...>]
        <handler> {
            <handler_option> [<handler_option_args...>]
        }
    }
    
  • If multiple handlers are provided, they are allowed to be of the same kind, i.e. not unique. Handlers are processed one by one, their order is preserved.
  • A handler-free route with or without matchers is also valid, it will pass the connection further, unless this behavior is eventually changed (out of scope of this PR).
  • Basically handlers only support multiple options to be set in their blocks, except for proxy handler which treats same-line options as upstream addresses. Individual handler syntax depends on the module used. See comments for caddyfile.Unmarshaler implementations for additional guidance.

Disclaimer

  • Although I've tested a lot of possible combinations of matchers, handlers and their options, some cases might have been missed. Feel free to share any config which doesn't work as expected. Please provide your caddyfile and caddy adapt output.
  • I only tested caddy adapt command, i.e. no tests have been done to confirm that matchers and handlers do work. Given I haven't changed anything in the inner logic, I expect no matching or handling regressions could be traced to this PR.
  • I only tested environment variables evaluated at launch ({$VAR} syntax), they do work. No placeholders have been tested, but nothing should prevent them (at least, {env.*}, {system.*} and {time.*}) from working correctly. However, some options do strict syntax checks (e.g. numbers, durations, IPs), so that they can't validate placeholders without additional changes.

Known limitations

There are a few functionality limitations, mostly due to the differences between caddyfile and json syntaxes. They are listed below.

  • http matcher doesn't support multiple inner matcher sets (nested OR logic). Use multiple named matcher sets instead:
    # if (http.host == example.com AND http.method == GET) 
    # OR (http.method == HEAD), use <handler>
    @a http {
        host example.com
        method GET
    }
    @b http method HEAD
    route @a @b {
        <handler>
    }
    
  • not matcher doesn't support multiple inner matcher sets (nested OR logic). Use multiple named matcher sets instead:
    # if !(tls.alpn == http/1 AND local_ip == 127.0.0.1)
    # OR !(tls.alpn == http/2), use <handler>
    @c not {
        local_ip 127.0.0.1
        tls alpn http/1
    }
    @d not tls alpn http/2
    route @c @d {
        <handler>
    }
    
  • not matcher, as well as any other layer4 matcher, is only allowed to be present once in a matcher set. While it's not a problem for other matchers, NOT AND NOT logic is only supported via subroutes:
    # if !(postgres) AND !(tls.sni == postgres.example.com), use <handler>
    @e not postgres
    route @e {
        subroute {
            @f not tls sni postgres.example.com
            route @f {
                <handler>
            }
        }
    }
    

Testing

Given @mholt is most likely busy with some developments of higher importance, I would reasonably expect a delay before this PR is merged. Unless you are patient enough, you can test and use these changes right now.

  • Beginner. Try pre-compiled docker images vnxme/caddy:layer4-caddyfile. They have a minimum of the latest Caddy, and layer4 module with caddyfile support.
  • Medium. Compile Caddy on your own with any additional modules you like:
    xcaddy build v2.8.4 \
        --with github.com/mholt/caddy-l4=github.com/vnxme/caddy-l4@caddyfile
    
  • Developer. You must know better than me how to check out and compile Caddy in your favourite IDE.

Config examples (layer4 parts only)

Own working config from #6439

OpenConnect VPN (ocserv) multiplexing
{
    servers {
        listener_wrappers {
            layer4 {
                @ocserv tls sni ocserv.example.com
                route @ocserv {
                    proxy ocserv.address.local:443
                }
                @others not tls sni ocserv.example.com
                route @others
            }
            tls
        }
    }
}

Some configs from #101 and #102

SSH multiplexing
{
    servers {
        listener_wrappers {
            layer4 {
                @ssh ssh
                route @ssh {
                    proxy localhost:22
                }
                # an empty route below is currenty required for fallback functionality
                # it will pass the connection further to the default tls listener wrapper
                route
            }
            tls
        }
    }
}
SSH over TLS
{
    servers {
        listener_wrappers {
            layer4 {
                @ssh tls sni ssh.example.com
                route @ssh {
                    tls
                    proxy localhost:22
                }
                # an empty route below is currenty required for fallback functionality
                # it will pass the connection further to the default tls listener wrapper
                route
            }
            tls
        }
    }
}

Some configs from #209

Prefetch
{
    layer4 {
        0.0.0.0:10443 {
            @pp proxy_protocol
            route @pp {
                proxy_protocol {
                    allow 0.0.0.0/0
                }
            }
            @tls_localhost tls sni localhost
            route @tls_localhost {
                tls
            }
            @http_localhost http host localhost
            route @http_localhost {
                proxy 127.0.0.1:10080
            }
        }
    }
}
SOCKSv4 before SOCKSv5
{
    layer4 {
        0.0.0.0:1080 {
            @s4 socks4
            route @s4 {
                proxy 10.64.0.1:1080
            }
            @s5 socks5
            route @s5 {
                socks5 {
                    credentials user password
                }
            }
        }
    }
}
Subroute
{
    layer4 {
        0.0.0.0:10443 {
            @level1 tls
            route @level1 {
                subroute {
                    @level2 tls sni localhost
                    route @level2 {
                        tls
                        proxy 127.0.0.1:10080
                    }
                }
            }
        }
    }
}
TLS in TLS
{
    layer4 {
        0.0.0.0:10443 {
            route {
                tls
            }
            @tls_inner tls sni inner
            route @tls_inner {
                tls
                proxy 127.0.0.1:10080
            }
        }
    }
}
Listener wrapper matcher with no handler
{
    servers {
        listener_wrappers {
            layer4 {
                @first tls sni localhost
                route @first {
                    tls
                    proxy 127.0.0.1:10080
                }
                @second http
                route @second
            }
        }
    }
}

Included changes

Below is a list of caddyfile directives (handlers, matchers, options), for which caddyfile.Unmarshaler interface or a similar function has been implemented in this PR.

Global Directives

Handlers (layer4.handlers.*)

Proxy load balancing selection policies (layer4.proxy.selection_policies.*):

Proxy upstream inner blocks:

TLS inner blocks:

Listener Wrappers (caddy.listeners.*)

Matchers (layer4.matchers.*)

TLS handshake matchers (tls.handshake_match.*):

Proposed changes

caddyserver/caddy repo, non-breaking:

  • Inner matchers local_ip, remote_ip and sni of tls matcher don't implement caddyfile.Unmarshaler interface. Replacement functions and a special workaround are temporarily placed in modules/l4tls/matcher.go to support these matchers. They can be removed, when native implementations are merged (PR to follow).
  • Inner blocks connection_policy and cert_selection of tls handler don't implement caddyfile.Unmarshaler interface. Replacement functions are temporarily placed in modules/l4tls/handler.go to support these blocks. They can be removed, when native implementations are merged (PR to follow).
  • Inner block tls of proxy handler doesn't implement caddyfile.Unmarshaler interface. Replacement function is temporarily placed in modules/l4proxy/upstream.go to support this block. It can be removed, when native implementation is merged (PR to follow).

mholt/caddy-l4 repo, breaking:

  • Layer4 remote IP matcher is called ip, while HTTP and TLS remote IP matchers are called remote_ip. It seems confusing to me, and I suggest we rename it to remote_ip for consistency, probably with some deprecation warnings. If agreed, a PR will follow.

Discussions

There is a big thread on caddyfile support (#16) created over 3 years ago, and 100+ people have expressed their support so far. Below are my 2 cents regarding the ideas discussed:

  • First of all, credits to @RussellLuo for sharing a quick implementation of caddyfile support.
  • I like what @francislavoie suggested here , but it's really ambitious and difficult to implement in a non-breaking way. It would be great if Caddy v3 could feature something like that.
  • I appreciate @mholt defined 2 possible ways of implementing caddyfile support for layer4 servers here. While I understand a separate ServerType could be a good way to implement the above, I decided to stick to the global directive.
  • I also support this suggestion by @RussellLuo, but it would be a breaking change. Definitely, it's also a syntax to consider for Caddy v3.
  • I understand the @CEbbinghaus's arguments for a separate configuration file here, but having a single configuration file for everything is a very nice feature of Caddy that I wouldn't like to be dropped.
  • Finally, what I suggest is to go step by step. Let's have a global option merged first. Then, when/if we agree on the syntax, it's not a big deal to prepare a separate ServerType, given all matchers and handlers are already covered by this PR.

@francislavoie
Copy link
Collaborator

Impressive amount of work!

I like what @francislavoie suggested #16 (comment) , but it's really ambitious and difficult to implement in a non-breaking way.

Not really - the plan is to roll caddy-l4 into Caddy mainline, and then build out Caddyfile support from there. I think it would be extremely unlikely to be an actual breaking change for anyone if we use a scheme prefix or whatever for site addresses because it's not current valid config.

My main concern is that if we introduce it like this now, then we would need to ask users to migrate their config somehow later when we do roll it into mainline. But we can mark it as experimental in the README for now I guess, noting that it will likely change later.

@vnxme
Copy link
Collaborator Author

vnxme commented Jul 17, 2024

the plan is to roll caddy-l4 into Caddy mainline

Can't wait for it to happen. Hope this PR will speed up the process.

if we use a scheme prefix

So my understanding is all server blocks in a caddyfile are currently processed by ServerType.Setup() here. I can definitely prepare a similar ServerType for layer4 functionality. What I don't understand is how to make Caddy use different ServerTypes (http and layer4 ones) for the same adapter, given no same-named adapters are allowed to be registered here.

My main concern is that if we introduce it like this now, then we would need to ask users to migrate their config somehow later when we do roll it into mainline. But we can mark it as experimental in the README for now I guess, noting that it will likely change later.

I don't mind if we continue working on it together. What you are talking about (layer4 global directive and readme changes) is only covered in steps 1/27 and 27/27. If you helped me solve the problem with ServerTypes described above, I would prepare a few more changes to replace global directive with scheme-specific server blocks.

@francislavoie
Copy link
Collaborator

What I don't understand is how to make Caddy use different ServerTypes (http and layer4 ones)

I want to throw away the concept of "server type" in the Caddyfile adapter. It was a misguided idea. It doesn't work in practice. That would be a purely internal change though, but reorganizing that would make it easier to make site blocks pluggable by changing whether we call down into httpcaddyfile or something else, per site block.

Either way, I think we should wait until we're ready to move caddy-l4 into mainline, which will still be a little while because there's a bit of an ongoing debate about how matchers work (see the other recent PRs) that needs to be resolved first, and a few other API stability details.

@vnxme
Copy link
Collaborator Author

vnxme commented Jul 17, 2024

Either way, I think we should wait

Sorry, I can see no point in waiting. I have seen the recent PR battle between #208 and #210, and I'm personally interested in @WeidiDeng and @ydylla having come to a mutually acceptable solution, because the problem they are trying to solve affects my use case (see caddyserver/caddy#6439). They seem to be busy with other things, just like @mholt. But all of these problems have nothing to do with caddyfile support which this PR is about.

reorganizing that would make it easier to make site blocks pluggable by changing whether we call down into httpcaddyfile or something else, per site block.

Why don't we do it in parallel with fixing caddy-l4 stability? It will be easier/faster to move caddy-l4 into the mainline, if it has caddyfile support. Is there anything, other than waiting, I could help you with in order to make this PR mergeable?

@vnxme
Copy link
Collaborator Author

vnxme commented Jul 17, 2024

I'm hesitating about steps 8-11/27 that deal with proxy handler. There are at least 2 possible syntax options:

  • nested health_checks, load_balancing and tls blocks which are implemented for layer4 as of now:

    proxy handler syntax, option 1
    proxy [<upstreams...>] {
        health_checks {
      	  active {
      		  interval <duration>
      		  port <int>
      		  timeout <duration>
      	  }
      	  passive {
      		  fail_duration <duration>
      		  max_fails <int>
      		  unhealthy_connection_count <int>
      	  }
        }
        load_balancing {
      	  selection <policy> [<policy_options>]
      	  try_duration <duration>
      	  try_interval <duration>
        }
        proxy_protocol <v1|v2>
        upstream [<args...>] {
      	  dial <address:port> [<address:port>]
      	  max_connections <int>
      	  tls {
      		  ca <module>
      		  client_auth <automate_name> | <cert_file> <key_file>
      		  curves <curves...>
      		  except_ports <ports...>
      		  handshake_timeout <duration>
      		  insecure_skip_verify
      		  renegotiation <never|once|freely>
      		  server_name <name>
      		  # DEPRECATED:
      		  root_ca_pool <certificates...>
      		  root_ca_pem_files <certificates...>
      	  }
        }
        upstream [<args...>]
    }
    
  • flat health_, lb_ and tls_ prefixed options consistent with reverse_proxy documentation here and here, like this:

    proxy handler syntax, option 2
    proxy [<upstreams...>] {
        health_interval <duration>
        health_port <int>
        health_timeout <duration>
        fail_duration <duration>
        max_fails <int>
        # NOTE: option below is called "unhealthy_request_count" in the reverse_proxy docs
        unhealthy_connection_count <int>
        lb_policy <name> [<options...>]
        lb_try_duration <duration>
        lb_try_interval <duration>
        upstream [<addresses...>] {
      	  dial <address:port> [<address:port>]
      	  # NOTE: option below is called "max_conns_per_host" in the reverse_proxy docs
      	  max_connections <int>
      	  tls
      	  tls_trust_pool <module>
      	  tls_client_auth <automate_name> | <cert_file> <key_file>
      	  tls_curves <curves...>
      	  tls_except_ports <ports...>
      	  tls_timeout <duration>
      	  tls_insecure_skip_verify
      	  tls_renegotiation <never|once|freely>
      	  tls_server_name <name>
      	  # NOTE: option below is deprecated and even not parsed by HTTPTransport.UnmarshalCaddyfile()
      	  # [tls_]root_ca_pool <certificates...>
      	  tls_trusted_ca_certs <certificates...>
      	  }
        }
    }
    

It's worth noting that reverse_proxy of the http app and proxy of the layer4 app are different handlers, and their options doesn't have to match. Still, I like the idea of switching from option 1 (having maximum consistency with json option names) to option 2 (having maximum consistency with existing caddyfile documentation). But before that we need to agree on the issues described in the code notes above (differently named options).

One place where we can't fully merge what is done for reverse_proxy to proxy is to directive. The latter handler has max_connections and tls (block or set of options) set per upstream, while the former shares these options across all upstreams. Besides, proxy allows each upstream to have multiple dial addresses (dial is []string), while reverse_proxy allows only one dial address (dial is simply string). I'm not sure why such a difference exists and what the use of multiple addresses per single upstream is, but it all means we can't substitute upstream block inside proxy for to subdirective, unless we agree to have another known functionality limitation (as compared to layer4 json config).

@mholt
Copy link
Owner

mholt commented Jul 17, 2024

Wow -- an incredible effort! Nice work, @vnxme. And this is one of the most organized, best-documented contributions I have ever seen. You get a gold star. 🌟

Just catching up on this thread... (Disclaimer: I have not reviewed the code change in any detail yet. Just a quick skim. I'm reading the discussion for starters.)

First, I want to clarify that I'm inclined to accept this contribution, perhaps with some revisions/discussion first... I know Francis has some concerns about future plans related to moving this into the mainline repo, so I want to make sure we can all be satisfied with a big change like this.

This module is experimental -- and documented as "in development" and to expect breaking changes.

Here's how I view this repository: it's the perfect place to experiment and try different ways to get it right, or to learn valuable lessons, before we merge into mainline Caddy.

So, in my opinion, it's OK if this isn't how it's done later when/if it becomes a standard part of Caddy. If the Caddyfile config looks one way while it's here, but then it changes when/if it gets merged into Caddy, that's OK because their entire build config will have to change anyway (they won't be compiling in a separate caddy-l4 plugin anymore).

Was there much or any code that had to be copied out of Caddy to make this work, @vnxme ?

Regarding

I'm hesitating about steps 8-11/27 that deal with proxy handler. There are at least 2 possible syntax options

This is a great question... I think I like flatter (not nested) configs in general (nesting can get unwieldy IMO, but sometimes it can be helpful). That said, a couple thoughts to help guide the decision...

  • How often are active and passive health checks both defined together? I don't see this often, and when only 1 is defined, I definitely prefer the flatter config syntax.
  • Given the per-connection control and flexibility in this module, I do think the tls block makes sense to me in this case.

Indentation/nesting can be easier for some people to grok or more difficult for others, since it can be harder to scan. But it does add structure. So... up for debate.

The latter handler has max_connections and tls (block or set of options) set per upstream, while the former shares these options across all upstreams.

The reason for this is because for the HTTP reverse_proxy, we use Go's underlying http.Transport which manages TCP connections for us, so we're limited to its exported config API.

I'm not sure why such a difference exists and what the use of multiple addresses per single upstream is

This is a really cool feature IMO! Even if niche. I actually used something like this with a backup service I wrote years ago. The basic idea is that you can have one downstream client that is able to communicate with multiple identical upstreams simultaneously. One upload to the proxy, then it tees to multiple upstreams, and their outputs all get aggregated to the client. This requires special handling by the client and the servers have to be in the same state (meaning, they have to be in a state where one will error and the others don't, for example, to avoid breaking your business logic) -- but it is quite useful in some cases. I think we can even improve on how this situation is handled and make more things configurable, but I digress; that's a discussion for another day.

Okay, now back to the top...

Overall I think this is a solid first pass. Probably just about how I would do it.

Known limitations

All probably fine, especially for now. We can iterate later if needed. :)

Layer4 remote IP matcher is called ip, while HTTP and TLS remote IP matchers are called remote_ip. It seems confusing to me, and I suggest we rename it to remote_ip for consistency, probably with some deprecation warnings. If agreed, a PR will follow.

Good point... the distinction is, of course, for local_ip (which I don't think even exists in this module... yet?), which I believe refers to the IP address the remote used to connect to this machine (so it could be 127.0.0.1, an internal IP like 10.* something, or a public IP... depending on the network interface, NAT, etc). (Correct me if I'm wrong.)

So, yeah, maybe we should fix this. I'm down for remote_ip.

I like what @francislavoie suggested #16 (comment) , but it's really ambitious and difficult to implement in a non-breaking way. It would be great if Caddy v3 could feature something like that.

I do like this idea, and I would love to see something like this stubbed out, but it doesn't have to be right now. Hopefully without a Caddy v3 😄 (which we don't have plans for).

I appreciate @mholt defined 2 possible ways of implementing caddyfile support for layer4 servers #16 (comment). While I understand a separate ServerType could be a good way to implement the above, I decided to stick to the global directive.

I'm OK with this approach for now, as it allows combining HTTP and layer4 together easier. The new ServerType would be a good way to implement an improved, experimental Caddyfile syntax to eventually upgrade what we currently have.

I also support #16 (comment) by @RussellLuo, but it would be a breaking change. Definitely, it's also a syntax to consider for Caddy v3.

Yep, this is also a decent idea; though like I said I prefer flatter configs when possible :)

I understand the @CEbbinghaus's arguments for a separate configuration file #16 (comment), but having a single configuration file for everything is a very nice feature of Caddy that I wouldn't like to be dropped.

I agree we can and should keep it to 1 file.

Finally, what I suggest is to go step by step. Let's have a global option merged first. Then, when/if we agree on the syntax, it's not a big deal to prepare a separate ServerType, given all matchers and handlers are already covered by this PR.

Yep, that's what I'm thinking too; even if implementing a new ServerType is currently impractical (which is I think why @francislavoie says it is a "misguided idea"), we can still reuse most of the code here related to parsing tokens, UnmarshalCaddyfile, etc. We just would wire it up a little differently at the highest level.

Anyway, I'd like to recommend that we carry this PR through to completion and merge it. It doesn't have to be the perfect implementation and it doesn't have to be "complete" (in the sense that there is nothing left to do -- it just has to work well). We can figure out how to work it into the Caddyfile more natively later; most of the code will remain, regardless.

@francislavoie Is that agreeable? We can get this into a merge-worthy state, then figure out how to upgrade the Caddyfile as a whole.

@mholt mholt added the enhancement New feature or request label Jul 17, 2024
@francislavoie
Copy link
Collaborator

Yep 👍

To clarify what I meant, the "server types" idea was to have entirely separate Caddyfile parsing per type, but what we actually want is separate site block parsing per type. It's currently at the wrong layer/level of the Caddyfile IMO. Just needs to be moved down to the site address layer and fork the parsing based on the detected type from the site address. But site addresses still need to be able to send data up to the top level to be assembled after (logging, tls, etc) which live outside the app in question.

@mholt
Copy link
Owner

mholt commented Jul 17, 2024

@francislavoie Ok, that sounds good. I like that idea. Thanks!

@vnxme Would you be okay helping us with massaging this into the new Caddyfile improvements when they are being worked on/released?

Another question -- does this have some sort of default directive order, or do all connection handlers need to be inside a route block?

@WeidiDeng
Copy link
Contributor

@mholt I don't think a default order is needed, unlike http. People will write the directives in the order they need. Although some of the handlers behave like middlewares in http, their purpose depends on the handlers besides them due to the diverse need of the users.

@vnxme
Copy link
Collaborator Author

vnxme commented Jul 18, 2024

@mholt, I appreciate your positive feedback, thank you!

Code sharing

Was there much or any code that had to be copied out of Caddy to make this work

I don't think I had to copy much from Caddy mainline. However, I studied the approaches you had taken there to ensure as much consistency as possible. For example, ParseCaddyfileNestedMatcherSet() defined in layer4/caddyfile.go is implemented in almost the same way as ParseCaddyfileNestedMatcherSet() from modules/caddyhttp/matchers.go. Generally I tried to reuse the code existing in the mainline for layer4 purposes. Let me know if you find any inefficiencies.

On the other hand, there is some code that I propose to move from this PR to the mainline. See caddyserver/caddy#6461 and caddyserver/caddy#6462. As mentioned above, we may also need to move the code related to tls option inside upstream directive of proxy handler, unless we get it flattened.

Syntax of proxy handler

This is a great question... I think I like flatter (not nested) configs in general (nesting can get unwieldy IMO, but sometimes it can be helpful). That said, a couple thoughts to help guide the decision...

  • How often are active and passive health checks both defined together? I don't see this often, and when only 1 is defined, I definitely prefer the flatter config syntax.
  • Given the per-connection control and flexibility in this module, I do think the tls block makes sense to me in this case.

Indentation/nesting can be easier for some people to grok or more difficult for others, since it can be harder to scan. But it does add structure. So... up for debate.

I agree that the flatter syntax looks better. In my view, it is also true for tls block. This way it will be easier for users to remember tls_ options for both proxy and reverse_proxy. And we won't loose any control and flexibility features, given tls_ options are kept inside upstream block, i.e. adjusted per each upstream.

Renaming of ip handler into remote_ip

Good point... the distinction is, of course, for local_ip (which I don't think even exists in this module... yet?), which I believe refers to the IP address the remote used to connect to this machine (so it could be 127.0.0.1, an internal IP like 10.* something, or a public IP... depending on the network interface, NAT, etc). (Correct me if I'm wrong.)

So, yeah, maybe we should fix this. I'm down for remote_ip.

OK, I will prepare a separate PR with this breaking change.

Further caddyfile modifications

To clarify what I meant, the "server types" idea was to have entirely separate Caddyfile parsing per type, but what we actually want is separate site block parsing per type. It's currently at the wrong layer/level of the Caddyfile IMO. Just needs to be moved down to the site address layer and fork the parsing based on the detected type from the site address. But site addresses still need to be able to send data up to the top level to be assembled after (logging, tls, etc) which live outside the app in question.

I support this point of view. However, I doubt I could implement it on my own without breaking anything inside Caddy.

Would you be okay helping us with massaging this into the new Caddyfile improvements when they are being worked on/released?

Yep, sure.

Handler directive ordering

does this have some sort of default directive order, or do all connection handlers need to be inside a route block?

With regards to handlers, neither default directive order exists, nor internal reordering occurs. As mentioned in the approach section of the opening post, handler order is preserved, i.e. if tls goes before proxy in a caddyfile, layer4 does TLS termination, than proxies.

People will write the directives in the order they need.

That's exactly what I assumed and took for granted in my approach.

Code check failures

I noticed, the automatic code checks are getting failed complaining about caddytls.* dependencies. However, the code compiles flawlessly both in my IDE and via xcaddy to build docker images. Did I forget to adjust anything else to pass these code checks?

@vnxme
Copy link
Collaborator Author

vnxme commented Jul 18, 2024

What I'm also hesitating about is how strict caddyfile syntax checks must be. Examples:

  • A directive doesn't support any same-line options, but a user provides one or a few. In most places of the mainline code caddyfile.Dispenser.ArgErr() is called, and so do I in this PR. However, there is at least one place in the mainline code where it's done differently: such same-line arguments are just skipped without any complaints.
  • An option expects some value that should be convertible into a number, e.g. duration, int, float. Then we have no choice, but to try parse this value into an expected type and return an error if it won't convert. Otherwise json marshalling and unmarshalling won't work properly. Both mainline and this PR share this approach.
  • An option expects addresses or address:port combinations. We can either simply put these strings to the corresponding fields, or parse addresses first, and fill the corresponding fields unless parsing fails. The latter is what I do in this PR. But if we go a little deeper into details, addresses are validated using different functions across layer4 code. E.g. addresses after local_ip matcher are parsed with layer4.ParseNetworks(), that doesn't accept {env.VAR} syntax, while addresses after upstream option of proxy handler are parsed with caddy.ParseNetworkAddress(), that does accept placeholders.

This is just to reconfirm that I'm not too strict with caddyfile syntax checks. Because I'm not sure that I treat placeholders correctly this way. As noted above, I checked {$VAR} syntax, and it works well for numbers and addresses. However, I vaguely understand whether we need {env.VAR} syntax to work with layer4. Would be grateful if somebody could share examples of using run-time evaluated environment variables and/or other placeholders with layer4.

@mholt
Copy link
Owner

mholt commented Jul 19, 2024

I noticed, the automatic code checks are getting failed complaining about caddytls.* dependencies. However, the code compiles flawlessly both in my IDE and via xcaddy to build docker images. Did I forget to adjust anything else to pass these code checks?

caddytls.MatchLocalIP is new, but is in 2.8.4: caddyserver/caddy@26748d0

So we might need to update go.mod here, since we're using new API surface.

What I'm also hesitating about is how strict caddyfile syntax checks must be. Examples:

You are very right, it is all over the place. Partly because I only went to lengths to make it strict based on bug reports or questions where people had the syntax wrong but no error. Sometimes it's not even harmful, but can be confusing.

So generally I leave it up to user feedback to make it more strict (because I'm lazy with stuff like that) but it doesn't hurt to be more strict if it's not too complex. Don't stress to make it perfect everywhere though if it's tedious.

@mholt
Copy link
Owner

mholt commented Jul 19, 2024

@vnxme Would you like to join our developer Slack? Just let me know which email to invite.

Also, after we merge this I can grant you collaborator privileges so you can continue to contribute in branches, instead of forks, if you want.

@francislavoie
Copy link
Collaborator

A directive doesn't support any same-line options, but a user provides one or a few. In most places of the mainline code caddyfile.Dispenser.ArgErr() is called, and so do I in this PR. However, there is at least one place in the mainline code where it's done differently: such same-line arguments are just skipped without any complaints.

I don't know why it was done that way in that spot. That's definitely weird. We tend to lean on the stricter side, rejecting anything not usable.

An option expects some value that should be convertible into a number, e.g. duration, int, float. Then we have no choice, but to try parse this value into an expected type and return an error if it won't convert. Otherwise json marshalling and unmarshalling won't work properly. Both mainline and this PR share this approach.

Yep 👍

E.g. addresses after local_ip matcher are parsed with layer4.ParseNetworks(), that doesn't accept {env.VAR} syntax, while addresses after upstream option of proxy handler are parsed with caddy.ParseNetworkAddress(), that does accept placeholders.

Yeah ideally most things should support placeholders if possible, at runtime. Thankfully with Caddyfile syntax {$ENV} style is possible as a recourse.

@vnxme
Copy link
Collaborator Author

vnxme commented Jul 19, 2024

So we might need to update go.mod here, since we're using new API surface.

That's exactly what I missed, thanks. I used go mod tidy to generate new go.mod and go.sum files. Hope, it will build now.

Would you like to join our developer Slack? Just let me know which email to invite.

Thanks for trust. I have little experience with Slack. If you think I might be of any help there, I don't mind. Do you have a public address? I could drop an e-mail there, so that not to publish it here.

Also, after we merge this I can grant you collaborator privileges so you can continue to contribute in branches, instead of forks, if you want.

I'm quite fine with working in forks. I think, it's more secure. Given I'm not a professional developer, what I would like most when I propose some changes is not to break anything because of my negligence.

Yeah ideally most things should support placeholders if possible, at runtime. Thankfully with Caddyfile syntax {$ENV} style is possible as a recourse.

Then, I believe, more tests would be needed to identify the places where runtime variables don't work. I suppose, key risk areas are IP addresses. I will try to investigate it myself. If anybody finds such a place before me, please don't hesitate to report here.

By the way, I have flattened proxy handler syntax, as discussed above. See 4 fixes for steps 8-11/27.

@mholt
Copy link
Owner

mholt commented Jul 20, 2024

@vnxme Sure, no problem. Send me an email to the address near the top part of this page: https://caddyserver.com/sponsor

Will take a closer look in the next few days (maybe after the weekend) :)

@vnxme
Copy link
Collaborator Author

vnxme commented Jul 21, 2024

Squashed a few fixes and rebased this PR onto master with #218 merged.

@mholt
Copy link
Owner

mholt commented Jul 22, 2024

This LGTM -- is there anything else we're waiting on before merging?

@francislavoie
Copy link
Collaborator

I'd like to see some Caddyfile adapt tests if that's not too much to ask. It would serve as examples as well to make sure the adapter functions produce the expected JSON.

See the Caddy main repo, there's tests like these https://github.com/caddyserver/caddy/tree/master/caddytest/integration/caddyfile_adapt called from https://github.com/caddyserver/caddy/blob/master/caddytest/integration/caddyfile_adapt_test.go I don't know how possible it is to write tests like this from a plugin package, but would be nice to have.

vnxme added 19 commits July 23, 2024 22:55
- caddyconfig.JSON() interprets AuthMethods []uint8 as a list of bytes
- changing type of AuthMethods to []uint16 resolves the problem
and inner structs:
- caddytls.MatchLocalIP
- caddytls.MatchRemoteIP
- caddytls.MatchServerName
and inner structs:
 - caddytls.ConnectionPolicy
 - caddytls.CustomCertSelectionPolicy
@vnxme
Copy link
Collaborator Author

vnxme commented Jul 23, 2024

Rebased this PR onto master with #219 merged.

@francislavoie Thanks for a great idea. I've prepared at least 1 test per each handler and matcher. Hope it would be enough. By the way, I've caught a few copy-and-paste bugs, so it was definitely worth it.

My suggestion for the next steps:

@mholt mholt linked an issue Jul 23, 2024 that may be closed by this pull request
@mholt mholt merged commit ec5408a into mholt:master Jul 23, 2024
6 checks passed
@mholt
Copy link
Owner

mholt commented Jul 23, 2024

Excellent work. Thank you so much @vnxme. Many people are grateful to you for this!

I'll attend to those other PRs next.

@vnxme vnxme deleted the caddyfile branch July 24, 2024 07:52
@techknowlogick
Copy link

Thank you, @vnxme, for your significant work on this PR, and @mholt and @francislavoie, for your thorough reviews on it, too. <3

@louis-lau
Copy link

Incredible! Was just looking into using this and dreading the JSON, much love from me.

PS, the readme still says "Since this app does not support Caddyfile (yet?)"

@mholt
Copy link
Owner

mholt commented Jul 27, 2024

@louis-lau That's being fixed in #223 -- which I'm still reviewing 😅

@qm3ster
Copy link

qm3ster commented Dec 23, 2024

Oh my god, this is insane. Never in a thousand years did I imagine Caddyfile support would get added, and frankly I had come to terms with it, but here we go?!
"Boar Head Exploding" Apple emoji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Support Caddyfile
7 participants