Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add diff/sync support for Konnect documents #358

Merged
merged 30 commits into from
May 17, 2021

Conversation

rainest
Copy link
Contributor

@rainest rainest commented Apr 30, 2021

Follow-up to #336. This adds support for diff and sync on top of dump, along with the associated necessary code in the differ, file builder, and solver. Some notes:

  • This is prescriptive about the layout of the document content files on disk. They must match the structure used by dump. My rationale despite previous discussion otherwise was that (a) we basically have to be prescriptive about the service package and service version structure without some sort of manifest file and (b) that paths are really names, and there shouldn't be any reason to actually handle any more complex directory tree structure below the package and version directories.
  • Originally, we intended to assume that the content directories would be under the same directory as konnect.yaml. In the course of writing this, I found that we actually support multiple state files, which is at odds with our original intent. The implementation in this PR uses the "first" state file's directory, but that's a cop-out (and I'm not even sure the order is consistent). How do we want to handle this? My first thought is that we use the state file directory if there's only a single state file, but if there are multiple, we have an additional flag to specify the content directories' location and require it when you provide multiple states.
  • In some manual test runs, I found that DeepEqual() finds a spurious difference in documents attached to service packages under some unknown circumstances. I believe this is somehow related to the versions attached to the parent package, since their addresses are the only difference I spotted walking through the code, although the values within them are the same and they shouldn't result in inequality detection. I disabled foreign checks, which clears that issue, and doesn't present any obvious issue--you can't change the foreign relationships of documents since they're embedded within their parent in the target state.

Travis Raines added 6 commits April 29, 2021 15:31
Practical testing revealed some incorrect endpoints and broken upstream
behavior for PATCH. Updates now use a DELETE/PUT (similar to the web UI)
rather than a PATCH.
@rainest rainest requested a review from a team as a code owner April 30, 2021 20:38
Copy link
Contributor

@mflendrich mflendrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but see comments.

konnect/document_service.go Outdated Show resolved Hide resolved
@@ -11,7 +11,6 @@ const (
type ParentInfoer interface {
URL() string
Key() string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder that, as discussed yesterday, the URL could probably play the role of the key.

If you think that makes sense, we could simplify this interface by removing Key.

// EqualWithOpts returns true if d1 and d2 are equal.
// If ignoreID is set to true, IDs will be ignored while comparison.
// If ignoreTS is set to true, timestamp fields will be ignored.
func (d1 *Document) EqualWithOpts(d2 *Document,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem of comparing structs with certain fields ignored has been solved already: go-cmp with FilterPath and Ignore can do that and I've successfully used that in other projects.

Not requesting changes here - sharing for visibility of the alternative, potentially better, approach.

@hbagdi
Copy link
Member

hbagdi commented May 6, 2021

Please hold from merging this.
I'll work with @rainest to get this merged.

@hbagdi
Copy link
Member

hbagdi commented May 6, 2021

Originally, we intended to assume that the content directories would be under the same directory as konnect.yaml. In the course of writing this, I found that we actually support multiple state files, which is at odds with our original intent. The implementation in this PR uses the "first" state file's directory, but that's a cop-out (and I'm not even sure the order is consistent). How do we want to handle this? My first thought is that we use the state file directory if there's only a single state file, but if there are multiple, we have an additional flag to specify the content directories' location and require it when you provide multiple states.

I think the state file will in most cases point to a directly rather than multiple files.
We also support a hierarchy of files, so multiple files in different directories are also supported.

This is prescriptive about the layout of the document content files on disk. They must match the structure used by dump. My rationale despite previous discussion otherwise was that (a) we basically have to be prescriptive about the service package and service version structure without some sort of manifest file and (b) that paths are really names, and there shouldn't be any reason to actually handle any more complex directory tree structure below the package and version directories.

The point is that we don't want users to follow the specific structure. If they want to have a different structure for any reason, they should be allowed. Is there a technical blocker to implement that?
EDIT: You might wonder why am I so adamant about this particular thing. There are two reasons:

  1. decK has intentionally not been opinionated about these layouts. It gives some degree of flexibility to our end users. Plus, if now we start becoming specific about the layout, we are breaking homogeneity in the product experience (which might be okay in future but I don't think right now is the time).
  2. Konnect is still early in the product lifecycle (birth to death). These assumptions look good and fair today but it is very natural to have changes in these assumptions in future. Minimizing the API surface we have to conform to for forward-compatibility usually leads to better product experience and also eases maintainability.

@hbagdi
Copy link
Member

hbagdi commented May 6, 2021

BUG:
1.I created a service package and a version. Added a service document and an OAS spec.
2.deck konnect dump
3. Delete package from the UI
4. deck konnect sync -> throws 500 for the create document for version

@rainest rainest force-pushed the wip/konnect-documents branch 2 times, most recently from 9b25cbf to d87e76a Compare May 7, 2021 21:33
@codecov-commenter
Copy link

codecov-commenter commented May 7, 2021

Codecov Report

Merging #358 (0fe75e3) into feat/konnect-documents (653773c) will decrease coverage by 6.27%.
The diff coverage is 23.56%.

Impacted file tree graph

@@                    Coverage Diff                     @@
##           feat/konnect-documents     #358      +/-   ##
==========================================================
- Coverage                   52.08%   45.81%   -6.28%     
==========================================================
  Files                          61       85      +24     
  Lines                        5049     5867     +818     
==========================================================
+ Hits                         2630     2688      +58     
- Misses                       2109     2857     +748     
- Partials                      310      322      +12     
Impacted Files Coverage Δ
cmd/common_konnect.go 14.58% <0.00%> (-0.64%) ⬇️
dump/dump_konnect.go 26.96% <0.00%> (+0.29%) ⬆️
file/konnect.go 0.00% <0.00%> (ø)
file/types.go 43.43% <ø> (ø)
file/writer.go 13.41% <0.00%> (-0.07%) ⬇️
solver/document.go 0.00% <0.00%> (ø)
solver/solver.go 0.00% <0.00%> (ø)
state/builder.go 0.00% <0.00%> (ø)
state/konnect_types.go 15.90% <6.66%> (-7.43%) ⬇️
file/builder.go 52.47% <38.23%> (+1.59%) ⬆️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 653773c...0fe75e3. Read the comment docs.

@rainest
Copy link
Contributor Author

rainest commented May 7, 2021

I think the state file will in most cases point to a directory rather than multiple files.

We do totally support multiple files in different directories though. This is valid:

/tmp/kdeck konnect diff --konnect-email "traines+konnect@konghq.com" --konnect-password-file pass \
    -s /tmp/wat/konnect.yaml \
    -s /tmp/why/konnect.yaml

The point is that we don't want users to follow the specific structure. If they want to have a different structure for any reason, they should be allowed. Is there a technical blocker to implement that?

From a design perspective, I'm inclined to use the single format for both dump (where we have to choose the layout anyway) and sync unless we have a known use case that would require the additional flexibility already. Limiting the decisions users have to make is beneficial if we don't know of a reason they'd actually need the flexibility.

If we do receive feedback that users need more flexibility here, I don't think we're locked in to this layout: we can add an alternative means of specifying the path while keeping this as a default, similar to how we default to kong.yaml/konnect.yaml but use your chosen state file if you specify another. We'd further have the benefit of understanding why users need the flexibility, allowing us to design around that need and bounce prototypes off interested users.

From a technical perspective, the more flexible approach requires some sort of manifest indicating the file locations AFAICT. IMO the easiest option for that is reintroducing the LocalPath field I had on Documents originally, and having them point to an absolute location. Earlier discussion indicated we didn't want this.

f9d1f12 shortens the content diffs. Content is stripped from the objects we pass to getDiff() (we still need to run that for the other fields) and passed through a unified diff instead, which is printed after the getDiff() output. It uses a Document-specific path in the crud.Update clause. If we need something similar for other types in the future, we should rework that into an interface.

7fb98fe makes Documents not arrays in the file. This aligns with what the GUI does, though not with the API, which does support multiple Documents as long as only one is published (they're just entirely unused in the GUI).

file/types.go Outdated Show resolved Hide resolved
cmd/common_konnect.go Show resolved Hide resolved
konnect/document_service.go Show resolved Hide resolved
@@ -14,6 +19,45 @@ var (
differ = gojsondiff.New()
)

func getDocumentDiff(a, b *state.Document) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: study and evaluate this

solver/utils.go Show resolved Hide resolved
Travis Raines added 4 commits May 11, 2021 14:14
Add checks to confirm a Document exists before acting on it.

7fb98fe missed several locations that previously relied on an empty
range to avoid interactions with nil Documents. Without the range, those
sections of code would attempt to interact with a Document even if the
parent Service Package/Version had none set, resulting in nil deref
panics.
Use local relative paths for Document objects in the state file. When
comparing against and syncing to Konnect state, strip these local paths
and use only the base filename with a prepended slash to match the path
value format created by the Konnect GUI.
Konnect Document updates used to DELETE existing Documents before PUTing
the updated Document. This DELETE call is unnecessary, as you can PUT
the updated Document immediately (although you cannot PATCH it).
@rainest
Copy link
Contributor Author

rainest commented May 12, 2021

In retrospect, the f9d1f12 explanation would have benefited from an example. In this case, swagger.json is a single-line JSON file (the Swagger pet store). Performing any sort of naive text diff against this would print a large amount of text, since all changes would be within the one line. The JSON document diff uses MarshalIndent() to pretty-print the JSON blobs first and then performs a unified diff on the results. This only shows the section that changes:

$ /tmp/kdeck konnect diff --konnect-email "traines+konnect@konghq.com" --konnect-password-file pass
updating document /swagger.json  {        
   "id": "084708f1-9af4-4a38-b6d3-6c593b87c552",
   "path": "/swagger.json",
   "published": true
 }
--- old
+++ new
@@ -229,7 +229,7 @@
 						]
 					}
 				],
-				"summary": "Add a new pet bloo bloo bloo to the store",
+				"summary": "Add a new pet to the store",
 				"tags": [
 					"pet"
 				]

Alternative would be some sort of JSON structure-aware diff that can instead show diffs at an object path or a word diff that can better handle small changes in a long line. I don't know of a readily available Go library for these.

19ddb57 handles using the Document Path field as both the Path value when interacting with the Konnect API and a filepath relative to the state file directory. Having now implemented this, I'm further convinced that we should use a separate state file field (localPath or filepath or whatever) instead of a field that serves both purposes.

With a single field, we effectively discard the original path value retrieved from the API when dumping, and must rebuild it using StripLocalDocumentPath() and FilenameToName(). This is additional complexity, breaks with other decK objects (which we can marshal and unmarshal without additional transformations), and imposes the same sort of hidden requirements we'd wanted to avoid by not enforcing the dump filestructure, just elsewhere, and worse, as it overlaps with content that's actually sent to Konnect (whereas the enforced filenames were internal to decK).

@hbagdi
Copy link
Member

hbagdi commented May 12, 2021

@rainest you might want to rebase the feat/konnect-documents branch on top of main and resolve all the conflicts and then rebase wip/konnect-documents on top of feat/konnect-documents. I expect there to be significant merge conflicts.

cmd/common_konnect.go Outdated Show resolved Hide resolved
solver/utils.go Outdated Show resolved Hide resolved
bContent = string(bBytes)
}
edits := myers.ComputeEdits(span.URIFromPath("old"), aContent, bContent)
contentDiff = fmt.Sprint(gotextdiff.ToUnified("old", "new", aContent, edits))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please rename old and new to actual path to the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both versions share the same local path--sort of, anyway, since one was pulled from the Konnect API. There aren't actually separate local filepaths to show.

We potentially could reconstruct the local paths and then append suffixes to get something like servicePackage/serviceVersion/foo.json.old and servicePackage/serviceVersion/foo.json.new, but it's rather convoluted, since these are state.Document structs (which contain the path value we actually get from/push to Konnect) rather than file.Document structs (where the path is the local path, which you see in konnect.yaml). IMO trying to pass that along from the konnect.yaml read onward into the state struct and hiding it from the diff mechanism probably isn't worth the effort at present.

solver/utils.go Show resolved Hide resolved
utils/utils.go Show resolved Hide resolved
@hbagdi
Copy link
Member

hbagdi commented May 12, 2021

With a single field, we effectively discard the original path value retrieved from the API when dumping, and must rebuild it using StripLocalDocumentPath() and FilenameToName(). This is additional complexity, breaks with other decK objects (which we can marshal and unmarshal without additional transformations), and imposes the same sort of hidden requirements we'd wanted to avoid by not enforcing the dump filestructure, just elsewhere, and worse, as it overlaps with content that's actually sent to Konnect (whereas the enforced filenames were internal to decK).

This is certainly more complex in decK's code but it makes the end-user experience better by a) being flexible (flexibility is good ro not is a separate discussion) b) hiding away the wrongs of the API of Konnect.

the same sort of hidden requirements we'd wanted to avoid by not enforcing the dump filestructure

Can you elaborate? How does it impose hidden requirements?
I do see the point that it overlaps with the setting in Konnect API but I think that's on Konnect API to solve and not decK.

Kong's API was mature enough that we could simply dump configuration as we saw in the API. This is not the case with Konnect's API so we can't treat both as the same thing. The argument of "this is how it was done for Kong" (unfortunately) doesn't apply to Konnect objects. This is sadly the case because of realities we live in. I do not like what we are doing here either but we have to put the end-user workflow and experience above code complexities in this area.

Travis Raines and others added 9 commits May 12, 2021 12:52
Rework the loop that retrieves documents associated with service
versions to iterate over version objects in the inner loop rather than
iterating over version indices.

The previous version of this loop could encounter a concurrency bug
where the version passed to ListAllForParent() was not the version in
the outer anonymous function, causing deck to fetch the same document
twice and fail current state building validity checks. The exact cause
of this bug is not clear (we do not modify the service package slice and
specified indices in the anonymous function call), but it does not
appear to occur with the revised loop.
Co-authored-by: Harry <harrybagdi@gmail.com>
@rainest
Copy link
Contributor Author

rainest commented May 12, 2021

you might want to rebase the feat/konnect-documents branch on top of main and resolve all the conflicts and then rebase wip/konnect-documents on top of feat/konnect-documents. I expect there to be significant merge conflicts.

IMO merge this into feat/konnect-documents and then merge main into that before PRing feat/konnect-documents into main. There are a few merge conflicts, but nothing difficult to resolve based on an initial trial run. Rebasing would probably overcomplicate things because history is pretty long at this point.

Re #358 (comment) the issue I have now is with the dual usage of the path field. We'd still offer the same level of flexibility to arrange content files as you like with separate path (for the Konnect API object) and localPath fields.

The hidden limitation arises from the translation between the value in the state file and the value sent to Konnect. If, for instance, you edit the state file to add a new version and document, you have to know that decK will strip off all but the filename from the path you use when setting the path value in Konnect. There's a tradeoff between economizing on state file fields and fields that have a distinct purpose that they satisfy well. The latter also being cleaner to implement is an additional point for that option in a holistic evaluation of the feature from both a UX and code maintainability/readability perspective.

@rainest rainest requested a review from hbagdi May 12, 2021 23:20
@hbagdi
Copy link
Member

hbagdi commented May 13, 2021

IMO merge this into feat/konnect-documents and then merge main into that before PRing feat/konnect-documents into main. There are a few merge conflicts, but nothing difficult to resolve based on an initial trial run. Rebasing would probably overcomplicate things because history is pretty long at this point.

That will complicate the history a bit but that's fine too then.
Please organize the commits properly in that case. All the wip commits need to go away - commits should tell a story and not contain retakes of various scenes. Please consult the history of this repo for an idea if you need one.

Copy link
Member

@hbagdi hbagdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there.

Parent: &konnect.ServiceVersion{
ID: kong.String("abc"),
},
Content: kong.String("{\"foo\":\"foo\",\"bar\":\"bar\"}"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace this and all other JSON strings with unescaped versions.

Suggested change
Content: kong.String("{\"foo\":\"foo\",\"bar\":\"bar\"}"),
Content: kong.String(`{"foo":"foo","bar":"bar"}`),

solver/utils.go Outdated Show resolved Hide resolved
solver/utils_test.go Outdated Show resolved Hide resolved
.golangci.yml Outdated Show resolved Hide resolved
@rainest rainest requested a review from hbagdi May 14, 2021 16:58
"b": 2
},
"foo": "foo"
}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you find this ugly, you could use https://github.com/lithammer/dedent to make this better. We should absolutely do this, not a blocker for merge.

Copy link
Member

@hbagdi hbagdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work here!
I'll leave merge to you as per your convenience.

@rainest rainest merged commit dc55797 into feat/konnect-documents May 17, 2021
@rainest rainest deleted the wip/konnect-documents branch May 17, 2021 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants