Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider Method for Disabling Legacy SDK Data Consistency Warning Log Demotion #1227

Closed
bflad opened this issue Jul 27, 2023 · 7 comments · Fixed by #1229
Closed

Consider Method for Disabling Legacy SDK Data Consistency Warning Log Demotion #1227

bflad opened this issue Jul 27, 2023 · 7 comments · Fixed by #1229
Labels
enhancement New feature or request
Milestone

Comments

@bflad
Copy link
Contributor

bflad commented Jul 27, 2023

Terraform Version

Terraform v1.5.4
on darwin_arm64

Use Cases

Terraform provider resources written with the legacy terraform-plugin-sdk Go module have a special flag enabled in the protocol which demotes certain cases of data consistency errors into warning logs:

TIMESTAMP [WARN]  Provider "TYPE" produced an invalid plan for ADDRESS, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .ATTRIBUTE: planned value cty.False for a non-computed attribute

If the problematic attribute value that raises the warning log is referenced elsewhere in a configuration, Terraform will raise an error diagnostic in the receiving location, which can help with finding these cases, but has a relatively low hit rate as compared to every attribute value that has this issue. This was a pragmatic choice for the existing provider ecosystem since at the time there was not a updated provider development SDK to match Terraform 0.12 and later concepts. It would have been extremely difficult for practitioners to upgrade Terraform if a part of doing so would have been to force all providers to fix all instances of these data inconsistency errors.

Fast forward to the present, the newer terraform-plugin-framework Go module is available, but it intentionally does not enable the protocol flag since a goal for moving the provider ecosystem forward is to not support errant provider implementations with respect to Terraform's intended data handling rules. Provider developers are embarking on the journey of migrating their terraform-plugin-sdk based providers to terraform-plugin-framework, and potentially encountering these data consistency errors for the first time either during initial acceptance testing after migrating their code or in worst case scenarios, after its been released after the migration.

At scale with larger providers, such as a major cloud provider, it is difficult to know about these underlying issues since they only exist in logging. Most practitioners/environments are not running with Terraform's logging capabilities enabled (e.g. setting the TF_LOG environment variable), and even for environments where this is enabled, it is difficult for developers to discover and remediate these issues since there are no native provider acceptance testing capabilities to detect this situation short of writing acceptance tests which explicitly check every attribute value in every resource.

Attempted Solutions

Rely on practitioner bug reports of the downstream resource error (when triggered) or provider developers enabling logging in acceptance testing environments and searching those logs for the relevant log entries.

Proposal

Introduce an environment variable so provider developers can have these errors raised during provider acceptance testing. In the core logic that checks for the legacy type system flag, conditionalize the warning log demotion on also not having the environment variable set. Normal practitioners should never encounter the newer behavior. Provider developers have the option of setting this environment variable in their provider acceptance testing environments. The terraform-plugin-framework migration guide can explicitly mention this environment variable as a preparation step.

References

No response

@bflad bflad added the enhancement New feature or request label Jul 27, 2023
@bflad
Copy link
Contributor Author

bflad commented Jul 27, 2023

Another potential option we have would be doing something similar using an environment variable within terraform-plugin-sdk itself to not set the legacy type system protocol flag at all. I'm not sure if there would be additional considerations on the core side of doing this though.

@jbardin
Copy link
Member

jbardin commented Jul 28, 2023

If the SDK were to just not set the flag, I think you would get the same behavior you are proposing, and I think it would be the easiest solution to conditionally switching the behavior.

One thing to consider though is that many of these errors are unavoidable with the legacy SDK, so enabling the strict protocol handling will make many resources non-functional. This will raise the error to the developer of course, but there's nothing that can be done about it without proactively moving to the plugin-framework. Since it requires proactive involvement regardless, are you sure it's more useful than directly looking for the warnings in the logs?

@apparentlymart
Copy link
Contributor

apparentlymart commented Jul 28, 2023

I agree with @jbardin that I'd prefer a solution that's entirely within the SDKv2 codebase so that provider developers can enable it independently of what version of Terraform Core they are using.

If it were in the SDK then it also opens some additional options for how to enable it, instead of (or in addition to) environment variables:

  • A special build tag or linker-set variable at build time.
  • Something a provider developer can activate using an extra function call in their provider's init code.
  • The previous item but on a per-resource-type basis rather than globally. e.g. a new method on ResourceData that means "I'm intending not to need the legacy SDK exception for this response".

The difference here is admittedly quite marginal, but it would avoid any situations where end-users might intentionally or unintentionally enable this setting on released providers and then open confusing bug reports about it.

With that said, I also share @jbardin's concern that this would probably just raise unavoidable errors where the only path forward would be to switch to the framework anyway. I wonder if we can instead try to document better how to find and understand these warning messages as a preparation step while planning a migration to the framework.

@bflad
Copy link
Contributor Author

bflad commented Jul 31, 2023

One thing to consider though is that many of these errors are unavoidable with the legacy SDK, so enabling the strict protocol handling will make many resources non-functional.

With that said, I also share @jbardin's concern that this would probably just raise unavoidable errors where the only path forward would be to switch to the framework anyway.

I was hoping that we might be able to tailor the error handling on core's side to avoid raising errors for unavoidable legacy SDK behaviors, such as providers returning unexpected values without Computed being set, since enabling that in the legacy SDK brings other potentially undesirable behaviors of copying prior state, etc.


The real crux here is that there is no intention of a team spending a time to properly design something for this situation. The goal was to loosely help provider developers (this feature request came from a major cloud partner) find out about data consistency issues separately from migrating their resources to the new framework when dealing with hundreds of resources. I will transfer this issue to the terraform-plugin-sdk repository for future consideration though.

@bflad bflad transferred this issue from hashicorp/terraform Jul 31, 2023
bflad added a commit that referenced this issue Aug 8, 2023
…stem protocol flags

Reference: #1227

Introduces new `Resource` type `EnableLegacyTypeSystemApplyErrors` and `EnableLegacyTypeSystemPlanErrors` fields that prevent the `PlanResourceChange` and `ApplyResourceChange` protocol operation responses from setting the `UnsafeLegacyTypeSystem` flag. When Terraform encounters the `UnsafeLegacyTypeSystem` flag, it will demote certain data consistency issues from error diagnostics that would immediately prevent Terraform workflows from completing to warning logs, which can be difficult for provider developers to discover. Provider developers are not  generally expected to try enabling these settings unless discovering these data consistency issues upfront is desired before migrating resources to terraform-plugin-framework.

This setting is on the individual resource level because these errors can be quite prevalent in existing terraform-plugin-sdk resources and therefore it would be problematic to not give provider developers more control over the behavior.

The settings are split between plan and apply because certain

This change set includes new website documentation to discuss these data consistency issues and how to potentially resolve them more in-depth. It is expected that the framework migration guide will be updated to recommend enabling these settings before migrating to help provider developers understand the issues before migrating. The error resolution section has one particular error to start and it is expected that this section will grow over time as provider developers report the various data consistency errors that Terraform can raise.
bflad added a commit that referenced this issue Aug 8, 2023
…stem protocol flags

Reference: #1227

Introduces new `Resource` type `EnableLegacyTypeSystemApplyErrors` and `EnableLegacyTypeSystemPlanErrors` fields that prevent the `PlanResourceChange` and `ApplyResourceChange` protocol operation responses from setting the `UnsafeLegacyTypeSystem` flag. When Terraform encounters the `UnsafeLegacyTypeSystem` flag, it will demote certain data consistency issues from error diagnostics that would immediately prevent Terraform workflows from completing to warning logs, which can be difficult for provider developers to discover. Provider developers are not  generally expected to try enabling these settings unless discovering these data consistency issues upfront is desired before migrating resources to terraform-plugin-framework.

This setting is on the individual resource level because these errors can be quite prevalent in existing terraform-plugin-sdk resources and therefore it would be problematic to not give provider developers more control over the behavior.

The settings are split between plan and apply because the protocol flag is on two separate protocol operations and because certain errors for one operation may be unavoidable, so it leaves provider developers the option to still raise errors for the other operation.

This change set includes new website documentation to discuss these data consistency issues and how to potentially resolve them more in-depth. It is expected that the framework migration guide will be updated to recommend enabling these settings before migrating to help provider developers understand the issues before migrating. The error resolution section has one particular error to start and it is expected that this section will grow over time as provider developers report the various data consistency errors that Terraform can raise.
@bflad
Copy link
Contributor Author

bflad commented Aug 8, 2023

Raised #1229 to add helper/schema.Resource level flags for this and some website documentation around the issues.

bflad added a commit that referenced this issue Aug 8, 2023
…stem protocol flags

Reference: #1227

Introduces new `Resource` type `EnableLegacyTypeSystemApplyErrors` and `EnableLegacyTypeSystemPlanErrors` fields that prevent the `PlanResourceChange` and `ApplyResourceChange` protocol operation responses from setting the `UnsafeLegacyTypeSystem` flag. When Terraform encounters the `UnsafeLegacyTypeSystem` flag, it will demote certain data consistency issues from error diagnostics that would immediately prevent Terraform workflows from completing to warning logs, which can be difficult for provider developers to discover. Provider developers are not  generally expected to try enabling these settings unless discovering these data consistency issues upfront is desired before migrating resources to terraform-plugin-framework.

This setting is on the individual resource level because these errors can be quite prevalent in existing terraform-plugin-sdk resources and therefore it would be problematic to not give provider developers more control over the behavior.

The settings are split between plan and apply because the protocol flag is on two separate protocol operations and because certain errors for one operation may be unavoidable, so it leaves provider developers the option to still raise errors for the other operation.

This change set includes new website documentation to discuss these data consistency issues and how to potentially resolve them more in-depth. It is expected that the framework migration guide will be updated to recommend enabling these settings before migrating to help provider developers understand existing issues before migrating. The error resolution section has one particular error to start and it is expected that this section will grow over time as provider developers report the various data consistency errors that Terraform can raise.
bflad added a commit that referenced this issue Aug 24, 2023
…stem protocol flags (#1229)

Reference: #1227

Introduces new `Resource` type `EnableLegacyTypeSystemApplyErrors` and `EnableLegacyTypeSystemPlanErrors` fields that prevent the `PlanResourceChange` and `ApplyResourceChange` protocol operation responses from setting the `UnsafeLegacyTypeSystem` flag. When Terraform encounters the `UnsafeLegacyTypeSystem` flag, it will demote certain data consistency issues from error diagnostics that would immediately prevent Terraform workflows from completing to warning logs, which can be difficult for provider developers to discover. Provider developers are not  generally expected to try enabling these settings unless discovering these data consistency issues upfront is desired before migrating resources to terraform-plugin-framework.

This setting is on the individual resource level because these errors can be quite prevalent in existing terraform-plugin-sdk resources and therefore it would be problematic to not give provider developers more control over the behavior.

The settings are split between plan and apply because the protocol flag is on two separate protocol operations and because certain errors for one operation may be unavoidable, so it leaves provider developers the option to still raise errors for the other operation.

This change set includes new website documentation to discuss these data consistency issues and how to potentially resolve them more in-depth. It is expected that the framework migration guide will be updated to recommend enabling these settings before migrating to help provider developers understand existing issues before migrating. The error resolution section has one particular error to start and it is expected that this section will grow over time as provider developers report the various data consistency errors that Terraform can raise.
@bflad bflad added this to the v2.28.0 milestone Aug 24, 2023
@apparentlymart
Copy link
Contributor

I missed the reply above before and I don't think what I have to say here really changes what you implemented but just for posterity in case someone finds this later and wonders where that dangling thread leads... 😀

The flag in the provider protocol that this is now allowing provider developers to choose to switch off was at the time the most granular kind of detection we could imagine, based both on how the provider schema works and on how Terraform Core deals with provider configurations internally.

While it's possible that with fresh eyes we could imagine a more granular form of it that e.g. allows disregarding only a subset of the possible errors based on additional details in the schema, I don't think any of us is still familiar enough with the details of that tradeoff -- which itself arose from carefully navigating conflicting constraints both on the Terraform Core and the SDK sides -- to be able to design such a thing without some non-trivial research effort first, which is hard to justify for a legacy escape hatch that we're trying to move away from.

What you implemented here seems like a nice compromise, which is hopefully granular enough to make it useful as a migration aid without burdening provider developers with all of the legacy SDK's quirks all at once.

techknowlogick pushed a commit to go-gitea/terraform-provider-gitea that referenced this issue Sep 8, 2023
… to v2.29.0 (#17)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/hashicorp/terraform-plugin-sdk/v2](https://github.com/hashicorp/terraform-plugin-sdk) | require | minor | `v2.27.0` -> `v2.29.0` |

---

### Release Notes

<details>
<summary>hashicorp/terraform-plugin-sdk (github.com/hashicorp/terraform-plugin-sdk/v2)</summary>

### [`v2.29.0`](https://github.com/hashicorp/terraform-plugin-sdk/releases/tag/v2.29.0)

[Compare Source](hashicorp/terraform-plugin-sdk@v2.28.0...v2.29.0)

NOTES:

-   all: This Go module has been updated to Go 1.20 per the [Go support policy](https://go.dev/doc/devel/release#policy). It is recommended to review the [Go 1.20 release notes](https://go.dev/doc/go1.20) before upgrading. Any consumers building on earlier Go versions may experience errors. ([#&#8203;1245](hashicorp/terraform-plugin-sdk#1245))

FEATURES:

-   helper/schema: Upgrade to protocol version 5.4, which can significantly reduce memory usage with Terraform 1.6 and later when a configuration includes multiple instances of the same provider ([#&#8203;1234](hashicorp/terraform-plugin-sdk#1234))

ENHANCEMENTS:

-   helper/validation: Added `AllDiag` and `AnyDiag`, which are `SchemaValidateDiagFunc` variants of `All` and `Any` ([#&#8203;1155](hashicorp/terraform-plugin-sdk#1155))
-   helper/validation: Added quoting in `StringInSlice` error diagnostic output to prevent confusion with values that contain spaces ([#&#8203;464](hashicorp/terraform-plugin-sdk#464))

### [`v2.28.0`](https://github.com/hashicorp/terraform-plugin-sdk/releases/tag/v2.28.0)

[Compare Source](hashicorp/terraform-plugin-sdk@v2.27.0...v2.28.0)

NOTES:

-   helper/schema: The `Resource` type `EnableApplyLegacyTypeSystemErrors` and `EnablePlanLegacyTypeSystemErrors` fields can be enabled to more easily discover resource data consistency errors which Terraform would normally demote to warning logs. Before enabling the flag in a production release for a resource, the resource should be exhaustively acceptance tested as there may be unrecoverable error situations for practitioners. It is recommended to first enable and test in environments where it is easy to clean up resources, potentially outside of Terraform. ([#&#8203;1227](hashicorp/terraform-plugin-sdk#1227))

ENHANCEMENTS:

-   helper/schema: Added `Resource` type `EnableLegacyTypeSystemApplyErrors` field, which will prevent Terraform from demoting data consistency errors to warning logs during `ApplyResourceChange` (`Create`, `Update`, and `Delete`) operations with the resource ([#&#8203;1227](hashicorp/terraform-plugin-sdk#1227))
-   helper/schema: Added `Resource` type `EnableLegacyTypeSystemPlanErrors` field, which can be used to prevent Terraform from demoting data consistency errors to warning logs during `PlanResourceChange` operations with the resource ([#&#8203;1227](hashicorp/terraform-plugin-sdk#1227))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi43OS4xIiwidXBkYXRlZEluVmVyIjoiMzYuNzkuMSIsInRhcmdldEJyYW5jaCI6Im1haW4ifQ==-->

Reviewed-on: https://gitea.com/gitea/terraform-provider-gitea/pulls/17
Co-authored-by: Renovate Bot <renovate-bot@gitea.com>
Co-committed-by: Renovate Bot <renovate-bot@gitea.com>
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
3 participants