Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform inappropriratly renames columns instead of adding/removing what is being specified #753

Closed
asosnovsky opened this issue Nov 9, 2021 · 9 comments
Labels
bug Used to mark issues with provider's incorrect behavior

Comments

@asosnovsky
Copy link

asosnovsky commented Nov 9, 2021

Provider Version: 0.25.19

Terraform Version: 1.0.10

Describe the bug

Say I have a deployment like this

resource "snowflake_database" "db" {
  name = "test_db"
}

resource "snowflake_schema" "schema" {
  database = snowflake_database.db.name
  name     = "example"
}

resource "snowflake_table" "tb" {
  database = snowflake_database.db.name
  schema   = snowflake_schema.schema.name
  name     = "tb1"
  column {
    name    = "name"
    type    = "VARCHAR(255)"
    comment = "User name"
  }
  column {
    name    = "age"
    type    = "NUMERIC"
    comment = "User's age"
  }
}

I applied the changes, and then one day a requirement is added, we need a new column added to tb1!
So when I include the column:

resource "snowflake_table" "tb" {
  database = snowflake_database.db.name
  schema   = snowflake_schema.schema.name
  name     = "tb1"
  column {
    name    = "name"
    type    = "VARCHAR(255)"
    comment = "User name"
  }
  column {
    name = "new_col"
    type = "VARCHAR(255)"
  }
  column {
    name    = "age"
    type    = "NUMERIC"
    comment = "User's age"
  }
}

I would expect there to be a change that specifies a new column is to be added, instead I get this:
image

Notice how the column is shifting everything down, renames an existing column and then adds a column that was already there.

Additionally, I noticed that if after this apply, I were to say remove the 'user name' i.e.

resource "snowflake_table" "tb" {
  database = snowflake_database.db.name
  schema   = snowflake_schema.schema.name
  name     = "tb1"
  column {
    name = "new_col"
    type = "VARCHAR(255)"
  }
  column {
    name    = "age"
    type    = "NUMERIC"
    comment = "User's age"
  }
}

It shifts all the columns upward...
image

I suspect this is due to the implementation detail of storing the NewColumn as a list and not checking if we are actually looking at the same column. Perhaps generating a unique hash to represent each column during the creation phase would resolve this issue (though it may be a big breaking change).

@asosnovsky asosnovsky added the bug Used to mark issues with provider's incorrect behavior label Nov 9, 2021
@dlawrences
Copy link

dlawrences commented Nov 15, 2021

This seems to be happening depending on how the order of the initial column names is affected by the addition of a new column. Say you add something that is going to find itself at the end of the list of updated columns, sorted alphabetically descending. The change will be the expected one. Otherwise it will "shift" columns as you correctly pointed out.

Per my own research, it seems to be something done per design by the Terraform Diff, and as described here there are two main solutions:

  • either change the schema to correctly qualify the objects that shouldn't be diff'd based on their order in a list or
  • use a Customized Diff

I find it surprising this has only been identified right now, although support for column additions has been added prior to 0.25.0 through #503, so it's been out there for a few good months

@dlawrences
Copy link

Btw, per our testing, it seems it is only the plan that is wrong and the actual deployment that happens against the table is per what you would expect. Seems this is because in
https://github.com/chanzuckerberg/terraform-provider-snowflake/blob/5b46be640d16086bbfddabbba3e206d2b15e78ad/pkg/resources/table.go#L624-L692

the changes are re-identified correctly at column name level instead of based on indexes. Can you confirm this is also what you see? i.e. the terraform plan and the terraform apply changes are unreliable, one should apply and see the actual effect.

@asosnovsky
Copy link
Author

asosnovsky commented Nov 18, 2021

@dlawrences you seem to be correct! the plan is wrong but it is deployed correctly!

I tested this by running the original terraform above, then inserting some data to the tables.
Then running the second terraform (where I add a column), and the data in snowflake appeared to be as expected!

Do you think that this should at least be mentioned somewhere in the docs, just so as to warn people that the plan may just show the wrong diff, while the deployment of the change is as expected?

@dlawrences
Copy link

I believe so @asosnovsky, I'll try shipping a PR on the docs for the team to review unless you want to go at it first

@dlawrences
Copy link

dlawrences commented Jan 29, 2022

Some updates here from my end. This matter still persists and the main reason is that Snowflake and the provider obviously care about the order of columns in the table resource.

This is apparent once you dig through the provider resources. The table resource uses TypeList as the type of the "column" property.

https://github.com/chanzuckerberg/terraform-provider-snowflake/blob/3b279055b66cd62f43da05559506f1afa282aa16/pkg/resources/table.go#L47

TypeList is a type that Terraform considers as ordered, so that means that any change in the order of list elements will trigger changes (potentially destructive).

The order of the columns will basically translate to the positional order in the database since columns are added in the order in which they are located within the list.

How we managed to solve it? Well, I need to give a little bit more background first:

On our end, we enable developer configuration of tables through the usage of YAML files (1 yaml file = 1 table). The developer configures columns in the right order in the YAML file and they expect them to be deployed in that very order.

We parse out the text from these YAML files and generate resources of this provider dynamically through some modules we have created internally. On our end, we reproduced the issue because, in the first place, we were coalescing all the columns under a map using the following structure:

columns = {
  "column_name_1" = {
    type = "type of the column"
    comment = "comment of the column"
    nullable = "nullable property"
    default = "default value strategy"
  }
}

As you can tell, the name of columns act as keys within the map. And after digging some more, I found out that map keys are iterated in lexicographic order always.

This basically explains why we were seeing all these weird changes being present in the plan output. When you would add a new column that would break the previous lexicographic order, since Terraform cares about the order of elements in the column property of the resource, the changes would appear as destructive.

To avoid this, we changed the representation of columns to be a list instead of a map and added the name as a property of each object instead:

columns = [
  {
    name = "column name"
    type = "type of the column"
    comment = "comment of the column"
    nullable = "nullable property"
    default = "default value strategy"
  }
]

When populating such a list of columns, the order in which the text of each column is parsed from YAML is retained. This means that if you add a column that still breaks the lexicographic order after of all columns in the YAML file, the column will not be destructive as previously since the order of elements in the list is not affected by their name anymore.

@truepill-fiona-tang
Copy link

truepill-fiona-tang commented Jun 15, 2022

Are there any plans to resolve this? Our plans are getting very noisy because of this issue, making it tricky to see what the genuine changes are.

@uristernik
Copy link

Any news on that?

@sfc-gh-asawicki
Copy link
Collaborator

We are closing this issue as part of a cleanup described in announcement. If you believe that the issue is still valid in v0.89.0, please open a new ticket.

@sfc-gh-asawicki sfc-gh-asawicki closed this as not planned Won't fix, can't repro, duplicate, stale Apr 30, 2024
@sfc-gh-jcieslak
Copy link
Collaborator

sfc-gh-jcieslak commented Oct 23, 2024

Hi all 👋
Recently, we've done more in-depth research on a few topics; one of them was ignoring the list item's order after creation and handling individual list item updates. When applied to the table columns use case it should solve most of the issues, but some of them won't, like column shift after dropping the first column (it may update the columns instead, but that shouldn't be an issue). A more detailed summary of the research should be posted (and announced in the discussions) soon, so stay tuned. The fix for table columns should be adjusted with the overall table refactor for v1 (that is marked as an essential object). As the issue is closed and the author doesn't mind the thread can be still monitored by looking at #420.

sfc-gh-jcieslak added a commit that referenced this issue Nov 5, 2024
Test cases for changes in lists and sets. From the given time, I only
went through essential cases that consisted of:
- Ignoring the order of list items after creation.
- Having the ability to update an item while ignoring the order.
For the testing, I created a test resource because currently, we don't
have anything to test more complex examples of certain resource
behaviors (temporary providers we create for custom diff testing are not
sufficient in this case). The resource is only added to the resource
list whenever a special env is set (we could remove it entirely and
leave some documentation in the resource file (and acc test file) on how
to prepare for the tests). The imitation of external storage was done by
creating a struct and its global instance (some of the things needed to
be exported since external changes tested in acceptance tests needed to
access those). Certain resource fields were researched to test different
approaches, each is described in the implementation file.

Also added an assert on lists/sets that is able to assert items in order
independent manner (it was needed for the tests of the proposals).

> Note: Only lists were tested, because there was no major issue with
them in current resources. For the lists the following issues were
addressed: #420, #753

## Next pr
- Apply (parameterized tests in object renaming test cases)
#3130 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to mark issues with provider's incorrect behavior
Projects
None yet
Development

No branches or pull requests

6 participants