Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scanner.Scanner/lexer.Tokenize does not properly decrement IndentLevel #672

Open
nmiyake opened this issue Feb 24, 2025 · 0 comments
Open
Labels
bug Something isn't working

Comments

@nmiyake
Copy link

nmiyake commented Feb 24, 2025

Describe the bug
When using lexer.Tokenize to tokenize input, the IndentLevel for a token is not properly reset when it is supposed to decrease by more than 1 -- the value is only decremented by 1 even if the indent level logically decreases by multiple more.

As a concrete example, for the input YAML:

top-level-one-indent-0:
  one-indent-1:
    - one-indent-2:
        one-indent-3:
          one-indent-4: four-value
top-level-two-indent-0:
  two-indent-1:
    two-indent-2:
      two-indent-3: three-value

The indent level for top-level-two-indent-0 should be 0, but it is reported as 3 (because the last indent level for one-indent-4 was 4, and when the scanner processes top-level-two-indent-0, it only decrements the indent level by 1). This mismatch is also propagated down through all child nodes.

To Reproduce

Go Playground:

https://go.dev/play/p/QI6toGvzMCE

Self-contained test file:

package yamlpatch

import (
	"fmt"

	"github.com/goccy/go-yaml/lexer"
	"github.com/goccy/go-yaml/token"
	"testing"
)

func TestTokenize(t *testing.T) {
	const in = `top-level-one-indent-0:
  one-indent-1:
    - one-indent-2:
        one-indent-3:
          one-indent-4: four-value
top-level-two-indent-0:
  two-indent-1:
    two-indent-2:
      two-indent-3: three-value`

	getTokenFn := func(val string, tokens token.Tokens) *token.Token {
		for _, currToken := range tokens {
			if currToken.Value == val {
				return currToken
			}
		}
		return nil
	}
	printPosFn := func(val string, tokens token.Tokens) {
		fmt.Printf("%s: %#v\n", val, getTokenFn(val, tokens).Position)
	}

	tokens := lexer.Tokenize(in)

	// top-level-one-indent-0: &token.Position{Line:1, Column:1, Offset:1, IndentNum:0, IndentLevel:0}
	printPosFn("top-level-one-indent-0", tokens)

	// one-indent-1: &token.Position{Line:2, Column:3, Offset:27, IndentNum:2, IndentLevel:1}
	printPosFn("one-indent-1", tokens)

	// one-indent-2: &token.Position{Line:3, Column:7, Offset:47, IndentNum:4, IndentLevel:2}
	printPosFn("one-indent-2", tokens)

	// one-indent-3: &token.Position{Line:4, Column:9, Offset:69, IndentNum:8, IndentLevel:3}
	printPosFn("one-indent-3", tokens)

	// one-indent-4: &token.Position{Line:5, Column:11, Offset:93, IndentNum:10, IndentLevel:4}
	printPosFn("one-indent-4", tokens)

	// top-level-two-indent-0: &token.Position{Line:6, Column:1, Offset:118, IndentNum:0, IndentLevel:3}
	printPosFn("top-level-two-indent-0", tokens)
	gotPos := getTokenFn("top-level-two-indent-0", tokens).Position
	if gotPos.IndentLevel != 0 {
		panic(fmt.Sprintf("Wrong indent level: expected 0, was %d", gotPos.IndentLevel))
	}

	// two-indent-1: &token.Position{Line:7, Column:3, Offset:144, IndentNum:2, IndentLevel:4}
	printPosFn("two-indent-1", tokens)

	// two-indent-2: &token.Position{Line:8, Column:5, Offset:162, IndentNum:4, IndentLevel:5}
	printPosFn("two-indent-2", tokens)

	// two-indent-3: &token.Position{Line:9, Column:7, Offset:182, IndentNum:6, IndentLevel:6}
	printPosFn("two-indent-3", tokens)
}

Expected behavior
For this specific input, the IndentLevel for top-level-two-indent-0 should be 0.

In general, the scanner's updateIndentLevel function should properly decrement the indent level based on the content/context instead of always just decrementing by 1 (

go-yaml/scanner/scanner.go

Lines 150 to 154 in abc7083

} else if s.prevLineIndentNum > s.indentNum {
if s.indentLevel > 0 {
s.indentLevel--
}
}
)

Screenshots
If applicable, add screenshots to help explain your problem.

Version Variables

  • Go version: [e.g. 1.21 ]
  • go-yaml's Version: [e.g. v1.11.1 ]

Additional context
Add any other context about the problem here.

@nmiyake nmiyake added the bug Something isn't working label Feb 24, 2025
@nmiyake nmiyake changed the title lexer.Tokenize does not properly decrement IndentLevel scanner.Scanner/lexer.Tokenize does not properly decrement IndentLevel Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant