-
-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/question: how to distinguish between unparsed data and strings? #661
Comments
@felixfontein It's difficult to provide hints without understanding exactly what you want to achieve, but basically, anything possible with gopkg.in/yaml.v3 is also possible with this library. According to the YAML specification, these are interpreted as strings at the AST level. If you want to decode them into arbitrary Go types (such as time.Time or int), type information is required. If you need to determine the type at the AST level, you must add a tag like Also, you can directly decode into a specified Go value using |
@goccy thanks for your reply! What I basically need to achieve is to read a YAML file and walk through its structure and identify the type for each value. I need to do this to create SOPS' internal representation of the data (and comments), where keys and values in mappings and elements in lists have the right types. For that I looked into the AST output, since it seems similar to gopkg.in/yaml.v3's
to get hold of the value - it will be of type With goccy/go-yaml, I don't see how I can get that information from So, given a - 12398132981498123981 # integer
- "12398132981498123981" # string
- 2024-01-01 # date
- "2024-01-01" # string But goccy/go-yaml's |
Ok, I think I figured out how to distinguish a |
@felixfontein Yes, if you just need to determine whether it's a quoted string, that approach will work fine. |
In case anyone has a similar problem, the following documents contain regular expressions that match all strings representing integers, floating point numbers, and timestamps supported by YAML:
I'm a bit torn about sexagesimal support. gopkg.in/yaml.v3, goccy/go-yaml, and ruamel.yaml do not seem to support it, though PyYAML does. (I guess it didn't always support it.) |
Determining the Go's type at the ast.Node level is not a common approach, so if you want to do it manually, you will need to implement the type-checking logic yourself. I want to make this project the de facto standard YAML library for Go. However, even though a large number of users are already using this library, the number of stars required for standardization is still significantly lacking. If you don’t mind, please lend me your support. If you haven’t starred it yet, please do so. I would also greatly appreciate it if you consider becoming a sponsor or recommending it to your friends. I hope that as this project grows, it will benefit all Go developers. |
Is your feature request related to a problem? Please describe.
I'm looking at migrating SOPS to goccy/go-yaml. For that I need to be able to parse arbitrary YAML documents and process its structure without assumptions on how it should looks like. While playing around with
parser.ParseBytes()
and the resulting AST, I noticed that I don't know how to figure out whether something that ends up as aStringNode
is actually a string, or something else (date/timestamp, integer, float):All the above sequence entries result in a
StringNode
. If you quote all the above numbers with"..."
, then you also getStringNode
s with the same values. I don't see how to figure out from aStringNode
whether it actually represents a string, or real data (like a date, timestamp, large integer, large float).What gopkg.in/yaml.v3 does:
time.Time
objects.98123918398129831987841872387138712837
) is parsed as a floating point number (distorting the value).Describe the solution you'd like
Describe alternatives you've considered
I'm not sure what the best way to proceed is. The numbers above have been picked so that they cannot be parsed as Golang integers or floats. (The date, and timestamps in general, can be represented by Golang types.)
Maybe:
time.Time
to represent all dates and timestamps.)StringNode
which tells what the data actually is (actual string; date/timestamp; integer; hexadecimal integer; octal integer; floating point number) if there is no native representation.StringNode
with type info? (That would allow lossless transformations of floats, for example, allowing to distinguish between1.10
and1.1
.)(Obviously all three can also be implemented together. I'm currently tending to like 2. and 3. most.)
Ref: getsops/sops#1616
The text was updated successfully, but these errors were encountered: