Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have clean_whitespace re-run type inference #1464

Merged
merged 5 commits into from
Jan 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/src/manpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -2312,7 +2312,7 @@ MILLER(1) MILLER(1)
(class=math #args=1) Ceiling: nearest integer at or above.

1mclean_whitespace0m
(class=string #args=1) Same as collapse_whitespace and strip.
(class=string #args=1) Same as collapse_whitespace and strip, followed by type inference.

1mcollapse_whitespace0m
(class=string #args=1) Strip repeated whitespace from string.
Expand Down Expand Up @@ -3011,7 +3011,7 @@ MILLER(1) MILLER(1)
strmatch(12345, "34") is true

1mstrmatchx0m
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in constrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in contrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
Examples:
strmatchx("a", "abc") returns:
{
Expand Down
4 changes: 2 additions & 2 deletions docs/src/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2291,7 +2291,7 @@ MILLER(1) MILLER(1)
(class=math #args=1) Ceiling: nearest integer at or above.

1mclean_whitespace0m
(class=string #args=1) Same as collapse_whitespace and strip.
(class=string #args=1) Same as collapse_whitespace and strip, followed by type inference.

1mcollapse_whitespace0m
(class=string #args=1) Strip repeated whitespace from string.
Expand Down Expand Up @@ -2990,7 +2990,7 @@ MILLER(1) MILLER(1)
strmatch(12345, "34") is true

1mstrmatchx0m
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in constrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in contrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
Examples:
strmatchx("a", "abc") returns:
{
Expand Down
4 changes: 2 additions & 2 deletions docs/src/reference-dsl-builtin-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1209,7 +1209,7 @@ capitalize (class=string #args=1) Convert string's first character to uppercase

### clean_whitespace
<pre class="pre-non-highlight-non-pair">
clean_whitespace (class=string #args=1) Same as collapse_whitespace and strip.
clean_whitespace (class=string #args=1) Same as collapse_whitespace and strip, followed by type inference.
</pre>


Expand Down Expand Up @@ -1364,7 +1364,7 @@ strmatch(12345, "34") is true

### strmatchx
<pre class="pre-non-highlight-non-pair">
strmatchx (class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in constrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
strmatchx (class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in contrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
Examples:
strmatchx("a", "abc") returns:
{
Expand Down
4 changes: 2 additions & 2 deletions man/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2291,7 +2291,7 @@ MILLER(1) MILLER(1)
(class=math #args=1) Ceiling: nearest integer at or above.

1mclean_whitespace0m
(class=string #args=1) Same as collapse_whitespace and strip.
(class=string #args=1) Same as collapse_whitespace and strip, followed by type inference.

1mcollapse_whitespace0m
(class=string #args=1) Strip repeated whitespace from string.
Expand Down Expand Up @@ -2990,7 +2990,7 @@ MILLER(1) MILLER(1)
strmatch(12345, "34") is true

1mstrmatchx0m
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in constrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in contrast to the `=~` operator. As well, while the `=~` operator limits matches to \1 through \9, an arbitrary number are supported here.
Examples:
strmatchx("a", "abc") returns:
{
Expand Down
4 changes: 2 additions & 2 deletions man/mlr.1
Original file line number Diff line number Diff line change
Expand Up @@ -3100,7 +3100,7 @@ Map example: apply({"a":1, "b":3, "c":5}, func(k,v) {return {toupper(k): v ** 2}
.RS 0
.\}
.nf
(class=string #args=1) Same as collapse_whitespace and strip.
(class=string #args=1) Same as collapse_whitespace and strip, followed by type inference.
.fi
.if n \{\
.RE
Expand Down Expand Up @@ -4675,7 +4675,7 @@ strmatch(12345, "34") is true
.RS 0
.\}
.nf
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \e1, \e2, etc. are not set, in constrast to the `=~` operator. As well, while the `=~` operator limits matches to \e1 through \e9, an arbitrary number are supported here.
(class=string #args=2) Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \e1, \e2, etc. are not set, in contrast to the `=~` operator. As well, while the `=~` operator limits matches to \e1 through \e9, an arbitrary number are supported here.
Examples:
strmatchx("a", "abc") returns:
{
Expand Down
3 changes: 2 additions & 1 deletion pkg/bifs/strings.go
Original file line number Diff line number Diff line change
Expand Up @@ -344,11 +344,12 @@ func BIF_capitalize(input1 *mlrval.Mlrval) *mlrval.Mlrval {

// ----------------------------------------------------------------
func BIF_clean_whitespace(input1 *mlrval.Mlrval) *mlrval.Mlrval {
return BIF_strip(
mv := BIF_strip(
BIF_collapse_whitespace_regexp(
input1, _whitespace_regexp,
),
)
return mlrval.FromInferredType(mv.String())
}

// ================================================================
Expand Down
4 changes: 2 additions & 2 deletions pkg/dsl/cst/builtin_function_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,7 @@ used within subsequent DSL statements. See also "Regular expressions" at ` + lib
{
name: "strmatchx",
class: FUNC_CLASS_STRING,
help: `Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in constrast to the ` + "`=~` operator. As well, while the `=~` operator limits matches to \\1 through \\9, an arbitrary number are supported here.",
help: `Extended information for whether the stringable first argument matches the regular-expression second argument. Regex captures are provided in the return-value map; \1, \2, etc. are not set, in contrast to the ` + "`=~` operator. As well, while the `=~` operator limits matches to \\1 through \\9, an arbitrary number are supported here.",
examples: []string{
`strmatchx("a", "abc") returns:`,
` {`,
Expand Down Expand Up @@ -444,7 +444,7 @@ used within subsequent DSL statements. See also "Regular expressions" at ` + lib
{
name: "clean_whitespace",
class: FUNC_CLASS_STRING,
help: "Same as collapse_whitespace and strip.",
help: "Same as collapse_whitespace and strip, followed by type inference.",
unaryFunc: bifs.BIF_clean_whitespace,
},

Expand Down
1 change: 1 addition & 0 deletions test/cases/dsl-clean-whitespace/0010/cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
mlr --icsv --ojson clean-whitespace then put -f ${CASEDIR}/mlr ${CASEDIR}/input.csv
Empty file.
18 changes: 18 additions & 0 deletions test/cases/dsl-clean-whitespace/0010/expout
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[
{
"a": 1,
"b": 2,
"c": 3,
"d": 4,
"e": 9,
"t": "int"
},
{
"a": 5,
"b": 6,
"c": 7,
"d": 8,
"e": 13,
"t": "int"
}
]
3 changes: 3 additions & 0 deletions test/cases/dsl-clean-whitespace/0010/input.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
a, b, c, d
1, 2, 3, 4
5, 6, 7, 8
2 changes: 2 additions & 0 deletions test/cases/dsl-clean-whitespace/0010/mlr
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
$e = $d + 5;
$t = typeof($d)
Loading