Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miller 6.5.0 #1134

Merged
merged 1 commit into from
Nov 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 1 addition & 123 deletions docs/src/10min.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,6 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

But `mlr cat` can also do format conversion -- for example, you can pretty-print in tabular format:
Expand All @@ -61,9 +58,6 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

`mlr head` and `mlr tail` count records rather than lines. Whether you're getting the first few records or the last few, the CSV header is included either way:
Expand All @@ -77,9 +71,6 @@ yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
red,circle,true,3,16,13.8103,2.9010
red,square,false,4,48,77.5542,7.4670
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -91,9 +82,6 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -120,9 +108,6 @@ go tool pprof -http=:8080 foo-stream
"rate": 8.2430
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can sort on a single field:
Expand All @@ -142,9 +127,6 @@ purple square false 10 91 72.3735 8.2430
yellow triangle true 1 11 43.6498 9.8870
purple triangle false 5 51 81.2290 8.5910
purple triangle false 7 65 80.1405 5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Or, you can sort primarily alphabetically on one field, then secondarily numerically descending on another field, and so on:
Expand All @@ -164,9 +146,6 @@ red square true 2 15 79.2778 0.0130
purple triangle false 7 65 80.1405 5.8240
purple triangle false 5 51 81.2290 8.5910
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

If there are fields you don't want to see in your data, you can use `cut` to keep only the ones you want, in the same order they appeared in the input data:
Expand All @@ -186,9 +165,6 @@ triangle false
circle true
circle true
square false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can also use `cut -o` to keep specified fields, but in your preferred order:
Expand All @@ -208,9 +184,6 @@ false triangle
true circle
true circle
false square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can use `cut -x` to omit fields you don't care about:
Expand All @@ -230,9 +203,6 @@ purple 7 65 80.1405 5.8240
yellow 8 73 63.9785 4.2370
yellow 9 87 63.5058 8.3350
purple 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Even though Miller's main selling point is name-indexing, sometimes you really want to refer to a field name by its positional index. Use `$[[3]]` to access the name of field 3 or `$[[[3]]]` to access the value of field 3:
Expand All @@ -252,9 +222,6 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -272,9 +239,6 @@ purple triangle NEW 7 65 80.1405 5.8240
yellow circle NEW 8 73 63.9785 4.2370
yellow circle NEW 9 87 63.5058 8.3350
purple square NEW 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can find the full list of verbs at the [Verbs Reference](reference-verbs.md) page.
Expand All @@ -292,9 +256,6 @@ red square true 2 15 79.2778 0.0130
red circle true 3 16 13.8103 2.9010
red square false 4 48 77.5542 7.4670
red square false 6 64 77.1991 9.5310
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -304,9 +265,6 @@ go tool pprof -http=:8080 foo-stream
color shape flag k index quantity rate
red square true 2 15 79.2778 0.0130
red circle true 3 16 13.8103 2.9010
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Computing new fields
Expand All @@ -331,9 +289,6 @@ purple triangle false 7 65 80.1405 5.8240 13.760388049450551 purple_triangl
yellow circle true 8 73 63.9785 4.2370 15.09995279679018 yellow_circle
yellow circle true 9 87 63.5058 8.3350 7.619172165566886 yellow_circle
purple square false 10 91 72.3735 8.2430 8.779995147397793 purple_square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

When you create a new field, it can immediately be used in subsequent statements:
Expand All @@ -356,9 +311,6 @@ purple triangle false 7 65 80.1405 5.8240 66 4363
yellow circle true 8 73 63.9785 4.2370 74 5484
yellow circle true 9 87 63.5058 8.3350 88 7753
purple square false 10 91 72.3735 8.2430 92 8474
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

For `put` and `filter` we were able to type out expressions using a programming-language syntax.
Expand All @@ -379,9 +331,6 @@ Zone,Total MWh
17,39.8
24,7.4
30,50.5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -393,9 +342,6 @@ Zone Total MWh
17 39.8
14 27.2
24 7.4
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

For `put` and `filter` expressions, use `${...}`:
Expand All @@ -409,9 +355,6 @@ Zone Total MWh Total KWh
17 39.8 39800
24 7.4 7400
30 50.5 50500
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

See also the [section on field names](reference-dsl-variables.md#field-names).
Expand Down Expand Up @@ -458,9 +401,6 @@ a,b,c
1,2,3
4,5,6
7,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Chaining verbs together
Expand All @@ -475,12 +415,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

This works fine -- but Miller also lets you chain verbs together using the word `then`. Think of this as a Miller-internal pipe that lets you use fewer keystrokes:
Expand All @@ -493,9 +427,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

As another convenience, you can put the filename first using `--from`. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
Expand All @@ -508,9 +439,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -524,9 +452,6 @@ shape quantity
square 72.3735
circle 63.5058
circle 63.9785
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Sorts and stats
Expand All @@ -543,9 +468,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Lots of Miller commands take a `-g` option for group-by: here, `head -n 1 -g shape` outputs the first record for each distinct value of the `shape` field. This means we're finding the record with highest `index` field for each distinct `shape` field:
Expand All @@ -558,9 +480,6 @@ color shape flag k index quantity rate
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
purple triangle false 7 65 80.1405 5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Statistics can be computed with or without group-by field(s):
Expand All @@ -574,9 +493,6 @@ shape quantity_count quantity_min quantity_mean quantity_max
triangle 3 43.6498 68.33976666666666 81.229
square 4 72.3735 76.60114999999999 79.2778
circle 3 13.8103 47.0982 63.9785
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -591,9 +507,6 @@ circle red 1 13.8103 13.8103 13.8103
triangle purple 2 80.1405 80.68475000000001 81.229
circle yellow 2 63.5058 63.742149999999995 63.9785
square purple 1 72.3735 72.3735 72.3735
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

If your output has a lot of columns, you can use XTAB format to line things up vertically for you instead:
Expand All @@ -611,9 +524,6 @@ rate_p75 8.5910
rate_p90 9.8870
rate_p99 9.8870
rate_p100 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Unicode and internationalization
Expand Down Expand Up @@ -646,9 +556,6 @@ UTF-8 data. For example:
κόκκινο κύκλος αληθινό 3 16 13.8103 2.9010
κίτρινο κύκλος αληθινό 8 73 63.9785 4.2370
κίτρινο κύκλος αληθινό 9 87 63.5058 8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -666,9 +573,6 @@ go tool pprof -http=:8080 foo-stream
κόκκινο τετράγωνο ψευδές 6 64 77.1991 9.5310
μοβ τρίγωνο ψευδές 7 65 80.1405 5.8240
μοβ τετράγωνο ψευδές 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -686,9 +590,6 @@ go tool pprof -http=:8080 foo-stream
желтый КРУГ истина 8 73 63.9785 4.2370 6
желтый КРУГ истина 9 87 63.5058 8.3350 6
фиолетовый КВАДРАТ ложь 10 91 72.3735 8.2430 10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## File formats and format conversion
Expand Down Expand Up @@ -788,9 +689,6 @@ a matter of specifying input-format and output-format flags:
"rate": 0.0130
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -800,9 +698,6 @@ go tool pprof -http=:8080 foo-stream
color,shape,flag,k,index,quantity,rate
yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

However, if JSON data has map-valued or array-valued fields, Miller gives you choices on how to
Expand Down Expand Up @@ -843,9 +738,6 @@ We can convert this to CSV, or other tabular formats:
<pre class="pre-non-highlight-in-pair">
hostname,pid,req.id,req.method,req.path,req.host,req.headers.host,req.headers.user-agent,res.status_code,res.header.content-type,res.header.content-encoding
localhost,12345,6789,GET,api/check,foo.bar,bar.baz,browser,200,text,plain
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -863,9 +755,6 @@ req.headers.user-agent browser
res.status_code 200
res.header.content-type text
res.header.content-encoding plain
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

These transformations are reversible:
Expand Down Expand Up @@ -897,12 +786,6 @@ These transformations are reversible:
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

See the [flatten/unflatten page](flatten-unflatten.md) for more information.
Expand Down Expand Up @@ -992,14 +875,9 @@ If you like, you can first copy off your original data somewhere else, before do

Lastly, using `tee` within `put`, you can split your input data into separate files per one or more field names:

<pre class="pre-highlight-in-pair">
<pre class="pre-highlight-non-pair">
<b>mlr --csv --from example.csv put -q 'tee > $shape.".csv", $*'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
<b>cat circle.csv</b>
Expand Down
Loading