Skip to content

Commit

Permalink
Release v1.17.8
Browse files Browse the repository at this point in the history
- Allow URLs and Table identification functions to be used as table identifiers.
  • Loading branch information
mithrandie committed Jul 24, 2022
2 parents bd009ac + 492d021 commit 1fd08b0
Show file tree
Hide file tree
Showing 44 changed files with 4,341 additions and 2,896 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Change Log

## Version 1.17.8

Released on Jul 24, 2022

- Allow URLs and Table identification functions to be used as table identifiers.

## Version 1.17.7

Released on Jul 3, 2022
Expand Down
13 changes: 12 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ SQL-like query language for csv
[![codecov](https://codecov.io/gh/mithrandie/csvq/branch/master/graph/badge.svg)](https://codecov.io/gh/mithrandie/csvq)
[![License: MIT](https://img.shields.io/badge/License-MIT-lightgrey.svg)](https://opensource.org/licenses/MIT)

csvq is a command line tool to operate CSV files.
Csvq is a command line tool to operate CSV files.
You can read, update, delete CSV records with SQL-like query.

You can also execute multiple operations sequentially in managed transactions by passing a procedure or using the interactive shell.
Expand All @@ -16,6 +16,17 @@ In the multiple operations, you can use variables, cursors, temporary tables, an
[![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/mithrandie/csvq?color=%2320b2aa&label=GitHub%20Release&sort=semver)](https://github.com/mithrandie/csvq/releases/latest)
[![GitHub tag (latest SemVer)](https://img.shields.io/github/v/tag/qittu/csvq-deb?color=%2320b2aa&label=Launchpad%20PPA)](https://launchpad.net/~mithrandie/+archive/ubuntu/csvq)

## Intended Use
Csvq is intended for one-time queries and routine processing described in source files on the amount of data that can be handled by spreadsheet applications.

It is not suitable for handling very large data since all data is kept on memory when queries are executed.
There is no indexing, calculation order optimization, etc., and the execution speed is not fast due to the inclusion of mechanisms for updating data and handling various other features.

However, it can be run with a single executable binary, and you don't have to worry about troublesome dependencies during installation.
You can not only write and run your own queries, but also give the source files to your coworkers to run.

This tool may be useful for those who want to handle data easily without having to think about troublesome matters.

## Features

* CSV File Operation
Expand Down
69 changes: 35 additions & 34 deletions docs/_posts/2006-01-02-flag.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,40 +10,41 @@ A flag is a representation of a [command option]({{ '/reference/command.html#opt

## Flags

| name | type | description |
| :- | :- | :- |
| @@REPOSITORY | string | Directory path where files are located |
| @@TIMEZONE | string | Default TimeZone |
| @@DATETIME_FORMAT | string | Datetime Format to parse strings |
| @@ANSI_QUOTES | boolean | Use double quotation mark as identifier enclosure |
| @@STRICT_EQUAL | boolean | Compare strictly that two values are equal for DISTINCT, GROUP BY and ORDER BY |
| @@WAIT_TIMEOUT | float | Limit of the waiting time in seconds to wait for locked files to be released |
| @@IMPORT_FORMAT | string | Default format to load files |
| @@DELIMITER | string | Field delimiter for CSV |
| @@ALLOW_UNEVEN_FIELDS | boolean | Allow loading CSV files with uneven field length |
| @@DELIMITER_POSITIONS | string | Delimiter positions for Fixed-Length Format |
| @@JSON_QUERY | string | Query for JSON data |
| @@ENCODING | string | Character encoding |
| @@NO_HEADER | boolean | Import first line as a record |
| @@WITHOUT_NULL | boolean | Parse empty fields as empty strings |
| @@STRIP_ENDING_LINE_BREAK | boolean | Strip line break from the end of files and query results |
| @@FORMAT | string | Format of query results |
| @@WRITE_ENCODING | string | Character encoding of query results |
| @@WRITE_DELIMITER | string | Field delimiter for query results in CSV |
| @@WRITE_DELIMITER_POSITIONS | string | Delimiter positions for query results in Fixed-Length Format |
| @@WITHOUT_HEADER | boolean | Write without the header line in query results |
| @@LINE_BREAK | string | Line Break in query results |
| @@ENCLOSE_ALL | boolean | Enclose all string values in CSV |
| @@JSON_ESCAPE | string | JSON escape type of query results |
| @@PRETTY_PRINT | boolean | Make JSON output easier to read in query results |
| @@EAST_ASIAN_ENCODING | boolean | Count ambiguous characters as fullwidth |
| @@COUNT_DIACRITICAL_SIGN | boolean | Count diacritical signs as halfwidth |
| @@COUNT_FORMAT_CODE | boolean | Count format characters and zero-width spaces as halfwidth |
| @@COLOR | boolean | Use ANSI color escape sequences |
| @@QUIET | boolean | Suppress operation log output |
| @@LIMIT_RECURSION | integer | Maximum number of iterations for recursive queries |
| @@CPU | integer | Hint for the number of cpu cores to be used |
| @@STATS | boolean | Show execution time |
| name | type | description |
|:----------------------------|:--------|:-------------------------------------------------------------------------------|
| @@REPOSITORY | string | Directory path where files are located |
| @@TIMEZONE | string | Default TimeZone |
| @@DATETIME_FORMAT | string | Datetime Format to parse strings |
| @@ANSI_QUOTES | boolean | Use double quotation mark as identifier enclosure |
| @@STRICT_EQUAL | boolean | Compare strictly that two values are equal for DISTINCT, GROUP BY and ORDER BY |
| @@WAIT_TIMEOUT | float | Limit of the waiting time in seconds to wait for locked files to be released |
| @@IMPORT_FORMAT | string | Default format to load files |
| @@DELIMITER | string | Field delimiter for CSV |
| @@ALLOW_UNEVEN_FIELDS | boolean | Allow loading CSV files with uneven field length |
| @@DELIMITER_POSITIONS | string | Delimiter positions for Fixed-Length Format |
| @@JSON_QUERY | string | Query for JSON data |
| @@ENCODING | string | Character encoding |
| @@NO_HEADER | boolean | Import first line as a record |
| @@WITHOUT_NULL | boolean | Parse empty fields as empty strings |
| @@STRIP_ENDING_LINE_BREAK | boolean | Strip line break from the end of files and query results |
| @@FORMAT | string | Format of query results |
| @@WRITE_ENCODING | string | Character encoding of query results |
| @@WRITE_DELIMITER | string | Field delimiter for query results in CSV |
| @@WRITE_DELIMITER_POSITIONS | string | Delimiter positions for query results in Fixed-Length Format |
| @@WITHOUT_HEADER | boolean | Write without the header line in query results |
| @@LINE_BREAK | string | Line Break in query results |
| @@ENCLOSE_ALL | boolean | Enclose all string values in CSV |
| @@JSON_ESCAPE | string | JSON escape type of query results |
| @@PRETTY_PRINT | boolean | Make JSON output easier to read in query results |
| @@SCIENTIFIC_NOTATION | boolean | Use Scientific Notation for large exponents in output |
| @@EAST_ASIAN_ENCODING | boolean | Count ambiguous characters as fullwidth |
| @@COUNT_DIACRITICAL_SIGN | boolean | Count diacritical signs as halfwidth |
| @@COUNT_FORMAT_CODE | boolean | Count format characters and zero-width spaces as halfwidth |
| @@COLOR | boolean | Use ANSI color escape sequences |
| @@QUIET | boolean | Suppress operation log output |
| @@LIMIT_RECURSION | integer | Maximum number of iterations for recursive queries |
| @@CPU | integer | Hint for the number of cpu cores to be used |
| @@STATS | boolean | Show execution time |


### SET FLAG
Expand Down
113 changes: 87 additions & 26 deletions docs/_posts/2006-01-02-select-query.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,12 +128,10 @@ table_entity

table_identifier
: table_name
| url
| table_identification_function
| STDIN

inline_table_identifier
: table_name
| url

laterable_table
: subquery
| subquery alias
Expand All @@ -158,19 +156,29 @@ join_condition
: ON condition
| USING (column_name [, column_name, ...])

table_identification_function
: FILE::(file_path)
: INLINE::(file_path)
: URL::(url_string)
: DATA::(data_string)

table_object
: CSV(delimiter, table_identifier [, encoding [, no_header [, without_null]]])
| FIXED(delimiter_positions, table_identifier [, encoding [, no_header [, without_null]]])
| JSON(json_query, table_identifier)
| JSONL(json_query, table_identifier)
| LTSV(table_identifier [, encoding [, without_null]])

inline_table_object
inline_table_object -- Deprecated. Table identification functions can be used instead.
: CSV_INLINE(delimiter, inline_table_identifier [, encoding [, no_header [, without_null]]])
| CSV_INLINE(delimiter, csv_data)
| JSON_INLINE(json_query, inline_table_identifier [, encoding [, no_header [, without_null]]])
| JSON_INLINE(json_query, json_data)

inline_table_identifier
: table_name
| url_identifier

```

_table_name_
Expand All @@ -179,7 +187,7 @@ _table_name_
A _table_name_ represents a file path, a [temporary table]({{ '/reference/temporary-table.html' | relative_url }}), or a [inline table]({{ '/reference/common-table-expression.html' | relative_url }}).
You can use absolute path or relative path from the directory specified by the ["--repository" option]({{ '/reference/command.html#options' | relative_url }}) as a file path.

When the file name extension is ".csv", ".tsv", ".json" or ".txt", the format to be loaded is automatically determined by the file extension and you can omit it.
When the file name extension is ".csv", ".tsv", ".json", ".jsonl" or ".txt", the format to be loaded is automatically determined by the file extension, and you can omit it.

```sql
FROM `user.csv` -- Relative path
Expand All @@ -190,12 +198,68 @@ _table_name_
The specifications of the command options are used as file attributes such as encoding to be loaded.
If you want to specify the different attributes for each file, you can use _table_object_ expressions for each file to load.

Once a file is loaded, then the data is cached and it can be loaded with only file name after that within the transaction.
Once a file is loaded, then the data is cached, and it can be loaded with only file name after that within the transaction.

_url_
: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})
: A string of characters representing URL starting with a schema name and a colon.

A URL of the http or https scheme to refer to a resource.
"http", "https" and "file" schemes are available.

```sql
https://example.com/files/data.csv -- Remote resource downloaded using HTTP GET method
file:///C:/Users/yourname/files/data.csv -- Local file specified by absolute path
file:./data.csv -- Local file specified by relative path
```

An inline table is created from remote resources.
The downloaded data is cached until the transaction ends.

The file format is automatically determined when the http response specifies the following content types.

| MIME type | Format |
|:-----------------|:-------|
| text/csv | CSV |
| application/json | JSON |

_table_identification_function_
: Function notation with a name followed by two colons.

- FILE::(file_path)

file_path: [string]({{ '/reference/value.html#string' | relative_url }})

This is the same as specifying a file using _table_name_.

- INLINE::(file_path)

file_path: [string]({{ '/reference/value.html#string' | relative_url }})

Files read by this function are not cached and cannot be updated.

- URL::(url_string)

url_string: [string]({{ '/reference/value.html#string' | relative_url }})

When specifying a resource using _url_, the path must be encoded, but this function does not require encoding.

- DATA::(data_string)

file_path: [string]({{ '/reference/value.html#string' | relative_url }})

This function creates an inline table from a string.

Example of use in a query:

```sql
SELECT id,
tag_name,
(SELECT COUNT(*) FROM JSON('', DATA::(assets))) AS number_of_assets,
published_at
FROM https://api.github.com/repos/mithrandie/csvq/releases
WHERE prerelease = false
ORDER BY published_at DESC
LIMIT 10
```

_alias_
: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})
Expand All @@ -217,28 +281,14 @@ _condition_
_column_name_
: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})

_delimiter_
: [string]({{ '/reference/value.html#string' | relative_url }})

_json_query_
: [JSON Query]({{ '/reference/json.html#query' | relative_url }})

Empty string is equivalent to "{}".

_json_file_
: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})

A _json_file_ represents a json file path.
You can use absolute path or relative path from the directory specified by the ["--repository" option]({{ '/reference/command.html#options' | relative_url }}) as a json file path.

If a file name extension is ".json", you can omit it.

_csv_data_
: [string]({{ '/reference/value.html#string' | relative_url }})

_json_data_
: [string]({{ '/reference/value.html#string' | relative_url }})

_delimiter_
: [string]({{ '/reference/value.html#string' | relative_url }})

_delimiter_positions_
: [string]({{ '/reference/value.html#string' | relative_url }})

Expand All @@ -255,6 +305,17 @@ _no_header_
_without_null_
: [boolean]({{ '/reference/value.html#boolean' | relative_url }})

_url_identifier_
: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})

A URL of the http or https scheme to refer to a resource.

_csv_data_
: [string]({{ '/reference/value.html#string' | relative_url }})

_json_data_
: [string]({{ '/reference/value.html#string' | relative_url }})

#### Special Tables
{: #special_tables}

Expand Down
6 changes: 6 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ title: Change Log - csvq

# Change Log

## Version 1.17.8

Released on Jul 24, 2022

- Allow URLs and Table identification functions to be used as table identifiers.

## Version 1.17.7

Released on Jul 3, 2022
Expand Down
19 changes: 15 additions & 4 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,32 @@ title: csvq - SQL-like query language for csv

## Overview

csvq is a command line tool to operate CSV files.
Csvq is a command line tool to operate CSV files.
You can read, update, delete CSV records with SQL-like query.

You can also execute multiple operations sequentially in managed transactions by passing a procedure or using the interactive shell.
In the multiple operations, you can use variables, cursors, temporary tables, and other features.

## Latest Release

Version 1.17.7
: Released on Jul 3, 2022
Version 1.17.8
: Released on Jul 24, 2022

<a class="waves-effect waves-light btn" href="https://github.com/mithrandie/csvq/releases/tag/v1.17.7">
<a class="waves-effect waves-light btn" href="https://github.com/mithrandie/csvq/releases/tag/v1.17.8">
<i class="material-icons left">file_download</i>download
</a>

## Intended Use
Csvq is intended for one-time queries and routine processing described in source files on the amount of data that can be handled by spreadsheet applications.

It is not suitable for handling very large data since all data is kept on memory when queries are executed.
There is no indexing, calculation order optimization, etc., and the execution speed is not fast due to the inclusion of mechanisms for updating data and handling various other features.

However, it can be run with a single executable binary, and you don't have to worry about troublesome dependencies during installation.
You can not only write and run your own queries, but also give the source files to your coworkers to run.

This tool may be useful for those who want to handle data easily without having to think about troublesome matters.

## Features

* CSV File Operation
Expand Down
8 changes: 4 additions & 4 deletions docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
<loc>https://mithrandie.github.io/csvq/</loc>
<lastmod>2022-07-03T20:31:51+00:00</lastmod>
<lastmod>2022-07-24T14:25:04+00:00</lastmod>
</url>
<url>
<loc>https://mithrandie.github.io/csvq/reference.html</loc>
Expand All @@ -30,7 +30,7 @@
</url>
<url>
<loc>https://mithrandie.github.io/csvq/reference/select-query.html</loc>
<lastmod>2022-05-03T15:57:12+00:00</lastmod>
<lastmod>2022-07-24T14:09:22+00:00</lastmod>
</url>
<url>
<loc>https://mithrandie.github.io/csvq/reference/insert-query.html</loc>
Expand Down Expand Up @@ -102,7 +102,7 @@
</url>
<url>
<loc>https://mithrandie.github.io/csvq/reference/flag.html</loc>
<lastmod>2021-05-04T23:51:25+00:00</lastmod>
<lastmod>2022-07-09T07:27:32+00:00</lastmod>
</url>
<url>
<loc>https://mithrandie.github.io/csvq/reference/environment-variable.html</loc>
Expand Down Expand Up @@ -182,7 +182,7 @@
</url>
<url>
<loc>https://mithrandie.github.io/csvq/changelog.html</loc>
<lastmod>2022-07-03T20:31:51+00:00</lastmod>
<lastmod>2022-07-24T14:25:04+00:00</lastmod>
</url>
<url>
<loc>https://mithrandie.github.io/csvq/license.html</loc>
Expand Down
2 changes: 1 addition & 1 deletion lib/action/run.go
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ func showStats(ctx context.Context, proc *query.Processor, start time.Time) {
}
width = width + 1

w := query.NewObjectWriter(proc.Tx)
w := proc.Tx.CreateDocumentWriter()
w.WriteColor(" TotalTime:", option.LableEffect)
w.WriteSpaces(width - len(exectime))
w.WriteWithoutLineBreak(exectime + " seconds")
Expand Down
Loading

0 comments on commit 1fd08b0

Please sign in to comment.