Release v1.17.8

- Allow URLs and Table identification functions to be used as table identifiers.
mithrandie · Jul 24, 2022 · 1fd08b0 · 1fd08b0
2 parents bd009ac + 492d021
commit 1fd08b0
Show file tree

Hide file tree

Showing 44 changed files with 4,341 additions and 2,896 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,11 @@
 # Change Log
 
+## Version 1.17.8
+
+Released on Jul 24, 2022
+
+- Allow URLs and Table identification functions to be used as table identifiers.
+
 ## Version 1.17.7
 
 Released on Jul 3, 2022

diff --git a/README.md b/README.md
@@ -6,7 +6,7 @@ SQL-like query language for csv
 [![codecov](https://codecov.io/gh/mithrandie/csvq/branch/master/graph/badge.svg)](https://codecov.io/gh/mithrandie/csvq)
 [![License: MIT](https://img.shields.io/badge/License-MIT-lightgrey.svg)](https://opensource.org/licenses/MIT)
 
-csvq is a command line tool to operate CSV files. 
+Csvq is a command line tool to operate CSV files. 
 You can read, update, delete CSV records with SQL-like query.
 
 You can also execute multiple operations sequentially in managed transactions by passing a procedure or using the interactive shell.
@@ -16,6 +16,17 @@ In the multiple operations, you can use variables, cursors, temporary tables, an
 [![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/mithrandie/csvq?color=%2320b2aa&label=GitHub%20Release&sort=semver)](https://github.com/mithrandie/csvq/releases/latest)
 [![GitHub tag (latest SemVer)](https://img.shields.io/github/v/tag/qittu/csvq-deb?color=%2320b2aa&label=Launchpad%20PPA)](https://launchpad.net/~mithrandie/+archive/ubuntu/csvq)
 
+## Intended Use
+Csvq is intended for one-time queries and routine processing described in source files on the amount of data that can be handled by spreadsheet applications.
+
+It is not suitable for handling very large data since all data is kept on memory when queries are executed.
+There is no indexing, calculation order optimization, etc., and the execution speed is not fast due to the inclusion of mechanisms for updating data and handling various other features.
+
+However, it can be run with a single executable binary, and you don't have to worry about troublesome dependencies during installation.
+You can not only write and run your own queries, but also give the source files to your coworkers to run.
+
+This tool may be useful for those who want to handle data easily without having to think about troublesome matters.
+
 ## Features
 
 * CSV File Operation

diff --git a/docs/_posts/2006-01-02-flag.md b/docs/_posts/2006-01-02-flag.md
@@ -10,40 +10,41 @@ A flag is a representation of a [command option]({{ '/reference/command.html#opt
 
 ## Flags
 
-| name | type | description |
-| :- | :- | :- |
-| @@REPOSITORY             | string  | Directory path where files are located |
-| @@TIMEZONE               | string  | Default TimeZone |
-| @@DATETIME_FORMAT        | string  | Datetime Format to parse strings |
-| @@ANSI_QUOTES            | boolean | Use double quotation mark as identifier enclosure |
-| @@STRICT_EQUAL           | boolean | Compare strictly that two values are equal for DISTINCT, GROUP BY and ORDER BY |
-| @@WAIT_TIMEOUT           | float   | Limit of the waiting time in seconds to wait for locked files to be released |
-| @@IMPORT_FORMAT          | string  | Default format to load files |
-| @@DELIMITER              | string  | Field delimiter for CSV |
-| @@ALLOW_UNEVEN_FIELDS    | boolean | Allow loading CSV files with uneven field length |
-| @@DELIMITER_POSITIONS    | string  | Delimiter positions for Fixed-Length Format |
-| @@JSON_QUERY             | string  | Query for JSON data |
-| @@ENCODING               | string  | Character encoding |
-| @@NO_HEADER              | boolean | Import first line as a record |
-| @@WITHOUT_NULL           | boolean | Parse empty fields as empty strings |
-| @@STRIP_ENDING_LINE_BREAK | boolean | Strip line break from the end of files and query results |
-| @@FORMAT                 | string  | Format of query results |
-| @@WRITE_ENCODING         | string  | Character encoding of query results |
-| @@WRITE_DELIMITER        | string  | Field delimiter for query results in CSV |
-| @@WRITE_DELIMITER_POSITIONS | string  | Delimiter positions for query results in Fixed-Length Format |
-| @@WITHOUT_HEADER         | boolean | Write without the header line in query results |
-| @@LINE_BREAK             | string  | Line Break in query results |
-| @@ENCLOSE_ALL            | boolean | Enclose all string values in CSV |
-| @@JSON_ESCAPE            | string  | JSON escape type of query results |
-| @@PRETTY_PRINT           | boolean | Make JSON output easier to read in query results |
-| @@EAST_ASIAN_ENCODING    | boolean | Count ambiguous characters as fullwidth |
-| @@COUNT_DIACRITICAL_SIGN | boolean | Count diacritical signs as halfwidth |
-| @@COUNT_FORMAT_CODE      | boolean | Count format characters and zero-width spaces as halfwidth |
-| @@COLOR                  | boolean | Use ANSI color escape sequences |
-| @@QUIET                  | boolean | Suppress operation log output |
-| @@LIMIT_RECURSION        | integer | Maximum number of iterations for recursive queries |
-| @@CPU                    | integer | Hint for the number of cpu cores to be used |
-| @@STATS                  | boolean | Show execution time |
+| name                        | type    | description                                                                    |
+|:----------------------------|:--------|:-------------------------------------------------------------------------------|
+| @@REPOSITORY                | string  | Directory path where files are located                                         |
+| @@TIMEZONE                  | string  | Default TimeZone                                                               |
+| @@DATETIME_FORMAT           | string  | Datetime Format to parse strings                                               |
+| @@ANSI_QUOTES               | boolean | Use double quotation mark as identifier enclosure                              |
+| @@STRICT_EQUAL              | boolean | Compare strictly that two values are equal for DISTINCT, GROUP BY and ORDER BY |
+| @@WAIT_TIMEOUT              | float   | Limit of the waiting time in seconds to wait for locked files to be released   |
+| @@IMPORT_FORMAT             | string  | Default format to load files                                                   |
+| @@DELIMITER                 | string  | Field delimiter for CSV                                                        |
+| @@ALLOW_UNEVEN_FIELDS       | boolean | Allow loading CSV files with uneven field length                               |
+| @@DELIMITER_POSITIONS       | string  | Delimiter positions for Fixed-Length Format                                    |
+| @@JSON_QUERY                | string  | Query for JSON data                                                            |
+| @@ENCODING                  | string  | Character encoding                                                             |
+| @@NO_HEADER                 | boolean | Import first line as a record                                                  |
+| @@WITHOUT_NULL              | boolean | Parse empty fields as empty strings                                            |
+| @@STRIP_ENDING_LINE_BREAK   | boolean | Strip line break from the end of files and query results                       |
+| @@FORMAT                    | string  | Format of query results                                                        |
+| @@WRITE_ENCODING            | string  | Character encoding of query results                                            |
+| @@WRITE_DELIMITER           | string  | Field delimiter for query results in CSV                                       |
+| @@WRITE_DELIMITER_POSITIONS | string  | Delimiter positions for query results in Fixed-Length Format                   |
+| @@WITHOUT_HEADER            | boolean | Write without the header line in query results                                 |
+| @@LINE_BREAK                | string  | Line Break in query results                                                    |
+| @@ENCLOSE_ALL               | boolean | Enclose all string values in CSV                                               |
+| @@JSON_ESCAPE               | string  | JSON escape type of query results                                              |
+| @@PRETTY_PRINT              | boolean | Make JSON output easier to read in query results                               |
+| @@SCIENTIFIC_NOTATION       | boolean | Use Scientific Notation for large exponents in output                          |
+| @@EAST_ASIAN_ENCODING       | boolean | Count ambiguous characters as fullwidth                                        |
+| @@COUNT_DIACRITICAL_SIGN    | boolean | Count diacritical signs as halfwidth                                           |
+| @@COUNT_FORMAT_CODE         | boolean | Count format characters and zero-width spaces as halfwidth                     |
+| @@COLOR                     | boolean | Use ANSI color escape sequences                                                |
+| @@QUIET                     | boolean | Suppress operation log output                                                  |
+| @@LIMIT_RECURSION           | integer | Maximum number of iterations for recursive queries                             |
+| @@CPU                       | integer | Hint for the number of cpu cores to be used                                    |
+| @@STATS                     | boolean | Show execution time                                                            |
 
 
 ### SET FLAG

diff --git a/docs/_posts/2006-01-02-select-query.md b/docs/_posts/2006-01-02-select-query.md
@@ -128,12 +128,10 @@ table_entity
 
 table_identifier
   : table_name
+  | url
+  | table_identification_function
   | STDIN
 
-inline_table_identifier
-  : table_name
-  | url
-
 laterable_table
   : subquery
   | subquery alias
@@ -158,19 +156,29 @@ join_condition
   : ON condition
   | USING (column_name [, column_name, ...])
 
+table_identification_function
+  : FILE::(file_path)
+  : INLINE::(file_path)
+  : URL::(url_string)
+  : DATA::(data_string)
+
 table_object
   : CSV(delimiter, table_identifier [, encoding [, no_header [, without_null]]])
   | FIXED(delimiter_positions, table_identifier [, encoding [, no_header [, without_null]]])
   | JSON(json_query, table_identifier)
   | JSONL(json_query, table_identifier)
   | LTSV(table_identifier [, encoding [, without_null]])
 
-inline_table_object
+inline_table_object  -- Deprecated. Table identification functions can be used instead.
   : CSV_INLINE(delimiter, inline_table_identifier [, encoding [, no_header [, without_null]]])
   | CSV_INLINE(delimiter, csv_data)
   | JSON_INLINE(json_query, inline_table_identifier [, encoding [, no_header [, without_null]]])
   | JSON_INLINE(json_query, json_data)
 
+inline_table_identifier
+  : table_name
+  | url_identifier
+
 ```
 
 _table_name_
@@ -179,7 +187,7 @@ _table_name_
   A _table_name_ represents a file path, a [temporary table]({{ '/reference/temporary-table.html' | relative_url }}), or a [inline table]({{ '/reference/common-table-expression.html' | relative_url }}).
   You can use absolute path or relative path from the directory specified by the ["--repository" option]({{ '/reference/command.html#options' | relative_url }}) as a file path.
 
-  When the file name extension is ".csv", ".tsv", ".json" or ".txt", the format to be loaded is automatically determined by the file extension and you can omit it. 
+  When the file name extension is ".csv", ".tsv", ".json", ".jsonl" or ".txt", the format to be loaded is automatically determined by the file extension, and you can omit it. 
 
   ```sql
   FROM `user.csv`          -- Relative path
@@ -190,12 +198,68 @@ _table_name_
   The specifications of the command options are used as file attributes such as encoding to be loaded. 
   If you want to specify the different attributes for each file, you can use _table_object_ expressions for each file to load.
 
-  Once a file is loaded, then the data is cached and it can be loaded with only file name after that within the transaction.
+  Once a file is loaded, then the data is cached, and it can be loaded with only file name after that within the transaction.
 
 _url_
-: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})
+: A string of characters representing URL starting with a schema name and a colon.
 
-  A URL of the http or https scheme to refer to a resource.
+  "http", "https" and "file" schemes are available.
+
+  ```sql
+  https://example.com/files/data.csv       -- Remote resource downloaded using HTTP GET method
+  file:///C:/Users/yourname/files/data.csv -- Local file specified by absolute path
+  file:./data.csv                          -- Local file specified by relative path
+  ```
+
+  An inline table is created from remote resources.
+  The downloaded data is cached until the transaction ends.
+
+  The file format is automatically determined when the http response specifies the following content types.
+
+| MIME type        | Format |
+|:-----------------|:-------|
+| text/csv         | CSV    |
+| application/json | JSON   |
+
+_table_identification_function_
+: Function notation with a name followed by two colons.
+
+  - FILE::(file_path)
+
+    file_path: [string]({{ '/reference/value.html#string' | relative_url }})
+
+    This is the same as specifying a file using _table_name_.
+
+  - INLINE::(file_path)
+
+    file_path: [string]({{ '/reference/value.html#string' | relative_url }})
+
+    Files read by this function are not cached and cannot be updated.
+
+  - URL::(url_string)
+
+    url_string: [string]({{ '/reference/value.html#string' | relative_url }})
+
+    When specifying a resource using _url_, the path must be encoded, but this function does not require encoding.
+
+  - DATA::(data_string)
+
+    file_path: [string]({{ '/reference/value.html#string' | relative_url }})
+
+    This function creates an inline table from a string.
+
+  Example of use in a query:
+
+  ```sql
+  SELECT id,
+         tag_name,
+         (SELECT COUNT(*) FROM JSON('', DATA::(assets))) AS number_of_assets,
+         published_at
+    FROM https://api.github.com/repos/mithrandie/csvq/releases
+   WHERE prerelease = false
+   ORDER BY published_at DESC
+   LIMIT 10
+  ```
 
 _alias_
 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }})
@@ -217,28 +281,14 @@ _condition_
 _column_name_
 : [identifier]({{ '/reference/statement.html#parsing' | relative_url }})
 
+_delimiter_  
+: [string]({{ '/reference/value.html#string' | relative_url }})
+
 _json_query_
 : [JSON Query]({{ '/reference/json.html#query' | relative_url }})
 
   Empty string is equivalent to "{}".
 
-_json_file_
-: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})
-
-  A _json_file_ represents a json file path.
-  You can use absolute path or relative path from the directory specified by the ["--repository" option]({{ '/reference/command.html#options' | relative_url }}) as a json file path.
-
-  If a file name extension is ".json", you can omit it. 
-
-_csv_data_
-: [string]({{ '/reference/value.html#string' | relative_url }})
-
-_json_data_
-: [string]({{ '/reference/value.html#string' | relative_url }})
-
-_delimiter_  
-: [string]({{ '/reference/value.html#string' | relative_url }})
-
 _delimiter_positions_  
 : [string]({{ '/reference/value.html#string' | relative_url }})
 
@@ -255,6 +305,17 @@ _no_header_
 _without_null_
 : [boolean]({{ '/reference/value.html#boolean' | relative_url }})
 
+_url_identifier_
+: [identifier]({{ '/reference/statement.html#parsing' | relative_url }})
+
+  A URL of the http or https scheme to refer to a resource.
+
+_csv_data_
+: [string]({{ '/reference/value.html#string' | relative_url }})
+
+_json_data_
+: [string]({{ '/reference/value.html#string' | relative_url }})
+
 #### Special Tables
 {: #special_tables}
 

diff --git a/docs/changelog.md b/docs/changelog.md
@@ -5,6 +5,12 @@ title: Change Log - csvq
 
 # Change Log
 
+## Version 1.17.8
+
+Released on Jul 24, 2022
+
+- Allow URLs and Table identification functions to be used as table identifiers.
+
 ## Version 1.17.7
 
 Released on Jul 3, 2022

diff --git a/docs/index.md b/docs/index.md
@@ -5,21 +5,32 @@ title: csvq - SQL-like query language for csv
 
 ## Overview
 
-csvq is a command line tool to operate CSV files. 
+Csvq is a command line tool to operate CSV files. 
 You can read, update, delete CSV records with SQL-like query.
 
 You can also execute multiple operations sequentially in managed transactions by passing a procedure or using the interactive shell.
 In the multiple operations, you can use variables, cursors, temporary tables, and other features. 
 
 ## Latest Release
 
-Version 1.17.7
-: Released on Jul 3, 2022
+Version 1.17.8
+: Released on Jul 24, 2022
 
-  <a class="waves-effect waves-light btn" href="https://github.com/mithrandie/csvq/releases/tag/v1.17.7">
+  <a class="waves-effect waves-light btn" href="https://github.com/mithrandie/csvq/releases/tag/v1.17.8">
     <i class="material-icons left">file_download</i>download
   </a>
 
+## Intended Use
+Csvq is intended for one-time queries and routine processing described in source files on the amount of data that can be handled by spreadsheet applications.
+
+It is not suitable for handling very large data since all data is kept on memory when queries are executed.
+There is no indexing, calculation order optimization, etc., and the execution speed is not fast due to the inclusion of mechanisms for updating data and handling various other features.
+
+However, it can be run with a single executable binary, and you don't have to worry about troublesome dependencies during installation.
+You can not only write and run your own queries, but also give the source files to your coworkers to run.
+
+This tool may be useful for those who want to handle data easily without having to think about troublesome matters.
+
 ## Features
 
 * CSV File Operation

diff --git a/docs/sitemap.xml b/docs/sitemap.xml
@@ -6,7 +6,7 @@
             http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
     <url>
         <loc>https://mithrandie.github.io/csvq/</loc>
-        <lastmod>2022-07-03T20:31:51+00:00</lastmod>
+        <lastmod>2022-07-24T14:25:04+00:00</lastmod>
     </url>
     <url>
         <loc>https://mithrandie.github.io/csvq/reference.html</loc>
@@ -30,7 +30,7 @@
     </url>
     <url>
         <loc>https://mithrandie.github.io/csvq/reference/select-query.html</loc>
-        <lastmod>2022-05-03T15:57:12+00:00</lastmod>
+        <lastmod>2022-07-24T14:09:22+00:00</lastmod>
     </url>
     <url>
         <loc>https://mithrandie.github.io/csvq/reference/insert-query.html</loc>
@@ -102,7 +102,7 @@
     </url>
     <url>
         <loc>https://mithrandie.github.io/csvq/reference/flag.html</loc>
-        <lastmod>2021-05-04T23:51:25+00:00</lastmod>
+        <lastmod>2022-07-09T07:27:32+00:00</lastmod>
     </url>
     <url>
         <loc>https://mithrandie.github.io/csvq/reference/environment-variable.html</loc>
@@ -182,7 +182,7 @@
     </url>
     <url>
         <loc>https://mithrandie.github.io/csvq/changelog.html</loc>
-        <lastmod>2022-07-03T20:31:51+00:00</lastmod>
+        <lastmod>2022-07-24T14:25:04+00:00</lastmod>
     </url>
     <url>
         <loc>https://mithrandie.github.io/csvq/license.html</loc>

diff --git a/lib/action/run.go b/lib/action/run.go
@@ -196,7 +196,7 @@ func showStats(ctx context.Context, proc *query.Processor, start time.Time) {
 	}
 	width = width + 1
 
-	w := query.NewObjectWriter(proc.Tx)
+	w := proc.Tx.CreateDocumentWriter()
 	w.WriteColor(" TotalTime:", option.LableEffect)
 	w.WriteSpaces(width - len(exectime))
 	w.WriteWithoutLineBreak(exectime + " seconds")