[FIX] fix analyzer error in window function(apache#2039)

yangzhg · Nov 25, 2019 · b79cb65 · b79cb65
1 parent c7d52af
commit b79cb65
Show file tree

Hide file tree

Showing 17 changed files with 574 additions and 303 deletions.
diff --git a/be/src/exec/repeat_node.cpp b/be/src/exec/repeat_node.cpp
@@ -146,6 +146,7 @@ Status RepeatNode::get_repeated_batch(
 
         for(size_t slot_idx = 0; slot_idx < _grouping_list.size(); ++slot_idx) {
             int64_t val = _grouping_list[slot_idx][repeat_id_idx];
+            DCHECK_LT(slot_idx, _tuple_desc->slots().size()) << "TupleDescriptor: " << _tuple_desc->debug_string();
             const SlotDescriptor *slot_desc = _tuple_desc->slots()[slot_idx];
             tuple->set_not_null(slot_desc->null_indicator_offset());
             RawValue::write(&val, tuple, slot_desc, tuple_pool);

diff --git a/docs/documentation/cn/sql-reference/sql-statements/Data Manipulation/GROUP BY.md b/docs/documentation/cn/sql-reference/sql-statements/Data Manipulation/GROUP BY.md
@@ -0,0 +1,173 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# GROUP BY
+
+## description
+
+  GROUP BY `GROUPING SETS` ｜ `CUBE` ｜ `ROLLUP` 是对 GROUP BY 子句的扩展，它能够在一个 GROUP BY 子句中实现多个集合的分组的聚合。其结果等价于将多个相应 GROUP BY 子句进行 UNION 操作。
+
+  GROUP BY 子句是只含有一个元素的 GROUP BY GROUPING SETS 的特例。
+  例如，GROUPING SETS 语句：
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
+  ```
+
+  其查询结果等价于：
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b
+  UNION
+  SELECT a, null, SUM( c ) FROM tab1 GROUP BY a
+  UNION
+  SELECT null, b, SUM( c ) FROM tab1 GROUP BY b
+  UNION
+  SELECT null, null, SUM( c ) FROM tab1
+  ```
+
+  `GROUPING(expr)` 指示一个列是否为聚合列，如果是聚合列为0，否则为1
+
+  `GROUPING_ID(expr  [ , expr [ , ... ] ])` 与GROUPING 类似， GROUPING_ID根据指定的column 顺序，计算出一个列列表的 bitmap 值，每一位为GROUPING的值. GROUPING_ID()函数返回位向量的十进制值。
+
+### Syntax
+
+  ```
+  SELECT ...
+  FROM ...
+  [ ... ]
+  GROUP BY [
+      , ... |
+      GROUPING SETS [, ...] (  groupSet [ , groupSet [ , ... ] ] ) |
+      ROLLUP(expr  [ , expr [ , ... ] ]) |
+      expr  [ , expr [ , ... ] ] WITH ROLLUP |
+      CUBE(expr  [ , expr [ , ... ] ]) |
+      expr  [ , expr [ , ... ] ] WITH CUBE
+      ]
+  [ ... ]
+  ```
+
+### Parameters
+
+  `groupSet` 表示 select list 中的列，别名或者表达式组成的集合 `groupSet ::= { ( expr  [ , expr [ , ... ] ] )}`
+
+  `expr`  表示 select list 中的列，别名或者表达式
+
+### Note
+
+  doris 支持两种语法，类似PostgreSQL 语法和 类似hive 语法，这两种语法实例如下
+
+  类 PostgreSQL 语法：
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
+  SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP(a,b,c)
+  SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY CUBE(a,b,c)
+  ```
+
+  类似hive 语法
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY a,b GROUPING SETS ( (a, b), (a), (b), ( ) );
+  SELECT  a, b, c, SUM( d )  FROM tab1 GROUP BY  a,b,c WITH ROLLUP
+  SELECT  a, b, c, SUM( d )  FROM tab1 GROUP BY  a,b,c WITH CUBE
+  ```
+
+  `ROLLUP(a,b,c)` 等价于如下`GROUPING SETS` 语句
+
+  ```
+  GROUPING SETS (
+  (a,b,c),
+  ( a, b ),
+  ( a),
+  ( )
+  )
+  ```
+
+  `CUBE ( a, b, c )` 等价于如下`GROUPING SETS` 语句
+
+  ```
+  GROUPING SETS (
+  ( a, b, c ),
+  ( a, b ),
+  ( a,    c ),
+  ( a       ),
+  (    b, c ),
+  (    b    ),
+  (       c ),
+  (         )
+  )
+  ```
+
+## example
+
+  下面是一个实际数据的例子
+
+  ```
+  > SELECT * FROM t;
+  +------+------+------+
+  | k1   | k2   | k3   |
+  +------+------+------+
+  | a    | A    |    1 |
+  | a    | A    |    2 |
+  | a    | B    |    1 |
+  | a    | B    |    3 |
+  | b    | A    |    1 |
+  | b    | A    |    4 |
+  | b    | B    |    1 |
+  | b    | B    |    5 |
+  +------+------+------+
+  8 rows in set (0.01 sec)
+
+  > SELECT k1, k2, SUM(k3) FROM t GROUP BY GROUPING SETS ( (k1, k2), (k2), (k1), ( ) );
+  +------+------+-----------+
+  | k1   | k2   | sum(`k3`) |
+  +------+------+-----------+
+  | b    | B    |         6 |
+  | a    | B    |         4 |
+  | a    | A    |         3 |
+  | b    | A    |         5 |
+  | NULL | B    |        10 |
+  | NULL | A    |         8 |
+  | a    | NULL |         7 |
+  | b    | NULL |        11 |
+  | NULL | NULL |        18 |
+  +------+------+-----------+
+  9 rows in set (0.06 sec)
+
+  > SELECT k1, k2, GROUPING_ID(k1,k2), SUM(k3) FROM t GROUP BY GROUPING SETS ((k1, k2), (k1), (k2), ());
+  +------+------+---------------+----------------+
+  | k1   | k2   | grouping_id(k1,k2) | sum(`k3`) |
+  +------+------+---------------+----------------+
+  | a    | A    |             0 |              3 |
+  | a    | B    |             0 |              4 |
+  | a    | NULL |             1 |              7 |
+  | b    | A    |             0 |              5 |
+  | b    | B    |             0 |              6 |
+  | b    | NULL |             1 |             11 |
+  | NULL | A    |             2 |              8 |
+  | NULL | B    |             2 |             10 |
+  | NULL | NULL |             3 |             18 |
+  +------+------+---------------+----------------+
+  9 rows in set (0.02 sec)
+  ```
+
+## keyword
+
+  GROUP, GROUPING, GROUPING_ID, GROUPING_SETS, GROUPING SETS, CUBE, ROLLUP
diff --git a/...ql-statements/Data Manipulation/insert.md → ...ql-statements/Data Manipulation/INSERT.md b/...ql-statements/Data Manipulation/insert.md → ...ql-statements/Data Manipulation/INSERT.md
@@ -18,7 +18,9 @@ under the License.
 -->
 
 # INSERT
+
 ## description
+
 ### Syntax
 
 ```
@@ -47,7 +49,7 @@ INSERT INTO table_name
 > query: 一个普通查询，查询的结果会写入到目标中
 >
 > hint: 用于指示 `INSERT` 执行行为的一些指示符。`streaming` 和 默认的非 `streaming` 方式均会使用同步方式完成 `INSERT` 语句执行
->       非 `streaming` 方式在执行完成后会返回一个 label 方便用户通过 `SHOW LOAD` 查询导入的状态
+> 非 `streaming` 方式在执行完成后会返回一个 label 方便用户通过 `SHOW LOAD` 查询导入的状态
 
 ### Note
 

diff --git a/.../documentation/en/sql-reference/sql-statements/Data Manipulation/GROUP BY_EN.md b/.../documentation/en/sql-reference/sql-statements/Data Manipulation/GROUP BY_EN.md
@@ -0,0 +1,170 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# GROUP BY
+
+## description
+
+  GROUP BY `GROUPING SETS` ｜ `CUBE` ｜ `ROLLUP` is an extension to  GROUP BY clause. This syntax lets you define multiple groupings in the same query. GROUPING SETS produce a single result set that is equivalent to a UNION ALL of differently grouped rows
+  For example GROUPING SETS clause:
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
+  ```
+
+  This statement is equivalent to:
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b
+  UNION
+  SELECT a, null, SUM( c ) FROM tab1 GROUP BY a
+  UNION
+  SELECT null, b, SUM( c ) FROM tab1 GROUP BY b
+  UNION
+  SELECT null, null, SUM( c ) FROM tab1
+  ```
+
+  `GROUPING(expr)` indicates whether a specified column expression in a GROUP BY list is aggregated or not. GROUPING returns 1 for aggregated or 0 for not aggregated in the result set.
+
+  `GROUPING_ID(expr  [ , expr [ , ... ] ])` describes which of a list of expressions are grouped in a row produced by a GROUP BY query. The GROUPING_ID function simply returns the decimal equivalent of the binary value formed as a result of the concatenation of the values returned by the GROUPING functions.
+
+### Syntax
+
+  ```
+  SELECT ...
+  FROM ...
+  [ ... ]
+  GROUP BY [
+      , ... |
+      GROUPING SETS [, ...] (  groupSet [ , groupSet [ , ... ] ] ) |
+      ROLLUP(expr  [ , expr [ , ... ] ]) |
+      expr  [ , expr [ , ... ] ] WITH ROLLUP |
+      CUBE(expr  [ , expr [ , ... ] ]) |
+      expr  [ , expr [ , ... ] ] WITH CUBE
+      ]
+  [ ... ]
+  ```
+
+### Parameters
+
+  `groupSet` is a set of expression or column  or it's alias appearing in the query block’s SELECT list. `groupSet ::= { ( expr  [ , expr [ , ... ] ] )}`
+
+  `expr`  is expression or column  or it's alias appearing in the query block’s SELECT list.
+
+### Note
+
+  doris support two style of syntax， PostgreSQL like and hive like, for example:
+  PostgreSQL like syntax:
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
+  SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP(a,b,c)
+  SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY CUBE(a,b,c)
+  ```
+
+  hive like syntax:
+
+  ```
+  SELECT a, b, SUM( c ) FROM tab1 GROUP BY a,b GROUPING SETS ( (a, b), (a), (b), ( ) );
+  SELECT  a, b, c, SUM( d )  FROM tab1 GROUP BY  a,b,c WITH ROLLUP
+  SELECT  a, b, c, SUM( d )  FROM tab1 GROUP BY  a,b,c WITH CUBE
+  ```
+
+  `ROLLUP(a,b,c)` is equivalent to `GROUPING SETS` as follows:
+
+  ```
+  GROUPING SETS (
+  (a,b,c),
+  ( a, b ),
+  ( a),
+  ( )
+  )
+  ```
+
+  `CUBE ( a, b, c )` is equivalent to `GROUPING SETS` as follows:
+
+  ```
+  GROUPING SETS (
+  ( a, b, c ),
+  ( a, b ),
+  ( a,    c ),
+  ( a       ),
+  (    b, c ),
+  (    b    ),
+  (       c ),
+  (         )
+  )
+  ```
+
+## example
+
+  This is a simple example
+
+  ```
+  > SELECT * FROM t;
+  +------+------+------+
+  | k1   | k2   | k3   |
+  +------+------+------+
+  | a    | A    |    1 |
+  | a    | A    |    2 |
+  | a    | B    |    1 |
+  | a    | B    |    3 |
+  | b    | A    |    1 |
+  | b    | A    |    4 |
+  | b    | B    |    1 |
+  | b    | B    |    5 |
+  +------+------+------+
+  8 rows in set (0.01 sec)
+
+  > SELECT k1, k2, SUM(k3) FROM t GROUP BY GROUPING SETS ( (k1, k2), (k2), (k1), ( ) );
+  +------+------+-----------+
+  | k1   | k2   | sum(`k3`) |
+  +------+------+-----------+
+  | b    | B    |         6 |
+  | a    | B    |         4 |
+  | a    | A    |         3 |
+  | b    | A    |         5 |
+  | NULL | B    |        10 |
+  | NULL | A    |         8 |
+  | a    | NULL |         7 |
+  | b    | NULL |        11 |
+  | NULL | NULL |        18 |
+  +------+------+-----------+
+  9 rows in set (0.06 sec)
+
+  > SELECT k1, k2, GROUPING_ID(k1,k2), SUM(k3) FROM t GROUP BY GROUPING SETS ((k1, k2), (k1), (k2), ());
+  +------+------+---------------+----------------+
+  | k1   | k2   | grouping_id(k1,k2) | sum(`k3`) |
+  +------+------+---------------+----------------+
+  | a    | A    |             0 |              3 |
+  | a    | B    |             0 |              4 |
+  | a    | NULL |             1 |              7 |
+  | b    | A    |             0 |              5 |
+  | b    | B    |             0 |              6 |
+  | b    | NULL |             1 |             11 |
+  | NULL | A    |             2 |              8 |
+  | NULL | B    |             2 |             10 |
+  | NULL | NULL |             3 |             18 |
+  +------+------+---------------+----------------+
+  9 rows in set (0.02 sec)
+  ```
+
+## keyword
+
+  GROUP, GROUPING, GROUPING_ID, GROUPING_SETS, GROUPING SETS, CUBE, ROLLUP