Skip to content

Commit

Permalink
Merge pull request #10 from bianhq/master
Browse files Browse the repository at this point in the history
refine pixels-hive doc and fix pixels-load.
  • Loading branch information
bianhq authored Jul 5, 2019
2 parents c8d1515 + bda853e commit cc9833b
Show file tree
Hide file tree
Showing 6 changed files with 50 additions and 39 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,12 @@ public void add(byte value)
vector[writeIndex++] = value;
}

@Override
public void add(String value)
{
add(Byte.parseByte(value));
}

@Override
public void flatten(boolean selectedInUse, int[] sel, int size)
{
Expand Down
58 changes: 22 additions & 36 deletions pixels-load/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,64 +14,50 @@ java -jar rainbow-benchmark-0.1.0-SNAPSHOT-full.jar --data_size=30720 --thread_n

`/text` is the directory in HDFS

## How to use pixels-load
## Build
different `LOAD` command, the same `DDL` command
- single thread
`pom.xml` change **mainClass** with 'cn.edu.ruc.iir.pixels.load.single.Main'
- multiple thread
`pom.xml` change **mainClass** with 'cn.edu.ruc.iir.pixels.load.multi.Main'

## Pixels consumer command line tool
1> Start `pixels-metadata` thread
## How to Use Pixels Load
- Start `pixels-metadata` thread
```
java -jar -Dio.netty.leakDetection.level=advanced -Drole=main pixels-damon-0.1.0-SNAPSHOT-full.jar metadata
```
2> Start `pixels-load` thread
- Start `pixels-load` thread
```
java -jar pixels-load-0.1.0-SNAPSHOT-full.jar
```
`Note` use `DDL -h` or `LOAD -h`, you can see the usages of the command
- DDL Command
```
DDL -s {schema_file} -d {db_name}
```
- LOAD Command *single thread*
```
LOAD -f {format} -o {original_data_path} -d {db_name} -t {table_name} -n {row_num} -r {row_regex}
Use `LOAD -h`, you can see the usages of the command
- Create table

pixels> LOAD -f pixels -o hdfs://dbiir01:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 300000 -r \t
Create table in presto:
```
- LOAD Command *multiple thread*
cd /home/iir/opt/presto-server-0.192
./bin/presto --server localhost:8080 --catalog pixels-presto --schema pixels
create table ...;
```
LOAD -f {format} -o {original_data_path} -d {db_name} -t {table_name} -n {row_num} -r {row_regex} -c {consumer_thread_num} -p {producer}
pixels> LOAD -f pixels -o hdfs://dbiir01:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 300000 -r \t -c 4 -p false
pixels> LOAD -f pixels -o hdfs://dbiir01:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 300000 -r \t -c 4
{producer} is optional, default false.

- LOAD *(single thread)*
```

## Presto Command
- execute query
LOAD -f {format} -o {original_data_path} -d {db_name} -t {table_name} -n {row_num} -r {row_regex}
```
cd /home/iir/opt/presto-server-0.192
./bin/presto --server localhost:8080 --catalog pixels-presto --schema pixels
show tables;
Example:
```

## Orc
- Use `hive` to create tables such as text, and insert data from text(like the following *.sql)
- text_ddl.sql, orc_ddl.sql
- load_ddl.sql
pixels> LOAD -f pixels -o hdfs://dbiir01:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 300000 -r \t
```
./bin/hive
- LOAD *(multiple thread)*
```

- presto
LOAD -f {format} -o {original_data_path} -d {db_name} -t {table_name} -n {row_num} -r {row_regex} -c {consumer_thread_num} -p {producer}
```
Example:
```
./bin/presto --server localhost:8080 --catalog hive --schema default
pixels> LOAD -f pixels -o hdfs://dbiir01:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 300000 -r \t -c 4 -p false
pixels> LOAD -f pixels -o hdfs://dbiir01:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 300000 -r \t -c 4
```
`producer` is optional, default false.

## Logs
## Where is the Log
Go to path `/home/iir/opt/presto-server-0.192/data/var/log/server.log`
17 changes: 17 additions & 0 deletions pixels-load/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,23 @@
<optional>true</optional>
</dependency>

<!-- grpc -->
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-netty-shaded</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-protobuf</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-stub</artifactId>
<optional>true</optional>
</dependency>

<dependency>
<groupId>net.sourceforge.argparse4j</groupId>
<artifactId>argparse4j</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
* -p false [optional, default false]
* </p>
* <p>
* LOAD -f orc -o hdfs://dbiir10:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 150000 -r \t -c 16
* LOAD -f orc -o hdfs://dbiir10:9000/pixels/pixels/test_105/source -d pixels -t test_105 -n 220000 -r \t -c 16
* -l hdfs://dbiir10:9000/pixels/pixels/test_105/v_0_order_orc/
* </p>
* [-l] is optimal, assign a path not the 'OrderPath' in db(Defined in Config.java)
Expand Down Expand Up @@ -148,7 +148,7 @@ public static void main(String args[])
int threadNum = Integer.valueOf(ns.getString("consumer_thread_num"));
boolean producer = ns.getBoolean("producer");

BlockingQueue<Path> fileQueue = null;
BlockingQueue<Path> fileQueue;
ConfigFactory configFactory = ConfigFactory.Instance();
FSFactory fsFactory = FSFactory.Instance(configFactory.getProperty("hdfs.config.dir"));

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ public void run()
conf.set("fs.file.impl", LocalFileSystem.class.getName());
FileSystem fs = FileSystem.get(URI.create(loadingDataPath), conf);
TypeDescription schema = TypeDescription.fromString(schemaStr);
System.out.println(schemaStr);
System.out.println(loadingDataPath);
VectorizedRowBatch rowBatch = schema.createRowBatch();
ColumnVector[] columnVectors = rowBatch.cols;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ public static void main(String[] args)
}
}

if (!command.equals("DDL") && !command.equals("LOAD"))
if (!command.equals("LOAD"))
{
System.out.println("Command error");
}
Expand Down

0 comments on commit cc9833b

Please sign in to comment.