Skip to content

Commit

Permalink
docs: Move existing documentation into new Contributor Guide and add …
Browse files Browse the repository at this point in the history
…Getting Started section (apache#334)
  • Loading branch information
andygrove authored and Steve Vaughan Jr committed Apr 29, 2024
1 parent a6088fc commit f474062
Show file tree
Hide file tree
Showing 11 changed files with 344 additions and 140 deletions.
125 changes: 125 additions & 0 deletions .github/workflows/benchmark-tpch.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

name: TPC-H Correctness

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
cancel-in-progress: true

on:
push:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
pull_request:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
# manual trigger
# https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
workflow_dispatch:

env:
RUST_VERSION: nightly

jobs:
prepare:
name: Build native and prepare data
runs-on: ubuntu-latest
container:
image: amd64/rust
env:
JAVA_VERSION: 11
steps:
- uses: actions/checkout@v4
- name: Setup Rust & Java toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: ${{env.RUST_VERSION}}
jdk-version: 11
- name: Cache Maven dependencies
uses: actions/cache@v4
with:
path: |
~/.m2/repository
/root/.m2/repository
key: ${{ runner.os }}-java-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-java-maven-
- name: Cache TPC-H generated data
id: cache-tpch-sf-1
uses: actions/cache@v4
with:
path: ./tpch
key: tpch-${{ hashFiles('.github/workflows/benchmark-tpch.yml') }}
- name: Build Comet
run: make release
- name: Upload Comet native lib
uses: actions/upload-artifact@v4
with:
name: libcomet-${{ github.run_id }}
path: |
core/target/release/libcomet.so
core/target/release/libcomet.dylib
retention-days: 1 # remove the artifact after 1 day, only valid for this workflow
overwrite: true
- name: Generate TPC-H (SF=1) table data
if: steps.cache-tpch-sf-1.outputs.cache-hit != 'true'
run: |
cd spark && MAVEN_OPTS='-Xmx20g' ../mvnw exec:java -Dexec.mainClass="org.apache.spark.sql.GenTPCHData" -Dexec.classpathScope="test" -Dexec.cleanupDaemonThreads="false" -Dexec.args="--location `pwd`/.. --scaleFactor 1 --numPartitions 1 --overwrite"
cd ..
benchmark:
name: Run TPCHQuerySuite
runs-on: ubuntu-latest
needs: [prepare]
container:
image: amd64/rust
steps:
- uses: actions/checkout@v4
- name: Setup Rust & Java toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: ${{env.RUST_VERSION}}
jdk-version: 11
- name: Cache Maven dependencies
uses: actions/cache@v4
with:
path: |
~/.m2/repository
/root/.m2/repository
key: ${{ runner.os }}-java-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-java-maven-
- name: Restore TPC-H generated data
id: cache-tpch-sf-1
uses: actions/cache/restore@v4
with:
path: ./tpch
key: tpch-${{ hashFiles('.github/workflows/benchmark-tpch.yml') }}
fail-on-cache-miss: true # it's always be cached as it should be generated by pre-step if not existed
- name: Download Comet native lib
uses: actions/download-artifact@v4
with:
name: libcomet-${{ github.run_id }}
path: core/target/release
- name: Run TPC-H queries
run: |
SPARK_HOME=`pwd` SPARK_TPCH_DATA=`pwd`/tpch/sf1_parquet ./mvnw -B -Prelease -Dsuites=org.apache.spark.sql.CometTPCHQuerySuite test
2 changes: 2 additions & 0 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,12 @@ on:
push:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
pull_request:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
# manual trigger
# https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/pr_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,12 @@ on:
push:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
pull_request:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
# manual trigger
# https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/spark_sql_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,12 @@ on:
push:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
pull_request:
paths-ignore:
- "doc/**"
- "docs/**"
- "**.md"
# manual trigger
# https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
Expand Down
109 changes: 0 additions & 109 deletions EXPRESSIONS.md

This file was deleted.

52 changes: 52 additions & 0 deletions docs/source/contributor-guide/contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Contributing to Apache DataFusion Comet

We welcome contributions to Comet in many areas, and encourage new contributors to get involved.

Here are some areas where you can help:

- Testing Comet with existing Spark jobs and reporting issues for any bugs or performance issues
- Contributing code to support Spark expressions, operators, and data types that are not currently supported
- Reviewing pull requests and helping to test new features for correctness and performance
- Improving documentation

## Finding issues to work on

We maintain a list of good first issues in GitHub [here](https://github.com/apache/datafusion-comet/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).

## Reporting issues

We use [GitHub issues](https://github.com/apache/datafusion-comet/issues) for bug reports and feature requests.

## Asking for Help

The Comet project uses the same Slack and Discord channels as the main Apache DataFusion project. See details at
[Apache DataFusion Communications]. There are dedicated Comet channels in both Slack and Discord.

## Regular public meetings

The Comet contributors hold regular video calls where new and current contributors are welcome to ask questions and
coordinate on issues that they are working on.

See the [Apache DataFusion Comet community meeting] Google document for more information.

[Apache DataFusion Communications]: https://datafusion.apache.org/contributor-guide/communication.html
[Apache DataFusion Comet community meeting]: https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing
Loading

0 comments on commit f474062

Please sign in to comment.