Skip to content
This repository has been archived by the owner on Apr 11, 2024. It is now read-only.

Commit

Permalink
Updated versions of Metabase, DuckDB. Added script to build Ubuntu ba…
Browse files Browse the repository at this point in the history
…sed Metabase Docker image.
  • Loading branch information
AlexR2D2 committed Oct 5, 2022
1 parent bcd50ff commit 8a9c6b4
Show file tree
Hide file tree
Showing 8 changed files with 299 additions and 17 deletions.
30 changes: 20 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,19 +55,19 @@ You require metabase to be installed alongside of your project
1. cd metabase-duckdb-driver/..
2. execute

```
git clone https://github.com/metabase/metabase
cd metabase
clojure -X:deps prep
cd modules/drivers
clojure -X:deps prep
cd ../../../metabase-duckdb-driver
```
```bash
git clone https://github.com/metabase/metabase
cd metabase
clojure -X:deps prep
cd modules/drivers
clojure -X:deps prep
cd ../../../metabase-duckdb-driver
```

### Build

1. modify :paths in deps.edn, make them absolute
2. `$ `clojure -X:build :project-dir "\"$(pwd)\""`
2. `$`clojure -X:build :project-dir "\"$(pwd)\""`

This will build a file called `target/duckdb.metabase-driver.jar`; copy this to your Metabase `./plugins` directory.

Expand All @@ -89,10 +89,20 @@ Because of feature of DuckDB allowing you [to run SQL queries directly on Parque

For example (somewhere in Metabase SQL Query editor):

```
```sql
# DuckDB selected as source

SELECT originalTitle, startYear, genres, numVotes, averageRating from '/Users/you/movies/title.basics.parquet' x
JOIN (SELECT * from '/Users/you/movies/title.ratings.parquet') y ON x.tconst = y.tconst
ORDER BY averageRating * numVotes DESC
```

## Docker

Unfortunately, DuckDB plugin does't work in the default Alpine based Metabase docker container due to some glibc problems. But it works in the Ubuntu based Metabase docker image. There is Ubuntu based image build script in the docker folder of this project. So, please, run Docker daemon in you host and:

```bash
./build_image.sh
```

After a while, it will build the `metabase_duckdb` Ubuntu based image of Metabase with DuckDB plugin. Just run container of this image exposing 3000 port.
24 changes: 24 additions & 0 deletions build_image.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Metabase soure code
MB_SRC_FOLDER=docker/metabase/source
MB_GIT_URL=https://github.com/metabase/metabase.git
MB_IMAGE_NAME=ubuntu_metabase
MB_DUCKDB_IMAGE_NAME=metabase_duckdb

# Clone metabase source code
if [ ! -d "$MB_SRC_FOLDER" ] ; then
git clone $MB_GIT_URL $MB_SRC_FOLDER
else
git -C $MB_SRC_FOLDER fetch
git -C $MB_SRC_FOLDER reset --hard HEAD
git -C $MB_SRC_FOLDER merge origin/master
fi

# Copy ubuntu based docker files/sh script into source code of Metabase
yes | cp -rf docker/metabase/Dockerfile $MB_SRC_FOLDER
yes | cp -rf docker/metabase/bin/docker/* $MB_SRC_FOLDER/bin/docker

# Build the Metabase Ubuntu based docker image
docker build -t $MB_IMAGE_NAME -f $MB_SRC_FOLDER/Dockerfile .

# Build the Metabase image with DuckDB plugin
docker build -t $MB_DUCKDB_IMAGE_NAME -f docker/Dockerfile .
12 changes: 6 additions & 6 deletions deps.edn
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,16 @@
; replace also the version in metabase-plugin.yaml
metabase/metabase-core {
:git/url "https://github.com/metabase/metabase.git"
:git/tag "v0.43.0"
:git/sha "ee686fcfe5"
:git/tag "v1.44.3"
:git/sha "7d50282"
}
metabase/build-drivers {
:git/url "https://github.com/metabase/metabase.git"
:git/tag "v0.43.0"
:git/sha "ee686fcfe5"
:git/tag "v1.44.3"
:git/sha "7d50282"
:deps/root "bin/build-drivers"
}
org.duckdb/duckdb_jdbc {:mvn/version "0.4.0"}
org.duckdb/duckdb_jdbc {:mvn/version "0.5.1"}
}

; build the driver with `clojure -X:build :project-dir "\"$(pwd)\""`
Expand All @@ -33,7 +33,7 @@
}
; We don't want to include metabase nor clojure in the uber jar
:oss {:replace-deps {
org.duckdb/duckdb_jdbc {:mvn/version "0.4.0"}
org.duckdb/duckdb_jdbc {:mvn/version "0.5.1"}
}}
}
}
4 changes: 4 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FROM ubuntu_metabase

COPY target/duckdb.metabase-driver.jar /plugins/
RUN chmod 744 /plugins/duckdb.metabase-driver.jar
46 changes: 46 additions & 0 deletions docker/metabase/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
###################
# STAGE 1: builder
###################

FROM metabase/ci:java-11-clj-1.11.0.1100.04-2022-build as builder

ARG MB_EDITION=oss

WORKDIR /home/circleci

COPY --chown=circleci . .
RUN INTERACTIVE=false CI=true MB_EDITION=$MB_EDITION bin/build

# ###################
# # STAGE 2: runner
# ###################

## Remember that this runner image needs to be the same as bin/docker/Dockerfile with the exception that this one grabs the
## jar from the previous stage rather than the local build
## we're not yet there to provide an ARM runner till https://github.com/adoptium/adoptium/issues/96 is ready

FROM --platform=linux/amd64 eclipse-temurin:11-jre as runner

ENV FC_LANG en-US LC_CTYPE en_US.UTF-8

# dependencies
RUN apt-get update && \
apt-get install -y bash fonts-dejavu curl ca-certificates-java && \
apt-get upgrade -y && \
rm -rf /var/cache/apt-get/* && \
mkdir -p /app/certs && \
curl https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem -o /app/certs/rds-combined-ca-bundle.pem && \
/opt/java/openjdk/bin/keytool -noprompt -import -trustcacerts -alias aws-rds -file /app/certs/rds-combined-ca-bundle.pem -keystore /etc/ssl/certs/java/cacerts -keypass changeit -storepass changeit && \
curl https://cacerts.digicert.com/DigiCertGlobalRootG2.crt.pem -o /app/certs/DigiCertGlobalRootG2.crt.pem && \
/opt/java/openjdk/bin/keytool -noprompt -import -trustcacerts -alias azure-cert -file /app/certs/DigiCertGlobalRootG2.crt.pem -keystore /etc/ssl/certs/java/cacerts -keypass changeit -storepass changeit && \
mkdir -p /plugins && chmod a+rwx /plugins

# add Metabase script and uberjar
COPY --from=builder /home/circleci/target/uberjar/metabase.jar /app/
COPY bin/docker/run_metabase.sh /app/

# expose our default runtime port
EXPOSE 3000

# run it
ENTRYPOINT ["/app/run_metabase.sh"]
25 changes: 25 additions & 0 deletions docker/metabase/bin/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
FROM --platform=linux/amd64 eclipse-temurin:11-jre as runner

ENV FC_LANG en-US LC_CTYPE en_US.UTF-8

# dependencies
RUN apt-get update && \
apt-get install -y bash fonts-dejavu curl ca-certificates-java && \
apt-get upgrade -y && \
rm -rf /var/cache/apt-get/* && \
mkdir -p /app/certs && \
curl https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem -o /app/certs/rds-combined-ca-bundle.pem && \
/opt/java/openjdk/bin/keytool -noprompt -import -trustcacerts -alias aws-rds -file /app/certs/rds-combined-ca-bundle.pem -keystore /etc/ssl/certs/java/cacerts -keypass changeit -storepass changeit && \
curl https://cacerts.digicert.com/DigiCertGlobalRootG2.crt.pem -o /app/certs/DigiCertGlobalRootG2.crt.pem && \
/opt/java/openjdk/bin/keytool -noprompt -import -trustcacerts -alias azure-cert -file /app/certs/DigiCertGlobalRootG2.crt.pem -keystore /etc/ssl/certs/java/cacerts -keypass changeit -storepass changeit && \
mkdir -p /plugins && chmod a+rwx /plugins

# add Metabase script and uberjar
COPY --from=builder /home/circleci/target/uberjar/metabase.jar /app/
COPY bin/docker/run_metabase.sh /app/

# expose our default runtime port
EXPOSE 3000

# run it
ENTRYPOINT ["/app/run_metabase.sh"]
173 changes: 173 additions & 0 deletions docker/metabase/bin/docker/run_metabase.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
#!/bin/bash

# if nobody manually set a host to listen on then go with all available interfaces and host names
if [ -z "$MB_JETTY_HOST" ]; then
export MB_JETTY_HOST=0.0.0.0
fi

# Setup Java Options
JAVA_OPTS="${JAVA_OPTS} -XX:+IgnoreUnrecognizedVMOptions"
JAVA_OPTS="${JAVA_OPTS} -Dfile.encoding=UTF-8"
JAVA_OPTS="${JAVA_OPTS} -Dlogfile.path=target/log"
JAVA_OPTS="${JAVA_OPTS} -XX:+CrashOnOutOfMemoryError"
JAVA_OPTS="${JAVA_OPTS} -server"

if [ ! -z "$JAVA_TIMEZONE" ]; then
JAVA_OPTS="${JAVA_OPTS} -Duser.timezone=${JAVA_TIMEZONE}"
fi

# usage: file_env VAR [DEFAULT]
# ie: file_env 'XYZ_DB_PASSWORD' 'example'
# (will allow for "$XYZ_DB_PASSWORD_FILE" to fill in the value of
# "$XYZ_DB_PASSWORD" from a file, especially for Docker's secrets feature)
# taken from https://github.com/docker-library/postgres/blob/master/docker-entrypoint.sh
# This is the specific function that takes the env var which has a "_FILE" at the end and transforms that into a normal env var.
file_env() {
local var="$1"
local fileVar="${var}_FILE"
local def="${2:-}"
if [ "${!var:-}" ] && [ "${!fileVar:-}" ]; then
echo >&2 "error: both $var and $fileVar are set (but are exclusive)"
exit 1
fi
local val="$def"
if [ "${!var:-}" ]; then
val="${!var}"
elif [ "${!fileVar:-}" ]; then
val="$(< "${!fileVar}")"
fi
export "$var"="$val"
unset "$fileVar"
}

# Here we define which env vars are the ones that will be supported with a "_FILE" ending. We started with the ones that would contain sensitive data
docker_setup_env() {
file_env 'MB_DB_USER'
file_env 'MB_DB_PASS'
file_env 'MB_DB_CONNECTION_URI'
file_env 'MB_EMAIL_SMTP_PASSWORD'
file_env 'MB_EMAIL_SMTP_USERNAME'
file_env 'MB_LDAP_PASSWORD'
file_env 'MB_LDAP_BIND_DN'
}

# detect if the container is started as root or not
# if non-root, it's likely we run in a k8s environment with well maintained permissions
# if root, we need to check some permissions in order to exec metabase with a non-root user
# In that case, the container is run as root, metabase is run as a non-root user
# Also, we call the docker_setup_env function before Metabase starts so it takes the Docker secrets in case there are any
if [ $(id -u) -ne 0 ]; then
# Launch the application
# exec is here twice on purpose to ensure that metabase runs as PID 1 (the init process)
# and thus receives signals sent to the container. This allows it to shutdown cleanly on exit
docker_setup_env
exec /bin/sh -c "exec java $JAVA_OPTS -jar /app/metabase.jar $@"
else
# Avoid running metabase (or any server) as root where possible
# If you want to use specific user and group ids that matches an existing
# account on the host pass them in $MGID and $MUID when starting metabase
MGID=${MGID:-2000}
MUID=${MUID:-2000}
#
## create the group if it does not exist
## TODO: edit an existing group if MGID has changed
getent group metabase > /dev/null 2>&1
group_exists=$?
if [ $group_exists -ne 0 ]; then
addgroup --system --gid $MGID metabase
fi

# create the user if it does not exist
# TODO: edit an existing user if MGID has changed
id -u metabase > /dev/null 2>&1
user_exists=$?
if [[ $user_exists -ne 0 ]]; then
adduser --disabled-password --disabled-login --gecos "" --uid $MUID --ingroup metabase metabase
fi

db_file=${MB_DB_FILE:-/metabase.db}

# In order to run metabase as a non-root user in docker, we need to handle various
# cases where we where previously ran as root and have an existing database that
# consists of a bunch of files, that are owned by root, sitting in a directory that
# may only be writable by root. It's not safe to simply change the ownership or
# permissions of an unknown directory that may be a volume mounted on the host, so
# we will need to detect this and make a place that is going to be safe to set
# permissions on.

# So first some preliminary checks:

# 1. Does this container have an existing H2 database file?
# 2. or an existing H2 database in it's own directory,
# 3. or neither?


# is there a pre-existing files only database without a metabase specific directory?
if ls $db_file\.* > /dev/null 2>&1; then
db_exists=true
else
db_exists=false
fi
# is it an old style file
if [[ -d "$db_file" ]]; then
db_directory=true
else
db_directory=false
fi

# If the db exits, and it's just some files in a shared directory we could do
# serious damage to peoples home or /tmp directories if we where to set the
# permissions on that directory to allow metabase to create db-lock and db-part
# file there. To keep them safe we make a new directory with the same name and
# move the db file into the new directory. If we where passed the name of a
# directory rather than a specific file, then we are safe to set permissions on
# that directory so there is no need to move anything.

# an example file would look like /tmp/metabase.db/metabase.db.mv.db
new_db_dir=$(dirname $db_file)/$(basename $db_file)

if [[ $db_exists = "true" && ! $db_directory = "true" ]]; then
mkdir $new_db_dir
mv $db_file\.* $new_db_dir/
fi

# and for the new install case we create the directory
if [[ $db_exists = "false" && $db_directory = "false" ]]; then
mkdir $new_db_dir
fi

# the case where the DB exists and is a directory, there is nothing to do
# so nothing happens here. This will be the normal case.

# next we tell metabase use the files we just moved into the directory
# or create the files in that directory if they don't exist.
docker_setup_env
export MB_DB_FILE=$new_db_dir/$(basename $db_file)

# TODO: print big scary warning if they are configuring an ephemeral instance

chown metabase:metabase $new_db_dir $new_db_dir/* 2>/dev/null # all that fussing makes this safe

# Ensure JAR file is world readable
chmod o+r /app/metabase.jar

# Initialize the Metabase db from H2 dump, if available
INITIAL_DB=$(ls /app/initial*.db 2> /dev/null | head -n 1)
if [ -f "${INITIAL_DB}" ]; then
echo "Initializing Metabase database from H2 database ${INITIAL_DB}..."
chmod o+r ${INITIAL_DB}
su metabase -s /bin/sh -c "exec java $JAVA_OPTS -jar /app/metabase.jar load-from-h2 ${INITIAL_DB%.mv.db} $@"

if [ $? -ne 0 ]; then
echo "Failed to initialize database from H2 database!"
exit 1
fi

echo "Done."
fi

# Launch the application
# exec is here twice on purpose to ensure that metabase runs as PID 1 (the init process)
# and thus receives signals sent to the container. This allows it to shutdown cleanly on exit
exec su metabase -s /bin/sh -c "exec java $JAVA_OPTS -jar /app/metabase.jar $@"
fi
2 changes: 1 addition & 1 deletion resources/metabase-plugin.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
info:
name: Metabase DuckDB Driver
version: 1.0.0-SNAPSHOT-0.1.1
version: 1.0.0-SNAPSHOT-0.1.2
description: Allows Metabase to connect to DuckDB databases.
contact-info:
name: Alexander Golubov
Expand Down

0 comments on commit 8a9c6b4

Please sign in to comment.