-
Notifications
You must be signed in to change notification settings - Fork 17
Docker
You need Docker Engine installed. If you don't have it, check its installation guide at https://docs.docker.com/engine/install/.
You should then have the docker-compose.yml
and qa-catalogue
files. You can get those:
Either: via cloning the repository:
git clone https://github.com/pkiraly/qa-catalogue.git
cd qa-catalogue
or
wget https://github.com/pkiraly/qa-catalogue/archive/refs/heads/main.zip
unzip main
cd qa-catalogue-main
Or: by downloading only the necessary files:
wget https://raw.githubusercontent.com/pkiraly/qa-catalogue/main/docker-compose.yml
mkdir docker
cd docker
wget https://raw.githubusercontent.com/pkiraly/qa-catalogue/main/docker/qa-catalogue
chmod +x qa-catalogue
cd ..
In docker there are two main concepts: image
is a packaged version of the application (think of it as an installation package), container
is an instance of the executable application. QA catalogue's container contains a full Ubuntu operating system, and all components the application needs: Java, PHP, R, Apache web server, Apache Solr, SQLite, the application itself and default configuration.
The following process will download the image and create a container with default values:
docker compose up -d
If you would like to modify the configuration you have three options: using a) global environmental variables b) local environmental variables c) storing variables to a docker file.
a) global environmental variables
export WEBPORT=9000
export CONTAINER=qa-catalogue
docker compose up -d
b) local environmental variables
WEBPORT=9000 CONTAINER=qa-catalogue docker compose up -d
c) storing variables to a docker file
The file should be named .env
. Here is a sample .env
file
IMAGE="ghcr.io/pkiraly/qa-catalogue:main"
WEBPORT=9000
CONTAINER=qa-catalogue
once you save it, you can run
docker compose up -d
It is also possible to explicitly reference a .env
file with option --env-file
.
The WEBCONFIG
variable contains a name of a directory, which contains a configuration.cnf
file, that will be used by the web application.
A sample web-config/configuration.cnf
file:
default-tab=completeness
label=My custom Catalogue
url=https://my-catalogue.org
linkTemplate=https://my-catalogue.org/catalogue/{id}
language=de
Check the documentation of configuration parameters of QA catalogue UI.
The properties of the library are: label
, url
, schema
, language
and linkTemplate
. The rest configure the behaviour of the application.
The INPUT
variable stores the directory where the bibliographic files take place. It should be inside your current directory, but it might be a linked directory. The default value is ./input
. In the following we suppose that you have a file ./input/rug01.backup.gz
, that contains bibliographic records in a gzipped alephseq format, and it has some MARC data elements defined locally in Gent university library.
At the end of the process we will have the image and a running container. You can check these with docker images -a
and docker ps -a
commands.
You can reach the web interface ([qa-catalogue-web]) at http://localhost:80/ (or at another port as configured with environment variable WEBPORT
).
Stop the running container and update the image, e.g.
docker pull ghcr.io/pkiraly/qa-catalogue:main
Then start a new container as described above.
Once we have the running container, we can run the analyses.
./docker/qa-catalogue \
--params "--marcVersion GENT --alephseq" \
--input-dir "" \
--mask "rug01.backup.gz" \
--catalogue gent \
completeness
The script uses a single docker variable: CONTAINER
. If you set it for the first docker command, please use it accordingly.
The command runs the completeness analyses of our input file. The --params
contains the catalogue specific parameters, here we have two: marcVersion
specifies the locally defined data elements, and alephseq
specifies a specific serialization format. input-dir
tells that there is no extra subdirectory within the host's (the local machine) input
directory (that is mapped to /opt/qa-catalogue/marc/
within the container). mask
is a file name pattern, if we have multiple files we can use Linux substitution characters such as *
, .
. The last part completeness
is the name of the analysis to run.