Thales is a metadata record management Web application focusing on research data collections. It allows metadata records to be created and edited. The metadata records are published on an OAI-PMH machine-readable feed in the RIF-CS format for other systems to harvest.
Thales was developed for an Australian National Data Service (ANDS) funded metadata stores project. Currently, it is designed to be deployed as an internal system where people using the Web interface must have a login account. The only publically accessible feature is the machine-readable feed. It was originally developed as a local metadata store for manually edited metadata records; the institutional metadata store that harvests these metadata records provided the publically accessible view of the metadata records.
This guide describes how to install Thales: either for development or production. It also provides example instructions on setting up the software Thales depends upon.
It is intended for people who install the application.
The Thales User Guide describes how to use it to create and modify metadata records.
The Thales Admin Guide describes how to setup and manage Thales.
The Thales Development Guide describes the import/export file format.
Thales is a Ruby on Rails Web application. It can run on any platform that support Ruby on Rails. By default, it is configured to use PostgreSQL.
Mandatory:
- Git
- PostgreSQL (or another Ruby on Rails supported database)
- Ruby 1.9.3 (Note: dependent Gems do not yet work with Ruby 2.0.)
- Bundler (or manage the gems manually)
Bundler will be used to install the required gems, such as: Ruby on Rails (3.2.11), nokogiri, pg and ruby-oai.
Optional:
See [Example platform setup] for instructions on using some Linux distributions as the platform.
-
Obtain a copy of the application:
git clone https://github.com/uq-eresearch/thales.git
Note: if deploying a production installation, consider carefully where the application will be installed, because the reverse proxy will need to access some of the files, but it is usually running as a differnt user to the owner of the files. For a production deployment consider installing it under somewhere like
/opt
where the permissions can allow all users read access. -
Change into the project directory:
cd thales
-
(Optional) If using RVM, create a project .rvmrc file:
rvm --rvmrc --create ruby-1.9.3@thales
-
Install gems:
bundle install
This will install Ruby on Rails, as well as the other required gems. If this is for a production-only installation, unnecessary gems can be excluded by using
bundle install --without="development test"
. -
Configure the connection between the application and PostgreSQL
There are different ways of doing this. These instructions will assume PostgreSQL is running on the same machine as the Web application and password authentication is used. Another common method is to use trust authentication, where the database username is the same as the operating system login username. See the PostgreSQL documentation on client authentication for more details.
a. Choose a database user name. These instructions uses "thales" as the database user name, but any database user name can be used.
If the database user name is the same as the login user name nothing more needs to be done, since by default PostgreSQL is setup to accept Unix socket connections from local users. If the database user name is different from the login user name, continue with the next step.
b. Edit the PostgreSQL client authentication configuration file. The default allows local access (i.e. via Unix-domain sockets) for all users to all databases via peer authentication method (i.e. operating system user name matches the database user name), but it will be edited to add password authentication.
sudoedit /var/lib/pgsql/data/pg_hba.conf Add the following line _before_ the other "local" entries. It allows local access (i.e. via Unix-domain sockets) to all databases for the user "thales" if they can be authenticated using the md5 method (i.e. can present a md5 digest of the correct password). local all thales md5
-
Start or restart PostgreSQL.
If PostgreSQL is already running, restart it (so that it loads the client authentication configuration file.)
sudo service postgresql start
Please continue with the [Development installation] or [Production installation] instructions.
These steps continue on from the [Common to both development and production installations].
-
Create PostgreSQL user with the CREATEDB role.
$ sudo -u postgres psql postgres=# CREATE USER thales WITH CREATEDB PASSWORD 'p@ssw0rd'; postgres=# \du postgres=# \q If using PostgreSQL _trust authentication_, the password can be omitted. Note: The CREATEDB role allows the user to drop any database (not just the ones it owns). This can be a security risk, so please consider if it is suitable for your setup before using it. The CREATEDB role is needed if you want to run the RSpec tests and the db:create Rake task.
-
Edit configuration
Edit config/database.yml to set the username and password to the new PostgreSQL username and password.
vi config/database.yml
Do this for the development and test environments.
development: ... username: thales password: p@ssw0rd ... test: ... username: thales password: p@ssw0rd ...
-
Create, define and populate the development database.
rake db:create rake db:migrate rake db:seed
-
Run the RSpec tests.
rake spec
The tests should run without producing any failures.
-
If using the Unicorn Web server for development testing, create the directory to hold the PID file (that was specified in the Unicorn config file) and make sure it is writable by the Unicorn process worker (see config/unicorn.rb). This is important because Unicorn will fail if the directory not exist.
mkdir -p tmp/pids
If using the WEBrick Web server, this is not necssary because it will automatically create the temporary directories it needs.
-
Start the development Rails server (see [Starting the Rails server]).
The application can be accessed by visiting http://localhost:3000/ (replacing "localhost" with the correct hostname, if necessary.) The initial account has the user name of "root" and no password.
The OAI-PMH feed will be available at http://localhost:3000/oaipmh (replacing "localhost" with the correct hostname.)
These steps continue on from the [Common to both development and production installations].
These steps assume the production Web application is run using the Unicorn HTTP server and nginx is used to provide TLS/SSL security.
-
Create PostgreSQL user and the production database.
$ sudo -u postgres psql postgres=# CREATE USER thales WITH PASSWORD 'rte0vurt3spes6ryqu5mo9hy9pybi2ra'; postgres=# \du postgres=# CREATE DATABASE thales OWNER thales; postgres=# \l postgres=# \q
If using PostgreSQL trust authentication, the password can be omitted.
Note: unlike in development, this database user does not have the CREATEDB role to improve security.
-
Edit configuration
Edit config/database.yml to set the username and password to the new PostgreSQL username and password.
vi config/database.yml
Do this for the production environment.
production: ... username: thales password: rte0vurt3spes6ryqu5mo9hy9pybi2ra ...
-
Define and populate the database.
RAILS_ENV=production rake db:migrate RAILS_ENV=production rake db:seed
-
Precompile the assets.
RAILS_ENV=production rake assets:precompile
-
Create a wrapper script so that Unicorn can be run in the correct RVM gemset. The second argument is the name of the gemset.
rvm wrapper `rvm current` thales unicorn
This will create a wrapper script called
~/.rvm/bin/thales_unicorn
. -
Edit the Unicorn configuration file. Set the following:
-
THALES_PROJ_DIR to where the sources have been installed;
-
pid to the location of the PID file (e.g. /var/run/thales/unicorn.pid);
-
user to the user that owns the worker processes.
vi config/unicorn.rb
Take note of the location of the PID file and user defined in this configuration file for the next step.
-
-
Create the directory to hold the PID file and make sure it is writable by the Unicorn process worker (both are specified in the Unicorn config file). This is important because Unicorn will fail if the directory does not exist.
mkdir -p tmp/pids
Continue with either the [Basic startup] or [Startup using Bluepill process monitor] steps.
This option uses a basic init.d script to start and stop the Unicorn HTTP server.
-
Configure the application to automatically start when the OS starts.
a. Copy the init.d script (basic) to the /etc/init.d directory.
sudo cp script/initd-thales-basic.sh /etc/init.d/thales
b. Edit it to set:
- PROJ_DIR to the directory where Thales has been installed; - UNICORN to the wrapper script created in step 5; and - PID_FILE to the path to the PID file in _config/unicorn.rb_ sudoedit /etc/init.d/thales
c. Register it.
sudo chkconfig thales on
-
Start the application.
sudo service thales start
The application will be running on port 30123 (unless you change it in the init.d script).
Attempting to access http://localhost:30123 from the host (since the firewall should be blocking external access to this port) should return a HTTP redirection to the (currently non-existent) HTTPS secured login page.
Other available commands are:
sudo service thales status sudo service thales restart sudo service thales stop
Continue with the [Reverse proxy server setup] steps.
This option uses the Bluepill process monitor to manage the Unicorn HTTP server.
-
Install Bluepill.
a. Install the Bluepill gem.
gem install bluepill
b. If needed, configure logging according to the Bluepill installation instructions.
-
Create a wrapper script so that Bluepill can be run in the correct RVM gemset. The second argument is the name of the gemset.
rvm wrapper `rvm current` thales bluepill
This will create a wrapper script called
~/.rvm/bin/thales_bluepill
. -
Configure Bluepill by editing the Thales Bluepill config file. Edit it to set:
-
USER to the user name to run the Unicorn processes as.
-
RAILS_ROOT to the directory where Thales is installed
-
UNICORN to the location of the Unicorn wrapper script.
vi config/bluepill.pill
-
-
Configure the application to automatically start when the OS starts.
a. Copy the init.d script (Bluepill) to the /etc/init.d directory.
sudo cp script/initd-thales-bluepill.sh /etc/init.d/thales
b. Edit it to set:
- BLUEPILL_BIN to the wrapper script for Bluepill created above. - BLUEPILL_CONFIG to the location of the above Bluepill config file. sudoedit /etc/init.d/thales
c. Register it.
sudo chkconfig thales on
-
Start the application.
sudo service thales start
The application will be running on port 30123 (unless it was changed in the init.d script).
Attempting to access http://localhost:30123 from the host (since the firewall should be blocking external access to this port) should return a HTTP redirection to the (currently non-existent) HTTPS secured login page.
Other available commands are:
sudo service thales status sudo service thales restart sudo service thales stop
Continue with the [Reverse proxy server setup] steps.
This section continues from either [Basic startup] or [Startup using Bluepill process monitor].
A reverse proxy server will be used to efficiently serve static content and the content from the Unicorn HTTP server. It will also be used to provide TLS/SSL security.
-
Install nginx.
In these steps, we will download and compile nginx, instead of using the distribution's release.
a. Download the sources from http://nginx.org
pushd ~ curl -O http://nginx.org/download/nginx-1.2.7.tar.gz
b. Unpack the sources.
tar xfz nginx-1.2.7.tar.gz cd nginx-1.2.7
c. Ensure dependencies are installed.
Either: i. If the distribution uses yum (e.g. Fedora and RHEL): sudo yum install gcc pcre pcre-devel zlib zlib-devel openssl openssl-devel ii. If the distribution uses apt-get (e.g. Ubuntu): sudo apt-get install build-essential libpcre3-dev zlib1g zlib1g-dev openssl libssl-dev
d. Compile and install. There are many options, but the essential one is the SSL module.
./configure --help ./configure --with-http_ssl_module make sudo make install
-
Create a user account to run nginx.
sudo useradd --shell /sbin/nologin --home-dir /usr/local/nginx -c "Nginx server" nginx
The warning about the home directory already existing can be ignored.
-
Change file and directory permissions to allow the nginx user to read the precompiled asset files. How this is done will depend on where the application was installed. For example, if they were installed in the user's home directory:
namei -l /home/thales/thales/public/assets chmod o+rx /home/thales
-
Configure nginx to proxy requests to the Unicorn HTTP server running on port 30123, and to serve both HTTP and HTTPS requests.
a. Obtain a TLS/SSL certificate for the site.
For testing, you could use a self-signed test certificate. See [Creating a self-signed test certificate] for one way of creating a test certificate.
b. Install the certificate and its (unencrypted) private key.
pushd ~/pki-credentials-for-my-domain sudo cp tls.crt /usr/local/nginx/conf/tls.crt sudo cp tls.key /usr/local/nginx/conf/tls.key sudo chmod 444 /usr/local/nginx/conf/tls.crt sudo chmod 400 /usr/local/nginx/conf/tls.key # keep private key secure! popd
c. Configure nginx. You can use the supplied example configuration file as a starting point, but remember to optimize it for your setup.
popd # to return to thales source directory sudo cp config/nginx.conf /usr/local/nginx/conf/nginx.conf
d. Start nginx
sudo /usr/local/nginx/sbin/nginx
e. Test the server by visiting http://localhost and https://localhost.
Please see the [Thales Administration guide](thales-admin-guide.md) for information on how to login and set it up for use.
Note: a common problem is the HTML page appears, but without the CSS styling. This is usually a permissions problem: the nginx user does not have permissions to read the static asset files. Usually one of the parent directories does not have read and execute permissions for other users.
f. If necessary, modify the configuration and retest.
sudoedit /usr/local/nginx/conf/nginx.conf sudo /usr/local/nginx/sbin/nginx -t sudo /usr/local/nginx/sbin/nginx -s reload
g. When finished, manually stop the nginx server.
sudo /usr/local/nginx/sbin/nginx -s quit
-
Setup nginx to start automatically when the OS starts.
a. Copy the nginx init.d script to the /etc/init.d directory.
sudo cp script/initd-nginx.sh /etc/init.d/nginx
b. Edit it to set:
- NGINX_BIN to the nginx executable. sudoedit /etc/init.d/nginx
c. Register it.
sudo chkconfig nginx on
Note: a production installation should manage the log files using logrotate, but its setup is beyond the scope of this documentation.
Thales will enforce the use of HTTPS for logins and user settings
pages in the production environment
(i.e. RAILS_ENV=production
). That is, requests over HTTP to those
pages are redirected to the equivalent HTTPS URL.
If you need to run in a production environment where HTTPS is not
available, run it with DISABLE_HTTPS=1
in the environment to
disable this.
These instructions describe how to use WEBrick or Unicorn as the development HTTP server.
WEBrick comes with Ruby 1.9.3 and is the standard Rails development server. It is not recommended for production use.
Unicorn is a fast HTTP server and its gem is included in the project. It can be used in production and in development.
The WEBrick HTTP server can be manually started in development mode using:
rails server -d
Or use the helper script:
script/server.sh start
The Unicorn HTTP server can be manually started in development mode using:
unicorn -D -c config/unicorn.rb --port 3000
Note: if the port number is not specified, the default is 8080.
The WEBrick HTTP server can be manually stopped using:
kill -s SIGINT processID
Where the processID can be found in the file tmp/pid/server.pid
.
Or use the helper script:
script/server.sh stop
The Unicorn HTTP server can be manually stopped using:
kill -s SIGQUIT processID
Where the processID can be found by running ps -ef | grep "unicorn master"
.
Run the helper script with -h
(or --help
) for more options.
script/server.sh --help
To run the server with TLS/SSL, the server certificate needs to be copied to config/pki/server.crt and the unencrypted private key copied to config/pki/server.key.
This section describes how to setup the required software on Linux.
These steps might be different for different Linux distributions and different configurations. These steps have been tested with a minimal install of:
- Fedora 18
- CentOS 6.3
- Scientific Linux 6.3
- Ubuntu 12.10
These steps have chosen to use:
- Single-user installation of Ruby Version Manager (RVM) for Ruby
- PostgreSQL from the distribution.
These instructions use a non-root account with sudo access. By default, the configuration files and scripts assume the user name is "thales", but any username can be used.
-
Install packages needed by RVM
Either:
a. If the distribution uses yum (e.g. Fedora and RHEL):
sudo yum install postgresql-server \ git \ tar bzip2 make gcc gcc-c++ \ zlib-devel openssl-devel \ readline-devel libyaml-devel libffi-devel libxml2-devel libxslt-devel postgresql-devel
The first line for the distribution's installation of PostgreSQL. The second line is for Git, used to obtain the application. The third line is required for RVM. The fourth line is required for Bundler (and must be installed before RVM compiles Ruby). The fifth line is required for other gems that the application needs.
b. If the distribution uses apt-get (e.g. Ubuntu):
sudo apt-get install \ build-essential openssl libreadline6 libreadline6-dev curl git-core \ zlib1g zlib1g-dev libssl-dev libyaml-dev libsqlite3-dev sqlite3 libxml2-dev libxslt-dev \ autoconf libc6-dev libgdbm-dev ncurses-dev automake \ libtool bison subversion pkg-config libffi-dev \ libpg-dev
-
Install Ruby Version Manager and Ruby.
curl -L https://get.rvm.io | bash -s stable --ruby=1.9.3 . ~/.rvm/scripts/rvm
Thales has been tested with Ruby 1.9.3-p374.
As of 14 March 2013, it does not work with Ruby 2.0, because one of the Gems it uses (ruby-oai 0.0.9) does not work with Ruby 2.0.0.
-
Initialize PostgreSQL.
a. For PostgreSQL 9.x:
sudo postgresql-setup initdb
b. For PostgreSQL 8.x:
sudo service postgresql initdb
-
Setup PostgreSQL to start automatically when the OS starts.
sudo chkconfig postgresql on
-
Configure firewalls
Choose which TCP/IP port for the Web application and configure the firewall to allow access to that port. For development and testing the default is port 3000. For production, you will want to use port 80 or 443.
a. FirewallD
For systems running FirewallD (e.g. Fedora 18).
Show the active zones and current settings for the public zone:
sudo firewall-cmd --get-active-zones sudo firewall-cmd --list-all --zone=public
For development, allow use of port 3000:
sudo firewall-cmd --add-port=3000/tcp # Note: not permanent sudo firewall-cmd --permanent --zone=public --add-port=3000/tcp
For production, allow use of the standard ports of 80 and 443:
sudo firewall-cmd --add-service=http # Note: not permanent sudo firewall-cmd --permanent --zone=public --add-service=http sudo firewall-cmd --add-service=https # Note: not permanent sudo firewall-cmd --permanent --zone=public --add-service=https
b. iptables
For systems running iptables.
Edit the iptables configuration file:
sudoedit /etc/sysconfig/iptables
Add the following lines:
-A INPUT -m state --state NEW -m tcp -p tcp --dport 3000 -j ACCEPT -A INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT -A INPUT -m state --state NEW -m tcp -p tcp --dport 443 -j ACCEPT
Restart iptables so the new settings are used:
sudo service iptables restart
These commands can be used to create a self-signed certificate for testing purposes:
openssl req -newkey rsa:2048 -nodes -keyout tls.key -out tls.csr
# Above command will prompt for extra information
openssl x509 -req -in tls.csr -signkey tls.key -days 90 -out tls.crt
rm tls.csr
chmod 444 tls.crt
chmod 400 tls.key
Check that firewalls are not blocking access. This can be done by accessing the service from the host machine.
curl -s 'http://localhost'
curl -s 'http://localhost/oaipmh?verb=Identify'
curl -s 'http://localhost:30123'
curl -s 'http://localhost:30123/oaipmh?verb=Identify'
Check if the necessary services are running:
sudo service postgresql status
sudo service thales status
sudo service nginx status
Check the log files. These will be under the log subdirectory, unless their locations were changed during installation.
ERROR: Gem bundler is not installed, run 'gem install bundler' first
Attempting to run 'gem install bundler' (as suggested by the error message) will usually fail with another error that says "Loading command: install (LoadError) cannot load such file -- zlib".
The zlib-devel and/or openssl-devel packages were not installed when Ruby was compiled, so bundler was not installed. Install these packages and reinstall Ruby.
sudo yum install zlib-devel openssl-devel
rvm list
rvm uninstall ruby-1.9.3-p362 # using value obtained from 'rvm list'
rvm install ruby
Data directory is not empty!
PostgreSQL has already been initialized. You do not need to run
sudo postgresql-setup initdb
again.
PG::Error: ERROR: permission denied to create database
The PostgreSQL user does not have permission to create or drop databases. This error is usually encountered when trying to run Rake targets (e.g.: db:create, db:drop, db:setup, db:reset or spec) and the PostgreSQL user does not have the CREATEDB role.
Nginx is constantly in the starting state
If running sudo service thales status
shows that Unicorn is always
"starting", check that the Unicorn PID file directory exists.
If it cannot create the PID file, Unicorn will not start.
502 Bad Gateway
The Unicorn HTTP server is not running or nginx has not been correctly configured.
403 Forbidden
When accessing a static file, this could be caused by the reverse proxy or HTTP server not having sufficient permissions to read the file. Normally, the worker process is running as a different user (e.g. nginx) from the owner of the files (e.g. thales).
Check the permissions on the file - and all ancestor directories - allow the server's user to access it. Typically, the home directory for a user is only readable by that user.
HTML page shows, but without any CSS styling
The precompiled assets are not being served. See 403 Forbidden above.
For further information on Thales, see the Thales project on GitHub.