Installation guide on a single server
Global requirements
Architecture overview
DataAdmin API and UI: mainly for Data owners/administrators who will manage datasets and users’ access
Private Learning Gateway API: for Data scientists and analysts who will remotely run analyses on the sensitive data (without direct access)
Nginx as a proxy
Requirements
Sarus does not require a specific OS as it uses Docker, but needs the following softwares installed:
docker version 19.03.8 or higher
docker-compose version 1.27.x or higher. Note that many linux distro are coming with lower version
To install the software, you need at least 10GB of disk space.
Resources recommendations
To install and run Sarus mainly on numeric data, we suggest:
GCP: c2-standard-4 instance (approx. $0.2/h)
AWS: c5.2xlarge (approx. $0.4/h), or m5.xlarge (approx. $0.2/h), with a 100GB attached SSD. And an AMI with a clean Ubuntu 20.04 OS.
Securing Sarus (forcing https)
This will force users to query Sarus with https instead of http, which means the communication between the user (from the Admin UI or from the SDK) and Sarus will be encrypted.
prepare a VM to install Sarus and make sure it will be accessible by its hostname over the network (your DNS has to be configured)
generate a SSL certificate tied to this hostname
copy this SSL certificate onto the VM (certificate chain and key)
install Sarus and configure the .env file with the path to the SSL certificate (see instruction below)
check that encryption is enforced by trying to connect to the Sarus UI with http: you should be redirected to https.
Installation steps on a single server with Docker compose
You should have received an installer zip file with a secret token from your Sarus representative.
1. Install docker and docker-compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
2. Unzip the provided installer zip file
3. Create a .env file from the env.template
4. Configure the .env file
GCLOUD_KEY_FILE_PATH indicating the path on the host machine to the google cloud credentials file to use
FS_PATH set it to the path of the gcs bucket/directory you want to use as storage backend. Should start with gs:// (something like gs://store-datasets-264717/my_dir)
If you want to reach the UI from outside the server, don’t forget to set the NGINX_HOST variable (either with the hostname or its IP).
5. Launch the installer installer.sh
It will ask you for a password, please enter the one provided by Sarus, and it will pull the Docker images. It then launches containers and the app is then available on port 80.
6. If you want to shutdown or restart Sarus
Network configuration
If you have provided SSL certificates using the SSL_DIR_PATH variable in the .env file:
Private learning gateway is on port 5000 with https enabled
DataAdmin api and UI are on port 443 with https enabled (+ redirect from 80)
Else:
Private learning gateway is on port 5000 without https
DataAdmin api and UI are on port 80 without https