Run Cresset calculations in the cloud with Cresset Engine Broker™ and ElastiCluster

In this post I will describe how to instantiate a cluster on Google Cloud (or any other cloud platform supported by ElastiCluster) and use it to run calculations with any Cresset desktop application, i.e., Forge, Spark or Flare, on any platform – Windows, Linux or macOS.

Configuring ElastiCluster is simple, and it is a one-off operation. Once this is done, you can spin up a Linux cluster in the cloud with a single command:

elasticluster start cebroker

Within a few minutes, the cluster will be up and running, and ready to run Cresset calculations via the Cresset Engine Broker.

Once your calculations are finished and you do not need the cluster anymore, you can switch it off with another single command:

elasticluster stop cebroker

How to

As I was new to Elasticluster and to Google Cloud, I followed the excellent Creating a Grid Engine Cluster with Elasticluster tutorial to understand how to enable the Google Cloud SDK, and generate credentials to grant ElastiCluster access to my Google Cloud account.

The rest of this blog post concentrates on the steps required to get this to work in a Linux bash shell. The same workflow runs smoothly in a macOS Terminal and in a Windows bash shell.

The process that I will follow is:

  1. Install ElastiCluster and its dependencies in a virtual environment
  2. Create an ElastiCluster configuration file
  3. Spin up the cluster and use it with Cresset desktop apps

Not all the steps are necessary if you have a VPC already set-up and running on Google Cloud, but this details how to start from scratch in the minimum time.

Install ElastiCluster and its dependencies in a virtual environment

Firstly, I suggest to follow the advice in the Creating a Grid Engine Cluster with Elasticluster tutorial  and install virtualenv, in order to create a separate Python environment for ElastiCluster:

paolo@cresset77 ~/blog$ pip install virtualenv
Collecting virtualenv
Downloading https://files.pythonhosted.org/packages/33/5d/314c760d4204f64e4a968275182b7751bd5c3249094757b39ba987dcfb5a/virtualenv-16.4.3-py2.py3-none-any.whl (2.0MB)
100% |################################| 2.0MB 7.7MB/s
Installing collected packages: virtualenv
Successfully installed virtualenv-16.4.3

Then, create and activate a new elasticluster virtual environment:

paolo@cresset77 ~/blog$ virtualenv elasticluster
No LICENSE.txt / LICENSE found in source
New python executable in /home/paolo/blog/elasticluster/bin/python2
Also creating executable in /home/paolo/blog/elasticluster/bin/python
Installing setuptools, pip, wheel...
done.
paolo@cresset77 ~/blog$ cd elasticluster/
paolo@cresset77 ~/blog/elasticluster$ . bin/activate
(elasticluster) paolo@cresset77 ~/blog/elasticluster$

Next, clone from GitLab the Cresset fork of the elasticluster repository which contains a few Cresset-specific Ansible playbooks required to automatically set up the Cresset Engine Broker and FieldEngine machinery on the cloud-hosted cluster:

(elasticluster) paolo@cresset77 ~/blog/elasticluster$ git clone git@gitlab.com:cresset-opensource/elasticluster.git src
Cloning into 'src'...
remote: Enumerating objects: 13997, done.
remote: Counting objects: 100% (13997/13997), done.
remote: Compressing objects: 100% (4820/4820), done.
remote: Total 13997 (delta 8383), reused 13960 (delta 8375)
Receiving objects: 100% (13997/13997), 5.23 MiB | 1.21 MiB/s, done.
Resolving deltas: 100% (8383/8383), done.

Finally, install elasticluster dependencies:

(elasticluster) paolo@cresset77 ~/blog/elasticluster$ cd src
(elasticluster) paolo@cresset77 ~/blog/elasticluster/src$ pip install -e .
Obtaining file:///home/paolo/blog/elasticluster/src
Collecting future (from elasticluster==1.3.dev9)
Downloading https://files.pythonhosted.org/packages/90/52/e20466b85000a181e1e144fd8305caf2cf475e2f9674e797b222f8105f5f/future-0.17.1.tar.gz (829kB)
100% |################################| 829kB 6.6MB/s
Requirement already satisfied: pip>=9.0.0 in /home/paolo/blog/elasticluster/lib/python2.7/site-packages (from elasticluster==1.3.dev9) (19.0.3)

[...]

Running setup.py develop for elasticluster
Successfully installed Babel-2.6.0 MarkupSafe-1.1.1 PrettyTable-0.7.2 PyCLI-2.0.3 PyJWT-1.7.1 PyYAML-5.1 adal-1.2.1 ansible-2.7.10 apache-libcloud-2.4.0 appdirs-1.4.3 asn1crypto-0.24.0 azure-common-1.1.18 azure-mgmt-compute-4.5.1 azure-mgmt-network-2.6.0 azure-mgmt-nspkg-3.0.2 azure-mgmt-resource-2.1.0 azure-nspkg-3.0.2 bcrypt-3.1.6 boto-2.49.0 cachetools-3.1.0 certifi-2019.3.9 cffi-1.12.2 chardet-3.0.4 click-7.0 cliff-2.14.1 cmd2-0.8.9 coloredlogs-10.0 contextlib2-0.5.5 cryptography-2.6.1 debtcollector-1.21.0 decorator-4.4.0 dogpile.cache-0.7.1 elasticluster enum34-1.1.6 funcsigs-1.0.2 functools32-3.2.3.post2 future-0.17.1 futures-3.2.0 google-api-python-client-1.7.8 google-auth-1.6.3 google-auth-httplib2-0.0.3 google-compute-engine-2.8.13 httplib2-0.12.1 humanfriendly-4.18 idna-2.8 ipaddress-1.0.22 iso8601-0.1.12 isodate-0.6.0 jinja2-2.10.1 jmespath-0.9.4 jsonpatch-1.23 jsonpointer-2.0 jsonschema-2.6.0 keystoneauth1-3.13.1 monotonic-1.5 msgpack-0.6.1 msrest-0.6.6 msrestazure-0.6.0 munch-2.3.2 netaddr-0.7.19 netifaces-0.10.9 oauth2client-4.1.3 oauthlib-3.0.1 openstacksdk-0.27.0 os-client-config-1.32.0 os-service-types-1.6.0 osc-lib-1.12.1 oslo.config-6.8.1 oslo.context-2.22.1 oslo.i18n-3.23.1 oslo.log-3.42.3 oslo.serialization-2.28.2 oslo.utils-3.40.3 paramiko-2.4.2 pathlib2-2.3.3 pbr-5.1.3 pyOpenSSL-19.0.0 pyasn1-0.4.5 pyasn1-modules-0.2.4 pycparser-2.19 pycrypto-2.6.1 pyinotify-0.9.6 pynacl-1.3.0 pyparsing-2.4.0 pyperclip-1.7.0 python-cinderclient-4.1.0 python-dateutil-2.8.0 python-gflags-3.1.2 python-glanceclient-2.16.0 python-keystoneclient-3.19.0 python-neutronclient-6.12.0 python-novaclient-9.1.2 pytz-2018.9 requests-2.21.0 requests-oauthlib-1.2.0 requestsexceptions-1.4.0 rfc3986-1.2.0 rsa-4.0 scandir-1.10.0 schema-0.7.0 secretstorage-2.3.1 simplejson-3.16.0 six-1.12.0 stevedore-1.30.1 subprocess32-3.5.3 typing-3.6.6 unicodecsv-0.14.1 uritemplate-3.0.0 urllib3-1.24.1 warlock-1.3.0 wcwidth-0.1.7 wrapt-1.11.1

Create an ElastiCluster configuration file

Open your favourite text editor and create a ~/.elasticluster/config file with the following content:

# slurm software to be configured by Ansible
[setup/ansible-slurm]
provider=ansible
# ***EDIT ME*** add cresset_flare only if you plan to run Flare calculations
frontend_groups=slurm_master,cresset_common,cresset_broker,cresset_flare
compute_groups=slurm_worker,cresset_common
# ***EDIT ME*** set the path to your Cresset license
global_var_cresset_license=~/path/to/your/cresset.lic

# Create a cloud provider (call it 'google-cloud')
[cloud/google-cloud]
provider=google
# ***EDIT ME*** enter your Google project ID here
gce_project_id= xxxx-yyyyyy-123456
# ***EDIT ME*** enter your Google client ID here
gce_client_id=12345678901-k3abcdefghi12jklmnop3ab4b12345ab.apps.googleusercontent.com
# ***EDIT ME*** enter your Google client secret here
gce_client_secret=ABCdeFg1HIj2_aBcDEfgH3ij

# Create a login (call it 'google-login')
[login/google-login]
image_user=cebroker
image_user_sudo=root
image_sudo=True
user_key_name=elasticluster
user_key_private=~/.ssh/google_compute_engine
user_key_public=~/.ssh/google_compute_engine.pub

# Bring all of the elements together to define a cluster called 'cebroker'
[cluster/cebroker]
cloud=google-cloud
login=google-login
setup=ansible-slurm
security_group=default
image_id=ubuntu-minimal-1804-bionic-v20190403
frontend_nodes=1
# ***EDIT ME*** set compute_nodes to the number of worker nodes that you
# wish to start
compute_nodes=2
image_userdata=
ssh_to=frontend

[cluster/cebroker/frontend]
flavor=n1-standard-2

[cluster/cebroker/compute]
# ***EDIT ME*** change 8 into 2, 4, 8, 16, 32, 64 depending on how many cores
# you wish on each worker node
flavor=n1-standard-8

The only bits that you will need to edit are those highlighted in red, i.e.:

  • The path to the Cresset license on your computer
  • Add cresset_flare if you are planning to run Flare calculations on the cluster
  • Your Google Cloud credentials
  • The number/type of nodes that you wish to use in your cluster

The example above uses a dual-core node to act as the head node (the Cresset Engine Broker does not need huge resources) and starts two 8-core nodes as worker nodes; you may well want to use more and beefier nodes, but this is meant as a small, inexpensive example.

At this stage, you are done with the configuration part. Please note that you only need to do this once; the only future action that you may want to carry out is changing the configuration to modify the type/number of your worker nodes.

Spin up the cluster and use it with Cresset desktop apps

This is the single command that you need to run, as advertised in the blog post headline:

(elasticluster) paolo@cresset77 ~/blog/elasticluster$ elasticluster start cebroker
Starting cluster `cebroker` with:
* 1 frontend nodes.
* 2 compute nodes.
(This may take a while...)

If you feel like having a coffee, this is a good time to brew one – bringing up the cluster will take a few minutes.

If everything works well, at the end of the process you should see:

Your cluster `cebroker` is ready!

Cluster name:     cebroker
Cluster template: cebroker
Default ssh to node: frontend001
- compute nodes: 2
- frontend nodes: 1

To login on the frontend node, run the command:

elasticluster ssh cebroker

To upload or download files to the cluster, use the command:

elasticluster sftp cebroker

Now ssh into your frontend node, forwarding port 9500 to localhost:

(elasticluster) paolo@cresset77 ~/blog/elasticluster$ elasticluster ssh cebroker -- -L9500:localhost:9500
Welcome to Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-1029-gcp x86_64)
 
* Documentation:  https://help.ubuntu.com
* Management:     https://landscape.canonical.com
* Support:        https://ubuntu.com/advantage

Get cloud support with Ubuntu Advantage Cloud Guest:
http://www.ubuntu.com/business/services/cloud
This system has been minimized by removing packages and content that are
not required on a system that users do not log into.

To restore this content, you can run the 'unminimize' command.

0 packages can be updated.
0 updates are security updates.

Last login: Mon Apr  8 21:28:16 2019 from 81.144.247.249

cebroker@frontend001:~$

Tunnelling port 9500 through ssh creates a secure, encrypted SSL connection to the Cresset Engine Broker running in the cloud. You probably won’t need this if you already have configured your VPC in Google Cloud.

You may easily verify that the Cresset Engine Broker is already running on the frontend node, waiting for connections from Cresset clients:

cebroker@frontend001:~$ systemctl status cresset_cebroker2
* cresset_cebroker2.service - server that allows Cresset products to start/connect to remote FieldEngines
Loaded: loaded (/etc/systemd/system/cresset_cebroker2.service; static; vendor preset: enabled)
Active: active (running) since Mon 2019-04-08 21:41:50 UTC; 8min ago
Process: 20917 ExecStop=/bin/bash /opt/cresset/CEBroker2/documentation/init.d/cresset_cebroker2.sh stop /var/run/CEBroker2.pid (code=exited, status=0/SUCCESS)
Process: 20930 ExecStart=/bin/bash /opt/cresset/CEBroker2/documentation/init.d/cresset_cebroker2.sh start /var/run/CEBroker2.pid (code=exited, status=0/SUCCESS)
Main PID: 20956 (CEBroker2.exe)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/cresset_cebroker2.service
> 20956 /opt/cresset/CEBroker2/lib/CEBroker2.exe --pid /tmp/tmp.Y7Wvk135fM -p 9500 -P 9501 -s /opt/cresset/CEBroker2/documentation/examples/start_SLURM_engine.sh -m 16 -M 16 -v

Now, start a Forge client running on your local machine.

In the Processing Preferences, configure the Cresset Engine Broker to Host localhost, Port 9500, and press the ‘Test Connection’ button; you should see ‘Connection OK’:

Now, align a dataset against a reference. In addition to your local processes, you should see 16 remote processes running on your cloud-hosted cluster:

Using remote computing resources requires a Cresset Engine Broker license and a Remote FieldEngine license; please contact us if you do not currently have such licenses and wish to give this a try. And do not hesitate to contact our support team if you need help to get the above working for you.