Setting Up Your Kubernetes Cluster for Galileo Applications on Google Kubernetes Engine (GKE)

Welcome to your guide on configuring and deploying a Google Kubernetes Engine (GKE) environment optimized for Galileo applications. Galileo, tailored for dynamic and scalable deployments, requires a robust and adaptable infrastructure—qualities inherent to Kubernetes. This guide will navigate you through the preparatory steps involving Identity and Access Management (IAM) and the DNS setup crucial for integrating Galileo’s services.

Prerequisites

Before diving into the setup, ensure you have the following:

  • A Google Cloud account.

  • The Google Cloud SDK installed and initialized.

  • Kubernetes command-line tool (kubectl) installed.

  • Basic familiarity with GKE, IAM roles, and Kubernetes concepts.

Setting Up IAM

Identity and Access Management (IAM) plays a critical role in securing and granting the appropriate permissions for your Kubernetes cluster. Here’s how to configure IAM for your GKE environment:

  1. Create a Project: Sign in to your Google Cloud Console and create a new project for your Galileo application if you haven’t done so already.

  2. Set Up IAM Roles: Navigate to the IAM & Admin section in the Google Cloud Console. Here, assign the necessary roles to your Google Cloud account, ensuring you have rights for GKE administration. Essential roles include roles/container.admin (for managing clusters), roles/iam.serviceAccountUser (to use service accounts with your clusters), and any other roles specific to your operational needs.

  3. Configure Service Accounts: Create a service account dedicated to your GKE cluster to segregate duties and enhance security. Assign the service account the minimal roles necessary to operate your Galileo applications efficiently.

Configuring DNS for Galileo

Your Galileo application requires four DNS endpoints for optimal functionality. These endpoints handle different aspects of the application’s operations and need to be properly set up:

  1. Acquire a Domain: If not already owned, purchase a domain name that will serve as the base URL for Galileo.

  2. Set Up DNS Records: Utilize your domain registrar’s DNS management tools to create four DNS A records pointing to the Galileo application’s operational endpoints. These records will route traffic correctly within your GKE environment.

More details in the Step 3: Customer DNS Configuration section.

Deploying Your Cluster on GKE

With IAM configured and DNS set up, you’re now ready to deploy your Kubernetes cluster on GKE.

  1. Create the Cluster: Use the gcloud command-line tool to create your cluster. Ensure that it is configured with the correct machine type, node count, and other specifications suitable for your Galileo application needs.

  2. Deploy Galileo: With your cluster running, deploy your Galileo application. Employ kubectl to manage resources and deploy services necessary for your application.

  3. Verify Deployment: After deployment, verify that your Galileo application is running smoothly by checking the service status and ensuring that external endpoints are reachable.

    **

    Total time for deployment:** 30-45 minutes

This deployment requires the use of Google Cloud’s CLI, **gcloud**. Please follow these instructions to install and set up gcloud for your GCP account.

Recommended Cluster Configuration

Galileo recommends the following Kubernetes deployment configuration. These details are captured in the bootstrap script Galileo provides.

ConfigurationRecommended Value
Nodes in the cluster’s core nodegroup4 (min) 5 (max) 4 (desired)
CPU per core node4 CPU
RAM per core node16 GiB RAM
Number of nodes in the cluster’s runners nodegroup1 (min) 5 (max) 1 (desired)
CPU per runner node8 CPU
RAM per runner node32 GiB RAM
Minimum volume size per node200 GiB
Required Kubernetes API version1.21
Storage classstandard

Step 0: Deploying the GKE Cluster

Run this script as instructed. If you have specialized tasks that require GPU processing make sure CREATE_ML_NODE_POOL=true is set before running the script. If you have any questions, please reach out to a Galilean in the slack channel Galileo shares with you and your team.

Step 1: Required Configuration Values

Customer specific cluster values (e.g. domain name, slack channel for notifications etc) will be placed in a base64 encoded string, stored as a secret in GitHub that Galileo’s deployment automation will read in and use when templating a cluster’s resource files.\

Mandatory fields the Galileo team requires:

Mandatory FieldDescription
GCP Account IDThe Customer’s GCP Account ID that the customer will use for provisioning Galileo
Customer GCP Project NameThe Name of the GCP project the customer is using to provision Galileo.
Customer Service Account Address for GalileoThe Service account address the customer has created for the galileo deployment account to assume.
GKE Cluster NameThe GKE cluster name that Galileo will deploy the platform to.
Domain NameThe customer wishes to deploy the cluster under e.g. google.com
GKE Cluster RegionThe region of the cluster.
Root subdomaine.g. “galileo” as in galileo.google.com
Trusted SSL Certificates (Optional)By default, Galileo provisions Let’s Encrypt certificates. But if you wish to use your own trusted SSL certificates, you should submit a base64 encoded string of

1. the full certificate chain, and

2. another, separate base64 encoded string of the signing key.

Step 2: Access to Deployment Logs

As a customer, you have full access to the deployment logs in Google Cloud Storage. You (customer) are able to view all configuration there. A customer email address must be provided to have access to this log.

Step 3: Customer DNS Configuration

Galileo has 4 main URLs (shown below). In order to make the URLs accessible across the company, you have to set the following DNS addresses in your DNS provider after the platform is deployed.

**

Time taken :** 5-10 minutes (post the ingress endpoint / load balancer provisioning)

ServiceURL
APIapi.galileo.company.[com|ai|io…]
Datadata.galileo.company.[com|ai|io…]
UIconsole.galileo.company.[com|ai|io…]
Grafanagrafana.galileo.company.[com|ai|io…]

Step 4: Post-deployment health-checks

Set up Firewall Rule for Horizontal Pod Autoscaler

On GKE, only a few ports allow inbound traffic by default. Unfortunately, this breaks our HPA setup. You can run kubectl -n galileo get hpa and check unknown values to confirm this. In order to fix this, please follow the steps below:

  1. Go to Firewall policies page on GCP console, and click CREATE FIREWALL RULE
  2. Set Target tags to the network tags of the GCE VMs. You can find the tags like this on the GCE instance detail page.
  3. Set source IPv4 ranges to the range that includes the cluster internal endpoint, which can be found on cluster basics ((link)).
  4. Allow TCP port 6443.
  5. After creating the firewall rule, wait for a few minutes, and rerun kubectl -n galileo get hpa to confirm unknown is gone.

Creating a GPU-enabled Node Group

For specialized tasks that require GPU processing, such as machine learning workloads, Galileo supports the configuration of GPU-enabled node pools.

  1. Node Group Creation: Create a g2-standard-8 node group with name galileo-ml , min_size 1, max_size 5, and label galileo-node-type=galileo-ml

  2. If the cluster has low usage and you want to save costs, you may also choose to use cheaper GPU like n1-standard-8 with GPU T4. Note that it only saves costs when the usage is too low to saturate one GPU, otherwise it would even cost more. And don’t choose this option if you use Protect that requires low real-time latency.

  3. When this is done, please reach out to Galileo team so that we can update the deployment config for you.

  4. In order to make Horizontal Pod Autoscaler work on GPU node group, it’s required to update the cluster Node auto-provisioning config to add limit for specified GPU type.