Deploying Galileo - Eks
Setting Up Your Kubernetes Cluster with EKS, IAM, and Trust Policies for Galileo Applications
This guide provides a comprehensive walkthrough for configuring and deploying an EKS (Elastic Kubernetes Service) environment to support Galileo applications. Galileo applications are designed to operate efficiently on managed Kubernetes services like EKS (Amazon Elastic Kubernetes Service) and GKE (Google Kubernetes Engine). This document, however, will specifically address the setup process within an EKS environment, including the integration of IAM (Identity and Access Management) roles and Trust Policies, alongside configuring the necessary Galileo DNS endpoints.
Prerequisites
Before you begin, ensure you have the following:
-
An AWS account with administrative access
-
kubectl
installed on your local machine -
aws-cli
version 2 installed and configured -
Basic knowledge of Kubernetes, AWS EKS, and IAM policies
Below lists the 4 steps to set deploy Galileo onto a an EKS environment.
Setting Up the EKS Cluster
-
Create an EKS Cluster: Use the AWS Management Console or AWS CLI to create an EKS cluster in your preferred region. For CLI, use the command
aws eks create-cluster
with the necessary parameters. -
Configure kubectl: Once your cluster is active, configure
kubectl
to communicate with your EKS cluster by runningaws eks update-kubeconfig --region <region> --name <cluster_name>
.
Configuring IAM Roles and Trust Policies
-
Create IAM Roles for EKS: Navigate to the IAM console and create a new role. Select “EKS” as the trusted entity and attach policies that grant required permissions for managing the cluster.
-
Set Up Trust Policies: Edit the trust relationship of the IAM roles to allow the EKS service to assume these roles on behalf of your Kubernetes pods.
Integrating Galileo DNS Endpoints
-
Determine Galileo DNS Endpoints: Identify the four DNS endpoints required by Galileo applications to function correctly. These typically include endpoints for database connections, API gateways, telemetry services, and external integrations.
-
Configure DNS in Kubernetes: Utilize ConfigMaps or external-dns controllers in Kubernetes to route your applications to the identified Galileo DNS endpoints effectively.
Deploying Galileo Applications
-
Prepare Application Manifests: Ensure your Galileo application Kubernetes manifests are correctly set up with the necessary configurations, including environment variables pointing to the Galileo DNS endpoints.
-
Deploy Applications: Use
kubectl apply
to deploy your Galileo applications onto the EKS cluster. Monitor the deployment status to ensure they are running as expected.
This deployment requires the use of AWS CLI commands. If you only have cloud console access, follow the optional instructions below to get eksctl working with AWS CloudShell.
Step 0: (Optional) Deploying via AWS CloudShell
To use eksctl
via CloudShell in the AWS console, open a CloudShell session and do the following:
# Create directory
mkdir -p $HOME/.local/bin
cd $HOME/.local/bin
# eksctl
curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
sudo mv /tmp/eksctl $HOME/.local/bin
The rest of the installation deployment can now be run from the CloudShell session. You can use vim
to create/edit the required yaml and json files within the shell session.
Recommended Cluster Configuration
Galileo recommends the following Kubernetes deployment configuration:
Configuration | Recommended Value |
---|---|
Nodes in the cluster’s core nodegroup | 4 (min) 5 (max) 4 (desired) |
CPU per core node | 4 CPU |
RAM per core node | 16 GiB RAM |
Number of nodes in the cluster’s runners nodegroup | 1 (min) 5 (max) 1 (desired) |
CPU per runner node | 8 CPU |
RAM per runner node | 32 GiB RAM |
Minimum volume size per node | 200 GiB |
Required Kubernetes API version | 1.21 |
Storage class | gp2 |
Here’s an example EKS cluster configuration.
Step 1: Creating Roles and Policies for the Cluster
- Galileo IAM Policy: This policy is attached to the Galileo IAM Role. Add the following to a file called
galileo-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"eks:AccessKubernetesApi",
"eks:DescribeCluster"
],
"Resource": "arn:aws:eks:CLUSTER_REGION:ACCOUNT_ID:cluster/CLUSTER_NAME"
}
]
}
- Galileo IAM Trust Policy: This trust policy enables an external Galileo user to assume your Galileo IAM Role to deploy changes to your cluster securely. Add the following to a file called
galileo-trust-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::273352303610:role/GalileoConnect"
],
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
- Galileo IAM Role with Policy: Role should only include the Galileo IAM Policy mentioned in this table. Create a file called
create-galileo-role-and-policies.sh
, make it executable withchmod +x create-galileo-role-and-policies.sh
and run it. Make sure to run in the same directory as the json files created in the above steps.
#!/bin/sh -ex
aws iam create-policy --policy-name Galileo --policy-document file://galileo-policy.json
aws iam create-role --role-name Galileo --assume-role-policy-document file://galileo-trust-policy.json
aws iam attach-role-policy --role-name Galileo --policy-arn $(aws iam list-policies | jq -r '.Policies[] | select (.PolicyName == "Galileo") | .Arn')
Step 2: Deploying the EKS Cluster
With the role and policies created, the cluster itself can be deployed in a single command using eksctl. Using the cluster template here, create a galileo-cluster.yaml
file and edit the contents to replace CUSTOMER_NAME
with your company name like galileo
. Also check and update all availabilityZones
as appropriate.
With the yaml file saved, run the following command to deploy the cluster:
eksctl create cluster -f galileo-cluster.yaml
Step 3: EKS IAM Identity Mapping
This ensures that only users who have access to this role can deploy changes to the cluster. Account owners can also make changes. This is easy to do with eksctl with the following command:
eksctl create iamidentitymapping
--cluster customer-cluster
--region your-region-id
--arn "arn:aws:iam::CUSTOMER-ACCOUNT-ID:role/Galileo"
--username galileo
--group system:masters
NOTE for the user: For connected clusters, Galileo will apply changes from github actions. So github.com should be allow-listed for your cluster’s ingress rules if you have any specific network requirements.
Step 4: Required Configuration Values
Customer specific cluster values (e.g. domain name, slack channel for notifications etc) will be placed in a base64 encoded string, stored as a secret in GitHub that Galileo’s deployment automation will read in and use when templating a cluster’s resource files.\
Mandatory Field | Description |
---|---|
AWS Account ID | The Customer’s AWS Account ID that the customer will use for provisioning Galileo |
Galileo IAM Role Name | The AWS IAM Role name the customer has created for the galileo deployment account to assume. |
EKS Cluster Name | The EKS cluster name that Galileo will deploy the platform to. |
Domain Name | The customer wishes to deploy the cluster under e.g. google.com |
Root subdomain | e.g. “galileo” as in galileo.google.com |
Trusted SSL Certificates (Optional) | By default, Galileo provisions Let’s Encrypt certificates. But if you wish to use your own trusted SSL certificates, you should submit a base64 encoded string of 1. the full certificate chain, and 2. another, separate base64 encoded string of the signing key. |
AWS Access Key ID and Secret Access Key for Internal S3 Uploads (Optional) | If you would like to export data into an s3 bucket of your choice. Please let us know the access key and secret key of the account that can make those upload calls. |
NOTE for the user: Let Galileo know if you’d like to use LetsEncrypt or your own certificate before deployment.
Step 5: Access to Deployment Logs
As a customer, you have full access to the deployment logs in Google Cloud Storage. You (customer) are able to view all configuration there. A customer email address must be provided to have access to this log.
Step 6: Customer DNS Configuration
Galileo has 4 main URLs (shown below). In order to make the URLs accessible across the company, you have to set the following DNS addresses in your DNS provider after the platform is deployed.
Time taken : 5-10 minutes (post the ingress endpoint / load balancer provisioning)
Service | URL |
---|---|
API | api.galileo.company.[com|ai|io…] |
UI | console.galileo.company.[com|ai|io…] |
Grafana | grafana.galileo.company.[com|ai|io…] |
Each URL must be entered as a CNAME record into your DNS management system as the ELB address. You can find this address by listing the kubernetes ingresses that the platform has provisioned.
Step 7: Post-deployment health-checks
GPU Enabled Nodes
For specialized tasks that require GPU processing, such as machine learning workloads, Galileo supports the configuration of GPU-enabled node pools. Here’s how you can set up and manage a node pool with GPU-enabled nodes using eksctl
, a command line tool for creating and managing Kubernetes clusters on Amazon EKS.
Creating a GPU-enabled Node Pool
-
Node Pool Creation: Use
eksctl
to create a node pool with an Amazon Machine Image (AMI) that supports GPUs. This example uses theg6.2xlarge
instances and specifies a GPU-compatible AMI.eksctl create nodegroup --cluster your-cluster-name --name galileo-ml --node-type g6.2xlarge --nodes-min 1 --nodes-max 5 --node-ami ami-0656ebce2c7921ec0 --node-labels "galileo-node-type=galileo-ml" --region your-region-id
In this command, replace
your-cluster-name
andyour-region-id
with your specific details. The--node-ami
option is used to specify the exact AMI that supports CUDA and GPU workloads. -
If the cluster has low usage and you want to save costs, you may also choose to use cheaper GPU like
g4dn.2xlarge
. Note that it only saves costs when the usage is too low to saturate one GPU, otherwise it would even cost more. And don’t choose this option if you use Protect that requires low real-time latency.
Using Managed RDS Postgres DB server
To use Managed RDS Postgres DB Server. You should create RDS Aurora directly in AWS console and Create K8s Secret and config map in kubernetes so that Galileo app can use it to connect to the DB server
Creating RDS Aurora cluster
- Go to AWS Console —> RDS Service and create a RDS Subnet group.
-
Select the VPC in which EKS cluster is running.
-
Select AZs A and B and the respective private subnets
- Next Create a RDS aurora Postgres Cluster. Config for the cluster are listed below. General fields like cluster name, username, password etc can we enter as per cloud best practice.
Field | Recommended Value |
---|---|
Engine Version | 16.x |
DB Instance class | db.t3.medium |
VPC | EKS cluster VPC ID |
DB Subnet Group | Select subnet group created in step 1 |
Security Group ID | Select Primary EKS cluster SG |
Enable Encryption | true |
- Create K8s Secret
- Kubernetes resources: Add the following to a file called
galileo-rds-details.yaml
. Update all marker $ text with appropriate values. Then runkubectl apply -f galileo-rds-details.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: galileo
---
apiVersion: v1
kind: Secret
metadata:
name: postgres
namespace: galileo
type: Opaque
data:
GALILEO_POSTGRES_USER: "${db_username}"
GALILEO_POSTGRES_PASSWORD: "${db_username}"
GALILEO_POSTGRES_REPLICA_PASSWORD: "${db_master_password}"
GALILEO_DATABASE_URL_WRITE: "postgresql+psycopg2://${db_username}:${db_master_password}@${db_endpoint}/${database_name}"
GALILEO_DATABASE_URL_READ: "postgresql+psycopg2://${db_username}:${db_master_password}@${db_endpoint}/${database_name}"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-datasources
namespace: galileo
labels:
app: grafana
data:
datasources.yaml: |
apiVersion: 1
datasources:
- access: proxy
isDefault: true
name: prometheus
type: prometheus
url: "http://prometheus.galileo.svc.cluster.local:9090"
version: 1
- name: postgres
type: postgres
url: "${db_endpoint}"
database: ${database_name}
user: ${db_username}
secureJsonData:
password: ${db_master_password}
jsonData:
sslmode: "disable"
---
Was this page helpful?