Log Analytics and Monitoring for Google Kubernetes Engine (GKE) with Hipster Shop

Post views: 511

This blog will guide you through setting up log analytics, monitoring and alert for a Google Kubernetes Engine (GKE) cluster using the Hipster Shop Observability Demo Application as reference.

1. Introduction to Kubernetes Monitoring and Log Analytics

Monitoring and log analytics in Kubernetes are essential for ensuring:

Performance: Detecting latency issues, CPU bottlenecks (e.g. throttling), and slowdowns.
Reliability: Catching errors and failures before they impact end-users.
Security: Identifying security anomalies like unauthorized access.

Effective monitoring and logging provide insights into application health, inter-service communication, and resource usage, enabling faster troubleshooting and optimization.

2. Why Use Hipster Shop as a Demo?

Hipster Shop is a microservices-based application designed by Google, simulating an e-commerce platform. It consists of several services (e.g., emailservice, checkoutservice, cartservice, etc.), making it an ideal reference for:

Microservices Monitoring: Hipster Shop’s architecture mirrors real-world applications with interdependent services.
Load Testing: The demo includes a loadgenerator to simulate realistic traffic.
Dynatrace Integration: The demo supports full observability with tools like Dynatrace to monitor, analyze, and troubleshoot services.

3. Setting Up Google Kubernetes Engine (GKE)

Prerequisite:

Ensure the following tools are installed on your machine:

kubectl
gcloud
git
helm
jq

Step 1: Create a GKE Cluster

Start by creating a GKE cluster via the Google Cloud Console or CLI:

gcloud container clusters create hipster-shop-cluster \
    --zone us-west1-a \
    --num-nodes 2 \
    --machine-type e2-standard-8

This sets up a Kubernetes cluster with two nodes, each configured for microservices deployment.

4. Deploying Hipster Shop on GKE

Clone the Hipster Shop Repository:

https://github.com/GoogleCloudPlatform/microservices-demo.git

Deploy Hipster Shop to GKE:

kubectl apply -f ./release/kubernetes-manifests.yaml -n hipster-shop

This command deploys each microservice along with their configurations, such as services and deployments, into the GKE cluster.

Verify the pods are ready:

kubectl get pods -n hipster-shop

After a few minutes, you should see the pods are in a running state.

NAME                                     READY   STATUS    RESTARTS   AGE
adservice-86d5fd9c56-tprp6               1/1     Running   0          75s
cartservice-6bb7677d47-lv25q             1/1     Running   0          78s
checkoutservice-64fd47cdfb-ldm8w         1/1     Running   0          81s
currencyservice-8587877797-q24bg         1/1     Running   0          77s
emailservice-78d967474f-bdl2k            1/1     Running   0          82s
frontend-84fb499859-f66dw                1/1     Running   0          79s
paymentservice-c5fb8bfb6-6cvs8           1/1     Running   0          79s
productcatalogservice-64cffd9d98-7vdhl   1/1     Running   0          78s
recommendationservice-549cc495d-285w5    1/1     Running   0          80s
redis-cart-867cd85fd4-l6zkm              2/2     Running   0          76s
shippingservice-685bc78f7b-jjg9z         1/1     Running   0          77s

Expose the Frontend:

helm upgrade --install ingress-nginx ingress-nginx  --repo https://kubernetes.github.io/ingress-nginx  --namespace ingress-nginx --create-namespace

This command installs the NGINX Ingress Controller as a load balancer to exposes the frontend of the Hipster Shop application.

Get the URL to your deployed Hipster Shop:

kubectl get svc ingress-nginx-controller -n ingress-nginx -ojson | jq -j '.status.loadBalancer.ingress[].ip'

5. Implementing Log Analytics in GKE

There are two ways you can enable Log Analytics to get insights into your GKE cluster. One way is to upgrade an existing bucket. The other is to create a new log bucket with Log Analytics enabled.

Step 1: Create a new log bucket

You can configure Cloud Logging to create a new log bucket with Log Analytics enabled.

Navigate to Logging, then click Logs Storage in the Cloud Console
Click CREATE LOG BUCKET at top, name the log bucket as hipstershop_log
Check both Upgrade to use Log Analytics and Create a new BigQuery dataset that links to this bucket
Type in a dataset name like hipstershop_dataset
Click Create bucket to create the log bucket

Step 2: Write to the new log bucket

To write logs from our GKE cluster to the new log bucket, you need to create a log sink and specify the log destination and setup a log filter for logging the gke cluster.

Navigate to Logging, then click Logs Explorer in the Cloud Console, and enable Show query on the top right and add the following filter.

resource.type="k8s_container"

Click on Actions and then Create sink.

Name the sink as hipstershop_sink
Select Logging bucket in the sink service, then choose hipstershop_log bucket
Click CREATE SINK

Step 3: Perform analytics on logs

Now we can perform analytics on logs stored in the hipster shop log bucket by running our own query in Log Analytics. In the Log views list, select the hipstershop_log._AllLogs, and then select Query. The Query pane is populated with a default query, which includes the log view that is queried.

To find the min, max and average latency (in ms) for the frontend service:

SELECT hour,
MIN(took_ms) AS min,
MAX(took_ms) AS max,
AVG(took_ms) AS avg
FROM ( SELECT
  FORMAT_TIMESTAMP("%H", timestamp) AS hour,
  CAST( JSON_VALUE(json_payload,
      '$."http.resp.took_ms"') AS INT64 ) AS took_ms
FROM `hipster-shop-440423.global.hipstershop_log._AllLogs`
WHERE
  json_payload IS NOT NULL
  AND SEARCH(labels,"frontend")
  AND JSON_VALUE(json_payload.message) = "request complete"
ORDER BY took_ms DESC, timestamp ASC )
GROUP BY 1

To find all sessions with shopping cart checkout:

SELECT
 JSON_VALUE(json_payload.session) as session_id ,COUNT(*) as count
FROM `hipster-shop-440423.global.hipstershop_log._AllLogs`
WHERE
 JSON_VALUE(json_payload['http.req.method']) = "POST"
 AND JSON_VALUE(json_payload['http.req.path']) = "/cart/checkout"
GROUP BY
 JSON_VALUE(json_payload.session)

6. Monitoring your GKE cluster health with Dashboards and Alert

To effectively monitor your GKE cluster and observe how it handles the load generated by the Hipster Shop demo application, you can set up a custom Google Cloud Monitoring dashboard to display key metrics such as CPU Request Utilization or Memory Limit Utilization, giving you insights into resource usage. For instance, we want to monitor the CPU utilization because if it consistently nears or exceeds 100%, our cluster may be at risk of CPU throttling, which can degrade application performance.

Step 1: Create a Custom Dashboard to monitor your pods

Go to the Google Cloud Console and open Monitoring
Select Metrics Explorer from the left menu and click on Select a metric
Select Kubernetes Container > Popular Metrics > CPU request utilization
Click Apply
Now click the Save Chart button and name the dashboard as K8s Container Dashboard

Step 2: Create an Alert to Identify Incident

We will create an alert policy to detect high CPU utilization among the containers and send notifications when the CPU usage exceeds a specific threshold, indicating potential resource constraints. Then we can use the dashboard to identify and respond to the incident.

In the Cloud Console, navigate to Monitoring > Alerting
Click + Create Policy
Click on Select a metric dropdown
Uncheck the Active option
Click on Kubernetes Container > Container->CPU request utilization, then click Apply
Set Rolling windows to 1 min, click Next
Set Threshold position to Above Threshold
Choose Threshold and define a Threshold value (e.g., 0.8 for 80% utilization)
Under Advanced Options, set Retest window to 5 min. By using a retest window, you can reduce the number of unnecessary alerts that might occur due to temporary anomalies in metrics. Click Next
Configure Notification Channels – Choose how you’d like to be notified (e.g., email, Slack, SMS), name the alert policy and then click Create Policy

Step 3: Monitor and Respond

With the dashboard and alert policy in place, you can now:

View real-time insights on CPU usage and active users.
Receive alerts when CPU usage exceeds the set threshold, allowing you to investigate and address potential issues before they impact the application’s performance.

7. Load Testing and Stress Analysis

We will use the loadgenerator included in the Hipster Shop demo application to simulate some user traffic.

Update Kubernetes Configuration for the Loadgenerator

In k8s-manifest.yaml, you can adjust how the loadgenerator runs in your cluster:

Concurrency and Duration: Define how many users and how long they should interact with the app.
Resource Limits: Adjust resource requests/limits to control how much CPU and memory the loadgenerator uses.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: loadgeneratorservice
  labels:
    app.kubernetes.io/component: loadgenerator
    app.kubernetes.io/part-of: hipster-shop
    app.kubernetes.io/name: loadgenerator
spec:
  schedule: "*/15 * * * *"  # Runs every 15 minutes
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: loadgenerator
              image: loadgenerator-image
              args: ["-u", "50", "-d", "10m"]

Schedule: "*/15 * * * *" triggers the CronJob every 15 minutes.
Number of virtual users (-u 50) is set to 50. For heavier load, increase this value.
Duration: Set the duration (-d 10m) in the args to 10 minutes to allow a short break between runs. This prevents overlap between consecutive jobs and allows each job to finish before the next one starts

After configuring, you can deploy the modified loadgenerator to your GKE cluster:

kubectl apply -f k8s-manifest.yaml -n hipster-shop

This will apply a load to the web store and after a while we will start receiving alert notifications. We can open the incident summary and respond to it by clicking on Acknowledge Incident. The incident status now shows Acknowledged, but that doesn’t solve the problem. In reality, you need to fix the root cause of the problem.

10. Conclusion

Monitoring and log analytics are essential for managing complex microservices applications in Kubernetes. Using Google Log Analytics and Monitoring, you can ensure comprehensive observability for your GKE cluster and container application. But monitoring doesn’t stop at metrics; actionable insights into logs and traces help you detect and resolve issues faster, ensuring your applications are always running smoothly.

Thank you for reading my blog and I hope you like it!

MY CLOUD JOURNEY