How to monitor and optimize Google cloud costs
According to a Gartner report, more than 70% of companies have migrated at least some of their workload to public cloud, and this momentum is only expected to increase in the coming future. Enterprises are preferring cloud technologies due to the several advantages that it offers – ease of installation, easy management and cost effectiveness, to name a few. Most companies today are moving to cloud native solutions to reduce maintenance efforts on their infrastructure and reduce cost. However, cloud data management and optimizing the cost of cloud computing is an ongoing challenge. If the billing performance is not monitored or the services that it is running on are not kept in check, it can end up incurring unnecessary and inefficient business costs.
There are many ways to optimize the cost of various cloud services, in this blog, we would be discussing a use case where we helped a customer optimize their Google Cloud Platform cost.
Best practices to optimize GCP cost
Running the operations on cloud offer benefits such as scaling up and down to meet demands and reduce operational expenditures. One can adopt the following best practices to keep GCP cloud cost in check.
- Exploring billing and cost management tools to get visibility and insights into costs incurred.
- Spending only for resources that are needed and getting rid of resources that are no longer used
- Optimizing cloud storage costs and performance by paying close attention to storage utilization and configurations
- Leveraging cloud based data warehouses and other centralized storage systems
This blog looks at Sigmoid’s approach that helped a client optimize their GCP cloud cost by monitoring cost on a regular basis and setting up an alerting system with the help of Grafana and a simple Python script using the following steps:
- Extracting billing data
- Setting up Grafana dashboards with BigQuery dataset
- Setting up alerts on billing data
1. Extracting billing data :
The first step was to understand the trends, for which we needed billing data, which was facilitated by enabling GCP billing export to Bigquery using the GCP Billing Export tool. It included the following steps:
- Enabling export of Cloud billing to BigQuery where the export tool pulls detailed Google Cloud billing data (such as usage, cost estimates, and pricing data) into a BigQuery dataset that is specified while enabling the export. The Cloud billing data from BigQuery can be used for detailed analysis
- In order to avoid mismatch with GCP billing and BigQuery data, the timezone in BigQuery should be updated (usage_start_time, usage_end_time)
2. Setting up Grafana dashboards with BigQuery dataset :
After exporting data in BigQuery, the next step was to visualize the billing data into dashboards like Grafana. It helped the customer understand the billing trends more accurately. It included the following steps:
- Installing BigQuery plugin to Grafana dashboard
- Setting timezone accordingly to get proper costing data
- Setting up dashboards to get cost by project, team or by tags. Below is a sample query for cost per GCP services
Below are some of the sample dashboards that were created by us:
- Dashboards by weekly trends
- ii. Dashboard with visualization
The dashboard created above had the following features:
- It showcases the overall cost of selected filter
- One can filter dashboard by project, services or by any custom tags (to filter by tag one has to unnest labels and assign to new fields. Also if the tag is not present, it can be set to No-Data or any custom tag. Grafana ignores null values.)
- It can create dashboards to show weekly trend by project, service or tags
- It displays the total cost incurred in a given time period by project, service or tags
3. Setting up alerts on billing data :
After creating dashboards the next task was to set up alerts on the billing dataset. Alerts helped the customer with daily updates on billing and monitor the resources and their cost consumption on a granular level (by services, project, tags etc.). To accomplish this, we developed a Python script with help of Python BigQuery and SendGrid module. We further created Kubernetes CronJob to schedule this job on a daily, weekly and monthly basis as mentioned below:
- i. Daily alerts
- It fetches weekly data and creates table
- It highlights cell with cost above threshold level
- It highlight rows if today’s cost is greater than cost of same day of last week
Fig: Sample screenshot of daily alerts received over email
- ii. Weekly and Monthly cost
- d. It compares the cost of this week/month with previous week/month
- e. It highlights cell with cost above threshold level
- f. It highlight rows if today’s cost is greater than cost of same day of last week
Fig: Sample screenshot of weekly and monthly alerts received over email
Strong DataOps practices are critical to not only ensure that the data infrastructure is highly available and performant but also reduce ongoing operational costs. This project helped the client get daily cloud costing updates and also provided them with dashboards to perform detailed analysis. With the help of this implementation we also managed to identify over-provisioned resources in the system. The solution helped the customer optimize their infrastructure and resulted in a greater than 50% reduction in monthly costs.
Gitesh Shinde is a Software Engineer at Sigmoid. He is a tech enthusiast learning every day about DevOps, DataOps and MLOps. He loves to dig into software engineering problems and solve them with modern technologies.