Amplifying business value of analytics with Cloud data warehouse
Reading Time: 8 minutes
Cloud data warehouses enable companies with a scalable data infrastructure and the ability to handle high-performance analytics workloads. The pandemic has further driven many companies to adopt cloud data migration from traditional enterprise data warehousing to manage the sudden changes in analytics requirements.
Sigmoid conducted a webinar to have an in-depth discussion with data science heads of major companies to understand their journey of transitioning to the cloud, best practices for efficiently operating data & analytics in the cloud, benefits of moving to the cloud, and more.
The panel consisted of:
-Nelson Chieh (Head of Data, Advanced Analytics & Martech, Natura & Co.)
-Iván Herrero Bartolomé (Chief Data Officer, Intercorp)
-Juan Carlos Oré (Chief Data Officer, Credicorp Capital), and
-Anush Kumar (Chief Growth Officer, LATAM Sigmoid)
The panel was moderated by Kathleen L D Maley, (Member, Analytics Expert Network, International Institute of Analytics). The detailed profiles of the speaker can be found here.
Moderator: What are the key reasons for organizations to migrate to the cloud from on-premise?
Nelson: One of the main reasons to move our on-premises data to the cloud is to be able to deploy more sophisticated AI models. Cloud solutions will give us more flexibility, scalability, and integration tools. We can access up-to-date solutions without worrying about data governance and data quality. It also provides ease of infrastructure management along with cost-effectiveness. We can now have a better time to market our solutions using cloud solutions.
Ivan: One of the key reasons is flexibility. Secondly, it allows us to access more computing power than we can in on-prem architecture. Thirdly, everything is faster when you work in the cloud. Overall it drives value for your business. You do not have to worry about the hardware required for on-prem or running the firewall smoothly. For instance, we are currently using the serverless components of GCP and we do not have to waste any time on challenges such as optimizing functionality, performance, or servers. These are some of the huge differences that we can find in the cloud.
Moderator: What challenges were solved by migrating to the cloud and what problems remained unaddressed?
Ivan: We have achieved what we were expecting to solve with the cloud. We are now processing billions of rows of data seamlessly and efficiently. That’s something we couldn’t do in our previous architecture on-premise. Apart from flexibility, we had the computing power to perform compute-intensive workloads in a cost-effective way. We are doing things faster than ever, adding value to the business. It solves the technical side of the equation. But we have to realize that there is a business side to it too. One challenge is that moving to the cloud does not solve the problems related to siloed data. Moving to the cloud is not something magical. There’s a lot of effort you have to do in parallel to solve the other part of the equation.
Anush: While working with on-premises data, working with machine learning modules had challenges such as low computing and processing power, which are addressed by moving to the cloud. The use of cloud data warehouses such as Snowflake, Amazon Redshift, BigQuery has helped businesses to handle large volumes of data. However, enterprises still deal with data silos. There was a lot of mismatch in terms of the data. We are striving to solve this challenge.
Juan: I completely agree with Ivan and Anush. Simply moving to the cloud is not going to solve the challenges that existed on-prem such as quality of data or data silos. Rather than just focusing on moving to the cloud, one should focus on addressing the core challenges of the business. The first step is to govern the data, and then, focus on the challenges of the business.
Moderator: What were the primary challenges you faced in establishing cloud-enabled data analytics? And how did you overcome them?
Nelson: I think the common misunderstanding that comes with moving to the cloud is that it automatically brings the culture of data-driven. And that you can magically deploy machine learning algorithms faster. Until your whole data is moved to the cloud platform, you still have to go back and forth with the on-prem data. Our mindset should be that of adopting a cloud-first mentality and not a cloud-only approach. So, cultural change is very important.
Juan: I agree with Nelson’s point on changing the mindset of people. We have to also educate the security team. When you mention cloud to the security team, they are scared that data is going to be available for everybody. You have to make them believe that data is going to be more secure than it was on-prem. You have to pay attention to security policies.
Moderator: Moving on to the cost, were you able to monetize your data and analytics and get the ROI that you were expecting after migrating to the cloud?
Juan: There are many ways to move data to the cloud such as moving data by domain, by department, by source, by country, among others. I prefer moving data by use case. We selected a few main points from the business to see which areas are giving us more results and accordingly selected those business cases to move it to the cloud. It might take years to move from on-prem to cloud. It is like building Lego piece by piece. Following this strategy, we were able to get the ROI that we were looking for.
Ivan: Setting up an expected ROI is difficult. Consider that you take all your data to move to a cloud but if you are not scaling the business or affecting it in any way, you cannot expect an ROI. Simply moving data from an on-prem architecture to the cloud will not bring the desired business outcome. So, I will probably rephrase the question into – is our analytics strategy or our analytics efforts, bringing tangible benefits for our business? And it’s absolutely yes. Are they showing an ROI? Absolutely, yes. I would like to emphasize that you shouldn’t expect an ROI just by moving data to the cloud. Very often, companies misunderstand this point.
Moderator: Do you believe that the performance, scalability, and integration capabilities are different in different cloud offerings?
Anush: We generally do the benchmark to understand the performance, scalability, and integration. With benchmarking we try to understand the volume of data that we are dealing with, the level of complexity, and more. Once the benchmarking is done, we derive conclusions like what kinds of data should be moved to the cloud? What kind of data needs to be kept on-prem? And more. The choice of cloud provider stems from what you are trying to achieve with the data.
Ivan: All three big cloud providers (Microsoft Azure, Amazon AWS, and Google Cloud Provider) provide the option of scalability and performance. The question here should be how hard is it for users to keep their cloud infrastructure scaling efficiently without the need for highly specialized skills. That should be the key point to assess while evaluating different cloud providers without worrying about the number of systems, system architecture, or more. One should be able to focus on using the components rather than worrying about the tech behind them.
Moderator: What learnings or guidance do you have regarding solving the problem of siloed data. After moving to the cloud, what should be done, what should not be done? Any insights on why moving to the cloud is so difficult?
Anush: This is a problem that most clients face. For instance, a client’s sales team decided to use Salesforce, thinking that the same data could be useful for the marketing team. However, there can be a lot of mismatches and duplication in the data. At Sigmoid, our data engineering team follows best practices to make sure that duplication is quickly removed. We use technology and tools to streamline data. While this has solved the problem to a great extent, we are still working on bringing best practices to deal with data silos.
Nelson: The first point to keep in mind while dealing with siloed data is how the IT team can deliver solutions for business people. The second point is knowledge management. Several siloed data don’t have the business rules to reveal the solution. Recognizing siloed data at the beginning of the data migration process can be valuable. This data should then be shared in a secure way with the data user community. Our strategy is that we first catalog the siloed data. We then bring this data to our production environment to allow the IT guy to do reversal mapping and documentation. We then rebuild the whole solution using the correct technology in the correct data source. It is not only the point of the willingness of data owners to isolate or secure data but it is the data governance issue. It helps to take a user-first approach.
Moderator: Do you think that enterprises could have been better prepared to manage uncertainties, like COVID-19, faster if they had already moved to the cloud? Do you think that moving to the cloud now will help them set up for quicker response to crises?
Ivan: I wouldn’t want to say, yes, of course. It depends on the amount of data that each company has. In our case, we moved to the cloud a few months before COVID-19 hit us. It proved to be very useful. When the pandemic hit, the architecture that we had built enabled us to understand how consumer behavior was changing. We were able to make decisions to accordingly keep serving our customers. Moving to the cloud helped us reach the granularity of it, with the kind of analyses and the speed at which we could interact with the data. We wouldn’t have been able to achieve it in the previous architecture. So in our case, it absolutely helped.
Anush: Working with some of the companies in the CPG and E-commerce industries, we found that the processing of real-time data was minimal during the pre-pandemic state. It drastically changed during the pandemic. Our engineering team is now processing terabytes of data generated by these companies, especially in the E-commerce industry, to take big business decisions.
Nelson: Our business model is direct-selling and pandemic directly impacted our customer count. So, going digital was not an option but a necessity if we wanted to survive. But just deciding to go digital is not enough — we had to reorient our decisions and accelerate a data-driven culture. It had to be done quickly with no time to train models. Moving to the cloud helped us to make this transition quickly.
Moderator: What would you do differently with respect to leveraging the cloud for data and analytics, if you had the chance to do it over?
Anush: The first recommendation would be to analyze if the client requires the whole ecosystem to move into the cloud or there are priority use cases that need to be moved to cloud first. The second point would be to identify the right cloud service provider for their needs. For instance, a client may be using AWS RedShift but there is a possibility of moving to Snowflake. We try to provide benchmarking to really understand the business problem and provide the right recommendations.
Ivan: My suggestion would be don’t put your KPIs on the volume of data — how much data are we using, how many users are engaging with the solutions, and more.
Juan: Spend more time in the change management process. It is very important because by moving to the cloud you are implementing a new way of working. If you don’t have it sorted, it is going to be hard to adopt the changes.
About the Author
Srishti is Content Marketing Manager at Sigmoid with a background in tech journalism. She has extensively covered Data Science and AI space in the past and is passionate about technologies defining them.