Big Data: A Gear Up for Cab’s Big Boys?
Gone are those days when folks would have to stand and wait for taxis or cabs to pick them up and drop them to their destinations. Now, it’s on their fingertips. Cabs are, well and truly, only a touch away. It has indeed become so easy. No more hollering or wasting your time. But we should realise that there is a lot going on behind the scenes, getting you the cab. Do they not have problems, covering so many cities, countries and continents? Yes, they do. They have problems galore. One such set of problems can be solved by Big Data and its analysis.
The global cab aggregator space is undergoing cut throat competition. New players are finding it increasingly hard to differentiate themselves from their peers. Ever since their inception, customer retention and ensuring good ride experience has always been of maximum importance. With abundant data at their disposal, it becomes important for the senior management to consider solutions that uncover actionable insights and deliver customer expectations. It is no longer just a game of marketing, as simply competing through offers and promotional codes wouldn’t always result in revenue. The key is simple, real time exploratory data analytics.
Ever wondered why the demand in a particular location has gone downwards? Is it because of lack of cab supply? or poor driver performance around that area? or is surge pricing taking its toll? Or could it be due to political unrest or any untoward incident in the neighbourhood? In today’s world of data boom, we need to leverage real time events to track and map the operational efficiency of your service.
Fig: Process workflow involved in booking a cab
Considering the significant factors of scale and time, your analytics platform should be capable of handling 100s of terabytes of data in a fraction of few seconds. Managing real time data from different sources and providing immediate visibility to critical business metrics is always a challenge which businesses face. All of this is made easy by Big Data. Most cab aggregators use data lakes to store a large terabytes of data generated everyday, and then use Spark and Hadoop to make sense of that data. The data comes from a range of data types and databases like SOA database tables, schema less data stores and the event messaging system, Apache Kafka.
As far as their product team goes, usually the cab aggregator’s data team does it all. All predictive models which power the ride sharing cab service ranging from drivers’ ETA, estimation of fare prices, calculating surge prices, to heat maps, so that drivers can position themselves accordingly, are decided by data in the data lakes after analyzing them by the data team. Through statistical and historical data analysis, the cab hailing companies aim to create a positive user experience. In order to further improve on this experience, they make use of data science driven insights and implement new business models, in real time.
The service also relies on a detailed rating system – users can rate drivers, and vice versa – to build up trust and allow both parties to make informed decisions about who they want to share a car with.
Drivers in particular have to be very conscious of keeping their standards high – a leaked internal document showed that those whose score falls below a certain threshold face being “fired” and not offered any more work. They have another metric to worry about, too – their “acceptance rate”. This is the number of jobs they accept versus those they decline. Drivers were told they should aim to keep this above 80%, in order to provide a consistently available service to passengers.
Reliability can be measured by:
- the number of bookings cancelled
- time for booking the service
- wait time
Responsiveness can be measured by the ability to provide services for unscheduled requests, availability of capacity and lead time confirmation.
To sum it up, cab aggregators have progressed from a simple mobile application with a clean UI to a backend and support service which is equally potent. Their users generate several billion instances of data which, now, they have taken the initiative to analyse in order to improve the customer experience, reduce the manpower needed, and safer transactions.