Algorithmic retailing – demystifying the data myths for AI adoption

According to Precedence Research Study, with rapid AI adoption the AI retail market size is projected to reach $54.92 billion by 2033.

While different retailers are in various stages of the AI journey to improve their process and performance, there are ample misconceptions among the retailers regarding their data and AI adoption. It is important for the retailers’ respective journeys to understand these myths and the potential mitigations.

Based on my experiences with multiple retailers across the globe, in this article I will dive into some of the key data myths prevailing across the retail industry and the ways to mitigate for retailers to make better decisions via AI adoption.

Myths about limited data

‘I do not have enough data or systems in place hence, the data may be far from being AI ready’

Myth: Many retailers, especially small ones complain that there is not enough data for them to leverage AI/ML models.

While it might be true that they may not have EDW to capture historical data or may not be buying a lot of syndicated data, still retailers will do well to realise that even availability of basic data such as transaction data is good enough to start with.

A caveat, however, is that this data should be available for a certain period.

Mitigation: My experiences with multiple retailers have shown that autoregression is one of the most important predictors to future sales.

The availability of basic transaction data for certain period can be a starting point for the application of machine learning. Following is the list of analyses that could be performed with transaction data:

·       Sales forecasting from sales history

·       Affinity/Halo analysis from basket data.

·       Price elasticity from price and sales.

In addition, other data such as item attributes, inventory, weather, demographics, competition etc. would assist in the application of a wider set of algorithms to answer a broader set of business questions.

Also, it does not take an investment as big as an EDW to apply AI. To start off, simple scripts written in Python or SQL are also sufficient to take data from transaction systems. This can even be the start of investment towards maintenance of historical data.

Myths about data quality

‘I have the historical data with all parameters in EDW, my data is really clean and AI ready’’

Myth: Many big retailers who have EDW in place talk about having all the data parameters possible.

However, in many cases there have been a lot of gaps in the actual data. Most retailers do not yet maintain perpetual inventory history and returns from sale transactions. While the current way of tracking in PoS might be different, corresponding algorithms might have a different requirement.

Despite maintenance of EDW, the data might not have been clean because of quality and maintenance degradation over time.

For example, Price maintenance, promotion tracking are the most common parameters with the most inconsistency found. In a lot of cases, even basic data has been found to need a lot of cleaning and transformation before application of algorithms to the data. 

Mitigation: There is a need to collaborate with relevant teams managing the data to get to the required data points. In many retailers, demographics data might not be readily available with merchandisers, but with store operations or marketing teams.

In some cases, the current system might have to be configured in a certain way to maintain history for added data points. For example, to effectively predict the demand for futures, in addition to sales information the corresponding returns history is also necessary.

In addition, for an AI transformation, it is essential to prepare a comprehensive set of data validation rules from a business perspective. For example, to identify outlier data points in price, cost and such data points where not a great magnitude of variation is expected.

The general modus operandi to cleanse data is to leverage a combination of various techniques and by working with the relevant stakeholders. An approach for cleaning the data without help from retailers could be treating the unclean data as missing to leverage MICE or averages or other data imputation techniques.

Algorithmic retailing – demystifying the data myths for AI adoption

Myths about problem solving

‘We hold enormous amount of data and hence, there is a guarantee to get good predictions and smaller data errors do not matter’

Myth: There are certain retailers who have had systems in place for a long time and hence, have large historical data.

In a lot of such cases, the assumption is that there is a guarantee that AI models would work very well, and any data errors wouldn’t matter much due to the volume of data available. The thought is that all the data available could be put through models they would do the needful to come up with magical recommendations.

Mitigation:  While it is true that having a good amount of history is essential for AI models, that does not guarantee good results. Despite having voluminous data, even small data errors make a big difference because most machine learning models are influenced by outliers as every data point is considered.

If historical data is not indicative enough of the future, models would not work. In such cases, hand designing of features becomes very important. As a matter of fact, in retail scenario, based on my experience - considering the nature and quality of data hand designed features is the most important aspect of most reliable predictions.

Hence, the application of domain/context to assist in feature engineering cannot be emphasised enough.

Myths about adoption

‘AI algorithms are considered blackbox, will my business users adapt it?’

Myth: Most ML algorithms and especially the ones which leverage Deep Neural Networks function like a ‘Blackbox’.

When it comes to business users, they are seldom comfortable using AI which is difficult to understand. Hence, the business community is averse to adoption of AI

Mitigation: While explainable AI continues to be a hot topic for research, there are existing techniques and approximations which can be leveraged to make the experience for the users better and help in adoption.

Techniques like VarImp, SHAP values, Neural Networks Sensitivity Analysis have been developed and the way to leverage these techniques can significantly help user confidence. Based on my experience a simple sales decomposition chart by leveraging a stepwise regression on the output of VarImp helps tremendously.

Similarly, a user-friendly provisioning for sensitivity analysis through what-if simulations can help the user to change multiple parameters and see the impact on KPIs has a positive impact on the users. It is essential to simplify the entire process to make it seem almost like a white box.

Setting data strategy is pivotal for AI adoption

While AI is everywhere, enough caution and planning needs to be exercised in implementation of AI and in preparing data for applying algorithms.

Many fortune 500 retailers have successfully implemented AI for improving their business operations in space optimisation, assortment optimisation, pricing decisions and each of them involve thorough work in terms of planning, formulating, cleaning the data, transforming to make it AI ready to drive business benefits.

About the author:

Arun Rasika Karunakaran has worked with more than 15 retailers across the US, Australia, and the UK.

She has spearheaded multiple business consulting engagements in merchandising for varied retail segments - general merchandise, fashion, fresh, pharma, home improvement, pet specialty, and office supplies.

She has strong expertise in applying technology for solving retail business problems leveraging AI-ML platforms. Her focus is on embedding sustainability in retail operations - assortment planning, store space planning, omnichannel strategies, item analytics, retail product pricing, and promotions.

She has co-authored a patent for hyper-localisation of retail assortments and optimisation of operations across the value chain. Rasika now heads presales for TCS’ AI powered retail strategic intelligence platform, TCS Optumera, in North America.