· Mon Apr 02, 2018 ·

How to always buy your clothes on sales!

Main topic Time series

Why buying full price when you can wait for a little and purchase on sales? A method on how we could make better-informed decisions through data science. A study on Singapore sales pattern since 2014.
If you ever felt terrible after buying an item and to see it discounted the next day, this article is for you. If you are always on the hunt for better deals, I hope to bring some help here. This is the story of how buying at H&M made me start my first ever data science project so I could always buy my clothes on sales.

1. The plot

A week before going to Shanghai on a business trip, I needed to buy a sweatshirt for it was a cold February in China. Living in Singapore, the only time you are not with a t-shirt on is while watching movies in the blasting aircon of the theaters. Buying that sweatshirt at H&M, I felt I made a good deal for something I would then rarely use. Comes the day before the trip and I'm wandering in malls. Sales everywhere. And the sweatshirt I bought few days before is now at 50% off.

I felt like I wasted my money. If only I had waited a bit longer... Bothered by this bad experience, I started thinking about solutions. There might be a better way.

How to delay my purchase for few more days when there is a strong chance of sales? What can I use to find out when is the right time to buy? How can I get alerted on the next sales?

2. Using Google trends to identify past sales periods

These questions haunted me for a week. How could I get this done? Where could I find the data? Not sure the clothing companies would share this with me. Should I collect the information directly from the stores? Can I ask the shop managers to help me?
Chinese New Year, Hari Raya, Deepavali & Black Friday are well-known sales periods but I never really keep track of these celebrations in advance and I never really know when the sales really start.

My first move was to check on Google Trends and play with what I could find there.

Weekly trends in searches related to 'sales' in Singapore from 2014 to 2017

The chart above can clearly help on identifying each year the big usual Summer sales and Black Friday periods. We could see when it started and when it ended. But in Singapore, smaller sales are scattered all year long and that's what I am more interested in finding. The first thing we need to do is to get the seasonality as it might become more apparent. Plotting the results in a heatmap, this is what we get:

Heatmap of the weekly seasonality in 'sales' related searches on Google in Singapore

Not too convincing. It did not change the fact that mid and end of year still dwarf the rest of the other sales period. Then, one solution might be to get the yearly seasonality and to find local maximum.

Identifying local maximum in 'sales' related searches in Singapore

Here, we get a clear highlight of the sales-related to the yearly festivities (Chinese New Year, Hari Raya & Deepavali).

What we have done so far is to build a model that can easily identify past periods based on sales-related searches on Google. Now, how can this model help on detecting future sales. In particular in the case when stores might have flash sales. Let's try to make this work in a way that we can get alerted in advance. The reasoning behind would be that if sales-related search increase more than forecasted it means we are likely to have sales in Singapore.

3. Forecasting for future detection

Based on the same data set, what we are going to perform is a yearly forecast taking into account past and future celebrations dates. I played with Facebook prophet library which is rather easy to implement. This procedure is useful when it comes to forecasting and adding holiday seasons.

Forecast for 2018 on 'sales' search in Singapore.

Looking at this first chart on yearly forecast, we can say that 'sales' related search on Google will likely not be as popular as in 2017. GSS and Black Friday should not bring as much online traction and traffic.

Smoothed weekly forecast (grey) versus actual daily search (red).

On this second chart, we look at the smoothed forecast (heavy grey line) with upper and lower limit (thin grey lines) versus the actual search trends on Google for the first three months of 2018. There is a big drop on the first day of Chinese New Year (16-03-2018), while before and after there are trends crossing the upper limit of the forecast which would mean that there are sales in Singapore.

From historical data we could indeed expect that on week 4 and on week 10 (see above). This methodology of comparing 'actual versus forecast' can work as a trigger for sales detection. This has been confirmed by H&M tweets earlier this year.

4. Using social media for better detection

The method above purely based on search on Google works but might not be the most robust for finding out about sales by our most favorite brands. In order to improve our alerting system, we could use text analysis on social media posts as shown on the two earlier post examples. Each brand will likely communicate on their events. It might even give us the percentage of sales that we can expect.

We can use twitter API to look for all the posts coming from @hmsingapore related to 'deals', 'offers' and 'sales'. By combining both the Google search methodology and the tracking of brands' posts on social media, we get something like this:

Detection of peaks in sales-related search (squares) versus communication on social media by H&M for their sales.

It becomes obvious now that for detecting in advance sales periods, Google Trends might not be the best source of data. Sales-related queries in general peak after the starting date of the sales as communicated by the brands.

As we can see between GSS and Black Friday a lot of smaller promotions are advertised by H&M until year end. In that case, it would be better to monitor the brands accounts and get alerted on time.

5. Conclusion: Sales and brand correlation

If we want to know when to buy clothe on sales there are two main things we can do:

To go a little further with what we have done and seen so far, I would like to make an attempt at understanding which brand benefit the most from sales. Which brand would gain the most traffic from online consumers when the sales are happening? The idea here is to again use Google trends as a proxy for getting the correlation between the brand search and sales search.

Correlation matrix between brand search and sales search in Singapore between 2014 and 2017.

It is quite apparent that the sportswear brands are quite uncorrelated to sales. While Burberry and Coach have even stronger correlation than H&M and Topshop. Interest for these brands tend to be higher when the prices are going down. It might be a sign that the target audience for the luxury brands is waiting the right moment to make their purchase decisions and is more likely to be price-sensitive.