Predictive models can tell you if a prospect will buy, how much they will spend, how long they will stay and who is the person they can refer you to for follow up sales. In fact, you can try to predict almost any data item that you keep regarding your leads, customers or even entire business ecosystem. Good models can do it with amazing accuracy of above 90% bad models can actually be worse than returning random results. Our intuition and current CRM data create a horrible model leaving us thinking that the whole machine learning hype is just that, hype. Often, however, perspective is to blame for inaccurate models. You have to shift your mind from the specific to the general. Remember it is better to be roughly right than precisely wrong.
If you have a budget for a predictive effort, most of it should go to dealing with your data and making it useful for prediction.
For example, in the leads table, you have fields for address, gender, age, related campaign, product interested in, lead stage in the sales pipeline, etc. The missing value you might want to be predicted is the likelihood of purchase.
Now learning from the past data will often lead the algorithm to give the worst possible predictions. It will mark all the current customers as great leads when in fact they already purchased and might not need to buy additional products (when the product is resold to current customers the customers are known and no predictive model is needed to find them). It will indicate that the best product to sell them is the one they already bought and it will mark all the prospects who did not buy yet as bad targets for future efforts.
Furthermore, the data describing the past is so specific that it only matches the past events and not likely to repeat exactly in a useful way. Many of the algorithms will make better than random estimates for familiar data but for unknown data they simply guess which is bad because people might rely on the random guess because it came out of a “model”.
So your CRM data can kill you both ways, it can predict the exact opposite of what you are looking for in the future and it can be so specific that will never match any future event and simply guess.
To try and find data aspects that can be useful for predictions, many models allow for taking the data and dividing it into a training sample and a test sample. The training sample is used to train the model and create the prediction engine and later the test sample is used to check if the rules generated can be applied to new data the model has not seen before and measure the level of success (for both the training and test data we have the actual past results so we can check how well the model did for each sample).
In many cases, this works mostly in theory as the test sample selected by default is just some partial points from the same training sample which means that if a very specific data item happens to be in both it will be treated as generic even if it is not.
For example, if a lead was talked to twice once was taken as part of the test sample and another time it was selected to be part of the training sample then a rule that takes into account the specific address of this lead will be validated by the test and included as a rule. I.e. if a lead comes from apartment 2/b it’s a good lead.
Decisions such as how to divide data to test and training can make the difference when assessing the predictive accuracy of a model.
In the old days, we used to sit for months and come up with business rules and various logical classifications. We would not cover as much as what the learning model can do automatically today but also we could be sure of the quality and generality of the rules we generated.
If we take the address field as an example. It is so specific that in most cases does not help to imply anything about the future. If you take each part of the address as a separate field such as floor level, zip area, street, city, state, and country, you end up with a number of variables some more specific to that user and some more generic and it may create a better prediction. For example, people from 5th floor buy more than people from the 1st floor. Sometimes it is useful to add groups of values for example to group all the people on the 1st floor and then all the people in floors 2-5 and all the people from higher floors. This can help learn from cases where leads come from floors 10, 15, 22 and 31 for example as being within the same floor category and use this learning to predict the conversion for a lead from the 17th floor. Same with age groups, type of residence, neighborhood etc.
To create an amazing predictive model, follow these 10 steps
- Gather as much data as makes sense. Even small amounts of data can be useful but more is better as long as you can process it in a reasonable time with your available IT resources.
- Add more fields that can be a more general form of the specific data or some data item that may be specific but can be used with other cases such as residence type, floor number or area code from the phone number. You can add more values specific to that day but not directly related such as weather, traffic, holiday, weekend, stock market performance etc.
- Decide on a way to treat missing values and values that are very rare. You can, for example, create a new category for missing values or just put 0 or fill them with the average value for the category or whatever seems to work best.
- Divide your data to test and learn… can be 25:75, 50:50, two separate data sets from different years or whatever seems to work best. You need enough samples to create a good model and you need enough test to check the model before you rely on it.
- Decide which fields you want to predict. This is not obvious and if you think about it there might be value in estimating various parameters, not just the probability to close but also maybe the best price to offer, the best time to call, the main reason to purchase, lifetime value, satisfaction score, etc.
- If you have notes fields or word tags or texts you want to add to the model as there are keywords that may prove predictive you can run another algorithm first to deal with the text and generate a value of the text related prediction score and add it to the general algorithm as another parameter. There is usually a separate process needed to process texts before they can be used as other numeric values for creating the model.
- Same with images or anything that can be represented as an image to better capture its essence.
- Think if you want the model to have a bias towards creating false negative predictions or false positive ones. The way the model scores its success can have a big impact on the results. Think what is better in your case to mark 10 good leads as bad or mark 10 bad leads as good… it really depends on your setting, with medical examination results it may be very different than with commercial leads. It depends also on the needed resources spent per lead for qualifying it.
- Choose a tool, there are many, which can take your data and create a model. There are tools that assume no technical skills and there are programming libraries you have to play with to run and later fine-tune your model. Some tools are used to help select the most useful fields and some show the decision tree of the model and some just create good models but with little explanation of the internal workflow. You can look at the Salford Predictive Modeler (commercial), Weka (open source), Amazon Machine Learning, H2O, Keras, TensorFlow, scikit-learn, Google Cloud Machine Learning, Azure ML Studio and similar offerings from IBM, or many others.
- When you had your fun get an expert and do it again with them to actually have something you could use to make money. As it turns out experience plays a key role in machine learning and deep learning projects.