Machine Learning and AI

Getting the mandate to integrate automation, or machine learning and AI at your firm can be quite daunting. How are you meant to implement a strategy to effectively drive change within your organization when there seems to be limitless potential but no starting line for where to begin.

Often executives given the task of managing big data with no tangible end goal. Harnessing big data can often present several challenges, beginning with the software and hardware that is required to handle and store all of it. Another hurdle that many encounter is not just the analysis of the data set, but the ability to adapt to the size of the data set.

Big data is not just a long series of data, but also a wide one. An example would be the size of a client database on a spreadsheet. Each client would have their own row, and if there are lots of clients this data set would be very long. However, in addition to each client receiving a row they would also have a column for each variable attributed to that client, ensuring that we can collect as much data on every client as possible, and this typically creates a very wide data set as well. It is in the presence of wide data sets that machine learning tools make the best use of data.

Benefits of Machine Learning
  • Predictions, Not Causality
    • The most common application of machine learning tools is to make predictions that can be applied to common business problems
      • Forecasting long term customer loyalty
      • Making recommendations on the best customer acquisition methods
    • Machine Learning assists in situations where decisions depend upon a large range of variables that result in an outcome and that ultimately validates the result of the prediction

  • Separating the Signal from the Noise
    • Feature Extraction
      • The process of figuring out what variables the model will use
    • Regularization
      • Is the way to split the difference between a flexible model and a conservative model, and this is usually calculated by adding a “penalty for complexity” that forces the model to stay simple
    • Cross-Validation
      • This is used when you want to be sure that your model is making good predictions and want to efficiently test the model by utilizing the data you already have to simulate an out-of-sample test of prediction accuracy
      • The most important thing is that the model never sees the test-set outcomes until after the model is built, this ensures that the test set is truly “held out” data. If this partition is not kept clear you can overestimate how good the model really is

The most important thing to remember when utilizing machine learning is that a prediction model does not get confused with that of a causal model. In prediction problems, causality is not the priority, rather it is the ability to identify patterns and predict outcomes in a specific environment. One data model that works for a particular data set will not always work for another.

This post isn’t meant to warn the reader about the over reliance on the abilities of machines, however, to identify where human engagement and intuition is still needed to process and synthesize the information. The alternative is to rely purely on human judgement which can also bring bias and errors. The best managers are learning that by utilizing a blend of both approaches that they are able to make better decisions, while learning that there is always the possibility for a better way.