Raul Chedrese

Machine Learning from Scratch in Elixir

It’s an exciting time for machine learning. What get’s me most excited though is the ability to easily leverage machine learning for building product features. The Elixir community has recently made big strides in making this easy. Tools like Nx, Axon, and Bumblebee make it easy to integrate machine learning into Elixir applications. If you’re anything like me though you may find it difficult to use these tools because you lack intuition on how they work under the hood. Let’s explore machine learning in Elixir without these tools.


So how do you “do” machine learning? The first thing to understand is that machine learning is all about making predictions. We might be trying to predict the price of a stock tomorrow, whether an image contains a cat, or which set of pixels should represent a sentence. All machine learning follows the same three steps: Predict, Compare, and Learn.

To start making predictions we need to create, or otherwise acquire, a machine learning model and then pass it input data to make predictions. This is our first machine learning model:

input * weight = prediction

Simple right? You can think of input here as the information we know, weight as the information the model has learned, and prediction the information we want to know.

Let’s turn this model into Elixir code.
defmodule Model do
  def run(input, weight) do
    input * weight
Let’s make some predictions! What should we predict? Let’s try to predict how much money a movie will make based on its budget. Our input will be the budget, $100,000, and the weight we use will be 2.5. We’ll talk much more about how to pick a weight later.
Model.run(100_000, 2.5)
Our model tells us that a film with a $100,000 budget will make $250,000! Is this a good prediction? Probably not but how do we know?


It’s critical that we have a way to calculate how accurate our predictions are. If we don’t know how wrong our model is we can’t know how to improve it. To assess how accurate our model is we run it for an input that we already know the correct output for.

Let’s look up a movie that actually had $200,000 budget and compare its world wide gross to what our model predicted. According to IMDB, Mad Max, released in 1980, had a budget of $200,000 and made a total of $99,750,000
input = 200_000
weight = 2.5
known_result = 99_750_000

prediction = Model.run(input, weight)
delta = prediction - known_result
error = (prediction - known_result) ** 2

IO.puts("Delta: #{delta}")
IO.puts("Error: #{error}")
Delta: -9.925e7
Error: 9.8505625e15
The delta tells us how far off our prediction is. We also calculate an error value. We will need the error value when we talk about training. For now it’s enough to know that there exist multiple ways to calculate error and the one we are using here is known as “mean squared error”.
<br />Clearly our model isn’t very good if it is resulting in such a high error value. We need to adjust our weight value based on what we just learned from this example.


The learning part of “Machine Learning” is all about updating our weight based on the error we calculate from making a prediction. Our model “learns” by making predictions and updating the weight until our error gets to 0. As with calculating error there exist multiple ways to calculate the amount to change the weight. The method we will use here is called gradient decent and it’s much simpler than it sounds.

First we’ll calculate the amount to change the weight, weight_delta, and then subtract that amount from our current weight. The weight delta is calculated by multiplying the delta, difference between our prediction and the known value, by the input. This “scales” the delta by the input and helps prevent the delta from being much to large. The delta could be a very large number even when the input is a small number.
weight_delta = delta * input
updated_weight = weight - weight_delta
We’ll add one more complication to our weight calculation, alpha. We’ll multiply our weight_delta by alpha. alpha is an arbitrary number that we can tweak how our model learns. We can increase alpha to make our model learn faster and decrease it to prevent our learning algorithm from overshooting.
alpha = 1
weight_delta = delta * input
updated_weight = weight - weight_delta * alpha
Here is the training function that combines all three steps to generate a new weight for our model.
defmodule ModelTrainer do
  def train(weight, input, goal_prediction, current_prediction \\ nil)

  def train(weight, _input, goal_prediction, goal_prediction) do
    IO.puts("Training Complete, final weight: #{weight}")

  def train(weight, input, goal_prediction, current_prediction) do
      "weight: #{weight}, current_prediction: #{current_prediction}, goal_prediction: #{goal_prediction}"

    # Predict
    alpha = 0.00000000001
    prediction = Model.run(input, weight)

    # Compare
    delta = prediction - goal_prediction
    _error = delta ** 2.0

    # Learn
    weight_delta = delta * input
    updated_weight = weight - weight_delta * alpha
    train(updated_weight, input, goal_prediction, prediction)
input = 200_000.0
starting_weight = 1.0
known_result = 99_750_000.0

ModelTrainer.train(starting_weight, input, known_result)
weight: 1.0, current_prediction: , goal_prediction: 9.975e7
weight: 200.1, current_prediction: 2.0e5, goal_prediction: 9.975e7
weight: 319.56, current_prediction: 4.002e7, goal_prediction: 9.975e7
weight: 498.74999999999994, current_prediction: 99749999.99999997, goal_prediction: 9.975e7
weight: 498.75, current_prediction: 99749999.99999999, goal_prediction: 9.975e7
Training Complete, final weight: 498.75
If we run this it will print out the learned weight, 498.75. We can verify this by plugging it back into our model.
Model.run(200_000, 498.75)
Our model learned to predict the correct value! This machine learning model has some glaring limitations but hopefully it has given you some insight into how machine learning models work at basic level. Some of the biggest issues with our model is that it doesn’t support multiple inputs or outputs. Our ModelTrainer also only trains on a single example. Useful machine learning models are trained on many examples. We’ll address these issues and more in a future post.