Machine Learning from Scratch in Elixir
April 8, 2023
It’s an exciting time for machine learning. What get’s me most excited though is the ability to easily leverage machine learning for building product features. The Elixir community has recently made big strides in making this easy. Tools like Nx, Axon, and Bumblebee make it easy to integrate machine learning into Elixir applications. If you’re anything like me though you may find it difficult to use these tools because you lack intuition on how they work under the hood. Let’s explore machine learning in Elixir without these tools.
Predict
So how do you “do” machine learning? The first thing to understand is that machine learning is all about making predictions. We might be trying to predict the price of a stock tomorrow, whether an image contains a cat, or which set of pixels should represent a sentence. All machine learning follows the same three steps: Predict, Compare, and Learn.To start making predictions we need to create, or otherwise acquire, a machine learning model and then pass it input data to make predictions. This is our first machine learning model:
input * weight = prediction
Simple right? You can think of
input
here as the information we know,
weight
as the information the model has learned, and
prediction
the information we want to know.
Let’s turn this model into Elixir code.
defmodule Model do
def run(input, weight) do
input * weight
end
end
Let’s make some predictions! What should we predict? Let’s try to predict how much money a movie will make based on its budget. Our input will be the budget, $100,000, and the weight we use will be 2.5. We’ll talk much more about how to pick a weight later.
Model.run(100_000, 2.5)
250000.0
Our model tells us that a film with a $100,000 budget will make $250,000! Is this a good prediction? Probably not but how do we know?
Compare
It’s critical that we have a way to calculate how accurate our predictions are. If we don’t know how wrong our model is we can’t know how to improve it. To assess how accurate our model is we run it for an input that we already know the correct output for.Let’s look up a movie that actually had a $200,000 budget and compare its world wide gross to what our model predicted. According to IMDB, Mad Max, released in 1980, had a budget of $200,000 and made a total of $99,750,000
input = 200_000
weight = 2.5
known_result = 99_750_000
prediction = Model.run(input, weight)
delta = prediction - known_result
error = (prediction - known_result) ** 2
IO.puts("Delta: #{delta}")
IO.puts("Error: #{error}")
Delta: -9.925e7
Error: 9.8505625e15
The
delta
tells us how far off our prediction is. We also calculate an
error
value. We will need the error value when we talk about training. For now it’s enough to know that there exist multiple ways to calculate error and the one we are using here is known as “mean squared error”.
Clearly our model isn’t very good if it is resulting in such a high error value. We need to adjust our
weight
value based on what we just
learned from this example.
Learn
The learning part of “Machine Learning” is all about updating ourweight
based on the
error
we calculate from making a prediction. Our model “learns” by making predictions and updating the
weight
until our
error
gets to 0. As with calculating error there exist multiple ways to calculate the amount to change the weight. The method we will use here is called gradient decent and it’s much simpler than it sounds.
First we’ll calculate the amount to change the weight,
weight_delta
, and then subtract that amount from our current weight. The weight delta is calculated by multiplying the
delta
, difference between our prediction and the known value, by the input. This “scales” the delta by the input and helps prevent the delta from being much to large. The delta could be a very large number even when the input is a small number.
weight_delta = delta * input
updated_weight = weight - weight_delta
19850000000002.5
We’ll add one more complication to our weight calculation,
alpha
. We’ll multiply our
weight_delta
by
alpha
.
alpha
is an arbitrary number that we can tweak to change how fast our model learns. We can increase
alpha
to make our model learn faster and decrease it to prevent our learning algorithm from overshooting.
alpha = 1
weight_delta = delta * input
updated_weight = weight - weight_delta * alpha
19850000000002.5
Here is the training function that combines all three steps to generate a new weight for our model.
defmodule ModelTrainer do
def train(weight, input, goal_prediction, current_prediction \\ nil)
def train(weight, _input, goal_prediction, goal_prediction) do
IO.puts("Training Complete, final weight: #{weight}")
end
def train(weight, input, goal_prediction, current_prediction) do
IO.puts(
"weight: #{weight}, current_prediction: #{current_prediction}, goal_prediction: #{goal_prediction}"
)
# Predict
alpha = 0.00000000001
prediction = Model.run(input, weight)
# Compare
delta = prediction - goal_prediction
_error = delta ** 2.0
# Learn
weight_delta = delta * input
updated_weight = weight - weight_delta * alpha
train(updated_weight, input, goal_prediction, prediction)
end
end
input = 200_000.0
starting_weight = 1.0
known_result = 99_750_000.0
ModelTrainer.train(starting_weight, input, known_result)
weight: 1.0, current_prediction: , goal_prediction: 9.975e7
weight: 200.1, current_prediction: 2.0e5, goal_prediction: 9.975e7
weight: 319.56, current_prediction: 4.002e7, goal_prediction: 9.975e7
...
weight: 498.74999999999994, current_prediction: 99749999.99999997, goal_prediction: 9.975e7
weight: 498.75, current_prediction: 99749999.99999999, goal_prediction: 9.975e7
Training Complete, final weight: 498.75
If we run this it will print out the learned weight,
498.75
. We can verify this by plugging it back into our model.
Model.run(200_000, 498.75)
99750000.0
Our model learned to predict the correct value! This machine learning model has some glaring limitations but hopefully it has given you some insight into how machine learning models work at a basic level. Some of the biggest issues with our model is that it doesn’t support multiple inputs or outputs. Our
ModelTrainer
also only trains on a single example. Useful machine learning models are trained on many examples. We’ll address these issues and more in a future post.