Softmax

Here is a super-short introduction to Softmax. Softmax is an important function in Machine Learning that converts a list of numbers into a list of probabilities. To illustrate, let us say you are deciding on a city to spend your summer vacation in, and you have three options: London, Paris, and Tokyo. Based on various factors like weather, food and transportation, you have assigned a “score” to each city.

London: 4
Paris: 2
Tokyo: 3

By using the softmax function, you can compute the probabilities of having a “good” vacation. First, calculate the exponential (e^x) of each score. Next, normalize the numbers by dividing each exponential by the sum of all exponentials.

London: e^4 / (e^4 + e^2 + e^3) = 0.70
Paris: e^2 / (e^4 + e^2 + e^3) = 0.09
Tokyo: e^3 / (e^4 + e^2 + e^3) = 0.21

In this example, the probability of having a “good” vacation in London is 0.70, or 70%.