*97*

**Cosine Similarity **is a measure of the similarity between two vectors of an inner product space.

For two vectors, A and B, the Cosine Similarity is calculated as:

**Cosine Similarity** = Î£A_{i}B_{i} / (âˆšÎ£A_{i}^{2}âˆšÎ£B_{i}^{2})

This tutorial explains how to calculate the Cosine Similarity between vectors in Python using functions from the NumPy library.

**Cosine Similarity Between Two Vectors in Python**

The following code shows how to calculate the Cosine Similarity between two arrays in Python:

from numpy import dot from numpy.linalg import norm #define arrays a = [23, 34, 44, 45, 42, 27, 33, 34] b = [17, 18, 22, 26, 26, 29, 31, 30] #calculate Cosine Similarity cos_sim = dot(a, b)/(norm(a)*norm(b)) cos_sim 0.965195008357566

The Cosine Similarity between the two arrays turns out to be **0.965195**.

Note that this method will work on two arrays of any length:

import numpy as np from numpy import dot from numpy.linalg import norm #define arrays a = np.random.randint(10, size=100) b = np.random.randint(10, size=100) #calculate Cosine Similarity cos_sim = dot(a, b)/(norm(a)*norm(b)) cos_sim 0.7340201613960431

However, it only works if the two arrays are of equal length:

import numpy as np from numpy import dot from numpy.linalg import norm #define arrays a = np.random.randint(10, size=90) #length=90 b = np.random.randint(10, size=100) #length=100 #calculate Cosine Similarity cos_sim = dot(a, b)/(norm(a)*norm(b)) cos_sim ValueError: shapes (90,) and (100,) not aligned: 90 (dim 0) != 100 (dim 0)

**Notes**

**1.** There are multiple ways to calculate the Cosine Similarity using Python, but as this Stack Overflow thread explains, the method explained in this post turns out to be the fastest.

**2. **Refer to this Wikipedia page to learn more details about Cosine Similarity.