Embedding layer

It gets to represent the meaning of a bag of words.

You need to provide the number of dimensions so the layer can represent in space the similarity between the words. A 2 would be the similarity on a plane and 3 on a 3d space.

With pyTorch you use

nn.Embedding(vocabulary_size, dimensions)

It is a NN we create where the inputs are all the words in the vocabulary.

If we draw the training from the embedding layer we will see that words with similar meaning form clusters