Today Yann Lecun tweeted about a blog post, written by Alexander Rush, proposing we use named dimensions in tensors. I think this idea is good by itself, because it abstracts the complexity of having to deal with tensor dimensions directly.
Most people are probably familiar with the fact that, in a tensor containing images, dimension 0 corresponds to the batch, dimension 1 to the height, 2 to the width and 3 to the channels; and this fact occurs whether you’re using PyTorch, TensorFlow, or any other DL library. The same happens for the most common tensors in NLP, for example in a padded batch dim 0 is the batch size, dim 1 the max sequence length, and dim 2 the hidden dimension (vector dimension). So it makes sense to address tensor dimensions according to their names instead of their index, and hide the batch dimension which appears everywhere.
I found this blog post by Alex Riley to be very useful for understanding how
einsum in particular, and the Einstein notation in general work. Also this stackoverflow answer contains lots of good examples.
Before this morning I had no idea about einsum, but now that I know about it I kind of understand its potential, and thought someone else might find it interesting. On the other hand, I’m interested in seeing whether named tensors become a thing in the near future. They would definitely lower the entry barrier for those that want to develop more complex DL models, and improve software quality around tensor manipulation.