Picture of Yee Wei Law

Transformer and attention

by Yee Wei Law - Sunday, 2 February 2025, 3:53 PM
 

References

[BCB15] D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, in ICLR, 2015. https://doi.org/10.48550/arXiv.1409.0473.
[BB24] C. M. Bishop and H. Bishop, Deep Learning: Foundations and Concepts, Springer Cham, 2024. https://doi.org/10.1007/978-3-031-45468-4.
[VSP+17] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. U. Kaiser, and I. Polosukhin, Attention is all you need, in Advances in Neural Information Processing Systems (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), 30, Curran Associates, Inc., 2017. Available at https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.

» Artificial intelligence (including machine learning which includes deep learning)