Adam is yet another stochastic gradient descent technique, building on Adadelta and RMSProp it fixes the shortcoming of Adagrad by using two running average in its calculation.
## Credit
Check out this blogpost for more gradient descent explanation: #adam
The music is taken from Youtube music!
## Table of Content
Introduction: 0:00
Theory: 0:21
Python Implementation: 3:49
Conclusion: 12:04
Here is an explanation of Adam from the blog post mentioned