Neural Network Learns to Generate Voice (RNN/LSTM)

[VOLUME WARNING] This is what happens when you throw raw audio (which happens to be a cute voice) into a neural network and then tell it to spit out what it’s learned. This is a recursive neural network (LSTM type) with 3 layers of 680 neurons each, trying to find patterns in audio and reproduce them as well as it can. It’s not a particularly big network considering the complexity and size of the data, mostly due to computing constraints, which makes me even more impressed with what it managed to do. The
Back to Top