ift6266tr.wordpress.com
About this blog | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/01/28/about-this-blog
You’re at the best WordPress.com site ever. This blog will be a logbook of experiments for the speech synthesis project made under Yoshua Bengio’s course at Université de Montréal. Project details and course material. January 28, 2014. First understanding →. Leave a Reply Cancel reply. Enter your comment here. Fill in your details below or click an icon to log in:. Address never made public). You are commenting using your WordPress.com account. ( Log Out. Notify me of new comments via email.
ift6266tr.wordpress.com
First experiment | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/02/27/first-experiment
You’re at the best WordPress.com site ever. As I said in my previous post, a long time ago, I tried what Hubert. Did: a small neural network with one hidden layer, tanh activation function and linear output with the mean squared error as loss function. So, I trained this model on the firsts sentences contained in the TIMIT dataset (FCJF0) by dividing acoustics samples in frames containing 240 acoustics samples plus one as a target. The goal is still to predict acoustics samples from previous one! I’...
ift6266tr.wordpress.com
First understanding | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/02/10/first-understanding
You’re at the best WordPress.com site ever. I never studied speech signal at all and didn’t know the composition of a signal. I never played with sounds or music, I don’t even play music! What is the format of a sequence? What do we need to match with what? What is in a .wav file? What to do with it? What are our inputs? As you can understand, I was completely lost and I didn’t manage to focus on what I had to do to move forward, despite the very good explanations provided by. February 10, 2014. You are ...
ift6266tr.wordpress.com
TIMIT – FCJF0 Comparisons | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/03/20/timit-fcjf0-comparisons/comment-page-1
You’re at the best WordPress.com site ever. TIMIT – FCJF0 Comparisons. This post will be like a comparison for the different posts I saw on the data of the first speaker (. As we need a standard way to compare our result, I will divide by the standard deviation too (of the whole dataset, that is 560). We also need to “rescale” our MSE by multiplying it by this standard deviation. I tried to match match William’s results. That’s why a standardized way to compare our work is essential. Provided a solution ...
ift6266tr.wordpress.com
February | 2014 | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/02
You’re at the best WordPress.com site ever. Monthly Archives: February 2014. As I said in my previous post, a long time ago, I tried what Hubert. Did: a small neural network with one hidden layer, tanh activation function and linear output with the mean squared error as loss function. This leads to 382 721 frames or examples to train on. I divided it as follow: 80% for train set, 10% for valid and test sets. Learning rate: 0.001. Some acoustics samples generated by the model:. I’ll try Theano, mayb...
ift6266tr.wordpress.com
March | 2014 | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/03
You’re at the best WordPress.com site ever. Monthly Archives: March 2014. TIMIT – FCJF0 Comparisons. This post will be like a comparison for the different posts I saw on the data of the first speaker (. As we need a standard way to compare our result, I will divide by the standard deviation too (of the whole dataset, that is 560). We also need to “rescale” our MSE by multiplying it by this standard deviation. I tried to match match William’s results. Learning rate: 0.01. Provided a solution for this....
ift6266tr.wordpress.com
Sparse-coded TIMIT | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/05/01/sparse-coded-timit
You’re at the best WordPress.com site ever. The plan, at this point, is to use a gammatone dictionary and sparse coding to train more efficiently on the TIMIT dataset. Gammatone functions are filters applied on frequencies. The goal is to keep frequencies processed by our ears. Joao explains this greatly in his blog. The idea is to use this sparse-coded version as input of a spike and slab RBM segmented by frames of 160 samples and with the previous, current and next phones. I kept the parameters use...
ift6266tr.wordpress.com
Thomas Rohée | Speech synthesis experiments
https://ift6266tr.wordpress.com/author/mennerve
You’re at the best WordPress.com site ever. Author Archives: Thomas Rohée. The plan, at this point, is to use a gammatone dictionary and sparse coding to train more efficiently on the TIMIT dataset. Gammatone functions are filters applied on frequencies. The goal is to keep frequencies processed by our ears. Joao explains this greatly in his blog. The idea is to use this sparse-coded version as input of a spike and slab RBM segmented by frames of 160 samples and with the previous, current and next phones...
ift6266h14.wordpress.com
Mar13 | Representation Learning - ift6266h14
https://ift6266h14.wordpress.com/2014/03/11/mar13/comment-page-1
Representation Learning – ift6266h14. Yoshua Bengio's graduate class on representation learning and deep learning. Final Exam →. Posted by Yoshua Bengio. Please study the following material in preparation for the March 13th class:. Of book draft on Deep Learning. Directory includes bibliography) by Y. Bengio, I. Goodfellow and A. Courville. Please put up your questions below, as replies to this post. 33 thoughts on “ Mar13. March 11, 2014 at 10:03. March 11, 2014 at 11:10. March 11, 2014 at 23:39. You ca...
ift6266tr.wordpress.com
First experiment | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/02/27/first-experiment/comment-page-1
You’re at the best WordPress.com site ever. As I said in my previous post, a long time ago, I tried what Hubert. Did: a small neural network with one hidden layer, tanh activation function and linear output with the mean squared error as loss function. So, I trained this model on the firsts sentences contained in the TIMIT dataset (FCJF0) by dividing acoustics samples in frames containing 240 acoustics samples plus one as a target. The goal is still to predict acoustics samples from previous one! I’...
SOCIAL ENGAGEMENT