Is there wights of voice or audio for VGG or Inception?
- I want to use
VGG16(orVGG19) for voice clustering task. - I read some articles which suggest to use
VGG(16 or 19) in order to build the embedding vector for the clustering algorithm. - The process is to convert the wav file into
mfccorplot (Amp vs Time)and use this as input toVGGmodel. - I tried it out with
VGG19(andweights='imagenet'). - I got bad results, and I assumed it because I'm using
VGGwith wrong weights (weights of images (imagenet))
So:
- Are there any audio/voice per-trained weights for VGG ?
- If not, are there other per-trained audio /voice models ?
Topic vgg16 transfer-learning inception feature-engineering deep-learning
Category Data Science