Skip to content
Permalink
3a2927f3e0
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
19 lines (13 sloc) 1.2 KB

7088_Spoken_LID_CNN

Language Identification (LID) of spoken audio using Convolutional Neural Networks (CNNs) on Mel-Spectrograms of the audio clips.

Original, raw dataset was pruned to 2000 audio clips per language. Download the full datasets from: https://commonvoice.mozilla.org/en/datasets Curated datasets have been pruned due to high upload sizes

Models haven't been included due to them being ~700MB Find the models at: https://livecoventryac-my.sharepoint.com/:f:/g/personal/shawc15_uni_coventry_ac_uk/Es74pTfbWIpBjVkXBASkYe0BB-Bid47fntVuYAPMLKyXFw?e=kNiJqP (Only Coventry University Outlook accounts have access.) Add the models to the "7088_Spoken_LID_CNN/models/" directory.

FFMPEG Executable files, used for converting MP3 files to WAV files haven't been included in the commit. Find them at: https://ffmpeg.org/ Put these FFMPEG executable files (ffmpeg.exe, ffprobe.exe & ffplay.exe) in the "7088_Spoken_LID_CNN/datasets/" & "7088_Spoken_LID_CNN/prototyping/outsider_dataset/" directories. Or, find another way to convert the MP3 audio files to WAV format.

This repository is submited in conjunction with the project report. Refer to the report and Appendix 1 if any confusion arises.