Skip to content
Permalink
3a2927f3e0
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
19 lines (13 sloc) 1.2 KB
# 7088_Spoken_LID_CNN
Language Identification (LID) of spoken audio using Convolutional Neural Networks (CNNs) on Mel-Spectrograms of the audio clips.
-
**Original, raw dataset was pruned to 2000 audio clips per language.** Download the full datasets from: https://commonvoice.mozilla.org/en/datasets
**Curated datasets have been pruned due to high upload sizes**
**Models haven't been included due to them being ~700MB**
Find the models at: https://livecoventryac-my.sharepoint.com/:f:/g/personal/shawc15_uni_coventry_ac_uk/Es74pTfbWIpBjVkXBASkYe0BB-Bid47fntVuYAPMLKyXFw?e=kNiJqP
(Only Coventry University Outlook accounts have access.)
Add the models to the "_7088_Spoken_LID_CNN/models/_" directory.
**FFMPEG Executable files, used for converting MP3 files to WAV files haven't been included in the commit. Find them at: https://ffmpeg.org/**
Put these FFMPEG executable files (ffmpeg.exe, ffprobe.exe & ffplay.exe) in the "_7088_Spoken_LID_CNN/datasets/_" & "_7088_Spoken_LID_CNN/prototyping/outsider_dataset/_" directories. Or, find another way to convert the MP3 audio files to WAV format.
-
This repository is submited in conjunction with the project report. Refer to the report and Appendix 1 if any confusion arises.