# 7088_Spoken_LID_CNN
Language Identification (LID) of spoken audio using Convolutional Neural Networks (CNNs) on Mel-Spectrograms of the audio clips.
**Original, raw dataset was pruned to 2000 audio clips per language.** Download the full datasets from:
**Curated datasets have been pruned due to high upload sizes**
**Models haven't been included due to them being ~700MB**
Find the models at:
(Only Coventry University Outlook accounts have access.)
Add the models to the "_7088_Spoken_LID_CNN/models/_" directory.
**FFMPEG Executable files, used for converting MP3 files to WAV files haven't been included in the commit. Find them at:**
Put these FFMPEG executable files (ffmpeg.exe, ffprobe.exe & ffplay.exe) in the "_7088_Spoken_LID_CNN/datasets/_" & "_7088_Spoken_LID_CNN/prototyping/outsider_dataset/_" directories. Or, find another way to convert the MP3 audio files to WAV format.
This repository is submited in conjunction with the project report. Refer to the report and Appendix 1 if any confusion arises.