Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
7088_Spoken_LID_CNN/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
19 lines (13 sloc)
1.2 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 7088_Spoken_LID_CNN | |
Language Identification (LID) of spoken audio using Convolutional Neural Networks (CNNs) on Mel-Spectrograms of the audio clips. | |
- | |
**Original, raw dataset was pruned to 2000 audio clips per language.** Download the full datasets from: https://commonvoice.mozilla.org/en/datasets | |
**Curated datasets have been pruned due to high upload sizes** | |
**Models haven't been included due to them being ~700MB** | |
Find the models at: https://livecoventryac-my.sharepoint.com/:f:/g/personal/shawc15_uni_coventry_ac_uk/Es74pTfbWIpBjVkXBASkYe0BB-Bid47fntVuYAPMLKyXFw?e=kNiJqP | |
(Only Coventry University Outlook accounts have access.) | |
Add the models to the "_7088_Spoken_LID_CNN/models/_" directory. | |
**FFMPEG Executable files, used for converting MP3 files to WAV files haven't been included in the commit. Find them at: https://ffmpeg.org/** | |
Put these FFMPEG executable files (ffmpeg.exe, ffprobe.exe & ffplay.exe) in the "_7088_Spoken_LID_CNN/datasets/_" & "_7088_Spoken_LID_CNN/prototyping/outsider_dataset/_" directories. Or, find another way to convert the MP3 audio files to WAV format. | |
- | |
This repository is submited in conjunction with the project report. Refer to the report and Appendix 1 if any confusion arises. |