This project is Part B of the coursework for the module Web Applications and AI (7051CEM) at Coventry University. The goal of this part is to develop a classification model to predict the event type of new event listings based on their attributes (e.g., location, date, description). The model is trained using the WEKA Java API, specifically employing the J48 decision tree algorithm (an implementation of the C4.5 algorithm).
- Event Type Prediction: Predicts the type of an event (e.g., Conference, Wedding, Workshop, Party) based on its attributes.
- Data Preprocessing:
- Converts date attributes to numeric values (days since epoch) for compatibility with the J48 algorithm.
- Transforms textual descriptions into word vectors using WEKA's
StringToWordVector
filter. - Removes irrelevant attributes (e.g., location) that are not useful for classification.
- Model Training and Evaluation:
- Trains a J48 decision tree classifier on a dataset of event listings.
- Evaluates the model using 10-fold cross-validation to assess its performance.
- Prediction: Provides predictions for new event listings with attributes like location, date, and description.
- Java: Core programming language for implementing the classifier.
- WEKA Java API: Used for machine learning tasks, including data preprocessing, model training, and evaluation.
J48
: Decision tree classifier (C4.5 algorithm).StringToWordVector
: For converting text descriptions into numeric features.DateToNumeric
: For converting date attributes into numeric values.Remove
: For filtering out irrelevant attributes.
- Dataset Format: ARFF (Attribute-Relation File Format) compatible with WEKA.
The dataset used for training the model is an extended version of the sample data provided in the assignment brief. The original dataset included 9 events across four event types: Conference, Wedding, Workshop, and Party. To ensure effective training and testing, the dataset was expanded to 100 rows with a balanced distribution across all event types (25 events per type).
Event Type | Event Name | Event Features |
---|---|---|
Conference | Tech Innovations | Location: Edinburgh, Date: 2025-06-10, Description: Showcase of tech trends |
Wedding | Sarah & Tom’s Wedding | Location: Manchester, Date: 2025-03-15, Description: Private family wedding |
Workshop | Java Programming Workshop | Location: Coventry, Date: 2025-01-28, Description: Learn Java Basics |
Party | New Year Bash | Location: Birmingham, Date: 2025-12-31, Description: Celebrate the New Year |
- Total Rows: 100
- Event Types: Conference, Wedding, Workshop, Party (25 rows each)
- Attributes:
location
: City where the event takes place (e.g., Manchester, London).date
: Event date inyyyy-MM-dd
format (e.g., 2025-03-15).description
: A brief description of the event (e.g., "Private family wedding").eventType
: The class label (Conference, Wedding, Workshop, Party).
- Generation Method: Additional data was generated manually to ensure variety in locations, dates, and descriptions, while maintaining realism (e.g., weddings often have descriptions like "Garden wedding celebration," while workshops include terms like "Intro to" or "Basics").
The dataset is stored in the events.arff
file, located in the resources directory.