Skip to content

thammireda/WekaAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

3382eee · Apr 2, 2025

History

3 Commits
Mar 31, 2025
Mar 31, 2025
Apr 2, 2025

Repository files navigation

WekaAI

Event Type Prediction Using WEKA Java API

Overview

This project is Part B of the coursework for the module Web Applications and AI (7051CEM) at Coventry University. The goal of this part is to develop a classification model to predict the event type of new event listings based on their attributes (e.g., location, date, description). The model is trained using the WEKA Java API, specifically employing the J48 decision tree algorithm (an implementation of the C4.5 algorithm).

Features

  • Event Type Prediction: Predicts the type of an event (e.g., Conference, Wedding, Workshop, Party) based on its attributes.
  • Data Preprocessing:
    • Converts date attributes to numeric values (days since epoch) for compatibility with the J48 algorithm.
    • Transforms textual descriptions into word vectors using WEKA's StringToWordVector filter.
    • Removes irrelevant attributes (e.g., location) that are not useful for classification.
  • Model Training and Evaluation:
    • Trains a J48 decision tree classifier on a dataset of event listings.
    • Evaluates the model using 10-fold cross-validation to assess its performance.
  • Prediction: Provides predictions for new event listings with attributes like location, date, and description.

Technologies Used

  • Java: Core programming language for implementing the classifier.
  • WEKA Java API: Used for machine learning tasks, including data preprocessing, model training, and evaluation.
    • J48: Decision tree classifier (C4.5 algorithm).
    • StringToWordVector: For converting text descriptions into numeric features.
    • DateToNumeric: For converting date attributes into numeric values.
    • Remove: For filtering out irrelevant attributes.
  • Dataset Format: ARFF (Attribute-Relation File Format) compatible with WEKA.

Dataset

The dataset used for training the model is an extended version of the sample data provided in the assignment brief. The original dataset included 9 events across four event types: Conference, Wedding, Workshop, and Party. To ensure effective training and testing, the dataset was expanded to 100 rows with a balanced distribution across all event types (25 events per type).

Sample Data (Original)

Event Type Event Name Event Features
Conference Tech Innovations Location: Edinburgh, Date: 2025-06-10, Description: Showcase of tech trends
Wedding Sarah & Tom’s Wedding Location: Manchester, Date: 2025-03-15, Description: Private family wedding
Workshop Java Programming Workshop Location: Coventry, Date: 2025-01-28, Description: Learn Java Basics
Party New Year Bash Location: Birmingham, Date: 2025-12-31, Description: Celebrate the New Year

Dataset Expansion

  • Total Rows: 100
  • Event Types: Conference, Wedding, Workshop, Party (25 rows each)
  • Attributes:
    • location: City where the event takes place (e.g., Manchester, London).
    • date: Event date in yyyy-MM-dd format (e.g., 2025-03-15).
    • description: A brief description of the event (e.g., "Private family wedding").
    • eventType: The class label (Conference, Wedding, Workshop, Party).
  • Generation Method: Additional data was generated manually to ensure variety in locations, dates, and descriptions, while maintaining realism (e.g., weddings often have descriptions like "Garden wedding celebration," while workshops include terms like "Intro to" or "Basics").

The dataset is stored in the events.arff file, located in the resources directory.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages