Welcome to the AI course projects repository. This collection showcases a variety of key concepts and techniques in artificial intelligence. Each project delves into a specific area of AI, providing hands-on experience and practical applications. These projects are designed to help you understand and implement both foundational and advanced AI algorithms.
Below are brief summaries of each project included in this repository:
Introduction: Genetic algorithms are inspired by natural selection and use ideas such as mating, mutation, and selection to model and solve optimization problems by iteratively improving solutions based on their fitness.
Problem Description: Solve the Knapsack Problem, deciding which items (snacks) to take on a picnic under constraints of weight, value, and diversity. The goal is to maximize the total value while adhering to these constraints.
Implementation Steps:
- Define Genes and Chromosomes: Chromosomes represent potential solutions, composed of genes that encode decisions about which snacks to take and in what quantities.
- Generate Initial Population: Create an initial set of random chromosomes (solutions).
- Fitness Function: Define a fitness function to evaluate how well each chromosome satisfies the problem constraints and objectives.
- Crossover and Mutation: Implement crossover (combining pairs of chromosomes) and mutation (randomly altering genes) to create new chromosomes for the next generation.
- Genetic Algorithm Execution: Run the genetic algorithm through multiple generations, selecting the best chromosomes to form new populations, aiming to improve the solutions iteratively.
- Evaluate Results: Test the algorithm with different inputs and refine the parameters to ensure optimal performance.
Some of Questions:
- Impact of small or large initial populations.
- Effect of increasing population size per generation on accuracy and speed.
- Comparison of crossover and mutation operations.
- Strategies to expedite reaching a solution.
- Issues when chromosomes stop evolving and how to address them.
- Solutions for situations where the problem has no feasible answer.
Project 2: Hidden Markov Models (HMM)
Introduction: Hidden Markov Models (HMM) are powerful tools for modeling time-series data and pattern recognition, particularly in dynamic systems with uncertainty. They are widely used in speech recognition.
Problem Description: Develop a speech recognition system for digits using HMM, utilizing a provided dataset of spoken digits from six speakers.
Implementation Steps:
- Data Preprocessing and Feature Extraction: Preprocess audio data to enhance quality and segment it. Extract features such as MFCCs from audio samples.
- Understanding HMM: Define states, observations, and transition/emission probabilities to model the system's behavior.
- HMM Implementation:
- With Libraries: Use libraries like
hmmlearnto build and train the HMM on the dataset. - From Scratch: Implement the HMM algorithm manually, including methods for state likelihood, the Expectation-Maximization (EM) step, training, and scoring.
- With Libraries: Use libraries like
- Evaluation and Analysis: Use metrics like F1 score, recall, precision, and accuracy to evaluate the model's performance. Generate confusion matrices to analyze performance.
Some of Questions:
- Utility of segmentation for this dataset.
- Detailed study of feature extraction techniques and their interrelationships.
- Robustness and sensitivity of MFCC features.
- Advantages and limitations of using MFCCs.
- Reasons for frame overlap in MFCC calculation.
- Reasons for using only the first 12-13 MFCC coefficients. ...
Introduction: Clustering involves grouping similar objects based on inherent similarities to discover natural groupings within the data. This technique is useful for applications like customer segmentation, image categorization, anomaly detection, and recommendation systems.
Problem Description: Cluster images of different flower species using clustering algorithms to group them accurately based on their features.
Implementation Steps:
- Data Preprocessing and Feature Extraction: Use the pre-trained VGG16 Convolutional Neural Network to extract features from flower images.
- Clustering Methods:
- K-Means: Choose an appropriate K value based on the number of flower categories and cluster the feature vectors.
- DBSCAN: Cluster the feature vectors using density-based clustering.
- Dimensionality Reduction: Use PCA to reduce the dimensionality of the feature vectors for visualization and comparison.
- Evaluation and Analysis: Use homogeneity and silhouette scores to evaluate the clustering results. Compare the performance of K-Means and DBSCAN.
Some of Questions:
- Reasons for feature extraction over raw pixel reading.
- Summary of three feature extraction techniques from images.
- Preprocessing steps for preparing images for the model.
- Comparison of K-Means and DBSCAN, including their pros and cons.
- Method used to determine the optimal K value in K-Means.
- Comparison of clustering results from K-Means and DBSCAN.
- Explanation and function of PCA for dimensionality reduction.
- Calculation and significance of silhouette and homogeneity scores.
- Reporting and analysis of clustering performance using these metrics.
- Suggestions for improving model performance. ...
Introduction: Machine Learning models are employed to make predictions based on data. This project focuses on predicting house prices in Boston using various machine learning techniques.
Problem Description: Predict the prices of houses in Boston based on features such as crime rate, number of rooms, and distance to employment centers.
Implementation Steps:
- Data Familiarization: Understand the dataset, including the types of features and their significance. Perform basic statistical analysis to identify distributions and outliers.
- Data Preprocessing:
- Handling Missing Values: Implement techniques like mean imputation, median imputation, or removal of missing data.
- Feature Scaling: Apply normalization or standardization to numerical features.
- Categorical Features: Encode categorical features using methods like one-hot encoding or label encoding.
- Model Training:
- Linear Regression: Train a linear regression model and evaluate using metrics like Mean Squared Error (MSE) and R² score.
- Decision Trees and Random Forests: Train decision tree and random forest models. Compare their performance against linear regression.
- Model Evaluation and Tuning: Use techniques like grid search or random search for hyperparameter tuning. Implement k-fold cross-validation to ensure robustness.
- Advanced Techniques (Optional): Explore ensemble methods and feature engineering to enhance model performance.
Some of Questions:
- Methods to handle missing values and their impact.
- Importance and methods of feature scaling.
- Differences between categorical and numerical features and their preprocessing techniques.
- Comparison of linear regression, decision trees, and random forests.
- Explanation and implementation of hyperparameter tuning.
- Evaluation metrics used for regression models and their significance.
- Strategies for model validation and ensuring robustness.
- Benefits and challenges of ensemble methods.
- Steps and importance of feature engineering.
- Reasons for using cross-validation and its effect on model performance. ...
Introduction: Convolutional Neural Networks (CNNs) are specialized deep learning models designed for processing structured grid data, such as images. They leverage spatial hierarchies in the data to perform tasks like classification, detection, and segmentation.
Problem Description: CNNs address the challenge of recognizing and interpreting patterns in visual data, including tasks such as identifying objects in an image, distinguishing between different scenes, and segmenting parts of an image for further analysis.
Implementation Steps:
- Convolutional Layers: Apply convolution operations to detect features like edges, textures, and patterns.
- Pooling Layers: Reduce the spatial dimensions through operations like max pooling or average pooling.
- Fully Connected Layers: Use dense layers to perform high-level reasoning and classification.
- Activation Functions: Introduce non-linearities using activation functions like ReLU.
- Training Process: Train the CNN using a large dataset, optimizing parameters through backpropagation and gradient descent.
- Evaluation and Fine-Tuning: Evaluate the model on validation data, fine-tune hyperparameters, and iterate to improve performance. Use techniques like dropout and data augmentation.
Some of Questions:
- Impact of different types of convolutional layers (e.g., standard, depthwise separable) on model performance.
- Effects of varying pooling layer parameters on the model's ability to generalize.
- Influence of architecture depth (number of layers) on the CNN's accuracy and training time.
- Best practices for selecting and applying activation functions in CNNs.
- Impact of different optimization techniques (e.g., Adam, SGD) on training and final model performance.
- Strategies to mitigate overfitting in CNNs, especially with smaller datasets. ...
Introduction: Reinforcement learning involves an agent exploring and interacting with its environment to gather knowledge and maximize total rewards. This project focuses on designing an AI opponent for the classic Snake game.
Problem Description: Task 1: Snake (Nostalgia) Design an AI opponent for a two-player version of the Snake game. The goal is to grow by eating apples and defeat the opponent. The game ends when one snake's head collides with another snake's body or itself, or when both heads collide, with the longer snake winning.
Implementation Steps:
- Game Rules Adaptation: Adapt the classic Snake game rules for two-player mode.
- Q-Learning Agent Training: Train your agent using the Q-learning method to maximize rewards over time.
- Observation Space Reduction: Reduce the large observation space by defining features that describe the environment, such as coordinates of the apple and the opponent snake's head.
- Define State and Action Space: Define the state space based on the reduced observation space and the action space as possible moves (up, down, left, right).
- Exploration and Exploitation: Implement strategies such as Decay Epsilon to balance exploration and exploitation during training.
- Hyperparameter Tuning and Model Saving: Experiment with different iterations and hyperparameters. Save the trained models and plot the reward earned by the model for each episode.
Competition: Snake (Scoring) Evaluate the performance of the trained snake models by competing against each other in a knockout league, with the winner determined based on the best of 101 games.
Some of Questions:
- Impact of reducing the observation space on the agent's performance.
- Effects of different distance metrics (Euclidean, Manhattan) on the Q-learning algorithm.
- Influence of the Decay Epsilon strategy on exploration-exploitation balance.
- Best practices for defining state and action spaces in a large observation environment.
- Impact of different hyperparameters on training time and performance.
- Strategies to prevent the agent from overfitting during training. ...