A comprehensive, educational implementation of various autoencoder architectures for image processing. This project demonstrates dimensionality reduction, image reconstruction, denoising, similarity search, and image morphing using deep learning.
- π START HERE - Get started in 5 minutes with quick commands and examples
- π PROJECT SUMMARY - Comprehensive guide covering concepts, architecture, and upgrade details
This project implements three types of autoencoders with increasing complexity:
- PCA Autoencoder - Linear dimensionality reduction similar to Principal Component Analysis
- Convolutional Autoencoder - Deep CNN-based encoder-decoder for better image reconstruction
- Denoising Autoencoder - Learns robust features by reconstructing clean images from noisy inputs
Additionally, the project includes:
- Image Retrieval - Find similar images using learned latent representations
- Image Morphing - Smooth interpolation between images in latent space
Input β Flatten β Dense(code_size) β Dense(original_size) β Reshape β Output
A simple linear autoencoder that compresses images into a low-dimensional code, similar to PCA but learned through backpropagation.
Encoder: Conv2D + MaxPool (Γ4) β Flatten β Dense(code_size)
Decoder: Dense β Reshape β Conv2DTranspose (Γ4)
Uses convolutional layers to learn hierarchical features and transpose convolutions to reconstruct images.
Same architecture as convolutional autoencoder, but trained with corrupted inputs and clean targets to learn robust, noise-invariant features.
- Python 3.8 or higher
- pip package manager
-
Clone the repository
git clone https://github.com/yourusername/Autoencoders-Decoders-using-Keras.git cd Autoencoders-Decoders-using-Keras -
Create a virtual environment (recommended)
# Windows python -m venv venv venv\Scripts\activate # Linux/Mac python3 -m venv venv source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
The project provides a unified command-line interface through main.py:
Train a simple linear autoencoder (quick, good for understanding basics):
python main.py --mode pca --epochs 15 --code-size 32Train a deep CNN autoencoder (best reconstruction quality):
python main.py --mode convolutional --epochs 25 --code-size 32 --batch-size 32Train an autoencoder to remove noise from images:
python main.py --mode denoising --epochs 25 --code-size 512 --noise-sigma 0.1Find similar images using learned representations:
python main.py --mode retrieval --model-path saved_models --n-queries 5 --n-neighbors 5Create smooth transitions between images:
python main.py --mode morphing --model-path saved_models --n-pairs 5 --n-steps 7| Argument | Type | Default | Description |
|---|---|---|---|
--mode |
str | required | Operation mode: pca, convolutional, denoising, retrieval, morphing |
--image-size |
int | 32 | Image dimensions (width/height) |
--test-size |
float | 0.1 | Fraction of data for testing |
| Argument | Type | Default | Description |
|---|---|---|---|
--code-size |
int | 32 | Latent code dimension (use 512 for denoising) |
--epochs |
int | 25 | Number of training epochs |
--batch-size |
int | 32 | Training batch size |
| Argument | Type | Default | Description |
|---|---|---|---|
--noise-sigma |
float | 0.1 | Gaussian noise standard deviation |
| Argument | Type | Default | Description |
|---|---|---|---|
--n-queries |
int | 3 | Number of query images |
--n-neighbors |
int | 5 | Number of similar images to retrieve |
| Argument | Type | Default | Description |
|---|---|---|---|
--n-pairs |
int | 5 | Number of image pairs to morph |
--n-steps |
int | 7 | Interpolation steps |
| Argument | Type | Default | Description |
|---|---|---|---|
--save-dir |
str | saved_models | Directory to save trained models |
--checkpoint-dir |
str | checkpoints | Directory for training checkpoints |
--model-path |
str | saved_models | Path to load trained models |
Autoencoders-Decoders-using-Keras/
β
βββ main.py # Main entry point with CLI
β
βββ models/ # Model architectures
β βββ __init__.py
β βββ pca_autoencoder.py
β βββ convolutional_autoencoder.py
β βββ denoising_autoencoder.py
β
βββ utils/ # Utility functions
β βββ __init__.py
β βββ data_loader.py # Dataset loading and preprocessing
β βββ visualization.py # Plotting and visualization
β βββ noise.py # Noise generation utilities
β
βββ image_retrieval.py # Similarity search implementation
βββ image_morphing.py # Image interpolation
β
βββ requirements.txt # Python dependencies
βββ README.md # This file
An autoencoder is a neural network that learns to compress data into a lower-dimensional representation (encoding) and then reconstruct the original data from this compressed form (decoding).
Original β [Encoder] β Compressed Code β [Decoder] β Reconstruction
- Dimensionality Reduction - Compress high-dimensional data
- Feature Learning - Learn meaningful representations automatically
- Denoising - Remove noise while preserving important features
- Anomaly Detection - Identify unusual patterns
- Generative Modeling - Create new, similar data
The compressed representation learned by the encoder. Points close together in latent space represent similar images.
The difference between input and output (typically Mean Squared Error). Lower loss means better reconstruction.
The smallest layer (latent code) that forces the network to learn efficient representations.
After training, you'll find:
- Training curves - Loss over epochs showing model improvement
- Reconstructions - Original vs reconstructed images showing quality
- Denoising examples - Original β Noisy β Denoised progression
- Similar images - Query image with nearest neighbors
- Morphing sequences - Smooth transitions between image pairs
All results are saved in the results/ directory.
The project uses the Labeled Faces in the Wild (LFW) dataset, which contains face images. The dataset is automatically downloaded via scikit-learn. If unavailable, synthetic data is generated for demonstration.
- Start with PCA - Quick training to verify setup
- Use GPU - Significantly faster for convolutional models
- Monitor loss - Should decrease steadily during training
- Adjust code size - Smaller = more compression, larger = better quality
- Early stopping - Training stops automatically if no improvement
Models are automatically saved during training. Best models are kept based on validation loss.
To use your own images, modify utils/data_loader.py:
def load_custom_dataset(image_dir, img_size=32):
# Load your images here
images = []
# ... your loading code ...
return np.array(images)Experiment with different configurations:
# Larger latent code for better quality
python main.py --mode convolutional --code-size 128 --epochs 50
# Stronger denoising
python main.py --mode denoising --noise-sigma 0.2 --code-size 512Trained models are saved in saved_models/:
encoder.weights.h5- Encoder weightsdecoder.weights.h5- Decoder weights
- Start with PCA (
models/pca_autoencoder.py) - Simple linear transformations - Move to CNN (
models/convolutional_autoencoder.py) - Hierarchical features - Explore denoising (
models/denoising_autoencoder.py) - Robust learning
- Encoder-Decoder architecture - Symmetric compression and reconstruction
- Transpose convolution - Upsampling in the decoder
- Latent space - Learned feature representations
- Transfer learning - Encoder features useful for other tasks
Contributions are welcome! Areas for improvement:
- Additional autoencoder variants (VAE, Ξ²-VAE)
- More noise types (salt-and-pepper, blur)
- Different architectures (ResNet-based, U-Net)
- Additional applications (style transfer, inpainting)
This project is licensed under the MIT License - see the LICENSE file for details.
- Original dataset: Labeled Faces in the Wild
- Inspired by classical computer vision and deep learning research
- Built with TensorFlow/Keras
For questions or suggestions, please open an issue on GitHub.
Happy Learning! π
This project is designed to be educational, demonstrating key concepts in autoencoders while maintaining professional code quality and industry standards.