Skip to content

philippefutureboy/files-integrity-check

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Files Integrity Check Tool

This tool scans files from a given root path to detect potential corruption or integrity issues. It supports multiple file types, including Git repositories, images, PDFs, documents, spreadsheets, archives, text files, scripts, and media files.

Features

  • Multi-threaded for efficiency
  • Supports rerun of previous results
  • Real-time CSV logging
  • Safe handling of file paths in CSV
  • Automatic detection of file types and associated integrity checks
  • Code is provided as-is, no tests implemented, and no guarantee this works on another platform than MacOS

Usage

Running the Program

python3 -m files_integrity_check --path <root_path> --outpath <output_csv> [--ignore <patterns>] [--rerun <previous_csv>]

Arguments

  • --path: Root path to check (required)
  • --outpath: Path to the output CSV file (required)
  • --ignore: Space-separated list of patterns to ignore (optional)
  • --rerun: Path to a previous CSV file to rerun failed checks (optional)

Example Usage

python3 -m files_integrity_check --path /mnt/data --outpath results.csv --ignore '*.tmp' '*.log'

Installation

Setting Up the Python Environment

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Installing External Dependencies

Depending on the file types you want to check, you may need some external programs installed on your system. Below are the main dependencies and how to install them.

Ubuntu/Debian

sudo apt update
sudo apt install ffmpeg git npm nodejs tsc make bash fish zsh inkscape

MacOS

brew update
brew install ffmpeg git npm nodejs tsc make bash fish zsh inkscape

Windows

Supported File Types

File type Extensions Supported Python dependencies External dependencies
Archives .zip, .gzip, .gz, .tar, .tgz Yes None None
Binary Files * Yes pefile (Windows) otool (Mac), readelf (Linux)
Code Files .py, .js, .jsx, .ts, .tsx Yes esprima, fonttools, PyYAML, lxml git, npm, node, tsc
CSV/TSV .csv, .tsv Yes None None
Documents .docx, .xlsx, .odt, .ods, .odp, .doc, .xls, .xlsm, .xlsb Yes python-docx, openpyxl, odfpy None
Environment and Config .ini, .cfg, .conf, .env, .env.*, .yml, .yaml, Makefile, toml Yes PyYAML None
Fonts .ttf, .otf Yes fonttools None
Git Repos .git Yes None git
HTML .html, .htm Yes beautifulsoup4 None
Images .jpg, .jpeg, .png, .gif, .bmp, .tiff, .ico, .webp, .psd, .svg, .ai, .esp Yes Pillow Inkscape (.ai, .esp only)
JSON .json Yes None None
Markdown .md Yes None None
Media .mp3, .wav, .flac, .midi, .mp4, .mov, .avi, .mkv, .webm, .ogg, .opus, .m4a, .3gp, .aac, .aiff, .mpg, .mpeg, .wmv Yes None FFmpeg
Notebooks .ipynb Yes nbformat None
PDF .pdf Yes PyPDF2 None
Shell Scripts .sh, .bash, .zsh, .fish Yes None bash, fish, zsh
SQL .sql Yes None None
Templates .jinja, .tmpl, .j2 Yes jinja2 None
XML .xml Yes lxml None

Unsupported file types

File type Extensions Supported Python dependencies External dependencies
3D Model *.obj, *.fbx, *.stl, *.dae, *.gltf, *.glb, *.ply No None None
3D Project *.blend, *.max, *.ma, *.mb, *.c4d No None Blender, Autodesk Maya, 3ds Max, Cinema 4D
ActionScript .as No None None
Ada .adb, .ads, .ada No None None
Audio Project *.als, *.flp, *.ptx, *.rpp No None Ableton Live, FL Studio, Pro Tools, Reaper
C .c, .h No None None
C++ .cpp, .cc, .cxx, .hpp, .hh, .hxx No None None
C# .cs No None None
CAD *.dwg, *.dxf, *.step, *.stp, *.iges, *.igs No None AutoCAD, FreeCAD
Clojure .clj, .cljs, .cljc No None None
Coffeescript .coffee No None None
Dart .dart No None None
Digital Audio Workstation (DAW) *.logicx, *.aif, *.band No None Logic Pro, GarageBand
Elixir .ex, .exs No None None
Erlang .erl, .hrl No None None
Game Engine Project *.unity, *.uproject, *.godot No None Unity, Unreal Engine, Godot
Go .go No None None
Gradle .gradle No None None
Groovy .groovy No None None
Haskell .hs No None None
Java .java No None None
Julia .jl No None None
Kotlin .kt, .kts No None None
Liquid .liquid No None None
Lisp .lisp, .lsp No None None
Lua .lua No None None
Motion Graphics/Animation *.aep, *.aepx, *.prproj, *.drp No None Adobe After Effects, Premiere Pro, DaVinci Resolve
Mustache .mustache No None None
Objective-C .m No None None
Objective-C++ .mm No None None
OCaml .ml, .mli No None None
Pascal .pas No None None
Perl .pl, .pm No None None
PowerShell .ps1 No None None
R .r, .R No None None
RMarkdown .rmd No None None
RestructuredText .rst No None None
Scala .scala No None None
Svelte .svelte No None None
Swift .swift No None None
TeX/LaTeX .tex, .ltx, .sty No None None
Vector Graphics *.ai, *.eps, *.svg No None Adobe Illustrator, Inkscape
Video Editing *.veg, *.prproj, *.drp No None Sony Vegas, Adobe Premiere Pro, DaVinci Resolve
Visual Basic .vb No None None
VFX Project *.nk, *.hip No None Nuke, Houdini
Vue .vue No None None
Web Design Project *.xd, *.fig, *.sketch No None Adobe XD, Figma, Sketch

License

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.

About

Checks the integrity of files at a given root path

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published