A Python package to find license expressions and copyright statements in a codebase.
Based on Google LicenseClassifer V2, GoLicense-Classifier (or glc for short) focuses on performance without compromising with accuracy.
Note: Currently, this package only supports Linux Platform. Work is in progress for Windows and Mac.
Installing GoLicense-Classifier is as simple as
pip install golicense-classifierOr, you can build the package from source as
git clone https://github.com/AvishrantsSh/GoLicense-Classifier.git
make dev
make packageTo get started, import LicenseClassifier class from the module as
from LicenseClassifier.classifier import LicenseClassifierNote: Work on Copyright Statement is still in beta phase. Expect some issues, mostly with binary files
The class comes bundled with some handy functions, each suited for a different task.
-
scan_directoryThis method is used to recursively walk through a directory and find license expressions and copyright statements. It returns a dictionary object with keys
headerandfiles.
classifier = LicenseClassifier() res = classifier.scan_directory('PATH_TO_DIR')
-
max_sizeMaximum size of file in MB. Default is set to 10MB. Set
max_size < 0to ignore size constraints -
use_buffer(Experimental)Set toTrueto use buffered file scanning.max_sizewill be used as buffer size. -
use_scancode_mappingSet to
Trueif you want to use Scancode license key mappings. Default is set toTrue.
-
-
scan_fileThis method is used to find license expressions and copyright statements in a single file.
classifier = LicenseClassifier() res = classifier.scan_file('PATH_TO_FILE')
-
max_sizeMaximum size of file in MB. Default is set to 10MB. Set
max_size < 0to ignore size constraints -
use_buffer(Experimental)Set toTrueto use buffered file scanning.max_sizewill be used as buffer size. -
use_scancode_mappingSet to
Trueif you want to use Scancode license key mappings. Default is set toTrue.
-
You can set custom threshold for scanning purpose that best suits your need. Simply change the parameter threshold during object creation as
classifier = LicenseClassifier(threshold = 0.9)Contributions are what makes the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
To get started, read the Contributing Guide.
- Google LicenseClassfifer V2 https://github.com/google/licenseclassifier/
- Ctypes Shared Library Code https://github.com/AvishrantsSh/Ctypes_LicenseClassifier
- Ctypes https://docs.python.org/3/library/ctypes.html