Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
name: CI

on:
push:
branches:
- "main"
pull_request:
types: [opened, synchronize, reopened]


concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
test:
name: Test
runs-on: ubuntu-latest

strategy:
fail-fast: true
matrix:
python: [3.12, 3.13]

steps:
- uses: actions/checkout@v4
with:
fetch-depth: 1

- name: Install the latest version of uv and set the python version
uses: astral-sh/setup-uv@v6.7.0
with:
enable-cache: true
python-version: ${{ matrix.python }}

- name: Pull dependencies
run: uv sync --all-extras --all-groups

- name: Execute tests
run: uv run pytest
35 changes: 35 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Lint

on:
push:
paths:
- '**.py'
- '.github/workflows/lint.yml'

permissions:
contents: write

jobs:
lint:
name: Lint Shell scripts
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
with:
fetch-depth: 1

- uses: astral-sh/setup-uv@v6.7.0

- name: Ruff lint
run: uv run ruff check --exit-zero .

- name: Ruff format
run: uv run ruff format .

- name: Commit changes
uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: Fix styling


69 changes: 69 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: Publish on PyPI

on:
push:
branches:
- "main"
release:
types: [created]

permissions:
contents: read

jobs:

build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 1

- name: Install the latest version of uv and set the python version
uses: astral-sh/setup-uv@v6.7.0
with:
python-version: "3.12"

- name: Build wheel
run: |
uv build

# Smoke test the build for packaging errors
- name: Smoke test (wheel)
run: uv run --isolated --no-project --with dist/*.whl tests/smoke_test.py
- name: Smoke test (source distribution)
run: uv run --isolated --no-project --with dist/*.tar.gz tests/smoke_test.py

- name: Upload build for publishing
uses: actions/upload-artifact@v4
with:
name: parxy_release
if-no-files-found: error
retention-days: 1
path: dist/*

pypi:
name: Upload release to PyPI
runs-on: ubuntu-latest
needs: build
environment: pypi
permissions:
id-token: write
steps:
- name: Install the latest version of uv and set the python version
uses: astral-sh/setup-uv@v6.7.0
if: github.event_name == 'release'
with:
enable-cache: false
ignore-empty-workdir: true
python-version: "3.12"

- name: Download build
uses: actions/download-artifact@v4
with:
name: parxy_release
path: dist

- name: Publish package distributions to PyPI
if: github.event_name == 'release'
run: uv publish
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[![CI](https://github.com/OneOffTech/parxy/actions/workflows/ci.yml/badge.svg)](https://github.com/OneOffTech/parxy/actions/workflows/ci.yml) [![Build Docker Image](https://github.com/OneOffTech/parxy/actions/workflows/docker.yml/badge.svg)](https://github.com/OneOffTech/parxy/actions/workflows/docker.yml)
[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
![pypi](https://img.shields.io/pypi/v/parxy.svg)
[![Pydantic v2](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json)](https://docs.pydantic.dev/latest/contributing/#badges) [![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv) [![CI](https://github.com/OneOffTech/parxy/actions/workflows/ci.yml/badge.svg)](https://github.com/OneOffTech/parxy/actions/workflows/ci.yml)

# OneOffTech Parxy

Expand Down Expand Up @@ -36,13 +36,13 @@ Parxy is available as a standalone command line and a library. The quickest way
Use with minimal footprint (fewer drivers supported):

```bash
uvx --from "git+https://github.com/oneofftech/parxy.git" parxy --help
uvx parxy --help
```

Use all supported drivers:

```bash
uvx --from "git+https://github.com/oneofftech/parxy.git[all]" parxy --help
uvx parxy[all] --help
```

See [Supported services](#supported-services) for the list of included drivers and their extras for the installation.
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[project]
name = "parxy-core"
name = "parxy"
version = "0.1.0"
description = "Parxy document processing gateway"
readme = "README.md"
Expand Down
4 changes: 2 additions & 2 deletions src/__main__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from parxy_cli import cli

if __name__ == "__main__":
cli()
if __name__ == '__main__':
cli()
3 changes: 1 addition & 2 deletions src/parxy_cli/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
import typer
from rich import print
from rich.console import Console
from rich.table import Table

from parxy_core.facade import Parxy

Expand Down Expand Up @@ -93,7 +92,7 @@ def parse(
# Process each file
for file_path in files:
try:
console.print(f'----')
console.print('----')

# Parse the document
doc = Parxy.parse(
Expand Down
2 changes: 1 addition & 1 deletion src/parxy_core/drivers/abstract_driver.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ def parse(

except Exception as ex:
self._logger.error(
f'Error while parsing file',
'Error while parsing file',
file,
self.__class__.__name__,
exc_info=True,
Expand Down
7 changes: 3 additions & 4 deletions src/parxy_core/drivers/llamaparse.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import io
from typing import TYPE_CHECKING, Dict, Any, Optional, Union
from logging import Logger
from typing import TYPE_CHECKING

# Type hints that will be available at runtime when llama_cloud_services is installed
if TYPE_CHECKING:
Expand Down Expand Up @@ -51,7 +50,7 @@ def _initialize_driver(self):
except ImportError as e:
raise ImportError(
'LlamaParse dependencies not installed. '
"Install with 'pip install parxy-core[llama]'"
"Install with 'pip install parxy[llama]'"
) from e

self.__client = LlamaParse(**self._config)
Expand Down Expand Up @@ -123,7 +122,7 @@ def _handle(
# For all other errors, raise as parsing exception
raise ParsingException(str(ex), self.__class__) from ex

if not res.error is None:
if res.error is not None:
raise ParsingException(
res.error, self.__class__, res.model_dump(exclude={'file_name'})
)
Expand Down
6 changes: 2 additions & 4 deletions src/parxy_core/drivers/llmwhisperer.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@

import validators

from typing import TYPE_CHECKING, Dict, Any

from logging import Logger
from typing import TYPE_CHECKING


# Type hints that will be available at runtime when llm whisperer is installed
Expand Down Expand Up @@ -50,7 +48,7 @@ def _initialize_driver(self):
except ImportError as e:
raise ImportError(
'LlmWhisperer dependencies not installed. '
"Install with 'pip install parxy-core[llmwhisperer]'"
"Install with 'pip install parxy[llmwhisperer]'"
) from e

self.__client = LLMWhispererClientV2(**self._config)
Expand Down
3 changes: 1 addition & 2 deletions src/parxy_core/drivers/pdfact.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
import io
from typing import Optional, Dict, Any
from typing import Optional

import requests
import validators

from urllib.parse import urljoin
from logging import Logger

from parxy_core.drivers import Driver
from parxy_core.models import (
Expand Down
2 changes: 1 addition & 1 deletion src/parxy_core/drivers/unstructured_local.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def _initialize_driver(self):
except ImportError as e:
raise ImportError(
'Unstructured dependencies not installed. '
"Install with 'pip install parxy-core[unstructured_local]'"
"Install with 'pip install parxy[unstructured_local]'"
) from e

def _handle(
Expand Down
12 changes: 9 additions & 3 deletions src/parxy_core/exceptions/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
from parxy_core.exceptions.authentication_exception import AuthenticationException as AuthenticationException
from parxy_core.exceptions.authentication_exception import (
AuthenticationException as AuthenticationException,
)
from parxy_core.exceptions.parsing_exception import ParsingException as ParsingException
from parxy_core.exceptions.file_not_found_exception import FileNotFoundException as FileNotFoundException
from parxy_core.exceptions.unsupported_format_exception import UnsupportedFormatException as UnsupportedFormatException
from parxy_core.exceptions.file_not_found_exception import (
FileNotFoundException as FileNotFoundException,
)
from parxy_core.exceptions.unsupported_format_exception import (
UnsupportedFormatException as UnsupportedFormatException,
)
8 changes: 4 additions & 4 deletions src/parxy_core/exceptions/authentication_exception.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
class AuthenticationException(Exception):
"""Exception raised for authentication errors.

This exception should be raised when authentication fails with external services
or APIs, such as invalid API keys, expired tokens, or insufficient permissions.

Expand Down Expand Up @@ -51,7 +51,7 @@ def __str__(self) -> str:
str
Formatted error message including service name and details
"""
base_message = f"Authentication failed for {self.service}: {self.message}"
base_message = f'Authentication failed for {self.service}: {self.message}'
if self.details:
return f"{base_message}\nDetails: {self.details}"
return base_message
return f'{base_message}\nDetails: {self.details}'
return base_message
8 changes: 4 additions & 4 deletions src/parxy_core/exceptions/file_not_found_exception.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
class FileNotFoundException(FileNotFoundError):
"""Exception raised for file not found errors.

This exception is raised when a file cannot be accessed for parsing.

Attributes
Expand Down Expand Up @@ -50,7 +50,7 @@ def __str__(self) -> str:
str
Formatted error message including service name and details
"""
base_message = f"Parsing failed for {self.service}: {self.message}"
base_message = f'Parsing failed for {self.service}: {self.message}'
if self.details:
return f"{base_message}\nDetails: {self.details}"
return base_message
return f'{base_message}\nDetails: {self.details}'
return base_message
8 changes: 4 additions & 4 deletions src/parxy_core/exceptions/parsing_exception.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
class ParsingException(Exception):
"""Exception raised for parsing errors.

This exception is raised when parsing document fails.

Attributes
Expand Down Expand Up @@ -50,7 +50,7 @@ def __str__(self) -> str:
str
Formatted error message including service name and details
"""
base_message = f"Parsing failed for {self.service}: {self.message}"
base_message = f'Parsing failed for {self.service}: {self.message}'
if self.details:
return f"{base_message}\nDetails: {self.details}"
return base_message
return f'{base_message}\nDetails: {self.details}'
return base_message
8 changes: 4 additions & 4 deletions src/parxy_core/exceptions/unsupported_format_exception.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
class UnsupportedFormatException(Exception):
"""Exception raised for file format not supported.

This exception is raised when a file is of a format not supported by the parsing service.

Attributes
Expand Down Expand Up @@ -50,7 +50,7 @@ def __str__(self) -> str:
str
Formatted error message including service name and details
"""
base_message = f"Unsupported format for {self.service}: {self.message}"
base_message = f'Unsupported format for {self.service}: {self.message}'
if self.details:
return f"{base_message}\nDetails: {self.details}"
return base_message
return f'{base_message}\nDetails: {self.details}'
return base_message
Loading