Skip to content

Commit

Permalink
doc!: Deprecate LayoutDict
Browse files Browse the repository at this point in the history
  • Loading branch information
dhdaines committed Dec 30, 2024
1 parent 1e2f0a7 commit 20db2b1
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 4 deletions.
7 changes: 3 additions & 4 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,17 @@
- [x] reimplement on top of ContentObject

## PLAYA 0.2.x
- [ ] deprecate LayoutDict
- [ ] add parallel extraction of pages
- [ ] Fix ToUnicode CMaps for CID fonts (file bug against pdfminer)
- [ ] `decode_text` is remarkably slow
- [ ] `render_char` and `render_string` are also quite slow
- [ ] add something inbetween `chars` and full bbox for TextObject
(what do you actually need for heuristic or model-based
extraction? probably just `adv`?)
- [ ] remove the rest of the meaningless abuses of `cast`
- [ ] document how to transform bbox attributes on StructElement,
Destination, etc (but you should just use "default" space)
- [ ] deprecate LayoutDict

## PLAYA 0.3 and beyond
## PLAYA 1.0
- [ ] make the structure tree lazy
- [ ] support ExtGState (submit PR to pdfminer)
- [ ] better API for document outline, destinations, links, etc
Expand Down
11 changes: 11 additions & 0 deletions playa/page.py
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,11 @@ class GraphicState:
class LayoutDict(TypedDict, total=False):
"""Dictionary-based layout objects.
!!! danger Deprecated
This interface is deprecated and has been moved to
[PAVÉS](https://github.com/dhdaines/paves). It will be
removed in PLAYA 0.3.
These are somewhat like the `T_obj` dictionaries returned by
pdfplumber. The type of coordinates returned are determined by
the `space` argument passed to `Page`. By default, `(0, 0)` is
Expand Down Expand Up @@ -555,6 +560,7 @@ class LayoutDict(TypedDict, total=False):
`None` if irrelevant/forbidden,
srcsize: Source dimensions of image in pixels.
bits: Number of bits per channel of image.
"""

object_type: str
Expand Down Expand Up @@ -1208,6 +1214,11 @@ def begin_tag(self, tag: PDFObject, props: Dict[str, PDFObject]) -> None:
class PageInterpreter(BaseInterpreter):
"""Processor for the content of a PDF page
!!! danger Deprecated
This interface is deprecated and has been moved to
[PAVÉS](https://github.com/dhdaines/paves). It will be
removed in PLAYA 0.3.
Reference: PDF Reference, Appendix A, Operator Summary
"""

Expand Down

0 comments on commit 20db2b1

Please sign in to comment.