Skip to content

Commit ea1c939

Browse files
julianprestergeritwagnergithub-actions[bot]github-actionspre-commit-ci[bot]
authored
genai package (#545)
* Move prescreen inclusion criterion input to ops prescreen * update and rename workflows Signed-off-by: Gerit Wagner <[email protected]> * Update README.md * crossref: catch Exception * refactor: pylint messages * Run Update documentation weekly to avoid many PRs * Update documentation (#548) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * europe_pmc: catch ValueError in lock.release() * Use posix paths for platform independence (#544) * Convert all paths for docker to posix * PRISMA: as_posix() * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * OCRMyPDF: as_posix() * fix prisma: path unlink() --------- Co-authored-by: Julian Prester <[email protected]> Co-authored-by: Gerit Wagner <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Gerit Wagner <[email protected]> * colrev project installation (making internal packages optional) (#530) * colrev project installation / make internal packages optional * drop optional extras from colrev * update * update gh-workflow Signed-off-by: Gerit Wagner <[email protected]> * format and docs * update upgrade * extract colrev-internal-package discovery * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * install packages after --add and init * add note on colrev install . to docs --------- Signed-off-by: Gerit Wagner <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * docs: fix path * fix docker test * docs: update path * docker tests: remove intermediate containers * package_manager: packages do not necessarily start with "colrev." * update dependencies (#550) Co-authored-by: Poetry updater <[email protected]> * paper_md: stop container * fix import error: local_index.builder * add todo * do not build paper in silent mode * Reduce dependencies and switch to pydantic (#551) * move dependencies to arxiv and dedupe * pin numpy<2.0 * add bib-dedupe * switch to pydantic * switch to pydantic * update * sources. use relative filenames * update docs * fix mypy * crossref: update printout * docs: drop asciinema of package --init Signed-off-by:t Gerit Wagner <[email protected]> * docs: add note on search udpates * cli: add instructions * upgrade: fix path-names in registry * tei_parser: set defaults * testing/fixes * Export instead of print * Remove instructor dependency * Split prompt into system and user * align screening output with prescreen file export * move packages asciinema to comments * add command how to verify git credentials * fixes * update dependencies (#553) Co-authored-by: Poetry updater <[email protected]> * [pre-commit.ci] pre-commit autoupdate (#556) updates: - [github.com/astral-sh/ruff-pre-commit: v0.6.2 → v0.6.7](https://github.com/astral-sh/ruff-pre-commit/compare/v0.6.2...v0.6.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * prep polish: reset original state * crossref: raise ServiceNotAvailableException in crossref_query() * update set_prepared in record.run_quality_model() * update sync Signed-off-by: Gerit Wagner <[email protected]> * update validation * fix long line * no name-format defect for abbreviated names * record.remove_field_provenance_note(): also remove IGNORE:note * record.change_entrytype(): run_quality_model() with set_prepared=True * fixes * temporarily remove genai * install all-internal-packages for devcontainer (pylint) * fix naming conventions * fix naming conventions * fix arxiv: pyproject.toml * Relax prep (#529) * has_fatal_quality_defects() * create package ref_check * record_test: ignore mypy errors * add ref_check as a default package * remove record notes * reorder imports * update validation * update poetry.lock * sync with main commit 520a71356c5767cfba5a133e788826680d70377d Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:28:13 2024 +0200 fixes commit 227bf715ae4c39b88d4fceb20353d6cac3787c7a Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:21:47 2024 +0200 record.change_entrytype(): run_quality_model() with set_prepared=True commit ba6ba8be481266c12cb6d0551ad08e4ea34b98ce Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:19:14 2024 +0200 record.remove_field_provenance_note(): also remove IGNORE:note commit 9f1c22814d5d34bfa7cf03b90b39d71e961ec597 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:18:37 2024 +0200 no name-format defect for abbreviated names commit 8df75d4dc2f846eda85ce4589b4a0388cacde0f0 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:18:02 2024 +0200 fix long line commit bbe831ddbc141da318c2c356a2f4b1ff61a31825 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:17:47 2024 +0200 update validation commit 12beaf5fdd4d874bc798d229896498850060da5a Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:03:27 2024 +0200 update sync Signed-off-by: Gerit Wagner <[email protected]> commit 520192c39ea21904d828fa6c4d863a5859c1a2d9 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:01:14 2024 +0200 update set_prepared in record.run_quality_model() commit ae108e57f14be0d20c4a275b40ed470457b1e262 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:00:42 2024 +0200 crossref: raise ServiceNotAvailableException in crossref_query() commit 9d882f338fb47a9fb200b98130c6504b46ef7709 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 15:58:17 2024 +0200 prep polish: reset original state commit ddea08ff0f0ffe79a2b40e560548f5ea9d1709c3 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue Sep 24 06:16:49 2024 +0200 [pre-commit.ci] pre-commit autoupdate (#556) updates: - [github.com/astral-sh/ruff-pre-commit: v0.6.2 → v0.6.7](https://github.com/astral-sh/ruff-pre-commit/compare/v0.6.2...v0.6.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 77c055e610f179fc76a0006f5ad11391d979111b Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Tue Sep 24 06:16:39 2024 +0200 update dependencies (#553) Co-authored-by: Poetry updater <[email protected]> commit d045a1da20b5dac6562aac34668c58666fb90e5b Author: Gerit Wagner <[email protected]> Date: Mon Sep 23 08:35:53 2024 +0200 fixes commit a112c204f219ad180f2471126fd685efa0f66a6d Author: Carlo <[email protected]> Date: Fri Sep 20 07:38:10 2024 +0200 add command how to verify git credentials commit 81ebfb4c6c2202191b28c9bb590dbf79509d30f8 Author: Gerit Wagner <[email protected]> Date: Thu Sep 19 19:25:34 2024 +0200 move packages asciinema to comments commit 700c80508455046f2407aa493b36ba2760ccbed1 Author: Gerit Wagner <[email protected]> Date: Thu Sep 19 07:50:11 2024 +0200 testing/fixes commit 260566119990226e07a1485bbead1c73900560a8 Author: Gerit Wagner <[email protected]> Date: Thu Sep 19 06:21:49 2024 +0200 tei_parser: set defaults commit cd4214116c90e70c369edb38eb3a2e8999a97b29 Author: Gerit Wagner <[email protected]> Date: Mon Sep 16 11:24:19 2024 +0200 upgrade: fix path-names in registry commit 36bfd051989681c992b4743963df1a5053924884 Author: Gerit Wagner <[email protected]> Date: Mon Sep 16 11:08:04 2024 +0200 cli: add instructions commit ecbad8fc09c548ff1e9507269b6bc09a2ce26769 Author: Gerit Wagner <[email protected]> Date: Sun Sep 15 10:51:48 2024 +0200 docs: add note on search udpates commit aa1b6766a03a02151c4838422995bc7fe2581374 Author: Gerit Wagner <[email protected]> Date: Sat Sep 14 12:00:44 2024 +0200 docs: drop asciinema of package --init Signed-off-by:t Gerit Wagner <[email protected]> commit e791cc50bc8fd25ef209e3bef77342f31efe4e6f Author: Gerit Wagner <[email protected]> Date: Sat Sep 14 09:41:21 2024 +0200 crossref: update printout commit ffd96287d953fd440dc26380ea3117ad32ce47ee Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 14:02:14 2024 +0200 Reduce dependencies and switch to pydantic (#551) * move dependencies to arxiv and dedupe * pin numpy<2.0 * add bib-dedupe * switch to pydantic * switch to pydantic * update * sources. use relative filenames * update docs * fix mypy commit 1b84a37edd4767c96d46f17e765aa1d98a99a9d8 Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 10:13:18 2024 +0200 do not build paper in silent mode commit 910dffac5f95df8e02b0efaf11c0a017de306f78 Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 08:05:21 2024 +0200 add todo commit 7277c719f08c49274724ceb0f4af0b14c906f189 Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 08:05:10 2024 +0200 fix import error: local_index.builder commit 026df759a150167b472fdd554683b8b491eb8105 Author: Gerit Wagner <[email protected]> Date: Wed Sep 11 07:09:15 2024 +0200 paper_md: stop container commit dc6ed4676acaf5a17eb9f5e8024a0ab848ae455a Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed Sep 11 06:19:15 2024 +0200 update dependencies (#550) Co-authored-by: Poetry updater <[email protected]> commit a4d12eccffc938a5c91ab117cf35e24120f45965 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 20:40:56 2024 +0200 package_manager: packages do not necessarily start with "colrev." commit 5d23d210d85c3ff0e16ae274d28690f0019e87d1 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 14:36:08 2024 +0200 docker tests: remove intermediate containers commit 5142f4db641252e2bd667752cf3a9d7ae54e7cfa Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 14:09:21 2024 +0200 docs: update path commit 2d6c2c28bcb803b7777adc0be79e41e665af05fe Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 14:08:34 2024 +0200 fix docker test commit 8c8f1370abf2dcbb5ab659c658f4411d46250425 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 13:42:36 2024 +0200 docs: fix path commit 7b8c31ed98f2b8e104b5f6e942e90f951160b280 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 09:23:19 2024 +0200 colrev project installation (making internal packages optional) (#530) * colrev project installation / make internal packages optional * drop optional extras from colrev * update * update gh-workflow Signed-off-by: Gerit Wagner <[email protected]> * format and docs * update upgrade * extract colrev-internal-package discovery * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * install packages after --add and init * add note on colrev install . to docs --------- Signed-off-by: Gerit Wagner <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit bea78ac507f362feb3b6126d01342a0d7f61ebec Author: Julian Prester <[email protected]> Date: Tue Sep 10 16:42:30 2024 +1000 Use posix paths for platform independence (#544) * Convert all paths for docker to posix * PRISMA: as_posix() * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * OCRMyPDF: as_posix() * fix prisma: path unlink() --------- Co-authored-by: Julian Prester <[email protected]> Co-authored-by: Gerit Wagner <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Gerit Wagner <[email protected]> commit cd944e1db31a4365a225672f7b082be95db5efef Author: Gerit Wagner <[email protected]> Date: Mon Sep 9 09:19:49 2024 +0200 europe_pmc: catch ValueError in lock.release() commit 685260be988edbb47d725cbfac298075ab46e011 Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Sun Sep 8 14:44:24 2024 +0200 Update documentation (#548) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 046c09ae0ff7f9a8f910e90a03c5585295797469 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:52:40 2024 +0200 Run Update documentation weekly to avoid many PRs commit 8fcb7e347ba1e29f26785d682e5c346be91a5237 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:52:15 2024 +0200 refactor: pylint messages commit 3ea50f6a377989081a06752a5a76492763385975 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:21:37 2024 +0200 crossref: catch Exception commit 163b5e10d88399fa88998ac04d63578e37dd51e3 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:19:55 2024 +0200 Update README.md commit 8f2460eb54e931d82d6b680107f05fd7736ff5a1 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:12:58 2024 +0200 update and rename workflows Signed-off-by: Gerit Wagner <[email protected]> commit c8bafc81c457e6007d33eb85635998eceec534bf Author: Gerit Wagner <[email protected]> Date: Fri Sep 6 08:26:07 2024 +0200 data endpoint. add and commit commit b460db7230858c9cced3592e7d8c2ed759e437ec Author: Gerit Wagner <[email protected]> Date: Fri Sep 6 08:25:48 2024 +0200 init: check Docker available commit 85ea667aa9571a118ef6688e86a78e64d77f0641 Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed Sep 4 08:58:41 2024 +0200 update dependencies (#539) Co-authored-by: Poetry updater <[email protected]> commit 69d3eff5daf2df1a37816fc062eae73e6c11eecd Author: Gerit Wagner <[email protected]> Date: Wed Sep 4 07:41:42 2024 +0200 update relink_pdfs Signed-off-by: Gerit Wagner <[email protected]> commit fbd4d780c275fbdf04e409a3d7f0c70843f400e2 Author: Gerit Wagner <[email protected]> Date: Tue Sep 3 09:11:58 2024 +0200 add note on the order of pre/screening packages commit 09aa2173829f03b177c60f38b76acf4fcc5d8649 Author: Gerit Wagner <[email protected]> Date: Tue Sep 3 08:20:41 2024 +0200 update pdf text extraction commit 6588b6867806312c24bb3bfd7d6eb0fc80860747 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:26:21 2024 +0200 europe_pmc: refactor commit 9877da003ef1085b4222c1d17bc0d364a00161c2 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:25:31 2024 +0200 crossref: refactor commit 7ba35df3e0ecfe204b3c639fd062e086a06e080d Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:24:35 2024 +0200 update print output commit dad30adb357ac0f38c171d1a5b9997e3a275e576 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:24:05 2024 +0200 RecordNotInIndexException: ID mandatory commit 45deaa89b222308c093505af150a392b97868db7 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:22:44 2024 +0200 pdf-get: fix symlinks after renamed dirs commit 91d5552929b026d1cd6fb0fe9bf11bf716b8a698 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:22:25 2024 +0200 load: warn on non-standardized fields (instead of raising exception) commit 829e036c061b1fc2903785500387924894525ff4 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:20:57 2024 +0200 paper_md: fix path commit 5b4079e6cc0a66877a0f82c15d9683a65632423e Author: Gerit Wagner <[email protected]> Date: Sat Aug 31 18:59:57 2024 +0200 sort SearchTypes commit 84d7901208ed067e40443f2ec55bfd5a3a712a8f Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Sat Aug 31 16:20:02 2024 +0200 Update documentation (#528) Co-authored-by: github-actions <[email protected]> * add instruction to cli-validation * files_dir: stricter quality control * merge main commit d9a1a8a4c5377e63846390e2a63f7bbe0bc18c5e Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 17:33:22 2024 +0200 fix arxiv: pyproject.tomlÄ commit 38552d1e5ad970d094d87c35b42f1f661d084ed1 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 17:33:06 2024 +0200 fix naming conventions commit 6831e5b65d4b608056ecab0ae428897d0eddd9b7 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 17:07:16 2024 +0200 fix naming conventions commit a4aefd01ebcd9365336e78c6efba34fe3426bf32 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 17:00:42 2024 +0200 install all-internal-packages for devcontainer (pylint) commit 532332997178ab99bc23c78e22dbf0acb27f425d Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:57:18 2024 +0200 temporarily remove genai commit 520a71356c5767cfba5a133e788826680d70377d Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:28:13 2024 +0200 fixes commit 227bf715ae4c39b88d4fceb20353d6cac3787c7a Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:21:47 2024 +0200 record.change_entrytype(): run_quality_model() with set_prepared=True commit ba6ba8be481266c12cb6d0551ad08e4ea34b98ce Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:19:14 2024 +0200 record.remove_field_provenance_note(): also remove IGNORE:note commit 9f1c22814d5d34bfa7cf03b90b39d71e961ec597 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:18:37 2024 +0200 no name-format defect for abbreviated names commit 8df75d4dc2f846eda85ce4589b4a0388cacde0f0 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:18:02 2024 +0200 fix long line commit bbe831ddbc141da318c2c356a2f4b1ff61a31825 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:17:47 2024 +0200 update validation commit 12beaf5fdd4d874bc798d229896498850060da5a Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:03:27 2024 +0200 update sync Signed-off-by: Gerit Wagner <[email protected]> commit 520192c39ea21904d828fa6c4d863a5859c1a2d9 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:01:14 2024 +0200 update set_prepared in record.run_quality_model() commit ae108e57f14be0d20c4a275b40ed470457b1e262 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 16:00:42 2024 +0200 crossref: raise ServiceNotAvailableException in crossref_query() commit 9d882f338fb47a9fb200b98130c6504b46ef7709 Author: Gerit Wagner <[email protected]> Date: Sat Sep 28 15:58:17 2024 +0200 prep polish: reset original state commit ddea08ff0f0ffe79a2b40e560548f5ea9d1709c3 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue Sep 24 06:16:49 2024 +0200 [pre-commit.ci] pre-commit autoupdate (#556) updates: - [github.com/astral-sh/ruff-pre-commit: v0.6.2 → v0.6.7](https://github.com/astral-sh/ruff-pre-commit/compare/v0.6.2...v0.6.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 77c055e610f179fc76a0006f5ad11391d979111b Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Tue Sep 24 06:16:39 2024 +0200 update dependencies (#553) Co-authored-by: Poetry updater <[email protected]> commit d045a1da20b5dac6562aac34668c58666fb90e5b Author: Gerit Wagner <[email protected]> Date: Mon Sep 23 08:35:53 2024 +0200 fixes commit a112c204f219ad180f2471126fd685efa0f66a6d Author: Carlo <[email protected]> Date: Fri Sep 20 07:38:10 2024 +0200 add command how to verify git credentials commit 81ebfb4c6c2202191b28c9bb590dbf79509d30f8 Author: Gerit Wagner <[email protected]> Date: Thu Sep 19 19:25:34 2024 +0200 move packages asciinema to comments commit 700c80508455046f2407aa493b36ba2760ccbed1 Author: Gerit Wagner <[email protected]> Date: Thu Sep 19 07:50:11 2024 +0200 testing/fixes commit 260566119990226e07a1485bbead1c73900560a8 Author: Gerit Wagner <[email protected]> Date: Thu Sep 19 06:21:49 2024 +0200 tei_parser: set defaults commit cd4214116c90e70c369edb38eb3a2e8999a97b29 Author: Gerit Wagner <[email protected]> Date: Mon Sep 16 11:24:19 2024 +0200 upgrade: fix path-names in registry commit 36bfd051989681c992b4743963df1a5053924884 Author: Gerit Wagner <[email protected]> Date: Mon Sep 16 11:08:04 2024 +0200 cli: add instructions commit ecbad8fc09c548ff1e9507269b6bc09a2ce26769 Author: Gerit Wagner <[email protected]> Date: Sun Sep 15 10:51:48 2024 +0200 docs: add note on search udpates commit aa1b6766a03a02151c4838422995bc7fe2581374 Author: Gerit Wagner <[email protected]> Date: Sat Sep 14 12:00:44 2024 +0200 docs: drop asciinema of package --init Signed-off-by:t Gerit Wagner <[email protected]> commit e791cc50bc8fd25ef209e3bef77342f31efe4e6f Author: Gerit Wagner <[email protected]> Date: Sat Sep 14 09:41:21 2024 +0200 crossref: update printout commit ffd96287d953fd440dc26380ea3117ad32ce47ee Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 14:02:14 2024 +0200 Reduce dependencies and switch to pydantic (#551) * move dependencies to arxiv and dedupe * pin numpy<2.0 * add bib-dedupe * switch to pydantic * switch to pydantic * update * sources. use relative filenames * update docs * fix mypy commit 1b84a37edd4767c96d46f17e765aa1d98a99a9d8 Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 10:13:18 2024 +0200 do not build paper in silent mode commit 910dffac5f95df8e02b0efaf11c0a017de306f78 Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 08:05:21 2024 +0200 add todo commit 7277c719f08c49274724ceb0f4af0b14c906f189 Author: Gerit Wagner <[email protected]> Date: Fri Sep 13 08:05:10 2024 +0200 fix import error: local_index.builder commit 026df759a150167b472fdd554683b8b491eb8105 Author: Gerit Wagner <[email protected]> Date: Wed Sep 11 07:09:15 2024 +0200 paper_md: stop container commit dc6ed4676acaf5a17eb9f5e8024a0ab848ae455a Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed Sep 11 06:19:15 2024 +0200 update dependencies (#550) Co-authored-by: Poetry updater <[email protected]> commit a4d12eccffc938a5c91ab117cf35e24120f45965 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 20:40:56 2024 +0200 package_manager: packages do not necessarily start with "colrev." commit 5d23d210d85c3ff0e16ae274d28690f0019e87d1 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 14:36:08 2024 +0200 docker tests: remove intermediate containers commit 5142f4db641252e2bd667752cf3a9d7ae54e7cfa Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 14:09:21 2024 +0200 docs: update path commit 2d6c2c28bcb803b7777adc0be79e41e665af05fe Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 14:08:34 2024 +0200 fix docker test commit 8c8f1370abf2dcbb5ab659c658f4411d46250425 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 13:42:36 2024 +0200 docs: fix path commit 7b8c31ed98f2b8e104b5f6e942e90f951160b280 Author: Gerit Wagner <[email protected]> Date: Tue Sep 10 09:23:19 2024 +0200 colrev project installation (making internal packages optional) (#530) * colrev project installation / make internal packages optional * drop optional extras from colrev * update * update gh-workflow Signed-off-by: Gerit Wagner <[email protected]> * format and docs * update upgrade * extract colrev-internal-package discovery * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * install packages after --add and init * add note on colrev install . to docs --------- Signed-off-by: Gerit Wagner <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit bea78ac507f362feb3b6126d01342a0d7f61ebec Author: Julian Prester <[email protected]> Date: Tue Sep 10 16:42:30 2024 +1000 Use posix paths for platform independence (#544) * Convert all paths for docker to posix * PRISMA: as_posix() * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * OCRMyPDF: as_posix() * fix prisma: path unlink() --------- Co-authored-by: Julian Prester <[email protected]> Co-authored-by: Gerit Wagner <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Gerit Wagner <[email protected]> commit cd944e1db31a4365a225672f7b082be95db5efef Author: Gerit Wagner <[email protected]> Date: Mon Sep 9 09:19:49 2024 +0200 europe_pmc: catch ValueError in lock.release() commit 685260be988edbb47d725cbfac298075ab46e011 Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Sun Sep 8 14:44:24 2024 +0200 Update documentation (#548) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 046c09ae0ff7f9a8f910e90a03c5585295797469 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:52:40 2024 +0200 Run Update documentation weekly to avoid many PRs commit 8fcb7e347ba1e29f26785d682e5c346be91a5237 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:52:15 2024 +0200 refactor: pylint messages commit 3ea50f6a377989081a06752a5a76492763385975 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:21:37 2024 +0200 crossref: catch Exception commit 163b5e10d88399fa88998ac04d63578e37dd51e3 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:19:55 2024 +0200 Update README.md commit 8f2460eb54e931d82d6b680107f05fd7736ff5a1 Author: Gerit Wagner <[email protected]> Date: Sun Sep 8 09:12:58 2024 +0200 update and rename workflows Signed-off-by: Gerit Wagner <[email protected]> commit c8bafc81c457e6007d33eb85635998eceec534bf Author: Gerit Wagner <[email protected]> Date: Fri Sep 6 08:26:07 2024 +0200 data endpoint. add and commit commit b460db7230858c9cced3592e7d8c2ed759e437ec Author: Gerit Wagner <[email protected]> Date: Fri Sep 6 08:25:48 2024 +0200 init: check Docker available commit 85ea667aa9571a118ef6688e86a78e64d77f0641 Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Wed Sep 4 08:58:41 2024 +0200 update dependencies (#539) Co-authored-by: Poetry updater <[email protected]> commit 69d3eff5daf2df1a37816fc062eae73e6c11eecd Author: Gerit Wagner <[email protected]> Date: Wed Sep 4 07:41:42 2024 +0200 update relink_pdfs Signed-off-by: Gerit Wagner <[email protected]> commit fbd4d780c275fbdf04e409a3d7f0c70843f400e2 Author: Gerit Wagner <[email protected]> Date: Tue Sep 3 09:11:58 2024 +0200 add note on the order of pre/screening packages commit 09aa2173829f03b177c60f38b76acf4fcc5d8649 Author: Gerit Wagner <[email protected]> Date: Tue Sep 3 08:20:41 2024 +0200 update pdf text extraction commit 6588b6867806312c24bb3bfd7d6eb0fc80860747 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:26:21 2024 +0200 europe_pmc: refactor commit 9877da003ef1085b4222c1d17bc0d364a00161c2 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:25:31 2024 +0200 crossref: refactor commit 7ba35df3e0ecfe204b3c639fd062e086a06e080d Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:24:35 2024 +0200 update print output commit dad30adb357ac0f38c171d1a5b9997e3a275e576 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:24:05 2024 +0200 RecordNotInIndexException: ID mandatory commit 45deaa89b222308c093505af150a392b97868db7 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:22:44 2024 +0200 pdf-get: fix symlinks after renamed dirs commit 91d5552929b026d1cd6fb0fe9bf11bf716b8a698 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:22:25 2024 +0200 load: warn on non-standardized fields (instead of raising exception) commit 829e036c061b1fc2903785500387924894525ff4 Author: Gerit Wagner <[email protected]> Date: Mon Sep 2 18:20:57 2024 +0200 paper_md: fix path commit 5b4079e6cc0a66877a0f82c15d9683a65632423e Author: Gerit Wagner <[email protected]> Date: Sat Aug 31 18:59:57 2024 +0200 sort SearchTypes commit 84d7901208ed067e40443f2ec55bfd5a3a712a8f Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Sat Aug 31 16:20:02 2024 +0200 Update documentation (#528) Co-authored-by: github-actions <[email protected]> * update upgrade of gh-actions * doi_org: use re instead of bs4 * record.has_fatal_quality_defects(): catch doi/numbers Signed-off-by: Gerit Wagner <[email protected]> * update GROBID * fix tei_parser * search_api_feed: make _add_record_to_feed public and remove redundant _get_prev_feed_record * colrev.files_dir: update rerun * fix linter messages: colrev.files_dir * update grobid-0.8.1 tests * update dependencies (#560) Co-authored-by: Poetry updater <[email protected]> * Update documentation (#561) Co-authored-by: github-actions <[email protected]> * colrev.github: support topic field, use inquirer * update dependencies (#562) Co-authored-by: Poetry updater <[email protected]> * update links/docs * colrev.files_dir: catch exception * update docs * colrev.arxiv: fix feedparser dep * colrev.files_dir: ignore pylint * release 0.13.0 * update doi in CITATION.cff * colrev.files_dir: catch connection error * add comparison * update README.md * update overview * update readme * update comparison * update docs * update docs/index * udpate docs/index * colrev-packages: if direct-url does not start with file://, it is not local * colrev-packages: shallow clone * init: install package before instantiating * update overview * add python venv instructions to getting started page * install setuptools with --break-system-packages * drop click_completion, update zope.interface * fix pylint warnings * codespaces: py3.12 * colrev init: add not to install internal_packages * temporarily deactivate is_installed() * add "colrev install all-packages" to getting started install steps * fix previous wrong colrev package command * set python version for workflow/tests * fix pylint warnings * package-manager: import pypi packages containing colrev in package-name * add bs4 dependency for docs * update dependencies * deps: add mypy to dev * fix: package name validation * fix: do not call install() in init * feat: create two commits upon init allows users to see one commit with the settings and data packages separate from the commit with the general file setup (e.g., gitignore, LICENSE) * deps: update bib-dedupe and Python (3.10-3.12) - fixes vulnerability in setuptools Squashed commit of the following: commit 0001f84e84f6d874851b2e0936854b54bfd2057a Author: Gerit Wagner <[email protected]> Date: Tue Oct 22 15:52:28 2024 +0200 add note on Python3.13 commit c3e79f9112217beff85718ed1f6e3922831077d0 Author: Gerit Wagner <[email protected]> Date: Tue Oct 22 15:08:31 2024 +0200 tests: specify python version for pipx install poetry commit 9745ae0f4bb06a6081f03dd32a167677f99ba914 Author: Gerit Wagner <[email protected]> Date: Mon Oct 21 09:08:58 2024 +0200 tests: drop python 3.13. full test commit 306f023b7b64a5c60f1a83dd19fa4437cf8ac482 Author: Gerit Wagner <[email protected]> Date: Mon Oct 21 07:35:40 2024 +0200 deps: update lxml commit 0c0a3596371704809d12fd0ee6c1150651cf1005 Author: Gerit Wagner <[email protected]> Date: Mon Oct 21 06:52:57 2024 +0200 deps: update bib-dedupe (Python, numpy) * docs: reduce badges * add installation of diverse internal packages * restructure installation of CoLRev and pre-commit hooks * update * dev: remove setup.md from devcontainer * docs: fix typo * update dependencies * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/pre-commit/pre-commit-hooks: v4.6.0 → v5.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.6.0...v5.0.0) - [github.com/psf/black-pre-commit-mirror: 24.8.0 → 24.10.0](https://github.com/psf/black-pre-commit-mirror/compare/24.8.0...24.10.0) - [github.com/asottile/reorder-python-imports: v3.13.0 → v3.14.0](https://github.com/asottile/reorder-python-imports/compare/v3.13.0...v3.14.0) - [github.com/asottile/pyupgrade: v3.17.0 → v3.19.0](https://github.com/asottile/pyupgrade/compare/v3.17.0...v3.19.0) - [github.com/pre-commit/mirrors-mypy: v1.11.2 → v1.13.0](https://github.com/pre-commit/mirrors-mypy/compare/v1.11.2...v1.13.0) - [github.com/astral-sh/ruff-pre-commit: v0.6.7 → v0.7.1](https://github.com/astral-sh/ruff-pre-commit/compare/v0.6.7...v0.7.1) * deps: add importlib_metadata for dash * deps: poetry updates manually (not as a cronjob) * crossref,dblp: update formatting * docs: include colrev-scidb * docs: update colrev-scidb * Update documentation (#573) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update documentation (#575) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * fix: entering author details in package --init * ui: add note for colrev package --init * fix: crossref - use cursor method for large queries * rename test files * fix: package_manager init built-in * release 0.13.1 * docs: SearchSource * version: drop pre-python3.8 * crossref: add get_dois() * format * deps: update pre-commit to prevent InvalidManifest error * search: catch ModuleNotFound * update synergy-datasets * release 0.13.2 * update mypy-python version * deps: update to bib-dedupe (silenced pandas warnings) * fix: europe_pmc empty-list * update pre-commit hooks * remove pylint flags * synergy: extract method * crossref: rename attribute * update docs * extract colrev.sync to separate PyPI package * extract hooks-update to colrev-sync * docs: remove hooks.update * fix: scope-prescreen optional with None * add colrev convert (cli) * update gh-action workflows * gh-actions: update deploy * update record_id_setter (for colrev convert) * colrev.plos package (#594) * Package PLOS created * Creating SearchSource Plos * Update plos_search_source.py * Fixing upload of SearchSource Plos * Skeleton of package plos * Fixing pushing to git * A few errors fixed * Main branch update with run error fixed * Fixing push * Fixing push 2 * Created file record_transformer and sttarted method json_to_record * Implemented some methods for the json_to_record * Record_transformer_class * Get variables from PLOS result * merged other branch, record transformer class completed * Add _prep_plos_record method for PLOS record preparation * Implemented method scope_excluded * merge branches to present * Fixed errors during meeting * Added save line in run_api_search * The format of the numbers and authors has changed. The abstract needs to be reviewed. * load implemented * unit_test * Some errors fixed * Documentation. Fixed errors from unit_test. Fixed error YEAR * Inputs and prints removed * Merged prepare branch into this branch * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * First test passed. (review method _item_to_record) * Unit-Test completed * Deleted useless files * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update format of publication_date filter * remove logging settings * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed pre-commit warnings * Module search deleted * undo unrelated changes * undo change * remove article_type (not returned in item) * reinclude pylint/crossref * copy README for import to CoLRev docs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * drop TOC * remove prep endpoint * import colrev.plos into colrev docs * update contributors --------- Co-authored-by: julialopezmarti <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Gerit Wagner <[email protected]> * prospero searchsource (#586) * new branch prospero-serachsource created, prospero package initialised and installed * created python file for functionalities, defined class properties * trivial changes added to prospero.py * Add requirements.txt for dependency management * Add Selenium setup, test script, and ChromeDriver * Add working load and add_endpoint methods for ProsperoSearchSource * search_func extract meta data, not tested * expand search_func * add test file * extract the content from columns * add into search method amount of articles from prospero * add into search method amount of articles from prospero * Update Prospero search functionality and tests * allow user input for search bar * save records into bib file * Fix indexing issues and improve BibTeX export in Prospero search * change variable names for better clarity, driver navigates between pages * Update ProsperoSearchSource with improved load methods and configurations * seperate smaller methods for better clarity * updates files * Fix add_endpoint method in ProsperoSearchSource for unique filenames and API compatibility * add run_api_search method * change naming for better clarity * rearrange import block for better clarity * change file name for better clarity * add heuristic method * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * small fixes and saving a basic bib file after blocks getting deleted! * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * load language, authors into bib file, fix repeated loading error * add run_api_search method * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * readjust page counter * add trivial changes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update prospero.py: Fixed KeyError, improved search functionality, and removed unwanted lines * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove source_identifier property * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change year field * change prospero_status * add documentation in README file * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update documentation * update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * poetry add selenium * add selenium dependencies * add search_parameters * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update search_parameters * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove bibtexparser dependency and rely on feed.save * added names of all members in toml file * Removed unnecessary files: .vscode/settings.json and requirements.txt * minimize console output, delete old test files * delete old selenium test files * Fix some flake8/mypy issues in ProsperoSearchSource * use review_manager.logger * fix several pylint issues * fix pylint * FIX None-Working Code * Fix: Resolve pre-commit warnings in prospero.py * add pylint disable * add pylint disable for prospero.py * Fix: Resolve pre-commit warnings in get_record_info.py * copy README for colrev docs * revise code * refactor * refactor: extract api module * update docs * move URL_PREFIX to class attribute * update contributors --------- Co-authored-by: ammar-uni <[email protected]> Co-authored-by: OlhaKomashevska <[email protected]> Co-authored-by: komashevska <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Gerit Wagner <[email protected]> Co-authored-by: Gerit Wagner <[email protected]> * Update documentation (#603) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * fix: ui-cli: detect SearchSource * fix: search/missing query * update docs: repo_name * deps: drop selenium for colrev core * replace pkg_resources with importlib (#605) * replace pkg_resources with importlib * fix locate_file * Replace pybtex (#606) * integrate bib-parser * revisions * bib-fixes: optional * fix: dataset.load_records_dict() with empty records * print details for NotImplementedError * refactor Logger (bib, name_formatter, ...) * reformat * refactor bib loader * integrate header-only * loader: print non-unique IDs * drop unnecessary code * update docs * refactor * update coverage-badge * deps: remove importlib_metadata * [pre-commit.ci] pre-commit autoupdate (#607) * [pre-commit.ci] pre-commit autoupdate updates: - [github.com/psf/black-pre-commit-mirror: 24.10.0 → 25.1.0](https://github.com/psf/black-pre-commit-mirror/compare/24.10.0...25.1.0) - [github.com/asottile/pyupgrade: v3.19.0 → v3.19.1](https://github.com/asottile/pyupgrade/compare/v3.19.0...v3.19.1) - [github.com/pre-commit/mirrors-mypy: v1.13.0 → v1.15.0](https://github.com/pre-commit/mirrors-mypy/compare/v1.13.0...v1.15.0) - [github.com/astral-sh/ruff-pre-commit: v0.7.1 → v0.9.6](https://github.com/astral-sh/ruff-pre-commit/compare/v0.7.1...v0.9.6) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * update docs * fix pylint warnings in cli * consistently update settings in add_package_to_settings * fix pylint warning in plos * update scope_prescreen and docs * docs: update files_dir * update docs * update cli handling * fix: loader accept empty fields * update * Replace zope-interfaces by abstract base classes (abc) (#610) * review-types: zope to abc * use baseclasses instead of zope interfaces * fix mypy/pylint errors * fix method * notes * fixes * code cleanup * rename interface to base-classes * update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * test package --init --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * fix: drop repoze.sphinx.autointerface from docs/conf * update docs * zope cleanup * poetry to uv (#611) * poetry to uv * fix installation * uv: create and use * set paths for uv venv * test * update pyproject.toml * update * updates * uv-install * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * install together * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restrict python version * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * select package manager for instalation Signed-off-by: Gerit Wagner <[email protected]> * update * update installation * update * update * update * Update pyproject.toml * update pyproject.tomls * update pyproject tomls * update authors * add src again * use tool.hatch.build.targets.wheel Signed-off-by: Gerit Wagner <[email protected]> * update wheel * updates for uv * update pyproject tomls * update * package_manager: do not reinstall * fix pyproject.tomls * do not reinstall * update * update * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update coverage * package_manager: update installed-packages * update workflows * drop poetry update * update docs-workflows * update publish-pypi workflow * fix tests * tests * update * tests * update * install dev dependencies * update * update * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * update * update * update * update * update * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update * update * update * update * update * update * udpate * update * update * update * update * update * update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update --------- Signed-off-by: Gerit Wagner <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * docs: run sphinx in uv * fix pylint warning in scope-prescreen * revise package_manager * remove comment * remove note * update pyproject tomls * update publishing workflow * fix pyproject.tomls/dependencies * release 0.14.0 * update doi, release checklist * update docs * update docs * update README/release-checklist * fix link * update nr extensions * update PLOS api/docs * update covert: ris * update ris writer * update load-utils: load_df * fixes * temporarily remove genai * Update documentation (#575) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Export instead of print * fixes * temporarily remove genai * Update documentation (#575) * Update documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Update pyproject.toml * prescreen: use input() in package instead of operation * fix * fix * fix * switch from zope-interface to ABC * udpate package * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Gerit Wagner <[email protected]> Co-authored-by: Julian Prester <[email protected]> Co-authored-by: Gerit Wagner <[email protected]> Co-authored-by: Gerit Wagner <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Poetry updater <[email protected]> Co-authored-by: Carlo <[email protected]> Co-authored-by: olgagirona <[email protected]> Co-authored-by: julialopezmarti <[email protected]> Co-authored-by: trathienphuc-tran <[email protected]> Co-authored-by: ammar-uni <[email protected]> Co-authored-by: OlhaKomashevska <[email protected]> Co-authored-by: komashevska <[email protected]>
1 parent 1ef0cca commit ea1c939

File tree

4 files changed

+196
-15
lines changed

4 files changed

+196
-15
lines changed

colrev/packages/colrev_cli_prescreen/src/prescreen_cli.py

Lines changed: 0 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -76,21 +76,6 @@ def _fun_cli_prescreen(
7676
stat_len: int,
7777
padding: int,
7878
) -> bool:
79-
if self.review_manager.settings.prescreen.explanation == "":
80-
print(
81-
f"\n{Colors.ORANGE}Provide a short explanation of the prescreen{Colors.END} "
82-
"(why should particular papers be included?):"
83-
)
84-
print(
85-
'Example objective: "Include papers that focus on digital technology."'
86-
)
87-
self.review_manager.settings.prescreen.explanation = input("")
88-
self.review_manager.save_settings()
89-
else:
90-
print("\nIn the prescreen, the following process is followed:\n")
91-
print(" " + self.review_manager.settings.prescreen.explanation)
92-
print()
93-
9479
self.review_manager.logger.debug("Start prescreen")
9580

9681
if 0 == stat_len:

colrev/packages/genai/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
## Summary
2+
3+
Gen-AI package.
4+
5+
## prescreen
6+
7+
docs...
8+
9+
## Links
10+
11+
...
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
[project]
2+
name = "colrev.genai"
3+
description = "CoLRev package for GenAI"
4+
version = "0.1.0"
5+
license = "MIT"
6+
authors = [
7+
{ name = "Julian Prester", email = "[email protected]" },
8+
{ name = "Gerit Wagner", email = "[email protected]" }
9+
]
10+
requires-python = ">=3.8, <4"
11+
dependencies = [
12+
"litellm>=1.37.0",
13+
"pydantic>=2.7.1",
14+
]
15+
16+
[project.urls]
17+
repository ="https://github.com/CoLRev-Environment/colrev/tree/main/colrev/packages/genai"
18+
19+
[tool.hatch.build.targets.wheel]
20+
packages = ["src"]
21+
22+
[tool.colrev]
23+
colrev_doc_description = "GenAI"
24+
colrev_doc_link = "README.md"
25+
search_types = []
26+
27+
[project.entry-points.colrev]
28+
prescreen = "colrev.packages.genai.src.genai_prescreen:GenAIPrescreen"
29+
screen = "colrev.packages.genai.src.genai_screen:GenAIScreen"
30+
31+
[build-system]
32+
requires = ["hatchling"]
33+
build-backend = "hatchling.build"
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
#! /usr/bin/env python
2+
"""Prescreen based on GenAI"""
3+
from __future__ import annotations
4+
5+
import csv
6+
from pathlib import Path
7+
from typing import ClassVar
8+
9+
import pandas as pd
10+
from litellm import completion
11+
from pydantic import BaseModel
12+
from pydantic import Field
13+
14+
import colrev.package_manager.package_base_classes as base_classes
15+
import colrev.package_manager.package_manager
16+
import colrev.package_manager.package_settings
17+
import colrev.record.record
18+
from colrev.constants import Colors
19+
from colrev.constants import RecordState
20+
21+
22+
# pylint: disable=too-few-public-methods
23+
# pylint: disable=duplicate-code
24+
25+
26+
class PreScreenDecision(BaseModel):
27+
"""
28+
Class for a prescreen
29+
"""
30+
31+
SYSTEM_PROMPT: ClassVar[str] = (
32+
"You are an expert screener of scientific literature. "
33+
"You are tasked with identifying relevant articles for a literature review. "
34+
"You are provided with the metadata of an article and are asked to determine "
35+
"whether the article should be included in the review based on an inclusion criterion."
36+
)
37+
included: bool = Field(
38+
description="Whether the article should be included in the review "
39+
+ "based on the inclusion criterion."
40+
)
41+
explanation: str = Field(description="Explanation of the inclusion decision.")
42+
43+
44+
class GenAIPrescreen(base_classes.PrescreenPackageBaseClass):
45+
"""GenAI-based prescreen"""
46+
47+
ci_supported: bool = Field(default=True)
48+
export_todos_only: bool = True
49+
50+
class GenAIPrescreenSettings(
51+
colrev.package_manager.package_settings.DefaultSettings, BaseModel
52+
):
53+
"""Settings for GenAIPrescreen"""
54+
55+
# pylint: disable=invalid-name
56+
# pylint: disable=too-many-instance-attributes
57+
58+
endpoint: str
59+
model: str = "gpt-4o-mini"
60+
61+
settings_class = GenAIPrescreenSettings
62+
63+
def __init__(
64+
self,
65+
*,
66+
prescreen_operation: colrev.ops.prescreen.Prescreen,
67+
settings: dict,
68+
) -> None:
69+
self.review_manager = prescreen_operation.review_manager
70+
self.settings = self.settings_class(**settings)
71+
self.prescreen_decision_explanation_path = (
72+
self.review_manager.paths.prescreen
73+
/ Path("prescreen_decision_explanation.csv")
74+
)
75+
76+
# pylint: disable=unused-argument
77+
def run_prescreen(
78+
self,
79+
records: dict,
80+
split: list,
81+
) -> dict:
82+
"""Prescreen records based on GenAI"""
83+
84+
if self.review_manager.settings.prescreen.explanation == "":
85+
print(
86+
f"\n{Colors.ORANGE}Provide a short explanation of the prescreen{Colors.END} "
87+
"(why should particular papers be included?):"
88+
)
89+
print(
90+
'Example objective: "Include papers that focus on digital technology."'
91+
)
92+
self.review_manager.settings.prescreen.explanation = input("")
93+
self.review_manager.save_settings()
94+
else:
95+
print("\nIn the prescreen, the following process is followed:\n")
96+
print(" " + self.review_manager.settings.prescreen.explanation)
97+
print()
98+
99+
# API key needs to be set as an environment variable
100+
inclusion_criterion = self.review_manager.settings.prescreen.explanation
101+
102+
screening_decisions = []
103+
104+
for record_dict in records.values():
105+
record = colrev.record.record.Record(record_dict)
106+
response = completion(
107+
model=self.settings.model,
108+
max_tokens=1024,
109+
messages=[
110+
{
111+
"role": "user",
112+
"content": f"{PreScreenDecision.SYSTEM_PROMPT}\n\n"
113+
+ f"INCLUSION CRITERION:\n\n{inclusion_criterion}\n\n"
114+
+ f"METADATA:\n\n{record}",
115+
}
116+
],
117+
response_format=PreScreenDecision,
118+
)
119+
prescreen_decision = PreScreenDecision.model_validate_json(
120+
response.choices[0].message.content
121+
)
122+
if prescreen_decision.included:
123+
record.set_status(RecordState.rev_prescreen_included)
124+
else:
125+
record.set_status(RecordState.rev_prescreen_excluded)
126+
127+
screening_decisions.append(
128+
{
129+
"Record": record.get_data()["ID"],
130+
"Inclusion/Exclusion Decision": (
131+
"Included" if prescreen_decision.included else "Excluded"
132+
),
133+
"Explanation": prescreen_decision.explanation,
134+
}
135+
)
136+
137+
self.review_manager.paths.prescreen.mkdir(parents=True, exist_ok=True)
138+
screening_decisions_df = pd.DataFrame(screening_decisions)
139+
screening_decisions_df.to_csv(
140+
self.prescreen_decision_explanation_path, index=False, quoting=csv.QUOTE_ALL
141+
)
142+
self.review_manager.logger.info(
143+
f"Exported prescreening decisions to {self.prescreen_decision_explanation_path}"
144+
)
145+
146+
self.review_manager.dataset.save_records_dict(records)
147+
self.review_manager.dataset.create_commit(
148+
msg="Pre-screen (GenAI)",
149+
manual_author=False,
150+
)
151+
152+
return records

0 commit comments

Comments
 (0)