Skip to content

Conversation

@omkarrr2533
Copy link
Contributor

@omkarrr2533 omkarrr2533 commented Nov 30, 2025

Closes #14451

This PR adds support for parsing arXiv HTML URLs (e.g., https://arxiv.org/html/2511.01348v2). The ARXIV_PREFIX regex pattern has been updated to include html alongside the existing abs and pdf patterns, and corresponding test cases have been added.

Steps to test

  1. Open JabRef
  2. Click on "Add entry using..."
  3. Paste an arXiv HTML URL: https://arxiv.org/html/2511.01348v2
  4. Verify that "Enter Identifier" is selected (not "Choose Entry Type")
  5. Verify the entry is created successfully

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • I manually tested my changes in running JabRef (always required)
  • I added JUnit tests for changes (if applicable)
  • [/] I added screenshots in the PR description (if change is visible to the user)
  • [/] I described the change in CHANGELOG.md in a way that is understandable for the average user (if change is visible to the user)
  • [/] I checked the user documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request updating file(s) in https://github.com/JabRef/user-documentation/tree/main/en.

 

- Added 'html' to ARXIV_PREFIX regex pattern to recognize arxiv.org/html/ URLs
- Added test cases for HTML URLs with HTTP/HTTPS and with/without version numbers

Fixes JabRef#14451
@github-actions
Copy link
Contributor

Hey @omkarrr2533!

Thank you for contributing to JabRef! Your help is truly appreciated ❤️

We have automated checks in place, based on which you will soon get feedback if any of them are failing. In a while, maintainers will also review your contribution. Once that happens, you can go through their comments in the "Files changed" tab and act on them, or reply to the conversation if you have further inputs.

Please re-check our contribution guide in case of any other doubts related to our contribution workflow.

<JavaCodeStyleSettings>
<option name="DO_NOT_WRAP_AFTER_SINGLE_ANNOTATION" value="true" />
<option name="SPACE_INSIDE_ONE_LINE_ENUM_BRACES" value="true" />
<option name="LAYOUT_ON_DEMAND_IMPORT_FROM_SAME_PACKAGE_FIRST" value="false" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert changes here

@koppor
Copy link
Member

koppor commented Nov 30, 2025

@omkarrr2533 Why no CHANGELOG.md update? Was it too hard for you?

private static final Logger LOGGER = LoggerFactory.getLogger(ArXivIdentifier.class);

private static final String ARXIV_PREFIX = "http(s)?://arxiv.org/(abs|pdf)/|arxiv|arXiv";
private static final String ARXIV_PREFIX = "http(s)?://arxiv.org/(abs|pdf|html)/|arxiv|arXiv";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about alphabetical ordering?! I changed it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that's my first contribution , i tried my best

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did very well. You had a quick reaction time and kept communicating with the maintainers.
If something is addressed or modified by us, its mostly something simple we fix quickly in the web interface to get the PR ready to be merged. Nothing to be concerned of.

We would love to see more from you. Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make sure you carefully read the contribution guidelines and stay true in what you code and what you write.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure .

@github-actions github-actions bot added the status: changes-required Pull requests that are not yet complete label Nov 30, 2025
@jabref-machine
Copy link
Collaborator

You ticked that you modified CHANGELOG.md, but no new entry was found there.

If you made changes that are visible to the user, please add a brief description along with the issue number to the CHANGELOG.md file. If you did not, please replace the cross ([x]) by a slash ([/]) to indicate that no CHANGELOG.md entry is necessary. More details can be found in our Developer Documentation about the changelog.

@calixtus calixtus added this pull request to the merge queue Nov 30, 2025
Merged via the queue into JabRef:main with commit bc80a34 Nov 30, 2025
50 of 51 checks passed
Siedlerchr added a commit to EricW123/jabref that referenced this pull request Dec 1, 2025
…ue676

* upstream/main: (227 commits)
  Adapt welcome message (JabRef#14487)
  Add message when closing a PR
  Add collection of "all" AI features (JabRef#14438)
  Trigger conflict-detection on push on main (JabRef#14479)
  Add unassigned_comment on comment issue
  New Crowdin updates (JabRef#14483)
  Tweak labels also at merge conflicts
  Merge --remove-label and --add-label
  Remove SmartGroup and refactor groups factory (JabRef#14398)
  more debug
  Support html when parsing arXiv identifiers (JabRef#14474)
  Add debug and fix run
  Remove "ready-for-review" if changes are required
  Have label move as last step of comment
  Add pr number to output
  change files to file(s) (JabRef#14465)
  Add CDS archive (JabRef#14476)
  Fix adapting labels (JabRef#14477)
  Chore(deps): Bump jablib/src/main/resources/csl-styles (JabRef#14468)
  Chore(deps): Bump net.bytebuddy:byte-buddy in /versions (JabRef#14472)
  ...
@omkarrr2533 omkarrr2533 deleted the fix-arxiv-html=import branch December 2, 2025 03:47
Siedlerchr added a commit to tejjgv/jabref that referenced this pull request Dec 2, 2025
* upstream/main: (102 commits)
  Adapt welcome message (JabRef#14487)
  Add message when closing a PR
  Add collection of "all" AI features (JabRef#14438)
  Trigger conflict-detection on push on main (JabRef#14479)
  Add unassigned_comment on comment issue
  New Crowdin updates (JabRef#14483)
  Tweak labels also at merge conflicts
  Merge --remove-label and --add-label
  Remove SmartGroup and refactor groups factory (JabRef#14398)
  more debug
  Support html when parsing arXiv identifiers (JabRef#14474)
  Add debug and fix run
  Remove "ready-for-review" if changes are required
  Have label move as last step of comment
  Add pr number to output
  change files to file(s) (JabRef#14465)
  Add CDS archive (JabRef#14476)
  Fix adapting labels (JabRef#14477)
  Chore(deps): Bump jablib/src/main/resources/csl-styles (JabRef#14468)
  Chore(deps): Bump net.bytebuddy:byte-buddy in /versions (JabRef#14472)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

first contrib status: changes-required Pull requests that are not yet complete

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support html when parsing arXiv identifiers

5 participants