Skip to content

Allow build_unit_cell to return atom site labels #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Mar 12, 2025

Conversation

janbridley
Copy link
Collaborator

@janbridley janbridley commented Mar 11, 2025

Description

  • Added the additional_columns arg to cif.build_unit_cell to allow for replication of element symbols (and other data from the array).
  • Ensured consistent ordering of unit cell positions by sorting in order of symmetry operations
  • Fixed type hinting in a few docstrings

TODOs

  • Allow for additional data return
  • Update return types in docstring
  • Add tests for new functionality
  • Verify this meets use cases

NOTE

parsnip ignores fractional occupancies while building unit cells. This is different than ASE, which always selects the most dominant species. The position data is unaffected, but users extracting additional columns should be aware that the selection of a particular species is not guaranteed.

Motivation and Context

As requested in #37, it would be useful to (optionally) replicate additional data alongside the atomic positions. This functionality has been added with the additional_columns arg in build_unit_cell, which allows returns associated data for each replicate atom. The current code returns an array for the positions and the additional data: while I considered a structured array, doing so makes indexing columns nontrivial.

Multi-array return (this PR)

(array([['Si', '0.01013'],
       ['Si', '0.01013'],
       ...
       ['O', '0.00127'],
       ['O', '0.00127'],
       ['O', '0.00127']], dtype='<U7'), 
array([[0.    , 0.1522, 0.25  ],
       [0.    , 0.8478, 0.75  ],
       ...
       [0.2336, 0.1245, 0.5814],
       [0.7664, 0.8755, 0.4186],
       [0.2664, 0.3755, 0.0814]]))

Structured array (alternate option)

[('Si', '0.01013', 0.    , 0.1522, 0.25  )
 ('Si', '0.01013', 0.    , 0.8478, 0.75  )
 ...
 ('O', '0.00127', 0.2336, 0.1245, 0.5814)
 ('O', '0.00127', 0.7664, 0.8755, 0.4186)
 ('O', '0.00127', 0.2664, 0.3755, 0.0814)]

Resolves #37

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change

Checklist:

  • I am familiar with the Development Guidelines
  • The changes introduced by this pull request are covered by existing or newly introduced tests.
  • I have updated the changelog and added my name to the credits.

@janbridley
Copy link
Collaborator Author

@cajfisher Please let me know if this fulfills your use case! Add additional_columns="_atom_site_label" to your calls to build_unit_cell to get an auxiliary array of the labels. By default, additional_columns is None and the method functions as it did previously.

@janbridley janbridley marked this pull request as ready for review March 12, 2025 14:58
@janbridley janbridley merged commit fa67199 into main Mar 12, 2025
12 checks passed
@janbridley janbridley deleted the feature/atom-labels branch March 13, 2025 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Full set of atom labels when building unit cell?
1 participant