This module requires the third-party module drupal/country
If you're using composer add ilri/cgspace_importer to your composer.json file under the repositories section.
Example:
{
"repositories": {
"ilri/cgspace_importer": {
"type": "vcs",
"url": "[email protected]:ilri/dspace-drupal-module.git"
}
}
}If you correctly added the repository as described above you can now run in your shell:
composer require ilri/cgspace_importerIf everything has been done correctly, the latest version of the module is downloaded and added to your installation.
You can now enable the module with Drush:
drush en cgspace_importeror directly from the admin/modules page.
Once the module is installed you need to configure it at the general settings page admin/config/cgspace/settings/general:
-
Endpoint: the CGSpace endpoint
Ex: https://dspace7test.ilri.org -
Importer ID: Your identifier for statistical purposes that will be added at any Request to CGSpace
Ex: CGIAR -
CGSpace discover query Page Size: The CGSpace API Page size for the queries, you can leave it at its default value or tune it depending on your system performance.
default: 100 -
Chunk size of nodes to be processed at once: The amount of nodes to be processed at each batch run for commands using the --all option, you can leave it at its default value or tune it depending on your system performance.
default: 50 -
DEBUG mode: If checked response times for any CGSpace request are logged. It could be used to measure performance.
default: disabledDo NOT USE on Production!
If you want to import everything from CGSpace you may skip this step; otherwise, you need to first select the Communities and then the Collections you want to import:
- Communities:
admin/config/cgspace/settings/communities - Collections:
admin/config/cgspace/settings/collections
Last but not least you need to configure the Processors: admin/config/cgspace/settings/processors
where you can decide to use the module embedded vocabularies or map with terms on your existing installation vocabularies.
Once everything is configured you need to run the first import that will sync CGSpace items with your Drupal nodes. Currently, only a Drush command is provided:
cgspace_importer:create
and you can run it using the --all option that will import all CGSpace items in the Sitemap Index (ignoring your Communities and Collections settings) or without that option that will import through the CGSpace API the items on Communities and Collections you selected.
-
Import all CGSpace items on Sitemap Index:
drush cgspace_importer:create --all
-
Import CGSpace items in Communities and Collections you configured using the CGSpace API:
drush cgspace_importer:createThis module provides the following Drush commands:
-
cgspace_importer:create [--all]
Create Publication nodes by generating the list through the CGSpace API according to communities and collections selected on the configuration page.
-
--all
if --all option is specified the list of Publication nodes is generated from CGSpace sitemap index
-
-
cgspace_importer:update [YYYY-MM-DD] [--all]
Update or create Publication nodes by generating the list through the CGSpace API according to communities and collections selected on the configuration page.
If a date argument is specified, the list of updated or added items is generated accordingly
-
--all
if --all option is specified the list of Publication nodes is generated through CGSpace API without taking care of collections and communities selected on the configuration page.
-
-
cgspace_importer:delete [--all]
Delete Publication nodes comparing with the list of items generated through the CGSpace API according to communities and collections selected on the configuration page.
-
--all
if --all option is specified the Publication nodes are compared to the list of items generated from the CGSpace sitemap index.
-
-
Import all CGSpace publications starting from CGSpace Sitemap Index
drush cgspace_importer:create --all
-
Import all CGSpace publications starting from configured Communities and Collections
drush cgspace_importer:create
-
Update all publications nodes using CGSpace Sitemap Index (ignoring the Communities and Collections configuration)
drush cgspace_importer:update --all
-
Update publications nodes using CGSpace API according to configured Communities and Collections.
drush cgspace_importer:update
-
Update from a specific date (Ex: January, 1st 2020)
drush cgspace_importer:update 2020-01-01
Tip: if you use a very old date as an argument, the behavior is similar to running create. Use carefully!
-
Remove Publication nodes using CGSpace API according to configured Communities and Collections.
drush cgspace_importer:delete
Tip: if you don't select any collections in the configuration you can use this command to remove all publication nodes from your Drupal installation.
-
Clean all publication nodes using the Sitemap Index.
drush cgspace_importer:delete --all
You can manage the update functionalities using your system cron with the drush cgspace_importer:update command
in addition you can also trigger it manually through the CGSpace Sync page admin/content/cgspace.
IMPORTANT: we suggest to run the update functionality:
- once per day in the case of full import with --all option
- once per week in case of partial import with configured Communities and Collections
Since we have some limitations from the current CGSpace API we're able to get added/updated items from the drush cgspace_importer:update command but not the removed ones.
This means that you need to run periodically the delete drush cgspace_importer:delete command as well.
IMPORTANT: we suggest to run the delete functionality:
- once per day in the case of full import with --all option
- once per week in case of partial import with configured Communities and Collections