Skip to content

Open-Orange-Button/Orange-Button-ProductRegistry

Repository files navigation

Orange Button Product Registry

Installation for development

If you are new to Django, first please read their excellent tutorial. It will explain much of the Product Registry's implementation.

Installing project dependencies

  1. Clone this repository and cd into it.

  2. Install the uv Python package manager:

    curl -LsSf https://astral.sh/uv/install.sh | sh
  3. Install the JSON command-line JSON processor jq.

  4. Install the project's Python dependencies.

    uv sync --dev
  5. Activate the Python virtual environment.

    source .venv/bin/activate

Building a local database

This project uses SQLite as the database for local development. To build a local SQLite Product Registry database,

  1. Update the Django migrations to match the Django model definitions. No changes detected means that they are up-to-date.

    python manage.py makemigrations
  2. Apply the migrations to the database to create all the tables.

    python manage.py migrate

Now that the database is built, we will populate it using Jupyter Notebooks.

  1. In another terminal window, start a Jupyter Notebook server by running

    uv run jupyter lab .
  2. The Django app ob_taxonomy defines tables for storing the Orange Button Taxonomy as database metadata that is referenced frequently in the Product Registry's code. Populate these tables by running the notebook ob_taxonomy/upload_taxonomy.ipynb (note that jq must be installed).

  3. In server/data_upsert, there are multiple Jupyter Notebooks for cleaning and uploading CEC data. Run each of these notebooks to output cleaned CEC data:

    • For ProdBattery, run the notebook server/data_upsert/ProdBattery/clean.ipynb.
    • For ProdModule, run the notebook server/data_upsert/ProdModule/clean.ipynb.
  4. With the cleaned data, upload the the CEC data.

    • For ProdBattery, run the notebook server/data_upsert/ProdBattery/upsert_no_orm.ipynb.
    • For ProdModule, run the notebook server/data_upsert/ProdModule/upsert_no_orm.ipynb.

Running a development server

Start a development server by running

python manage.py runserver

Deployment

This project is set up to be deployed on Amazon (AWS) Elastic Container Service (ECS) with a MySQL database, where "Container" refers to a Docker container. Configuring AWS deployment infrastructure though their developer console is tedious and difficult to do consistently, so this project uses Terraform to automate the configuration instead.

Infrastructure configuration with Terraform

These are roughly the steps to follow. Depending on what infrastructure already exists (e.g., HTTPS certificates for the domain), you may need to do additional steps.

  1. In terraform/variables.tf, edit the variables service-name, service-name-alphanumeric, and service-domain-name.
  2. In terraform/bastion.tf, edit the aws_security_group called bastion_sg by writing your IP address in the ingress cidr_block. This "Bastion" EC2 server is needed later for connecting to the database to upload data.
  3. In terraform/database.tf, edit the password field of the aws_db_instance called mysql.
  4. cd into terraform.
  5. Create an SSH key pair for connecting to the "Bastion" EC2 instance later: ssh-keygen -f bastion_key.
  6. Run terraform init.
  7. Use the AWS command-line interface to log into AWS by running aws login.
  8. Run terraform plan to see what infrastructure will be configured (additions, changes, and deletions).
  9. Run terraform apply to perform the infrastructure configuration.
  10. Note the final outputs of terraform apply. They include the following:
    • An Amazon Elastic Container Registry (ECR) address to which we will push a Docker image of this project.
    • The IP address of the "Bastion" EC2 instance we will use a proxy to connect to the database.

Pushing a Docker image to Amazon Elastic Container Registry (ECR)

  1. Rename product_registry/settings_deployment.py to product_registry/settings.py and replace the appropriate values (e.g., SECRET_KEY, ALLOWED_HOSTS, etc.).

  2. Build the Docker image.

    docker build -t django-ecs .  # django-ecs is an arbitrary name for the image
  3. Debugging the image. With Django's HTTPS redirection turned off, try

    docker run --rm -p 8000:8000 --name django-test django-ecs --bind 0.0.0.0:8000

    and then go to 127.0.0.1.

  4. Log Docker into AWS.

    aws ecr get-login-password --region us-<REGION>-1 | docker login --username AWS --password-stdin <AWS_ACCOUNT_NUMBER>.dkr.ecr.us-<REGION>-1.amazonaws.com

    Note

    If you used sudo docker to build the image, you must also use sudo docker here.

  5. Tag the Docker image.

    docker tag django-ecs:latest <AWS_ACCOUNT_NUMBER>.dkr.ecr.us-<REGION>-1.amazonaws.com/<ECR_REPO_NAME>:latest
  6. Push the Docker image to AWS ECR.

    docker push <AWS_ACCOUNT_NUMBER>.dkr.ecr.us-<REGION>-1.amazonaws.com/<ECR_REPO_NAME>:latest
  7. Look the load balancer page in the AWS developer console to find the ECS task and check whether the task instances are running. Look at the logs in AWS CloudWatch to debug.

  8. Once an ECS task instance is running, it will automatically create the tables in the database.

  9. If you are building an image to replace the current one in production, AWS ECS will not automatically use the newly built image if it has the same tag (e.g., "latest") as the old image. To get AWS ECS to use the new image, navigate to the ECS service following breadcrumbs like

    Amazon Elastic Container Service > Clusters > ob-product-registry-2026-02-cluster > Services > ob-product-registry-2026-02-service > Health
    

    In the top right, there should be a button labeled "Update service". Click the dropdown next to it and select "Force new deployment".

Connecting and uploading data to the database

The database is in a private subnet of the virtual private cloud (VPC) we created, so we cannot connect to it directly to upload data. Instead, we use a temporary "Bastion" EC2 instance as a proxy. Run

ssh -i bastion_key -L 3307:<DATABASE_URL>:3306 ec2-user@<BASTION_EC2_IP>

where we intentionally linked port 3307 of our local machine to port 3306 of the database. You can find the DATABASE_URL in the AWS Relational Database Service section of the AWS developer console.

Now, we can connect to the database by running

mysql -h 127.0.0.1 -P 3307 -u admin -p

and entering the password in terraform/database.tf.

To upload data into the database, one successful technique is as follows:

  1. Build a local SQLite database containing all the data to be uploaded to the remote production MySQL database.

  2. Use DuckDB to connect to both the local SQLite database than the remote database.

    attach 'db.sqlite3' as lds (type sqlite);
    attach 'host=localhost user=admin password=<DATABASE_PASSWORD> database=OBProductRegistry port=3307' as rds (type mysql);
  3. In DuckDB, select data from the local database to upsert into the remote database. Here is an example of inserting ProdBattery and ProdModule data:

    begin transaction;
    insert into rds.server_dcinput select * from lds.server_dcinput;
    insert into rds.server_dcoutput select * from lds.server_dcoutput;
    insert into rds.server_dimension select * from lds.server_dimension;
    insert into rds.server_product select * from lds.server_product;
    insert into rds.server_prodbattery select * from lds.server_prodbattery;
    insert into rds.server_entity select * from lds.server_entity;
    insert into rds.server_certificationagency select * from lds.server_certificationagency;
    insert into rds.server_checksum select * from lds.server_checksum;
    insert into rds.server_firmware select * from lds.server_firmware;
    insert into rds.server_prodcertification select * from lds.server_prodcertification;
    insert into rds.server_product_ProdCertifications select * from lds.server_product_ProdCertifications;
    insert into rds.server_prodcell select * from lds.server_prodcell;
    insert into rds.server_prodglazing select * from lds.server_prodglazing;
    insert into rds.server_moduleelectrating select * from lds.server_moduleelectrating;
    insert into rds.server_prodmodule select * from lds.server_prodmodule;
    insert into rds.server_prodmodule_ModuleElectRatings select * from lds.server_prodmodule_ModuleElectRatings;
    insert into rds.server_sourcecountry select * from lds.server_sourcecountry;
    insert into rds.server_product_SourceCountries select * from lds.server_product_SourceCountries;
    commit;
  4. Once all the data is uploaded, destroy the "Bastion" EC2 instance to save costs. In terraform/bastion.tf, comment out everything except the aws_security_group named bastion_sg. In terraform/security_groups.tf, remove aws_security_group.bastion_sg.id from the security_groups of ingress of the aws_security_group named rds_sg. Run terraform apply to remove the security group from the database and destroy the "Bastion" EC2 instance. Next, in terraform/bastion.tf, comment out the aws_security_group named bastion_sg. Rerun terraform apply to destroy the Bastion's security group.

Serving the website from the domain

Enter the domain and the load balancer's URL into your domain provider's DNS CNAME table.