Command Line Interface Documentation

Please double check that you've followed the installation instructions in the root README. The procedure has changed recently. This document assumes that you have run pip install -e . as instructed.

Table of Contents

ACME
HELM
FAME
HIRAILS
CSM
- CSM Flight Pipeline
JEWEL
TOGA Optimization

Pipeline Types

Each autonomy system is broken into four pipelines. The intended use case for each pipeline type is described below. The remainder of this document describes the detailed usage for each pipeline type in each autonomy system.

Pipeline Type	Use Case
Flight Pipeline	Flight pipelines implement the autonomy run onboard the instrument. This can be considered flight software.
Ground Pipeline	Ground pipelines implement standard ground processing run after flight pipeline data has been downlinked. This can be considered ground data system software. The flight pipeline must be run before the ground pipeline.
Analysis Pipeline	Analysis pipelines provide the capabilities of flight and ground pipelines, as well as additional evaluation tools utilized by machine learning SMEs assessing the health of the overall autonomy system.
Simulation Pipeline	Simulation pipelines are a standalone capability used to generate synthetic instrument data for testing purposes.

ACME

ACME_flight_pipeline

For a first-time run on raw files (this will generate .pickle files for future use):

$ ACME_flight_pipeline --data "path/to/files/*.raw" --outdir specify/directory

For pickle files (Scan all .pickle files in directory):

$ ACME_flight_pipeline --data "path/to/files/*.pickle" --outdir specify/directory

For reprocessing the database:

$ ACME_flight_pipeline --reprocess_dir "labdata/toplevel/" --reprocess_version vX.y --reprocess

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--data`	Glob of files to be processed	None
💡	`--outdir`	Output directory path	None
	`--masses`	Path to file containing known masses	`cli/configs/compounds.yml`
	`--params`	Path to config file for Analyzer	`cli/configs/acme_config.yml`
	`--sue_weights`	Path to weights for Science Utility Estimate	`cli/configs/acme_sue_weights.yml`
	`--dd_weights`	Path to weights for Diversity Descriptor	`cli/configs/acme_dd_weights.yml`
	`--log_name`	Filename for the pipeline log	`ACME_flight_pipeline.log`
	`--log_folder`	Folder path to store logs	`cli/logs`
	`--kill_file`	Pipeline will halt if this file is found	ACME_flight_kill_file
	`--cores`	Number of processor cores to utilize	`7`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None

ACME_ground_pipeline

For a first-time run on raw files (this will generate .pickle files for future use):

$ ACME_ground_pipeline --data "path/to/files/*.raw" --outdir specify/directory

For pickle files (Scan all .pickle files in directory):

$ ACME_ground_pipeline --data "path/to/files/*.pickle" --outdir specify/directory

For reprocessing the database:

$ ACME_ground_pipeline --reprocess_dir "labdata/toplevel/" --reprocess_version vX.y --reprocess

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--data`	Glob of files to be processed	None
💡	`--outdir`	Output directory path	None
	`--masses`	Path to file containing known masses	`cli/configs/compounds.yml`
	`--params`	Path to config file for Analyzer	`cli/configs/acme_config.yml`
	`--sue_weights`	Path to weights for Science Utility Estimate	`cli/configs/acme_sue_weights.yml`
	`--dd_weights`	Path to weights for Diversity Descriptor	`cli/configs/acme_dd_weights.yml`
	`--log_name`	Filename for the pipeline log	`ACME_ground_pipeline.log`
	`--log_folder`	Folder path to store logs	`cli/logs`
🔽	`--noplots`	Disables plotting output	None
🔽	`--noexcel`	Disables excel file output	None
🔼	`--debug_plots`	Enables per-peak plots for debugging purposes	None
💡 🔽	`--space_mode`	Only output science products; equivalent to `--noplots --noexcel`	None
	`--cores`	Number of processor cores to utilize	`7`
	`--saveheatmapdata`	Save heatmap as data file in addition to image	None
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None
💡	`--knowntraces`	Process only known masses specified in `configs/compounds.yml`	None

ACME_analysis_pipeline

For a first-time run on raw files (this will generate .pickle files for future use):

$ ACME_analysis_pipeline --data "path/to/files/*.raw" --outdir specify/directory

For pickle files (Scan all .pickle files in directory):

$ ACME_analysis_pipeline --data "path/to/files/*.pickle" --outdir specify/directory

For reprocessing the database:

$ ACME_analysis_pipeline --reprocess_dir "labdata/toplevel/" --reprocess_version vX.y --reprocess

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--data`	Glob of files to be processed	None
💡	`--outdir`	Output directory path	None
	`--masses`	Path to file containing known masses	`cli/configs/compounds.yml`
	`--params`	Path to config file for Analyzer	`cli/configs/acme_config.yml`
	`--sue_weights`	Path to weights for Science Utility Estimate	`cli/configs/acme_sue_weights.yml`
	`--dd_weights`	Path to weights for Diversity Descriptor	`cli/configs/acme_dd_weights.yml`
	`--log_name`	Filename for the pipeline log	`ACME_analysis_pipeline.log`
	`--log_folder`	Folder path to store logs	`cli/logs`
🔽	`--noplots`	Disables plotting output	None
🔽	`--noexcel`	Disables excel file output	None
🔼	`--debug_plots`	Enables per-peak plots for debugging purposes	None
💡 🔽	`--space_mode`	Only output science products; equivalent to `--noplots --noexcel`	None
	`--cores`	Number of processor cores to utilize	`7`
	`--saveheatmapdata`	Save heatmap as data file in addition to image	None
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None
💡	`--knowntraces`	Process only known masses specified in `configs/compounds.yml`	None

ACME_simulator

Simulate Raw ACME samples to debug and better understand ACME analyser. This Simulator was used to generate the Silver and Golden Dataset at data_OWLS/ACME/.... The config file for those datasets is saved in those folders as well.

Arguments

Argument flag	Description	Default Value
`--params`	Path to config file	`cli/configs/acme_sim_params.yml`
`--out_dir`	Path to output directory	None
`--n_runs`	Number of simulation runs	`10`
`--log_name`	Filename for pipeline log	`ACME_simulator.log`
`--log_folder`	Folder path to store logs	`cli/logs`

ACME_evaluation

ACME Evaluation measures the performance of ACME on simulator data and hand-labeled lab data. There are two versions of this script:

ACME_evaluation.py is the original evaluation script. It calculates the output precision and the label recall, which is the number of output peaks that match to any label and the number of labels that match to any output, respectively. Note that the f1-score cannot be calculated due to differing populations.
ACME_evaluation_strict.py is the stricter evaluation script. It enforces a one-to-one match between the output and labeled peaks, marking duplicate detections as false positives. This allows the script to calculate a formal precision, recall, and f1. This script is recommended.

Arguments

Argument flag	Description	Default Value
`acme_outputs`	Required, glob of peak output from analyzer	None
`acme_labels`	Required, glob of peak labels	None
`--hand_labels`	Expects hand labels (as opposed to simulator labels)	None
`--mass_threshold`	Max distance (in mass indices) between matching peaks.	`30`
`--time_threshold`	Max distance (in time indices) between matching peaks.	`30`
`--ambiguous`	Some peaks are labeled as ambiguous. Consider these as true peaks.	None
`--log_name`	Filename for pipeline log.	`ACME_evaluation.log`
`--log_folder`	Folder path to store logs.	`cli/logs`

HELM

HELM_flight_pipeline

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/helm_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`HELM_flight_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--kill_file`	Pipeline will halt if this file is found	`HELM_flight_kill_file`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None
❗	`--predict_model`	Path to ML model for motility classification.	`cli/models/classifier_labtrain_v02.pickle`

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`preproc`	Lowers the resolution from 2048x2048 to 1024x1024 for analysis.	TRUE
`validate`	Generates data validation products, including videos and MHIs.	TRUE
`tracker`	Track particles in the experiment.	TRUE
`features`	Extract features from detected tracks.	FALSE
`predict`	Predict motility of all tracks with classification model.	FALSE
`asdp`	Generate ASDP products, including a visualization video.	FALSE
`manifest`	Generate file manifest for JEWEL.	FALSE

Most steps depend on output from all previous steps. This table lists step prerequisites.

Step Name	Prerequisite Steps	Other Reqs
`preproc`	N/A	N/A
`validate`	`preproc`	N/A
`tracker`	`preproc` `validate`	N/A
`features`	`preproc` `validate` `tracker`	`track_evaluation` Optional
`predict`	`preproc` `validate` `tracker` `features`	Pretrained Model
`asdp`	`preproc` `validate` `tracker` `features` `predict`	Track Labels Optional
`manifest`	N/A	Various `validate`, `tracker`, `predict`, `asdp` Products Optional

There are also pipelines available for common combinations of steps.

Pipeline Name	Description	Steps
`pipeline_train`	Pipeline to train the motility classifier.	`preproc`, `validate`, `tracker`, `track_evaluation`, `features`, `train`
`pipeline_predict`	Pipeline to predict motility.	`preproc`, `validate`, `tracker`, `features`, `predict`
`pipeline_products`	Pipeline to generate all products.	`preproc`, `validate`, `tracker`, `features`, `predict`, `asdp`, `manifest`
`pipeline_space`	Pipeline to generate space-mode products. Disables most plotting, especially in validate.	`preproc`, `validate`, `tracker`, `features`, `predict`, `asdp`, `manifest`

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 50 valid holograms in said subdirectory. Valid holograms are:
- Images with extension .tif
- Images with resolution 2048x2048
- These values can be configured in the config.
The enumerated names of the images are expected to be consecutive.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in helm_config.yml before executing the pipeline. These use src/cli/configs/helm_config.yml by default.

Validate your Experiment Data

HELM_flight_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate

Note how, by adding a wildcard, you can process multiple experiments at once.

Generate Particle Tracks

HELM_flight_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate tracker \
--use_existing preproc validate

Note how, by specifying --use_existing, the pipeline will use existing preproc and validate step output if they already exist.

Validate, Track Particles, Predict Motility, and Generate Visualization

HELM_flight_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps pipeline_products \
--use_existing preproc validate tracker

Note that --config and --predict_model can also be specified, but we're just using the default values here.

Tracker outputs

In the output folder, the following subfolders will be made:

/plots: plots of all the tracks, where each track is colored with a distinct color

/tracks: subfolders for each cases, with .track files giving the coordinates

/configs: configuration file for the case

/train classifier: an empty folder, which is needed by train_model.py

Classifier outputs

In the output folder, under /train classifier, you will see the following:

track motility.csv, which gives each track with its label, is just a log file output

yourclassifier.pickle, classifier in pkl form

plots/ roc curve, confusion matrix, feature_importance if running Zaki's classifier

HELM_ground_pipeline

⚠️ The flight pipeline must be run before the ground pipeline.

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/helm_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`HELM_ground_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`validate`	Generates data validation products, including videos and MHIs.	TRUE
`asdp`	Generate ASDP products, including a visualization video.	FALSE

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 50 valid holograms in said subdirectory. Valid holograms are:
- Images with extension .tif
- Images with resolution 2048x2048
- These values can be configured in the config.
The enumerated names of the images are expected to be consecutive.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in helm_config.yml before executing the pipeline. These use src/cli/configs/helm_config.yml by default.

HELM_analysis_pipeline

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/helm_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`HELM_analysis_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None
❗	`--train_feats`	Only usees tracks with labels for model training.	None
❗	`--predict_model`	Path to ML model for motility classification.	`cli/models/classifier_labtrain_v02.pickle`
❗	`--toga_config`	Override config filepath for TOGA optimization.	None

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`preproc`	Lowers the resolution from 2048x2048 to 1024x1024 for analysis.	TRUE
`validate`	Generates data validation products, including videos and MHIs.	TRUE
`tracker`	Track particles in the experiment.	TRUE
`point_evaluation`	Using track labels, measure point accuracy of the tracker.	TRUE
`track_evaluation`	Using track labels, measure track accuracy of the tracker.	TRUE
`features`	Extract features from detected tracks.	FALSE
`train`	Train the motility classification model.	FALSE
`predict`	Predict motility of all tracks with classification model.	FALSE
`asdp`	Generate ASDP products, including a visualization video.	FALSE
`manifest`	Generate file manifest for JEWEL.	FALSE

Most steps depend on output from all previous steps. This table lists step prerequisites.

Step Name	Prerequisite Steps	Other Reqs
`preproc`	N/A	N/A
`validate`	`preproc`	N/A
`tracker`	`preproc` `validate`	N/A
`point_evaluation`	`preproc` `validate` `tracker`	Track Labels
`track_evaluation`	`preproc` `validate` `tracker`	Track Labels
`features`	`preproc` `validate` `tracker`	`track_evaluation` Optional
`train`	`preproc` `validate` `tracker` `track_evaluation` `features`	Track Labels
`predict`	`preproc` `validate` `tracker` `features`	Pretrained Model
`asdp`	`preproc` `validate` `tracker` `features` `predict`	Track Labels Optional
`manifest`	N/A	Various `validate`, `tracker`, `predict`, `asdp` Products Optional

There are also pipelines available for common combinations of steps.

Pipeline Name	Description	Steps
`pipeline_train`	Pipeline to train the motility classifier.	`preproc`, `validate`, `tracker`, `track_evaluation`, `features`, `train`
`pipeline_predict`	Pipeline to predict motility.	`preproc`, `validate`, `tracker`, `features`, `predict`
`pipeline_tracker_eval`	Pipeline to evaluate tracker performance.	`preproc`, `validate`, `tracker`, `point_evaluation`, `track_evaluation`
`pipeline_products`	Pipeline to generate all products.	`preproc`, `validate`, `tracker`, `point_evaluation`, `track_evaluation`, `features`, `predict`, `asdp`, `manifest`
`pipeline_space`	Pipeline to generate space-mode products. Disables most plotting, especially in validate.	`preproc`, `validate`, `tracker`, `features`, `predict`, `asdp`, `manifest`

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory raw/
Have at least 50 valid holograms in said subdirectory. Valid holograms are:
- Images with extension .tif
- Images with resolution 2048x2048
- These values can be configured in the config.
The enumerated names of the images are expected to be consecutive.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in helm_config.yml before executing the pipeline. These use src/cli/configs/helm_config.yml by default.

Validate your Experiment Data

HELM_analysis_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate

Note how, by adding a wildcard, you can process multiple experiments at once.

Generate Particle Tracks

HELM_analysis_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate tracker \
--use_existing preproc validate

Note how, by specifying --use_existing, the pipeline will use existing preproc and validate step output if they already exist.

Train a motility model

Use the pipeline_train step bundle to run the tracker, evaluation, feature generator, and training steps. The --use_existing flag will skip any steps that were previously computed:

HELM_analysis_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps pipeline_train \
--train_feats \
--use_existing preproc validate

Validate, Track Particles, Predict Motility, and Generate Visualization

HELM_analysis_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps pipeline_products \
--use_existing preproc validate tracker

Note that --config and --predict_model can also be specified, but we're just using the default values here.

Tracker outputs

In the output folder, the following subfolders will be made:

/plots: plots of all the tracks, where each track is colored with a distinct color

/tracks: subfolders for each cases, with .track files giving the coordinates

/configs: configuration file for the case

/train classifier: an empty folder, which is needed by train_model.py

Classifier outputs

In the output folder, under /train classifier, you will see the following:

track motility.csv, which gives each track with its label, is just a log file output

yourclassifier.pickle, classifier in pkl form

plots/ roc curve, confusion matrix, feature_importance if running Zaki's classifier

HELM_simulator

The HELM simulator generates synthetic DHM data. It can be used as a source of sensitivity and sanity checks to assess performance of edge cases and limits (such as particle speed) of HELM. This tool is broken down into two major steps:

simulate particle tracks (specifying position, brightness, velocity through time)
simulate DHM images from tracks (creating 2D tif hologram images using noise and tracks)

HELM_simulator [required-args] [optional-args]

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--configs`	Configuration parameters for synthetic data.	`configs/helm_simulator_config.yml`
💡	`--n_exp`	Number of experiments to generate with config(s).	1
✅	`--sim_outdir`	Directory to save the synthetic data to.	None
	`--log_name`	Filename for the pipeline log.	`HELM_simulator.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`

Common Usage Examples

# Single config
HELM_simulator.py \
--configs src/cli/configs/helm_simulator_config_v2.yml \
--n_exp 2 \
--sim_outdir <local_output_dir>

# Multiple configs
HELM_simulator.py \
--configs src/cli/configs/sim_config_v*.yml \
--n_exp 2 \
--sim_outdir <local_output_dir>

Config options

Within the configuration file, you can set items like

image parameters (e.g., resolution, chamber size, noise characteristics, etc.)
experiment parameters (number of motile/non-motile particles, length of recording, drift, etc.)
particle parameters (e.g., shape/size/brightness of particle, movement distribution, etc.)

There are two pre-baked configuration to choose from. They differ slightly in how they generate motile tracks dynamics.

helm_simulator_config_v1.yml This configuration generates all tracks (motile and non-motile) by making random perturbations to a track's velocity at each time step. Motile tracks are best generated by assigning a wide movement distribution and high momentum. This approach is simple to use, but creates tracks that are less realistic than in helm_simulator_config_v2.yml.
helm_simulator_config_v2.yml This configuration (default) uses Variational Autoregression (VAR) models to simulate more realistic motile tracks. Non-motile tracks are simulated using the same random perturbation approach in helm_simulator_config_v1.yml. VAR models are essentially N-dimensional autoregression models. Use VAR models that were fitted to real particle tracks to generate synthetic ones.

In the config, you must specify a VAR model file that was calibrated using the statsmodels package. The VAR model files are stored in helm_dhm/simulator/var_models and an example on how to fit a VAR model to data is in src/research/wronk/simulation_dynamics. See this statsmodels page for general information on VAR models.

FAME

FAME_flight_pipeline

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/fame_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`FAME_flight_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--kill_file`	Pipeline will halt if this file is found	`FAME_flight_kill_file`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None
❗	`--predict_model`	Path to ML model for motility classification.	`cli/models/classifier_labtrain_v02.pickle`

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`preproc`	Lowers the resolution from 2048x2048 to 1024x1024 for analysis.	TRUE
`validate`	Generates data validation products, including videos and MHIs.	TRUE
`tracker`	Track particles in the experiment.	TRUE
`features`	Extract features from detected tracks.	FALSE
`train`	Train the motility classification model.	FALSE
`predict`	Predict motility of all tracks with classification model.	FALSE
`asdp`	Generate ASDP products, including a visualization video.	FALSE
`manifest`	Generate file manifest for JEWEL.	FALSE

Most steps depend on output from all previous steps. This table lists step prerequisites.

Step Name	Prerequisite Steps	Other Reqs
`preproc`	N/A	N/A
`validate`	`preproc`	N/A
`tracker`	`preproc` `validate`	N/A
`features`	`preproc` `validate` `tracker`	`track_evaluation` Optional
`train`	`preproc` `validate` `tracker` `track_evaluation` `features`	Track Labels
`predict`	`preproc` `validate` `tracker` `features`	Pretrained Model
`asdp`	`preproc` `validate` `tracker` `features` `predict`	Track Labels Optional
`manifest`	N/A	Various `validate`, `tracker`, `predict`, `asdp` Products Optional

There are also pipelines available for common combinations of steps.

Pipeline Name	Description	Steps
`pipeline_train`	Pipeline to train the motility classifier.	`preproc`, `validate`, `tracker`, `features`, `train`
`pipeline_predict`	Pipeline to predict motility.	`preproc`, `validate`, `tracker`, `features`, `predict`
`pipeline_products`	Pipeline to generate all products.	`preproc`, `validate`, `tracker`, `point_evaluation`, `features`, `predict`, `asdp`, `manifest`
`pipeline_space`	Pipeline to generate space-mode products. Disables most plotting, especially in validate.	`preproc`, `validate`, `tracker`, `features`, `predict`, `asdp`, `manifest`

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 50 valid frames in said subdirectory. Valid frames are:
- Images with extension .tif
- Images with resolution 2048x2048
- These values can be configured in the config.
The enumerated names of the images are expected to be consecutive.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in fame_config.yml before executing the pipeline. These use src/cli/configs/fame_config.yml by default.

Validate your Experiment Data

FAME_flight_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate

Note how, by adding a wildcard, you can process multiple experiments at once.

Generate Particle Tracks

FAME_flight_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate tracker \
--use_existing preproc validate

Note how, by specifying --use_existing, the pipeline will use existing preproc and validate step output if they already exist.

Validate, Track Particles, Predict Motility, and Generate Visualization

FAME_flight_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps pipeline_products \
--use_existing preproc validate tracker

Note that --config and --predict_model can also be specified, but we're just using the default values here.

Tracker outputs

In the output folder, the following subfolders will be made:

/plots: plots of all the tracks, where each track is colored with a distinct color

/tracks: subfolders for each cases, with .track files giving the coordinates

/configs: configuration file for the case

/train classifier: an empty folder, which is needed by train_model.py

Classifier outputs

In the output folder, under /train classifier, you will see the following:

track motility.csv, which gives each track with its label, is just a log file output

yourclassifier.pickle, classifier in pkl form

plots/ roc curve, confusion matrix, feature_importance if running Zaki's classifier

FAME_ground_pipeline

⚠️ The flight pipeline must be run before the ground pipeline.

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/fame_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`FAME_ground_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`validate`	Generates data validation products, including videos and MHIs.	TRUE
`asdp`	Generate ASDP products, including a visualization video.	FALSE

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 50 valid frames in said subdirectory. Valid frames are:
- Images with extension .tif
- Images with resolution 2048x2048
- These values can be configured in the config.
The enumerated names of the images are expected to be consecutive.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in fame_config.yml before executing the pipeline. These use src/cli/configs/fame_config.yml by default.

Validate your Experiment Data

FAME_ground_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate

Note how, by adding a wildcard, you can process multiple experiments at once.

Validate, Track Particles, Predict Motility, and Generate Visualization

FAME_ground_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps pipeline_products \
--use_existing preproc validate tracker

Note that --config and --predict_model can also be specified, but we're just using the default values here.

FAME_analysis_pipeline

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/fame_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`FAME_analysis_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None
❗	`--train_feats`	Only usees tracks with labels for model training.	None
❗	`--predict_model`	Path to ML model for motility classification.	`cli/models/classifier_labtrain_v02.pickle`
❗	`--toga_config`	Override config filepath for TOGA optimization.	None

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`preproc`	Lowers the resolution from 2048x2048 to 1024x1024 for analysis.	TRUE
`validate`	Generates data validation products, including videos and MHIs.	TRUE
`tracker`	Track particles in the experiment.	TRUE
`point_evaluation`	Using track labels, measure point accuracy of the tracker.	TRUE
`track_evaluation`	Using track labels, measure track accuracy of the tracker.	TRUE
`features`	Extract features from detected tracks.	FALSE
`train`	Train the motility classification model.	FALSE
`predict`	Predict motility of all tracks with classification model.	FALSE
`asdp`	Generate ASDP products, including a visualization video.	FALSE
`manifest`	Generate file manifest for JEWEL.	FALSE

Most steps depend on output from all previous steps. This table lists step prerequisites.

Step Name	Prerequisite Steps	Other Reqs
`preproc`	N/A	N/A
`validate`	`preproc`	N/A
`tracker`	`preproc` `validate`	N/A
`point_evaluation`	`preproc` `validate` `tracker`	Track Labels
`track_evaluation`	`preproc` `validate` `tracker`	Track Labels
`features`	`preproc` `validate` `tracker`	`track_evaluation` Optional
`train`	`preproc` `validate` `tracker` `track_evaluation` `features`	Track Labels
`predict`	`preproc` `validate` `tracker` `features`	Pretrained Model
`asdp`	`preproc` `validate` `tracker` `features` `predict`	Track Labels Optional
`manifest`	N/A	Various `validate`, `tracker`, `predict`, `asdp` Products Optional

There are also pipelines available for common combinations of steps.

Pipeline Name	Description	Steps
`pipeline_train`	Pipeline to train the motility classifier.	`preproc`, `validate`, `tracker`, `track_evaluation`, `features`, `train`
`pipeline_predict`	Pipeline to predict motility.	`preproc`, `validate`, `tracker`, `features`, `predict`
`pipeline_tracker_eval`	Pipeline to evaluate tracker performance.	`preproc`, `validate`, `tracker`, `point_evaluation`, `track_evaluation`
`pipeline_products`	Pipeline to generate all products.	`preproc`, `validate`, `tracker`, `point_evaluation`, `track_evaluation`, `features`, `predict`, `asdp`, `manifest`
`pipeline_space`	Pipeline to generate space-mode products. Disables most plotting, especially in validate.	`preproc`, `validate`, `tracker`, `features`, `predict`, `asdp`, `manifest`

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 50 valid frames in said subdirectory. Valid frames are:
- Images with extension .tif
- Images with resolution 2048x2048
- These values can be configured in the config.
The enumerated names of the images are expected to be consecutive.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in fame_config.yml before executing the pipeline. These use src/cli/configs/fame_config.yml by default.

Validate your Experiment Data

FAME_analysis_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate

Note how, by adding a wildcard, you can process multiple experiments at once.

Generate Particle Tracks

FAME_anaysis_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps preproc validate tracker \
--use_existing preproc validate

Note how, by specifying --use_existing, the pipeline will use existing preproc and validate step output if they already exist.

Train a motility model

Use the pipeline_train step bundle to run the tracker, evaluation, feature generator, and training steps. The --use_existing flag will skip any steps that were previously computed:

FAME_analysis_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps pipeline_train \
--train_feats \
--use_existing preproc validate

Validate, Track Particles, Predict Motility, and Generate Visualization

FAME_analysis_pipeline \
--experiments my/experiments/glob/string \
--batch_outdir my/experiments/batch_directory \
--steps pipeline_products \
--use_existing preproc validate tracker

Note that --config and --predict_model can also be specified, but we're just using the default values here.

Tracker outputs

In the output folder, the following subfolders will be made:

/plots: plots of all the tracks, where each track is colored with a distinct color

/tracks: subfolders for each cases, with .track files giving the coordinates

/configs: configuration file for the case

/train classifier: an empty folder, which is needed by train_model.py

Classifier outputs

In the output folder, under /train classifier, you will see the following:

track motility.csv, which gives each track with its label, is just a log file output

yourclassifier.pickle, classifier in pkl form

plots/ roc curve, confusion matrix, feature_importance if running Zaki's classifier

HIRAILS

HIRAILS_flight_pipeline

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/hirails_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`HIRAILS_flight_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`tracker`	Identify particles in the experiment.	TRUE
`asdp`	Generate ASDP products.	FALSE
`manifest`	Generate file manifest for JEWEL.	FALSE

Most steps depend on output from all previous steps. This table lists step prerequisites.

Step Name	Prerequisite Steps	Other Reqs
`tracker`	N/A	N/A
`asdp`	`tracker`	Track Labels
`manifest`	N/A	`asdp` Products

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 1 valid frames in said subdirectory. Valid frames are:
- Images with extension .tif
- Images with resolution 2200x3208x3
- These values can be configured in the config.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in hirails_config.yml before executing the pipeline. These use src/cli/configs/hirails_config.yml by default.

Process Experiment Data

HIRAILS_flight_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps tracker asdp manifest

Note how, by adding a wildcard, you can process multiple experiments at once.

HIRAILS_ground_pipeline

⚠️ The flight pipeline must be run before the ground pipeline.

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/hirails_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`HIRAILS_ground_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`tracker`	Identify particles in the experiment.	TRUE
`asdp`	Generate ASDP products.	FALSE
`manifest`	Generate file manifest for JEWEL.	FALSE

Most steps depend on output from all previous steps. This table lists step prerequisites.

Step Name	Prerequisite Steps	Other Reqs
`tracker`	N/A	N/A
`asdp`	`tracker`	Track Labels
`manifest`	N/A	`asdp` Products

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 1 valid frames in said subdirectory. Valid frames are:
- Images with extension .tif
- Images with resolution 2200x3208x3
- These values can be configured in the config.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in hirails_config.yml before executing the pipeline. These use src/cli/configs/hirails_config.yml by default.

Process Experiment Data

HIRAILS_ground_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps tracker asdp manifest

Note how, by adding a wildcard, you can process multiple experiments at once.

HIRAILS_analysis_pipeline

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
💡	`--config`	Filepath of configuration file.	`cli/configs/hirails_config.yml`
✅	`--experiments`	Glob string pattern of experiment directories to process.	None
✅	`--steps`	Steps of the pipeline to run. See below for description of steps.	None
✅	`--batch_outdir`	Output directory for batch-level results.	None
💡 🔽	`--use_existing`	Attempt to reuse previous processing output for any steps defined here. See description below for options.	None
💡 🔽	`--space_mode`	Only output space products. Skips most plots.	None
💡	`--cores`	Number of processor cores to utilize.	`7`
	`--note`	String to be appended to output directory name.	None
	`--log_name`	Filename for the pipeline log.	`HIRAILS_analysis_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`
	`--priority_bin`	Downlink priority bin (lower number is higher priority) for generated products	`0`
	`--manifest_metadata`	Manifest metadata (YAML string); takes precedence over file entries	None
	`--manifest_metadata_file`	Manifest metadata file (YAML)	None

Steps

This table lists all steps available. It also indicates which steps can be used with the --use_existing step. It is listed in typical order of usage.

Step Name	Description	`--use_existing`
`tracker`	Identify particles in the experiment.	TRUE
`asdp`	Generate ASDP products.	FALSE
`manifest`	Generate file manifest for JEWEL.	FALSE

Most steps depend on output from all previous steps. This table lists step prerequisites.

Step Name	Prerequisite Steps	Other Reqs
`tracker`	N/A	N/A
`asdp`	`tracker`	Track Labels
`manifest`	N/A	`asdp` Products

Valid Experiments

An experiment is defined by a unique directory. To be considered valid, an experiment must satisfy the following:

Contain the subdirectory Holograms/
Have at least 1 valid frames in said subdirectory. Valid frames are:
- Images with extension .tif
- Images with resolution 2200x3208x3
- These values can be configured in the config.

Common Usage Examples

A brief reminder that these examples assume you have followed the installation instructions.

Make sure to specify desired configuration parameters in hirails_config.yml before executing the pipeline. These use src/cli/configs/hirails_config.yml by default.

Process Experiment Data

HIRAILS_analysis_pipeline \
--experiments "my/experiments/glob/wildcard_*_string" \
--batch_outdir my/experiments/batch_directory \
--steps tracker asdp manifest

Note how, by adding a wildcard, you can process multiple experiments at once.

CSM

CSM_flight_pipeline

Arguments

This table lists all arguments available. They are annotated with emoji flags to indicate the following:

✅ Required
🔼 Increased Runtime
🔽 Decreased Runtime
💡 Useful, often used
❗ Warning, requires deliberate use

⭐	Argument flag	Description	Default Value
✅	`compath`	Filepath of COM CSV file.	None
💡	`--config`	Filepath of configuration file.	`cli/configs/csm_config.yml`
	`--log_name`	Filename for the pipeline log.	`CSM_flight_pipeline.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`

Valid COM CSV Files

Valid COM CSV files are generated by /owls-bus/owls-bus-fprime/blob/devel/util/log_processor.py, and contain the bp_CSM.EC_Conductivity channel name.

Common Usage Examples

Process CSM COM output

CSM_flight_pipeline CSM_2_1639618003_239055.csv

JEWEL

JEWEL generates an ordering for downlinking ASDPs contained within an ASDP Database (ASDP DB), given a user-specified configuration file.

JEWEL [required-args] [optional-args]

Required	Argument Flag	Description	Default
✅	`dbfile`	The path to the ASDP DB file	None
✅	`outputfile`	The path to the output CSV file where the ordered data products will be written	None
	`--config`	Path to a config (.yml) file.	`cli/configs/jewel_default.yml`
	`--log_name`	Filename for the pipeline log.	`JEWEL.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`

Auxiliary Scripts

Below are several several scripts used to support JEWEL by managing the ASDP DB. First is a script to update the contents of the ASDP DB. The first time the script is invoked, an ASDP DB will be initialized and populated. During subsequent invocations, the DB will be updated with any new ASDPs that have been generated.

update_asdp_db [required-args] [optional-args]

Required	Argument Flag	Description	Default
✅	`rootdirs`	A list of root directories for each of the ASDP results (e.g., for HELM or ACME); each directory should correspond to a single ASDP.	None
✅	`dbfile`	The path to where the DB file will be stored (currently CSV format)	None
	`--log_name`	Filename for the pipeline log.	`update_asdp_db.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`

The next script simulates a downlink for testing JEWEL with ground in the loop. It traverses the ordering produced by JEWEL, "downlinks" untransmitted ASDPs, and marks them as transmitted within the ASDP DB.

simulate_downlink [required-args] [optional-args]

Required	Argument	Description	Default
✅	`dbfile`	The path to the ASDP DB file	None
✅	`orderfile`	The path to the ASDP ordering file produced by JEWEL	None
✅	`datavolume`	The simulated downlink data volume in bytes	None
	`-d`/`--downlinkdir`	Simulate downlink of ASDP files by copying to this directory. If None, still mark files as downlinked.	None
	`--log_name`	Filename for the pipeline log	`simulate_downlink.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`

The next script is used to manually set the downlink status of individual ASDPs. This can be invoked during the downlink process, or via a ground command to manually reset the downlink status of an item that was transmitted but not received, for example.

set_downlink_status [required-args] [optional-args]

Required	Argument	Description	Default
✅	`dbfile`	The path to the ASDP DB file.	None
✅	`asdpid`	Integer ASDP identifier.	None
✅	`status`	The new downlink status, either "transmitted" or "untransmitted".	None
	`--log_name`	Filename for the pipeline log.	`set_downlink_status.log`
	`--log_folder`	Folder path to store logs.	`cli/logs`

Finally, the last script will plot the results of a downlink session, showing the JEWEL ordering, visualizing SUEs and DDs, and providing summary and detailed visualizations for each ASDP. Currently, the script only supports complete downlinks, not partial downlinks, using the simulate_downlink script.

JEWEL_plot_downlink [required-args] [optional-args]

Required	Argument Flag	Description	Default
✅	`sessiondir`	The downlink session directory to which files have been copied via the `simulate_downlink` command.	None
✅	`outputdir`	The output directory that will contain visualization files, with the main `index.html` file at the root.	None

SUE Options

For HELM and FAME pipelines, there are a several methods for calculating science utility estimate (SUEs):

sum_confidence: add track motility confidence values within an observation. Statistically, this is equivalent to the expected number of motile tracks within the observation. This method uses a max_sum parameter, which is used to normalize the total sum between 0 and 1.
topk_confidence: computes the probability that at least motile track is present within the top k most confident tracks. The parameter k can be adjusted.
intensity: uses an intensity-based SUE calculation to be used specifically with FAME data. To compute a single intensity across the entire observation, there is a track_statistic parameter (either "minimum", "maximum", "median", or "percentile") that summarizes the intensity across each track. If the "percentile" option is used, there is a percentile parameter under track_statistic_params to specify the percentile. Likewise, there is an observation_statistic parameter to summarize the intensity across tracks. Finally, there are weights for red, green, blue, or grayscale channels that can be used to combine intensities across channels. The final intensity is normalized between 0 and 1.

Common Usage Examples

The following example shows how JEWEL and associated tools can be used as part of the OWLS Autonomy pipeline. First, update the ASDP DB using:

$ update_asdp_db --rootdirs [path to ASDP root directories] --dbfile asdpdb.csv

The asdpdb.csv file will be created or updated if it already exists to contain the ASDPs specified. This process should be performed before any downlink after new ASDPs are generated. At any time, JEWEL can be invoked using the following command:

$ JEWEL --dbfile asdpdb.csv --outputfile jewel_ordering.csv -c jewel_config.yml

The ASDPs ordered for downlink will be placed in jewel_ordering.csv. To simulate downlink of the ASDPs, use:

$ mkdir downlink
$ simulate_downlink --dbfile asdpdb.csv --orderfile jewel_ordering.csv --datavolume -1 -d downlink

This will create a downlink session directory containing the ASDPs. To plot the and view results for this session, use:

$ JEWEL_plot_downlink downlink/20220421T154736/ downlink/20220421T154736/visualization
$ open downlink/20220421T154736/visualization/index.html

TOGA

Running TOGA on HELM

Generic docs for getting toga installed here: https://github-fn.jpl.nasa.gov/MLIA/TOGA/blob/master/README.md

git clone https://github-fn.jpl.nasa.gov/MLIA/TOGA.git
cd TOGA
conda env create -f envs/toga36-env.yml
source activate toga36
python setup.py develop

python setup.py develop is critical here, not pip install . . This will set up the package such that the installation uses code from this directory instead of copying files to the pip library.

cli/TOGA_wrapper.py is the main interface between HELM and TOGA. This script reads in a TOGA generated config file along with the experiment directory to run on. It then calls HELM_pipeline via subprocess and reports back to TOGA via a generated metrics.csv file.

In addition to usual TOGA parameters a subset of point and/or track evaluation metrics, metrics_names, must be specified in the config (as a list). These metrics are each aggregated over the experiments via a simple mean. In the case of multiple metrics, TOGA will treat all but one as "fixed axes" - toga does not optimize over fixed axes, rather it searches for top solutions according to the single "non-fixed" axis across the full spectrum of the others. See the banana-sink example on the toga side for a simple multi-dimensional problem.

Steps on the TOGA side

Since we are committed to maintaining TOGA as a project agnostic tool, a few configuration tweaks are needed specific to running TOGA on HELM. These are mostly handled via TOGA configuration files, discussed below.

After cloning TOGA, all HELM configuration files are in TOGA/test/run_configurations/HELM/. Each of run_settings.yml, gene_performance_metrics.yml, and genetic_algorithm_settings.yml should be copied to TOGA/toga/config/ to override default TOGA configuration.

Furthermore, the following items in run_settings.yml need to be updated:

gene_template should be the absolute path to TOGA/test/run_configurations/HELM/helm_config.yml (the only config not copied to the TOGA/toga/config/ folder)
work_dir -> base_dir should name a working directory
command -> cmd should have the absolute path to TOGA_wrapper.py
command -> static_args should name a valid experiment dir

~~IMPORTANT: These configs will then need to be copied to the toga environment lib directory to take effect. (See https://github-fn.jpl.nasa.gov/MLIA/TOGA/issues/5) for details.~~ Issue closed.

Lastly, the environment variable PYTHON needs to be set (in the TOGA virtual environment) to the python executable running HELM. This is the version of python TOGA_wrapper.py will use to run the helm pipeline.

which python
export PYTHON="absolute/path/to/conda/python"

TODO: Helm should have its own virtual envrionment to avoid the environment variable.

Thomas's debugging tips

Important: For MLIA machines, you may need to copy config files to wherever TOGA is installed upon update before calling toga_server or toga_client. For code changes (inserting prints/etc.), rerun pip install on TOGA. It is easy to get into a state where your work repo does not match the install version that commands toga_server/toga_client reference.
On starting TOGA client, does the helm pipeline run? I.e. does the client start spewing helm prints?
- Yes? Continue below
- No? Look in toga's experiment directory (work_dir -> base_dir specified in run_settings). Are there any yml's in the random_config subdir?
  - Yes? TOGA should be able to call HELM. Put prints in TOGA_wrapper.py on the helm side.
  - No? TOGA is failing to generate configs. Ensure the helm config yml specifies genes correctly (properly indented, type and range for each). Put prints in population.py -> create_individual() and mutate(). TODO: "bool" type is broken; use int in range [0, 1] instead. Issue logged on TOGA side
Try running with a single worker on a single experiment (so runtime is short). Upon finishing the helm script does the client print "{'individual': [uuid], 'status': 'successfully stored'}"?
- Yes? The client seems to be running correctly. If configs do not appear in the best toga experiment subdir, the server may not yet have updated (it prints "running serialization" when this happens) or it may need to be restarted for any config changes to take affect.
- No? If it looks like the HELM run was cut short, the timeout in run_settings.yml may be set too low. Otherwise, the client is probably failing to parse the outputted metrics after finishing the call to HELM. Prints can be placed at the bottom of TOGA_wrapper.py. Double check that the metric_names key in helm_config.yml matches those in gene_performance_metrics.yml.

Jake's debugging tips

Any warnings that the shell is not configured for conda can be safely ignored.
The TOGA client will likely not quit on Ctrl-C, use screen then use Ctrl-A, K, then Y to terminate.
Don't tell TOGA to use conda in the run settings on analysis or paralysis

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Command Line Interface Documentation

Pipeline Types

ACME

ACME_flight_pipeline

Arguments

ACME_ground_pipeline

Arguments

ACME_analysis_pipeline

Arguments

ACME_simulator

Arguments

ACME_evaluation

Arguments

HELM

HELM_flight_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

Tracker outputs

Classifier outputs

HELM_ground_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

HELM_analysis_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

Tracker outputs

Classifier outputs

HELM_simulator

Arguments

Common Usage Examples

Config options

FAME

FAME_flight_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

Tracker outputs

Classifier outputs

FAME_ground_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

FAME_analysis_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

Tracker outputs

Classifier outputs

HIRAILS

HIRAILS_flight_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

HIRAILS_ground_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

HIRAILS_analysis_pipeline

Arguments

Steps

Valid Experiments

Common Usage Examples

CSM