Skip to content

Visualization

Nick B edited this page Oct 20, 2021 · 3 revisions

If you are visualizing data that corresponds to a benchmark application (i.e., you used the 'comm_pattern_analysis.sh' script to generate data), we provide two methods to visualize kernel distance data from ANACIN-X that corresponds to one of the three provided benchmark applications. We recommend using Jupyter Notebooks if you can pull it up from your machine. If you can't use Jupyter notebooks on your machine, we provide 2 command line python tools to: (1) create a .png file for visualization of the relationship between kernel distance and percentage of message non-determinism, or kdts visualization and (2) create a bar chart of which callstack functions presented the highest impact on kernel distance during an applications runtime, or callstack visualization. Read the Benchmark Application Data Visualization section below for details.

In either case, we provide the user with sample KDTS data within the ANACIN-X project under the sub-directory 'sample_kdts'. The user has the option to visualize the provided sample results or to visualize their own generated results. We describe each visualization method in more detail below.

If you are visualizing data that corresponds to an external application (i.e., you did not use the 'comm_pattern_analysis.sh' script to generate data), we provide 2 command line python tools to (1) create a .png file for visualization of the relationship between kernel distance and slices of an application execution, or kdts visualization and (2) create a bar chart of which callstack functions presented the highest impact on kernel distance during an applications runtime, or callstack visualization. Read the External Application Data Visualization section below for details.

Benchmark Application Data Visualization

Method 1 - Jupyter

If you can use Jupyter to visualize the data for the project and you generated data from one of the three provided benchmark applications, input the following command from the machine where you ran your copy of ANACIN-X:

jupyter notebook

Within Jupyter, find the file titled visualization.ipynb in the same directory as the script you used to produce your data (that is the root of your project).

By opening this visualization jupyter notebook and following the instructions within, you can visualize the kernel distance data. If you complete the steps provided in the visualization jupyter notebook, you will also find a .png file generated in your file system where you ran the notebook from.

Method 2 - Command Line Visualization

If you can't use Jupyter to visualize the data, then we recommend using the command line python tool to generate the png images. This will take a few key steps:

If you are generating a kdts visualization for your own data, you will first need to identify the inputs to the visualization script. These can be found in 2 ways.

  1. The first way to find the inputs is to look at the last 7 lines printed to standard out from your run. There you can find:
  • The path to your kernel distance (KDTS) data file.
  • The communication pattern used.
  • The path to your kernel config JSON file.
  • The three settings which define message non-determinism percentages: 'Starting Non-determinism Percentage', 'Non-determinism Percentage Step Size', and 'Ending Non-determinism Percentage'.
  • The percentage of topological non-determinism. (For unstructured mesh visualizations.)
  1. The second way to find the inputs is to navigate to your output directory in the following way.
  • Navigate to the directory that your output was stored in from your run and save the full path including the file name of the json file stored there.
  • Open the run_config.txt file stored in that directory and save the name of the communication pattern as listed in the first line item.
  • Find your non-determinism percentage settings on the three lines titled 'Starting Non-determinism Percentage', 'Non-determinism Percentage Step Size', and 'Ending Non-determinism Percentage'.
  • If you ran unstructured mesh, your topological non-determinism percentage can also be found in the run_config.txt file under the line 'Topological Non-determinism Percentage'. Then exit the run_config.txt file.
  • Follow the output directory structure down based on the inputs you gave to set the project with until you find a file titled 'kdts.pkl' and save the full path to your 'kdts.pkl' file so that it can be used when calling the visualization script.

Once you've gathered the needed inputs, return to the project directory where you submitted jobs from. From there, input the following command:

python3 anacin-x/event_graph_analysis/visualization/make_message_nd_plot.py KDTS CP KP Ou NDPl NDPs NDPh [--nd_neighbor_fraction NDPt]

Positional Arguments
KDTS  Kernel distance time series file
CP    The type of communication pattern benchmark used (i.e., message_race, amg2013, or unstructured_mesh)
KP    Graph kernel policy file used to generate data
Ou    Output file to store the visualization in, excluding the file type (e.g., output_file or kdts_visualization)
NDPl  Lowest percentage of message non-determinism used in decimal format
NDPs  The message non-determinism percent step size in decimal format
NDPh  Highest percent of message non-determinism used in decimal format

Options:
--nd_neighbor_fraction NDPt  Used only for visualizations of unstructured mesh data to identiy the topological non-determinism percent in decimal format

If you are generating a kdts visualization from provided sample kdts data, then do the following steps:

  1. Get the full path to the root of your ANACIN-X project.
  2. Get the full path to the provided sample kdts file you're using. (Found within the 'sample_kdts/' directory)
  3. Get the name of the communication pattern used from the name of the kdts file you're using. (The file name should be of the form samp__kdts_.pkl)
  4. Then use the following command from the root project directory to generate a .png visualization for the provided data.
python3 anacin-x/event_graph_analysis/visualization/make_message_nd_plot.py KDTS CP KP Ou 0.0 0.1 1.0

Positional Arguments
KDTS  Kernel distance time series file
CP    The type of communication pattern benchmark used (i.e., message_race, amg2013, or unstructured_mesh)
KP    Graph kernel policy file used to generate data (i.e., anacin-x/event_graph_analysis/graph_kernel_policies/wlst_5iters_logical_timestamp_label.json)
Ou    Output file to store the visualization in, excluding the file type (e.g., output_file or kdts_visualization)

A png file that visualizes the relationship between kernel distance and percentage of message non-determinism in your communication pattern will be produced and placed in the working directory if no absolute path is given for output or in the absolute path provided as an output file. Note that there's no need to include the file type (.png) at the end of your output file name, as it will be attached automatically.

If you are generating a callstack visualization, you will need to input the following sequence of commands from within your project directory.

Note, if you are creating visualizations from sample data provided within ANACIN-X, you will not be able to use the following steps to generate a callstack visualization. If you wish to generate a callstack visualization, you will need to generate your own data.

Important: In the second command below, the file apps/comm_pattern_generator/build/comm_pattern_generator is an executable file for your project. Be sure that you run the below commands on KDTS data that was generated from the same executable file as what you input below!

Note, also for the second of the following commands, running the script 'callstack_analysis.py', may take a few minutes time to complete:

python3 anacin-x/event_graph_analysis/anomaly_detection.py KDTS anacin-x/event_graph_analysis/anomaly_detection_policies/all.json -o flagged_indices.pkl

Positional Arguments:
KDTS  Kernel distance time series file

python3 anacin-x/event_graph_analysis/callstack_analysis.py FI KDTS apps/comm_pattern_generator/build/comm_pattern_generator

Positional Arguments:
FI    Flagged indices pickle file produced by by 'anomaly_detection.py (will have the same path as kdts.pkl file)
KDTS  Kernel distance time series file

python3 anacin-x/event_graph_analysis/visualization/visualize_callstack_report.py AR --plot_type="bar_chart"

Positional Arguments:
AR    Anomaly report file with same path as kdts.pkl file (i.e., 'non_anomaly_report_for_policy_all.txt')

If the above commands completed with no errors, you will find a file titled 'callstack_distribution.png' within your project directory that will visualize the relative normalized frequencies of callstacks within your communication pattern. In particular, it will visualize the relative likelihood that each callstack function has an impact on non-determinism in your communication pattern.

If you're doing your work and producing visualizations on a remote machine, remember to copy your png image(s) to your local machine using a tool like scp to view the image.

External Application Data Visualization

If you're visualizing data from an external application (i.e., not one of the provided three benchmark applications), use the following collection of commands to visualize data.

If you're generating a kdts visualization, run the following command from within the directory 'anacin-x/event_graph_analysis'.

python3 visualization/visualize_kernel_distance_time_series.py KDTS --plot_type=box

Positional Arguments
KDTS  Kernel distance time series file (Can be found in the directory that stores all your run directories.)

This will create the visualization file titled 'kernel_distance_time_series.png' within the 'anacin-x/event_graph_analysis' directory.

If you're visualizing callstack data, do so with the following commands from within the root directory of your project.

python3 anacin-x/event_graph_analysis/anomaly_detection.py KDTS anacin-x/event_graph_analysis/anomaly_detection_policies/all.json -o flagged_indices.pkl

Positional Arguments:
KDTS  Kernel distance time series file

python3 anacin-x/event_graph_analysis/callstack_analysis.py FI KDTS E

Positional Arguments:
FI    Flagged indices pickle file produced by by 'anomaly_detection.py (will have the same path as kdts.pkl file)
KDTS  Kernel distance time series file
E     Executable file that was traced

python3 anacin-x/event_graph_analysis/visualization/visualize_callstack_report.py AR --plot_type="bar_chart"

Positional Arguments:
AR    Anomaly report file with same path as kdts.pkl file (i.e., 'non_anomaly_report_for_policy_all.txt')

If the above commands completed with no errors, you will find a file titled 'callstack_distribution.png' within your project directory that will visualize the relative normalized frequencies of callstacks within your communication pattern. In particular, it will visualize the relative likelihood that each callstack function has an impact on non-determinism in your communication pattern.

If you're doing your work and producing visualizations on a remote machine, remember to copy your png image(s) to your local machine using a tool like scp to view the image.

Clone this wiki locally