diff --git a/documentation/tutorial/jupyter_notebook_step_by_step_illustration.html b/documentation/tutorial/jupyter_notebook_step_by_step_illustration.html deleted file mode 100644 index f8519db..0000000 --- a/documentation/tutorial/jupyter_notebook_step_by_step_illustration.html +++ /dev/null @@ -1,14170 +0,0 @@ - - - - -jupyter_notebook_step_by_step_illustration - - - - - - - - - - - - - - - - - - - - - - - -
-
- -
-
-
-

AI4eLIFE: Easying local image feature extraction using AI.

A data-centric AI for tumor segmentation from [18F]FDG-PET images.

-

This jupyter notebook is designed to explain how to run the proposed data-centric AI. It also illustrates how to visualize and interpret the results.

- -
-
-
-
-
-
In [1]:
-
-
-
# import important libraries
-import os
-import numpy as np
-import pandas as pd 
-import matplotlib.pyplot as plt
-
-# python script
-import visualization as view
-
- -
-
-
- -
-
-
-
In [2]:
-
-
-
import warnings
-warnings.filterwarnings('ignore')
-
- -
-
-
- -
-
-
-
-

Step 1: Prepare the data as described in the readme.md file.

For example, we have the input data inside the folder E:/data/input/. We want to save the output data under the E:/data/output folder.

-

Let us examine the data structure.

- -
-
-
-
-
-
In [3]:
-
-
-
# check the folder structure of the main folder for the input and output data.
-# function to print the folders, subfolders, and files in a given path "main_folder_path".
-def list_folders_files(main_folder_path):
-    for root_dir, dirs, files in os.walk(main_folder_path):
-        folder_level = root_dir.replace(main_folder_path, '').count(os.sep)
-        indentation = "-"*4*(folder_level) # tabe 
-        print(f'{indentation}{os.path.basename(root_dir)}')
-        
-        sub_indentation = "-"*4*(folder_level + 1)
-        
-        for get_file in files:
-            print(f'{sub_indentation}{get_file}')
-
-main_directory = "E:/data"
-list_folders_files(main_directory)
-
- -
-
-
- -
-
- - -
- -
- - -
-
data
-----input
---------patient_A
-------------gt
-----------------gt.nii
-------------pet
-----------------pet.nii.gz
---------patient_B
-------------gt
-----------------gt.nii
-------------pet
-----------------pet.nii.gz
---------patient_C
-------------gt
-----------------gt.nii
-------------pet
-----------------pet.nii.gz
---------patient_D
-------------ct
-----------------ct.nii.gz
-------------gt
-----------------gt.nii
-------------pet
-----------------pet.nii.gz
-----output
-
-
-
- -
-
- -
-
-
-
-

We have the two main folders, input and output folders. Under the input folder, we do have the structured data.

- -
-
-
-
-
-
-

Step 2: Download/Clone the repository

-
-
-
-
-
-
In [5]:
-
-
-
# Let us clone the repository to a folder named "E:/step-by-step-test"
-# create the directory
-os.mkdir("E:/step-by-step-test") 
-
- -
-
-
- -
-
-
-
In [6]:
-
-
-
# change working directory 
-os.chdir("E:/step-by-step-test/")
-print(f"You are at {os.getcwd()} directory !")
-
- -
-
-
- -
-
- - -
- -
- - -
-
You are at E:\step-by-step-test directory !
-
-
-
- -
-
- -
-
-
-
In [7]:
-
-
-
# clone or download the directory to the created folder: 
-!git clone https://github.com/KibromBerihu/ai4elife.git
-
- -
-
-
- -
-
-
-
-

Step 3: Install the package using either the virtual environment or the Docker-based approach.

Kindly refer to This for detailed installation instructions.

-

Change to the cloned ai4elife folder: os.chdir path/to/cloned/ai4elife/

- -
-
-
-
-
-
In [8]:
-
-
-
os.chdir('E:/step-by-step-test/ai4elife/')
-
- -
-
-
- -
-
-
-
-

If you choose option one installation using a virtual environment, activate the virtual environment: conda activate myenv

- -
-
-
-
-
-
-

Step 4: Run the following command

Make sure you activate the virtual environment if you choose virtual environment-based testing.

- -
-
-
-
-
-
In [16]:
-
-
-
# Example 
-!python test_env.py --input_dir "E:/data/input/" --output_dir "E:/data/output/"
-
- -
-
-
- -
-
-
-
-

Run the the following if you choose Docker-based testing:

-

Option 1:run_docker_image.bat E:/data/input/ E:/data/output DockerName Latest ai_1

-

Option 2: docker run -it --rm --name ai_1 -v E:/data/input/:/input -v E:/data/output/:/output DockerName:Latest

- -
-
-
-
-
-
-

Step 5: Result visualization and interpretation: examine the output folder

-
-
-
-
-
-
In [10]:
-
-
-
# Let us check the output directory.
-main_directory = "E:/data/output/"
-list_folders_files(main_directory)
-
- -
-
-
- -
-
- - -
- -
- - -
-
-----surrogate_ground_truth.csv
-----surrogate_predicted.csv
-data_default_3d_dir_
-----patient_A
---------ground_truth.nii
---------pet.nii
-----patient_B
---------ground_truth.nii
---------pet.nii
-----patient_C
---------ground_truth.nii
---------pet.nii
-----patient_D
---------ground_truth.nii
---------pet.nii
-data_default_MIP_dir
-----patient_A
---------ground_truth_coronal.nii
---------ground_truth_sagittal.nii
---------pet_coronal.nii
---------pet_sagittal.nii
-----patient_B
---------ground_truth_coronal.nii
---------ground_truth_sagittal.nii
---------pet_coronal.nii
---------pet_sagittal.nii
-----patient_C
---------ground_truth_coronal.nii
---------ground_truth_sagittal.nii
---------pet_coronal.nii
---------pet_sagittal.nii
-----patient_D
---------ground_truth_coronal.nii
---------ground_truth_sagittal.nii
---------pet_coronal.nii
---------pet_sagittal.nii
-predicted_data
-----patient_A
---------patient_A_ground_truth.nii
---------patient_A_pet.nii
---------patient_A_predicted.nii
-----patient_B
---------patient_B_ground_truth.nii
---------patient_B_pet.nii
---------patient_B_predicted.nii
-----patient_C
---------patient_C_ground_truth.nii
---------patient_C_pet.nii
---------patient_C_predicted.nii
-----patient_D
---------patient_D_ground_truth.nii
---------patient_D_pet.nii
---------patient_D_predicted.nii
-
-
-
- -
-
- -
-
-
-
-
    -
  1. Given N patients in the input directory, the system produces N resized (4x4x4 voxel size) and cropped (128x128x256 resolution) 3D data under the folder name data_default_3d_dir_. -

    -
  2. -
  3. It generates corresponding maximum intensity projections (MIPs) and is saved under the folder name data_default_MIP_dir.

    -

    mip.png

    -
  4. -
  5. The LBFNet uses the generated MIP images under the folder data_default_MIP_dir to predict the lymphoma tumor regions. -

    -
  6. -
  7. The predicted images are saved under the folder name predicted_data. -

    -
  8. -
  9. Two excel files are generated.

    -
      -
    • The first excel file is with the name surrogate_ground_truth.csv for surrogate biomarker features calculated from the ground truth segmentations (lymphoma tumor regions from an expert), if available. If there is no ground truth data provided along with the pet images, this file will have all its values with zero.

      -
    • -
    • The second CSV file, surrogate_predicted.csv, contains the surrogate biomarkers predicted by the AI algorithm.

      -
    • -
    -
  10. -
- -
-
-
-
-
-
-

Step 6: Excel or csv file interpretation

-
-
-
-
-
-
In [11]:
-
-
-
# read the csv files
-df_predicted = pd.read_csv("E:/data/output/surrogate_predicted.csv", encoding='latin1')
-df_predicted.head()
-
- -
-
-
- -
-
- - -
- -
Out[11]:
- - - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PIDsTMTV_sagittalsTMTV_coronalsTMTV_(mm²)Sagittal_xySagittal_zCoronal_xyCoronal_zsDmax_(mm)sDmax_(mm)_euclideanXYZ
0patient_A73776824080.027.051.028.051.0628.0327.7804144.04.04.0
1patient_B68578823568.022.075.029.075.0804.0448.5532304.04.04.0
2patient_C39643813344.029.0105.059.0106.01196.0652.1717574.04.04.0
3patient_D85199729568.038.098.061.096.01172.0619.4836564.04.04.0
-
-
- -
- -
-
- -
-
-
-
In [12]:
-
-
-
# Simple statistical description of the data. 
-df_predicted.describe()
-
- -
-
-
- -
-
- - -
- -
Out[12]:
- - - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
sTMTV_sagittalsTMTV_coronalsTMTV_(mm²)Sagittal_xySagittal_zCoronal_xyCoronal_zsDmax_(mm)sDmax_(mm)_euclideanXYZ
count4.0000004.0000004.0000004.0000004.000004.000004.0000004.0000004.0000004.04.04.0
mean667.250000747.75000022640.00000029.00000082.2500044.2500082.000000950.000000511.9972644.04.04.0
std193.667025231.0128066766.2787416.68331324.4591518.2094324.372115279.761803151.8367610.00.00.0
min396.000000438.00000013344.00000022.00000051.0000028.0000051.000000628.000000327.7804144.04.04.0
25%612.750000685.50000021012.00000025.75000069.0000028.7500069.000000760.000000418.3600264.04.04.0
50%711.000000778.00000023824.00000028.00000086.5000044.0000085.500000988.000000534.0184434.04.04.0
75%765.500000840.25000025452.00000031.25000099.7500059.5000098.5000001178.000000627.6556814.04.04.0
max851.000000997.00000029568.00000038.000000105.0000061.00000106.0000001196.000000652.1717574.04.04.0
-
-
- -
- -
-
- -
-
-
-
-

Note: PID: Unique patient identifier, sTMTV: surrogate total metabolic tumor volume in pixels. _coronal and _sagittal indicate that the values were computed from the coronal and sagittal views, respectively. Sagittal_xy and coronal_xy indicate the dissemination along the width of the 2D sagittal and coronal PET MIP images, and the Sagittal_z and coronal_z indicate the dissemination along with the height of the 2D sagittal and coronal PET MIP images. The following figure illustrates computing the dissemination from the coronal view. -dissemination_profile.png

-

The x, y, and z are the voxel spacing of the images.

- -
-
-
-
-
-
In [13]:
-
-
-
# Visualize the distribution of the computed biomarkers. 
-plot_biomarker = ["sTMTV_(mm²)", "sDmax_(mm)"]
-for biomarker in plot_biomarker:
-    df_predicted[biomarker].hist()
-    plt.title(f"Histogram of {biomarker}")
-    plt.show()
-
- -
-
-
- -
-
- - -
- -
- - - - -
- -
- -
- -
- -
- - - - -
- -
- -
- -
-
- -
-
-
-
-

Step 7: Visualization of the PET and their segmentation results

-
-
-
-
-
-
In [14]:
-
-
-
# data path
-path = "E:/data/output/predicted_data//"
-view.read_predicted_images(path)
-
- -
-
-
- -
-
- - -
- -
- - -
-
Number of cases: 4
-(2, 128, 256, 1)
-
- Image ID: 	 %s patient_A
-
-
-
- -
- -
- - - - -
- -
- -
- -
- -
- - -
-
- Image ID: 	 %s patient_A
-
-
-
- -
- -
- - - - -
- -
- -
- -
- -
- - -
-
(2, 128, 256, 1)
-
- Image ID: 	 %s patient_B
-
-
-
- -
- -
- - - - -
- -
- -
- -
- -
- - -
-
- Image ID: 	 %s patient_B
-
-
-
- -
- -
- - - - -
- -
- -
- -
- -
- - -
-
(2, 128, 256, 1)
-
- Image ID: 	 %s patient_C
-
-
-
- -
- -
- - - - -
- -
- -
- -
- -
- - -
-
- Image ID: 	 %s patient_C
-
-
-
- -
- -
- - - - -
- -
- -
- -
- -
- - -
-
(2, 128, 256, 1)
-
- Image ID: 	 %s patient_D
-
-
-
- -
- -
- - - - -
- -
- -
- -
- -
- - -
-
- Image ID: 	 %s patient_D
-
-
-
- -
- -
- - - - -
- -
- -
- -
-
- -
-
-
-
-

Note that if no ground truth is provided it will only display the predicted images.

- -
-
-
-
-
-
-

Good job!

- -
-
-
-
-
- - - - - -