Data Preparation

<< Click to Display Table of Contents >>

Navigation:  Preparation of Training Data and Neural Network >

Data Preparation

Image Import

The use of a database is a prerequisite for developing an ML model in PAI. Please refer to the PMOD Basic Functionality User Guide  for instructions how to create and use databases. In the example below a database called BraTS was created and the data from the MICCAI BraTS Challenge imported.

Image Association

A training sample consists of one or several image series, and the segmentation reference result from which the neural network should learn. All of these images need to be associated in the database so that when a single image is referenced all related images are identified.

To associate the images, select a subject in the Subjects list and then all series to be associated in the Series list. From the option menu indicated below select Associate Images, which brings up a dialog window confirming association of the selected series.


To identify which image series is the reference segment map, select it in the list, then the TAG column, and in the menu that appears


select the SEGMENT entry. If more than one input image is required for the segmentation, it is important that they always appear in the same order in the association list. Please use the arrow buttons to the right of the list for shifting the position of a selected element.

Existing associations can be checked by selecting one of the image series and activating the button indicated below:


Adding a Descriptive Variable for Training (Project Description)

We strongly recommend adding a descriptive label to the series used for training by defining the Project. This description will be used to check that new data used for Prediction has the same content as that used for Training.

If a difference in the Project description (or number of studies) in Training/Prediction is detected, warning messages based on the following structure will be returned:


Data Cropping

Another part of the data preparation consists of reducing the data volume to the relevant portion. In the brain segmentation example the image should be restricted to the brain. This process can be included in the training set definition by creating a VOI that will serve as the cropping box and associating it with the input data using the same tools as image association.


To achieve this, open the input image, create a suitable VOI such as a box, position it properly and save it to the database. Then select the input image in the Series list on the DB Load page, followed by Associate VOI from the same menu where the images were associated.


In the dialog window which opens select the saved VOI and activate Set Selected.

Automatic Association Creation

The ML training process requires the preparation of a large number of samples. To make this process easier a mechanism for the automatic association of the images is available. It uses the Incoming Folder method. A folder is defined in the DICOM configuration, which is regularly checked for data to be imported into the database. It takes into account information prepared in a csv file that must also be located in the incoming folder. The structure of such a csv file is illustrated below:


The label defined in the Project column is assigned to the imported image series. Once imported, Associate Images Automatically can be used to generate the associations. Note that in the example, four images in each sample are used as input for the segmentation. To establish a consistent order, numbers are used in the labels.