Description
Parallel coordinates plots are an interactive way to display the data of a table where each row is represented by a line taking different values on multiple parallel axes. A colour scheme can be applied for further clarity.
Application
Sometimes, the complexity of a data set makes it hard to really understand it. It is possible to look at 2 or 3 variables at the same time, but it might be hard for example to understand the relationship between 5 or 10 different variables.
The parallel coordinates plot enables to visualise data in a very intuitive way. Moreover, the interactivity enables to select specific regions for each variable. This function is mainly used to find non-trivial relationship in a dataset, and to filter data in an interactive way.
How to use
This function requires a table with two or more numerical columns to work properly
After selecting the desired table in the “Data” field, numerical columns from this table should be selected in the Columns field. These columns will be the parallel axes along which the lines are plot.
Selecting a non-numerical column at that stage would yield an error as they are not suitable to be displayed in parallel coordinates.
It is also possible to colour the lines by one of the numerical columns in order to get extra information in the final plot. However, this column does not necessarily need to be selected as one of the axes.
The figures below show the result you would get with the configuration shown above. Each coloured line represents a row from the table. On the left figure, all lines are shown.
It is possible to filter lines along an axis by clicking and dragging a line along this axis. These ranges can be moved along the axis or resized by clicking and dragging within the range or on its extremities. It is also possible to create multiple ranges along the same axis.
On the right figure, filters were applied, selecting high and low values of alpha, high values of the windspeed, and low values of CD_mean
.
The function can be used as an exploration tool, to visualise relationship, but the filtered data can be saved and reused later on in the notebook by ticking the box Filter data below the graph.
Examples
Here is an example of how to get insight from parallel coordinates (using the dataset used in the data exploration tutorial):
The data set contains simulation results calculating the drag and lift of a spoiler, based on geometrical parameters. An optimal spoiler needs to have a low drag (low CD_mean
) and a good downforce (low CL_mean
).
By selecting ranges of low values for both CD_mean
and CL_mean
(see last two axes on the right), the graph clearly shows that optimal designs must have a high width and high chord (see first two axes on the left), but that the angle of attack alpha can take any value as long as they are not extreme (see the third axis).
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article