Wide-to-long

Modified on Mon, 19 Dec, 2022 at 6:18 PM

Description

This manipulator transforms a wide table into a long table.


Application

Engineering data is recorded in various different ways, whether it comes from a test bench, sensors or even simulations. To use this data to explore results and train models, you will need a consistent format. In some occasions, it means changing the shape of the table to make it easier for a model to learn from it. To learn more about long and wide tables see our FAQ article What is a "long" and a "wide" table?


How to use

After choosing the dataset which to transform from wide to long format you need to fill three fields.

  • Columns
  • Variable name
  • Value name

Consider this example of a wide table:

Each test (defined by the ID column) considers a time series with six time steps. The values of the measurement (for example a pressure) is stored in the columns Value_Number. These column labels consist of the name of the variable which is stored in the column (in the example just called Value) plus the value of the time at which the value was captured. Value_2 therefore translates to Value at t=2s.

We now want to have one column Time and a separate column Value in which each row just represents a single point in time. To achieve that for the example above you need to fill the three fields above like this:

Columns
That are the columns which form the series, that is all columns which contain Value. For the example above you need to assign all six columns (Value_0, Value_1, … Value_5) to this field.
Variable name
Variable here is the variable on which the series depends. That is, the variable which is hidden in the column labels. In our case that is Time. That is we are defining the name of the variable on which the series depends.
Value name
Here we specify the name of the new column containing the values. In our example above we would use Value. If you stick to wide table labels as shown above in the example this is typically the first part of the the column labels.

As you might want to keep your original dataset in the wide format you can tick Save output under different name and change the result of this step under another name.


More on this step

If you do all steps a described above, your result looks like this:

The Time column is not what we are looking for. Instead of the time stamps we now got the former column labels. To get to our final result you can use Quick Columns and Specify Data Formats.

  1. Create a new Quick Columns step and select the dataset result of Wide-to-long. Put in a name for the new column (e.g. Time numerical), select Operation: Split and the Column: Time. The Separator is _ (underscore) and as Index use -1 (which tells the function to use the last element which is our time stamp.
  2. Next apply Specify Data Formats to this new column and change the data type to Float (float64).

These steps will give you the intended result with correct data formats: 

If you need to do the conversion the other way round you can use the step Long-to-wide.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article