Custom code

Modified on Mon, 31 Jul, 2023 at 2:59 PM

Description

Custom code enables a user to write and run his own python code in order to transform a tabular or 3D data set

Application

Even though the platform offers numerous functions to transform tables, users might sometimes want to transform their data in a way so specific that no general solution is available. This function will allow such users to manipulate the data with total freedom and flexibility.

The function can be used for example to manually modify the values of the table, restructure the table, to create new columns in a complex way that cannot be handled by the Quick Columns function.

How to use

Knowledge of python coding is required for this step.

Select the tabular data on which you want to apply the Python code.
Write the code in the code section.

Here are a few important points:

The data is stored in a pandas dataframe.
If the data is a tabular dataset, the dataframe is stored in a variable called df (abbreviation of “dataframe”), whatever is the name of the selected dataset.
If the data is a 3D dataset, the dataframe is stored in a variable called polydata (see below in “More on this step” below for more details on how to manipulate 3D data).
Each column of the dataset can be accessed by df["Column Name"]. You need to know their exact names! Use the dataset preview or the Columns function to display the name.
The dataset outputted by the step is the value of df after the last line of the code. Other variables will not be exposed. Make sure you apply your changes to df (see last example below).

After your code is written, you have the following options before running the step:

Enable Deep Copy.
Add a description to your code that will be visible from the notebook (e.g. “data filtered in my specific way”). If no description is added, the step will just print the custom code (without any line breaks!). Especially if the code becomes longer it is highly recommended to add a description instead.
Enable Save output under a different name if the step should not overwrite the initial dataset. If this option is not enabled the dataset will be overwritten.

Warning: this function gives a lot of freedom to the user, but it comes at a cost. Make sure your code is correct before running it. In case of error in the code, the platform will simply let you know that there is an error, but will not indicate what the error is. You can edit the step and modify your code, and then re-run the step.

Available Python packages

Note that numpy is already imported as np and pandas is already imported as pd. The following list shows all libraries that are whitelisted, which means that you can import them if you need to:

collection, datetime, itertools, keras, numpy, pandas, pyvista, re, scipy, shapely, sklearn, src

You are not able to import other libraries. If you want more details about pre-imported libraries, contact us.

Package import behavior

Due to the design of the custom code environment, packages have to be imported locally within each function.

Using lambda requires loading the library in a function as well. Therefore, if you try to use lambda functions like in the following code (if string starts with “Rain” replace that by “Sun”):

import re
df= df.rename(columns=lambda x: re.sub('^Rain', 'Sun', x)

You will probably encounter an error such as “Error executing custom code: name 're' is not defined.” Instead, you should try to move the code into a subroutine and import the package re in that local context:

def replace_string(x):
    import re
    return re.sub('^Rain', 'Sun', x)
    
df = df.rename(columns=replace_string)

Examples

Examples working on tabular data

```
df['score'] = df['drag']*2/(1-df['lift']*1000)
```
This code will create a new column score that will be calculated from the two columns drag and lift.
```
df['drag'] = df['drag']*1000-1
```
This will simply modify the existing column drag according to the operation written.
```
df = df.loc[df['drag’] > 2]
```
This will keep the rows for which the value of the drag column is higher than 2. If you want to update your table, make sure to use df = … at the beginning. In the example above, if you only type df.loc[df['drag’] > 2], the table will not be changed.
```
df['time'] = df.ID.apply(lambda st: int(st[4:]))/100
```
This will create a column called time which would remove the first 4 characters of the column ID, then convert the result into an integer, and then divide the result by 100. For instance, if the column ID has a value Step136, the time column would have the numerical value 1.36.
```
df['critical'] = np.where(df['drag'] < 0.75, 0, 1)
```
This will create a new column called critical which would be 0 if the drag column is lower than 0.75, and 1 otherwise.

Examples working on 3D data

```
polydata.point_arrays['NewField'] = polydata.points[:, 0]
```
This code will create a new field called NewField that would be assigned the value of the X coordinate at each point (see below for more details on custom code for 3D data).

tempVariable = polydata.points[:, 0]
polydata.points[:, 0] = polydata.points[:, 1]
polydata.points[:, 1] = tempVariable

This code will swap the X and Y coordinates

```
polydata = polydata.triangulate()
```
This code will transform the mesh to triangular faces (which are easier to manipulate than quadrilateral faces).