Specify Data Types

Modified on Mon, 02 Jan 2023 at 11:43 AM

Description

With this function you can change the data type of columns in your dataset.


Application

The importers determine and assign the data type of all columns which usually this works well (you can check the data type with the Data Types function). But in case the importer assigned the wrong data type this function can be used to set the data type correctly. Typically, this occurs if there is an inconsistency with the data (e.g. one row of a numerical column contains a string which results in Categorical data type for the entire column). Therefore, you should check the data and fix the underlying issue before changing the data type.

If you create a new column with Quick Columns and the data type is not inferred correctly you can also use this function to set the correct data type.


How to use

  • Create the step and assign a tabular dataset to it in the field Data.
  • In the field Columns to change select all columns for which you want to change the data type.
  • For each selected column a new field appears in which you can set the data type for that column. The current data type of the column is preset in the field.
  • Change the data type of all selected columns by means of the new fields according to your requirements.
  • Once you finished the settings click Apply to execute the step.

Available Data Types

Data typeDescriptionPython equivalent
Float (float64)

Floating point number. Use this for any numerical data.

float
Boolean (bool)Columns with this data type can hold only one of two values (True/False, 0/1).bool
Categorical (object)Use this for any non-numerical columns. Typically, that would be Strings (i.e. text).object

More on this step

The tabular models on the platform convert any numerical input or output to Float. Therefore, this function doesn’t offer the data type Integer as an option for numerical values. If you need to convert a column to Integer (e.g. for other downstream transformations) you can do so with Quick Columns' custom code operation by applying the following code:

df["ColumnName"].astype(int)

You could either create a copy of the original column under a new name (and with changed data type) or overwrite the existing column.

You can’t do this with the Custom code function as that function preserves the data types of all columns involved!

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article