Auto-sampling for large data sets
If there are more than 10,000 rows in the data set you are using, your browser will take time to generate plots and this will slow down the whole notebook. To avoid this, when a data set contains more than 10,000 rows, the platform will automatically and randomly select 10,000 points to display. This enables better performance, but also makes it easier to read the plot as more than 10,000 points is hard to visualise and understand. However, if you feel that you really need to see all the data and if you don't mind slowing down the platform, you can tick the option to show all data points.
To make sure that plots are consistent, the same random seed is applied throughout the platform, which means that the same data will be displayed, even in different notebooks.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article