FAQ: How do FTP batch size and page size work together to impact performance?

Thanks to @davidgollom for his help on this question (originally posted on Fine-tune integrator.io for optimal data throughput). This post brought to you by our awesome PM, @viliandyleonardo. Dave said that by setting his batch size to 20, this greatly improved performance.

Q: Should I set batch size higher for an FTP export when processing small files? Does the batch size impact the page size?

A:

  • The FTP export Page size defines the number of records that you want IO to group for processing and sending downstream to be imported into their destination.
  • The FTP export Batch size specifies the number of files you want IO to export from the FTP site in the same session without having to spend time logging out and logging back in to the server for each file.
  • FTP batch size does not impact page size but they both work together to control the integration overall performance.

If you expect many large-sized files to export, then setting a really high batch size might experience network timeout errors. However, if you expect many small-sized files, then it is advisable to set a higher batch size value so that IO does not process one small file at a time, leading to a more performant integration. You can start with batch size 20. The new error management (to be released) has a chart that plots average processing time/success event over time, and you can use this chart to analyze your flow performance and tweak the batch size as you see fit.

The Concurrency in the FTP connection provides another control to tune your FTP export performance. Setting this field allows you to open multiple connections to the same FTP site and export multiple files from the FTP site. To be sure, please note that that the FTP connection concurrency setting only works for an export step and it is not applicable for import step.