As you can see here only 3000 orders were inserted into snowflake. Only 1 page was processed. This means that each page only contained 3000 orders instead of the expected 12,000 orders. Is there a cap on page size here? I first tried this with a page size of 60 and then 120 with the same number of orders per page.
Yes, each page is limited to 5MB is size. We will automatically cap the number of records when this size limit is reached, and we will auto route records to the next page of data, etc...
In general, it is not recommended that you max out the pages in terms of MBs, and the 5MB is meant to be a never needed upper limit (but that helps us size and scale servers in our backend).
There is no performance advantage to having 10 vs 50 pages of data in a typical flow, and your biggest lever for increasing data throughput in your flow is via the "Concurrency level" field on the connection resources. i.e. You can set the concurrency on your Snowflake connection to 10 or 20 to drastically speed things up. For your specific setup outlined above, I highly recommend setting the page size field to 10 max (i.e. because each of your records contains 100 orders, so this means each page will have 1000 orders, and this is a perfect per request size for Snowflake data loading IMO).
I was wondering, If instead of doing "one to many" in my Snowflake import bubble, I specify the path to the individual records in your Amazon source export, would this reduce my page size? This might be a small tweak that can help with the overall performance.
Yeah I def recommending doing what I suggested here, and then once you make that change, you should set 'Page size' on your Amazon source export to 100 and test the performance, and then try 1000 and test the performance again, and then just pick a final number between 100 - 1000 based on the results of your perf tests. If you dont have time to run this perf test, then just go with 100 and that should def be fast enough to get started.