Understanding the “sample” Parameter in API Datasource – What Does It Do?

Hello Team, I am using open source BI helical insight. I am trying to use API as a data source.

While configuring an API as a datasource in Helical Insight 6.1 GA, we noticed a parameter called sample mentioned towards the end of the configuration (e.g., "sample": "10" ).

Blog I am referring to connect to API : https://www.helicalinsight.com/connect-and-use-an-api-as-a-data-source-in-helical-insight-5-0/

Could you please explain:

  • What is the purpose of this parameter?
  • How does it affect the datasource or schema generation?

Thanks,
Vema.

Hello,

The sample parameter in an API datasource is used to define the number of records to be fetched as a sample for schema detection.

What it does:

When Helical Insight connects to an API datasource, it needs to understand the structure (schema) of the incoming JSON data — such as:

  • Field names
  • Data types
  • Nested structure

Instead of scanning the entire dataset, it uses a sample set of records to infer this schema.

How to use it:

{
...
"sample": "10"
...
}

This means:

  • The system will fetch 10 records from the API
  • Based on those records, it will automatically determine the schema

Why it is useful:

  • Improves performance by avoiding full data scan
  • Helps in faster metadata creation
  • Ensures a representative structure is captured

Important Note:

  • If the sample size is too small, some fields may be missed
  • If it’s too large, it may slow down the schema detection process

Recommendation:

Use a moderate value like 10–50, depending on how varied your API response is.

This parameter is especially helpful when dealing with dynamic or complex JSON APIs, where structure may vary across records.

Thank You,
Helical Insight.