When working with datasets that contain missing values, attempting calculations on these incomplete rows can result in undefined outputs.
Why use a fallback value?
To ensure your calculations are accurate, you may choose to set a fallback value for missing data.
Setting a fallback value allows calculations to proceed even when data is missing, by substituting a predefined value in place of the missing data. This can be particularly useful in maintaining continuity in your analysis. However, it's important to use this feature thoughtfully.
-
Context Matters: Replacing missing values can lead to misleading results if the missing data does not accurately represent the replacement value. Always consider the specific context of your dataset before applying a fallback value.
-
Data Integrity: Ensure that substituting missing values does not compromise the integrity of your analysis. Incorrect replacement can result in skewed or inaccurate insights.
-
Consult Guidance: If you're unsure how to handle missing data, consult your data governance team or refer to established guidelines for your particular use case.
Setting a fallback value
To set a fallback value for missing data, follow these steps:
- Within the Explore Data tool, click 'Calculation', then select 'Manage missing data fallback'
- Toggle on the 'Enable missing data fallback' option
- Enter the missing data fallback value you wish to use
- Click 'Save' to apply the fallback value.
What to Expect
Once you’ve set a fallback value, any calculations that encounter missing data will use the fallback value you provided. This will allow the calculation to return a result, even when the dataset contains gaps.
Important Note: Setting a fallback value does not alter the underlying raw data or how it is displayed. The data table will continue to show missing values in their original form. However, the calculations derived from this data will incorporate the fallback value.
In the source dropdown, a disclaimer will appear to inform users that a fallback value has been applied and specify what that value is. This ensures transparency in how the data is being processed.
Use this feature judiciously to maintain the accuracy and reliability of your analysis.