Validating and uploading your data files

Ensure your data is clean, accurate, and ready for analysis by applying validations during file uploads. With this guide, learn how to add validations, handle errors, and manage column mismatches to streamline your data integration process.

When uploading Excel or CSV files to your dataset, you can add specific validations to each column to ensure data consistency and accuracy. Follow these steps to validate and upload your data:

Adding Column Validations

  1. Start Validation: In the “Validation” step of the upload process, click on "Add Validation".

  2. Select a Column: Choose the column you want to validate.

  3. Allow Empty Cells: Specify if the column can contain empty cells.

  4. Choose a Validation Type:

    • Number: Requires the data to be numerical.
    • Whole Number: Requires the data to be a whole number.
    • List: Provide a list of acceptable values (one per line) that the data must match.
    • Regular Expression: Choose a template from the dropdown or input a custom regular expression for specific formatting rules.
  5. Finalise Validation: Click "Add Validation" to save your column validation.

Uploading Data with Validations

  1. After setting your validations, proceed to the final step of the upload and click "UPLOAD DATA".
  2. If there are errors:
    • 5 Rows or Fewer: You'll receive an immediate notification of the errors.
    • More Than 5 Rows: Click on "See details" to review a summary of errors in a popup window, allowing you to make necessary corrections.

Managing Column Mismatches

When uploading new Excel or CSV data to an existing dataset, the system will flag any column mismatches. This ensures your new data aligns with the current dataset structure.

Types of Column Mismatches

  1. Missing Column: A column in the original dataset is missing from the uploaded data file.
  2. New Column Detected: The uploaded data includes a new column not present in the original dataset.
  3. Column Order Mismatch: The column order in the new upload does not match the current dataset’s order.

You can hover over the info icon for detailed information on these errors.

Overwriting Existing Data

If you want to replace the current dataset structure with the new upload format:

  1. Select "Overwrite" and check the "Force upload" option.
  2. Confirm the action in the popup window.
  3. Click "UPLOAD DATA".

Note: Overwriting will permanently replace the existing dataset with the new data structure. Ensure that all uploaded data is accurate before proceeding.