Should I put my Variable in the table rows or columns?

Deciding on whether to put a Variable in rows or columns does not change what data is being collected by the Schema. What this is used for is to format the data entry table in such a way as to make it more intuitive for contributors to enter data.

As an example, imagine we are collecting data for these Variables and Categories:

  • Measured quantity
    • Births
  • State or Territory
    • New South Wales
    • Victoria
  • Year
    • 2020
    • 2021
    • 2022

If we put all 3 Variables into columns, then we would have a data entry table that looks something like this:

KB_DataSchema_MultiIndexTable_SingleRow

This is not very intuitive for data entry as we only have 1 row and a separate column for each Variable combination to enter data against.

A good rule of thumb is to put Variables with few or only 1 Category into the columns. While putting Variables with many Categories in rows.

Since Year has the most Categories we will try putting it into rows. This is the way the data entry table would look after doing so:

KB_DataSchema_MultiIndexTable_MultiRow

Now we have a separate row for each of the Year Variables Categories. This is simpler for our contributors to work with.