Linking datasets creates a logical connection between them, allowing charts to visualize and be filtered by data spread across multiple datasets. In this article, we will cover:

How to link and unlink datasets

Datasets must be linked through a common column. In our example, the datasets are linked through the customer ID, since this ID occurs in both datasets. Sets without a common column can’t be linked. The values of that common column must be exactly the same in both datasets including spaces and capital letters. This is important when you work with customer names for example. The column names may differ between datasets, since those are not used as values in the charts.

  • Linking
    • Navigate to the master dataset and open the linking modal by clicking on the "links" button.
    • Select a second dataset to use as the lookup dataset
      • The "N -> 1" button can be clicked to switch which dataset is in the master and lookup position.
    • Click and drag the common column from one dataset to the other
  • Unlinking
    • Navigate to either dataset and open the linking modal by clicking on the "links" button.
    • Click on the "=" sign in the link and select "Remove link"

What does MANY to ONE mean?

There is an important rule to remember when linking datasets: MANY (N) to ONE (1). This rule refers to the direction of the link, meaning you always link from one dataset (Master table) to another (Lookup table).

Links between two datasets are done using the LOOKUP principle, which means Luzmo looks up a specified value in another dataset. A join therefore occurs when the specified value of a field in dataset A is found in another field from Dataset B.

What is a lookup join?

In customer facing analytics, the LOOKUP join is intuitive to end-users. The master dataset uses the common column to add extra context from the lookup table. The system stops at the moment it finds a match.

Requirements for Linking Datasets:

  1. The two datasets you want to link must have a common column (e.g. you can link Customer ID and Customer ID and not Customer ID and Customer Name.)

  2. The common column must be of the same data type (linking hierarchies with numeric fields will not work!).

  3. Make sure that your data is only linked from one side to another. Don't make loops.

  4. Always link from MANY to ONE, not the other way around. A ONE to ONE link is a subset of the MANY to ONE link, and therefore is also possible.

  5. You cannot use a derived column as the linking column.

  6. Linking datasets from different data sources will cause the queries to not get pushed down, potentially resulting in slower loading times.

How links are joined using SQL

Linking is accomplished differently depending on whether the master or the lookup table is being queried:

  1. The lookup join retrieves, for every row of A, exactly zero or one rows from B that match the row A. This does not directly correspond to a SQL join type.
SELECT
  ...
FROM
  A
  LEFT JOIN (
		SELECT DISTINCT ON (B.join_key_1[, ...]) * FROM B
  ) B
    ON A.join_key_1 = B.join_key_1 [AND ...]
  1. The exists join retrieves only rows from A for which at least one matching row from B is found.
SELECT
  ...
FROM
  A
WHERE
  EXISTS (SELECT * FROM B WHERE A.join_key_1 = B.join_key_1 [AND...])

Linked filtering

Linked filtering is when filters on one dataset are propegated through links to other datasets, and vice versa.

How to control lookup filtering

  • Lookup filtering is enabled by default for new dashboards, ensuring that linked datasets filter each other.
  • If you do not want linked filtering, you can find the toggle for linked filtering in the advanced settings section of your dashboard settings.
  • Disabling Lookup filtering means you can still visualize data across different datasets, but when a chart only has data from one dataset, it will only listen to filters from that specific dataset.

Chaining linked datasets

You can link two datasets, and you can even make a whole chain of linked datasets. It is called a chain since the direction must always be the same, from many (N) to one (1).

However, to obtain linked filtering in your dashboard, the data must be linked directly. There is no problem when you want to apply this to datasets A and B, or B and C. However, if you want to filter on set C and affect set A, dataset B must also be involved in the chart. That is because set B makes the link between set A and set C.

We can also use this logic for example when creating a map. Suppose we want to visualize the Sales volume in the world. Our third dataset will be the public one about the geography of all countries, named "World Countries". Since the Sales dataset and our Geography datasets don’t have a common column and can’t be linked, we need to use the Customer dataset to make the link between them. In the Customer set is a column called 'Country' which can be linked with the column ‘Formal name’ in our geography dataset. Our three datasets will be linked as follows:

Need more information?

Do you still have questions? Let us know how we can help.
Send us feedback!