Unaccounted for numbers displaying with data

I am using the following SELECT statement to pull data to display in a table:

SELECT color, COUNT(color) AS poolcolors
FROM combined_shipped_2024
WHERE
  ({{ !variable20.value }} OR brand = {{ variable20.value }}) AND (({{ !variable21.value }} OR EXTRACT(YEAR FROM shipped::date) = {{ variable21.value }}) AND ({{ !variable22.value }} OR EXTRACT(MONTH FROM shipped::date) = {{ variable22.value }}))
GROUP BY color
ORDER BY poolcolors DESC
LIMIT 20

For some reason, the data is showing additional numbers that I can not account for. Here is a screenshot:

I have verified that 317 is the correct number for Graphite, however, I also have another Graphite showing in the table with the number 6. Same problem with Caribbean Sparkle. Any ideas what would be causing that?

Hello @tomm,

Interesting issue, that is odd that there are three instances of 'graphite' showing up :thinking:

It looks like you set up the SQL query correctly, my guess is that the issue might stem from variations in the color values, such as differences in case, whitespace, or special characters.

To get around this, we can test out using some SQL sanitizers to normalize the color names and see if that returns the correct results.

SELECT 
  TRIM(LOWER(color)) AS normalized_color, 
  COUNT(*) AS poolcolors
FROM combined_shipped_2024
WHERE
  ({{ !variable20.value }} OR brand = {{ variable20.value }}) 
  AND (({{ !variable21.value }} OR EXTRACT(YEAR FROM shipped::date) = {{ variable21.value }}) 
  AND ({{ !variable22.value }} OR EXTRACT(MONTH FROM shipped::date) = {{ variable22.value }}))
GROUP BY normalized_color
ORDER BY poolcolors DESC
LIMIT 20;

Let me know if this works, LOWER will make sure the colors are lowercased and TRIM will ensure that any white space is removed. Hopefully it fixes the discrepancies :crossed_fingers:

Thanks for the reply @Jack_T. I tried what you suggested and I got the same results, except everything was lowercase.

No problem, that is odd that the repeats still appear.

Maybe we need to refactor the WHERE section of the query. But first let's verify that the data for color is clean.

Can you try

SELECT DISTINCT TRIM(LOWER(color)) AS normalized_color, color
FROM combined_shipped_2024
WHERE
  ({{ !variable20.value }} OR brand = {{ variable20.value }}) 
  AND (({{ !variable21.value }} OR EXTRACT(YEAR FROM shipped::date) = {{ variable21.value }}) 
  AND ({{ !variable22.value }} OR EXTRACT(MONTH FROM shipped::date) = {{ variable22.value }}));

Compare normalized_color and color to see if there are still unclean values.

Then, lets try changing the WHERE clause,

WHERE 
  ({{ !variable20.value }} OR brand = {{ variable20.value }}) 
  AND ({{ !variable21.value }} OR EXTRACT(YEAR FROM shipped::date) = {{ variable21.value }}) 
  AND ({{ !variable22.value }} OR EXTRACT(MONTH FROM shipped::date) = {{ variable22.value }})

An then to finish it off we can reapply the normalization in both SELECT and GROUP BY

SELECT 
  TRIM(LOWER(color)) AS normalized_color, 
  COUNT(*) AS poolcolors
FROM combined_shipped_2024
WHERE
  ({{ !variable20.value }} OR brand = {{ variable20.value }}) 
  AND ({{ !variable21.value }} OR EXTRACT(YEAR FROM shipped::date) = {{ variable21.value }}) 
  AND ({{ !variable22.value }} OR EXTRACT(MONTH FROM shipped::date) = {{ variable22.value }})
GROUP BY TRIM(LOWER(color))
ORDER BY poolcolors DESC
LIMIT 20;

Hopefully this works! :crossed_fingers: