I am attempting to write a small loop in SQL (I am using postgres) that increments a date and adds that date plus how many times it has looped into a new table I have created. However, I am getting a syntax error - but this same query works on other platforms.
CREATE TABLE month_counter (
month_start TIMESTAMP,
counter_value INTEGER
);
DO $$
DECLARE
current_timestamp TIMESTAMP := '2024-01-01 00:00:00';
end_timestamp TIMESTAMP := '2024-12-31 23:59:59';
counter INTEGER := 1;
BEGIN
-- Start the loop
WHILE current_timestamp <= end_timestamp LOOP
-- Insert into the table
INSERT INTO month_counter (month_start, counter_value)
VALUES (current_timestamp, counter);
-- Increment the counter
counter := counter + 1;
-- Increment the timestamp by 1 month
current_timestamp := current_timestamp + INTERVAL '1 month';
END LOOP;
END $$;
I keep getting the syntax error on the
-- Increment the timestamp by 1 month
current_timestamp := current_timestamp + INTERVAL '1 month';
and I am not sure what I am doing wrong - and the error isnt descriptive - does anyone have any ideas?
Please?
I believe it's because current_timestamp
is a reserved word in Postgres and your code is trying to redefine it.
I think there's a simpler way to do this anyway with a FOR loop.
CREATE TABLE month_counter (
month_start TIMESTAMP,
counter_value INTEGER
);
DO $$
DECLARE
starting_timestamp TIMESTAMP := '2024-01-01 00:00:00';
end_timestamp TIMESTAMP := '2024-12-31 23:59:59';
BEGIN
-- Start the loop
FOR i in 1..DATE_PART('month', end_timestamp) - DATE_PART('month', starting_timestamp) LOOP
-- Insert into the table
INSERT INTO month_counter (month_start, counter_value)
VALUES ((starting_timestamp + interval '1 month' * (i - 1) ), i);
END LOOP;
END $$;
I'm curious, what's the use case for the table?
1 Like
OMG, @kschirrmacher you are my hero. I didnt realize it was a reserved word - I beat my head against this all yesterday thinking I did something else wrong.... thank you so much.
Here is an example of my use case - and please tell me if there is a better way to do this:
Let's say I have a animal shelter. I have records of when a dog, cat or bird enter the shelter and when they leave they shelter.
I want to create a chart that shows over an arbitrary time period (past 6, 12, 18 months) how many of each the shelter has.
My thought here was to create a new table that only has the information I want based on the arbitrary time frame and use that data to create the chart.
Does that make sense?
I think I follow! You're on the right track with a date spine but you don't necessarily need a table for it. You can generate it from a generated series of numbers like so:
SELECT
'2024-01-01'::datetime + interval '1 month' * generate_series(0, ((DATE_PART('year', getdate ()) - DATE_PART('year', '2024-01-01'::date)) * 12 + (DATE_PART('month', getdate ()) - DATE_PART('month', '2024-01-01'::date)))::int) AS month;
This means that you don't have to maintain or store this table and running a new iteration with a different beginning and end date only takes a few ms.
Assuming you have a table shelter_stays
that looks like:
(Species ID would probably be an attribute on the visitor
record here but easier to illustrate like this)
with month_spine as (
SELECT
-- our starting date for the date spine
'2024-01-01'::date
-- the interval at which we want to increment the date for each row of this table
+ interval '1 month'
-- Multiplying the interval by an incrementing series of integers, starting at zero
* generate_series(0,
-- Calculate the upper end of the series, starting with the number of years between two dates multiplied by 12 to convert to months
((DATE_PART('year', current_date) - DATE_PART('year', '2024-01-01'::date)) * 12
-- plus the number of months between the two dates, irrespective of the year
+ (DATE_PART('month', current_date) - DATE_PART('month', '2024-01-01'::date)))::int) AS month;
)
select
-- selecting month and species here. Any additional columns selected and grouped by will mean that your count is specific to each row where all of the grouped attributes are the same
month,
species_id,
-- distinct to only count a unique stay once per month when the stay starts and end in the same month
count(distinct id)
from shelter_stays
inner join month_spine on
-- joining on either of these gives us two rows per matching condition
date_trunc('month', shelter_days.visit_start_date) = month_spine.month
or date_trunc('month', shelter_days.visit_end_date) = month_spine.month
--grouping to make our count function work
group by 1, 2
From there you can add additional joins and groupings as needed. If you have multiple shelters to query add the shelter ID in the select and group by to split the count out by the attribute as well.
Okay - so I am replying to say "thank you!" so... THANK YOU.
I dont fully grok all of this yet - I am new to SQL (but have been an engineer for many years).
I am going to go play with it, and I'll get back to you - with a better "thanks" or more questions
1 Like
It's funky at first but hopefully you learn to love it. You'll be a better engineer with data that pops out in the right shape/scope the first time, too.
I'll add a few comments to the earlier snippet in case that helps.
Happy to answer any other questions that you come up with!
Edit: Fixed up a few Redshift-specific functions to be Postgres friendly in the example.
1 Like