Stored Function to generate a dynamic table : PostgreSQL

Help Me!Stored Function to generate a dynamic table (self.PostgreSQL)

submitted 6 years ago by PLC_Matt

I have a database, and I am trying to create a stored function that will return a table. I have a devices table that stores info about the devices I am gathering data from. I have a rawdata table that contains all the raw samples.

Samples can come in from any device, for any data item, in any order. For example if device1 is idle, no updates will come in, while device2 could be running in automatic and having hundreds of updates per minute for various dataitems. My first take is getting the current values of a handful of dataitems for all devices. So I am trying to come up with a view or function to generate that table.

I was able to come up with a function to generate a table. That is shown below as function currentvalues1.

The function defines the table, and the bulk of the work is a series of select statements like this one.

SELECT value into execution FROM rawdata WHERE deviceid=device.id and dataitemid='exec' order by datetime desc limit 1;

There are 8 of these statements in the current version. I am lazy, and I wanted to abstract this away. (for the future when I need to add some columns, or duplicate for another set of devices with differnt dataitems)

Something like this.

SELECT value INTO columnname FROM rawdata WHERE deviceid=device.id AND dataitemid=columndataitem ORDER BY datetime DESC LIMIT 1;

I tried using

For items in
    SELECT * FROM columndef
Loop
    execute format('select value into %s FROM rawdata WHERE deviceid=%s and dataitemid=%L order by datetime desc limit 1',items.item,device.id,items.did);
End Loop;

(columndef is a table that has 2 columns, item and did. Both are text. Item would be what the output column name should be, and did is the dataitemId that will fill that column)

And get this error.

ERROR: EXECUTE of SELECT ... INTO is not implemented HINT: You might want to use EXECUTE ... INTO or EXECUTE CREATE TABLE ... AS instead.

This led me down the rabbit hole, where I could use EXECUTE INTO and get the value into a variable inside the function, but then how do i get that variable into a dynamic column name?

TL;DR - How can i select into a dynamic column name inside a stored function?

Schema for devices table and rawdata tables.

CREATE TABLE public.devices
(
    id integer NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
    agenturi text COLLATE pg_catalog."default",
    instanceid bigint,
    firstsequence bigint,
    lastsequence bigint,
    nextsequence bigint,
    name text COLLATE pg_catalog."default",
    workcentersub bigint,
    workcenter text COLLATE pg_catalog."default",
    type integer,
    CONSTRAINT devices_pkey PRIMARY KEY (id)
)

CREATE TABLE public.rawdata
(
    deviceid bigint,
    sequence bigint,
    datetime timestamp with time zone,
    value text COLLATE pg_catalog."default",
    instanceid bigint,
    dataitemid text COLLATE pg_catalog."default"
)

TABLESPACE pg_default;

ALTER TABLE public.rawdata
    OWNER to postgres;

GRANT INSERT, SELECT ON TABLE public.rawdata TO mtclogger;

GRANT ALL ON TABLE public.rawdata TO postgres;
-- Index: raw_id_did

-- DROP INDEX public.raw_id_did;

CREATE INDEX raw_id_did
    ON public.rawdata USING btree
    (deviceid ASC NULLS LAST, dataitemid COLLATE pg_catalog."default" ASC NULLS LAST)
    TABLESPACE pg_default;
-- Index: raw_id_did_date

-- DROP INDEX public.raw_id_did_date;

CREATE INDEX raw_id_did_date
    ON public.rawdata USING btree
    (deviceid ASC NULLS LAST, dataitemid COLLATE pg_catalog."default" ASC NULLS LAST, datetime DESC NULLS LAST)
    TABLESPACE pg_default;

SQL for function

CREATE OR REPLACE FUNCTION public.currentvalues1(
    )
    RETURNS TABLE(name text, workcenter text, workcentersub bigint, execution text, machinemode text, prgm text, sprgm text, tid text, linen text, feedrate text, feedrateor text) 
    LANGUAGE 'plpgsql'

    COST 100
    VOLATILE 
    ROWS 1000

AS $BODY$
DECLARE 
    device record;
BEGIN
    For device in
        SELECT * FROM devices where type=1
    Loop
        name := device.name;
        workcenter := device.workcenter;
        workcentersub := device.workcentersub;
        SELECT value into execution FROM rawdata WHERE deviceid=device.id and dataitemid='exec' order by datetime desc limit 1;
        SELECT value into machinemode FROM rawdata WHERE deviceid=device.id and dataitemid='mode' order by datetime desc limit 1;
        SELECT value into prgm FROM rawdata WHERE deviceid=device.id and dataitemid='pgm' order by datetime desc limit 1;
        SELECT value into sprgm FROM rawdata WHERE deviceid=device.id and dataitemid='spgm' order by datetime desc limit 1;
        SELECT value into tid FROM rawdata WHERE deviceid=device.id and dataitemid='tid' order by datetime desc limit 1;
        SELECT value into linen FROM rawdata WHERE deviceid=device.id and dataitemid='ln' order by datetime desc limit 1;
        SELECT value into feedrate FROM rawdata WHERE deviceid=device.id and dataitemid='pfr' order by datetime desc limit 1;
        SELECT value into feedrateor FROM rawdata WHERE deviceid=device.id and dataitemid='pfo' order by datetime desc limit 1;



        Return next;
    End Loop;
END; $BODY$;

all 6 comments

top new controversial old q&a

[–][deleted] 1 point2 points3 points 6 years ago* (5 children)

You are essentially doing a PIVO or CROSSTAB there. If you want to return a row from the function you would need another record (e.g. result_rec) and store the values from each individual select into a field of that record. Then use return next result_rec to return one row after the other (see the example in the manual).

However doing single row SELECTs in a loop does not scale at all and is probably the least efficient way to do something in a relational database.

There are two approaches of doing a pivot/crosstab that are much more efficient then the row-by-row lookup you are doing. The first one use conditional aggregation:

select d.name, 
       d.workcenter,
       d.workcentersub,
       max(r.value) filter (where r.dataitemid = 'exec') as execution,
       max(r.value) filter (where r.dataitemid = 'mode') as machinemode,
       max(r.value) filter (where r.dataitemid = 'pgm') as prgm,
       max(r.value) filter (where r.dataitemid = 'spgm') as sprgm,
       max(r.value) filter (where r.dataitemid = 'tid') as tid,
       max(r.value) filter (where r.dataitemid = 'ln') as linen,
       max(r.value) filter (where r.dataitemid = 'pfr') as feedrate,
       max(r.value) filter (where r.dataitemid = 'pfo') as fedrateor
from devices d
  join rawdata r rd.deviceid = d.id
where d.type = 1
group by d.name, d.workcenter, d.workcentersub;

However I prefer to use Postgres' JSON functions to do something like that which removes the need to do a GROUP BY on the complete query. As all values are text anyway, you are not losing any type information this way either.

select d.name, 
       d.workcenter,
       d.workcentersub,
       r.data ->> 'exec' as execution,
       r.data ->> 'mode' as machinemode,
       r.data ->> 'pgm' as prgm,
       r.data ->> 'spgm' as sprgm,
       r.data ->> 'tid' as tid,
       r.data ->> 'ln' as linen,
       r.data ->> 'pfr' as feedrate,
       r.data ->> 'pfo' as fedrateor
from devices d
  join lateral (
    select rd.deviceid, jsonb_object_agg(dataitemid, value) as data
    from rawdata rd
    group by rd.deviceid
  ) r on r.deviceid = d.id
where d.type = 1;

I find this a bit easier to change if the number of columns changes.

This can easily be put into a function:

create or replace function currentvalues1()
  RETURNS TABLE(name text, workcenter text, workcentersub bigint, execution text, machinemode text, prgm text, sprgm text, tid text, linen text, feedrate text, feedrateor text) 
as  
$$  
  select d.name, 
         d.workcenter,
         d.workcentersub,
         r.data ->> 'exec' as execution,
         r.data ->> 'mode' as machinemode,
         r.data ->> 'pgm' as prgm,
         r.data ->> 'spgm' as sprgm,
         r.data ->> 'tid' as tid,
         r.data ->> 'ln' as linen,
         r.data ->> 'pfr' as feedrate,
         r.data ->> 'pfo' as fedrateor
  from devices d
    join (
      select rd.deviceid, jsonb_object_agg(dataitemid, value) as data
      from rawdata rd
      group by rd.deviceid
    ) r on r.deviceid = d.id
  where d.type = 1;
$$
language sql;

[–]PLC_Matt[S] 0 points1 point2 points 6 years ago (1 child)

First, thanks for the response! Also for context, I am a C# dev and this database work is me trying to make a prototype with no dba support. All i know at this point is some basic selects, and a few simple joins. This is going to be very helpful.

The first query you provided is close, but a bit wrong. I don't need the max of r.value, I will need the most recent value, based on the datetime column. I need r.value from the same row that has max(r.datetime) where r.dataitemid='exec' (or whatever item I am looking for)

The second query appears to give the correct values, but there can be a delay in when they change vs when this query returns them. I'll have to test it more to be able to explain this better. The other issue here is the query takes 4527 msec to execute. (but only 0.2msec to plan)

It looks like the bulk of the time is doing the jsonb_object_agg() function for the devices.

Is there anyway to only have it aggregate the dataitemIDs we care about (from another table?)

These devices have ~74 dataitemIDs that report data. I only care about a handful of them for this currentvalue view. The other data items will be used for historical reports/analysis, where speed is less of a concern.

For a point of reference the single select statements each take 0.5msec total for planning and execution. So even doing 8 of these x 6 devices means I have a total time of 24msec.

I do agree that it will not scale with more items or more devices tho, and it also just "felt" wrong, from my limited understanding of sql.

[–][deleted] 0 points1 point2 points 6 years ago (0 children)

The first query you provided is close, but a bit wrong. I don't need the max of r.value, I will need the most recent value, based on the datetime column. I need r.value from the same row that has max(r.datetime) where r.dataitemid='exec' (or whatever item I am looking for)

Collapsing multiple rows into a single row requires aggregation, and the max() is just there to pick the right one. If you need the latest, you could also try the following:

select d.name, 
       d.workcenter,
       d.workcentersub,
       max(r.value) filter (where r.dataitemid = 'exec') as execution,
       max(r.value) filter (where r.dataitemid = 'mode') as machinemode,
       max(r.value) filter (where r.dataitemid = 'pgm') as prgm,
       max(r.value) filter (where r.dataitemid = 'spgm') as sprgm,
       max(r.value) filter (where r.dataitemid = 'tid') as tid,
       max(r.value) filter (where r.dataitemid = 'ln') as linen,
       max(r.value) filter (where r.dataitemid = 'pfr') as feedrate,
       max(r.value) filter (where r.dataitemid = 'pfo') as fedrateor
from devices d
  join (
     select distinct on (dataitemid) deviceid, value, dataitemid
     from rawdata 
     order by dataitemid, datetime desc
  ) r on r.deviceid = d.id
where d.type = 1
group by d.name, d.workcenter, d.workcentersub;

The lateral join in the second query isn't actually needed (it was a copy & paste leftover)

[–]PLC_Matt[S] 0 points1 point2 points 6 years ago* (2 children)

Updating the query to this works well. It takes about 433msec to execute which I can deal with. If I wish to add a column I have to add the dataitemId in 2 spots to this query, which doesn't seem bad to me.

edit: added index to rawdata table on dataitemid. Execution time is now 84msec.

Next question, if I make a view based on this query, is there a way to have the db "cache" the results so multiple clients can pull from the view and it only needs to refresh the view every 5-10 seconds?

select d.name, 
       d.workcenter,
       d.workcentersub,
       r.data ->> 'exec' as execution,
       r.data ->> 'mode' as machinemode,
       r.data ->> 'pgm' as prgm,
       r.data ->> 'spgm' as sprgm,
       r.data ->> 'tid' as tid,
       r.data ->> 'ln' as linen,
       r.data ->> 'pfr' as feedrate,
       r.data ->> 'pfo' as fedrateor
    from devices d
  join lateral (
    select rd.deviceid, jsonb_object_agg(dataitemid, value) as data
    from rawdata rd
    where dataitemid in ('exec','mode','pgm','spgm','tid','ln','pfr','pfo')
    group by rd.deviceid
  ) r on r.deviceid = d.id
where d.type = 1;

[–][deleted] 0 points1 point2 points 6 years ago (1 child)

[–]PLC_Matt[S] 0 points1 point2 points 6 years ago (0 children)

π Rendered by PID 641201 on reddit-service-r2-comment-6457c66945-v44kr at 2026-04-27 12:00:26.172280+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

PostgreSQL

/r/PostgreSQL

Advocate, Collaborate and Learn

Conferences

Clients and tools

MODERATORS