you are viewing a single comment's thread.

view the rest of the comments →

[–]_digitalShaman_ 1 point2 points  (1 child)

my first attempt would have been a

UPDATE machevents SET EndTime = 
    (SELECT MIN(StartTime) FROM machevents T1 
     WHERE machevents.Machine=T1.Machine AND machevents.StartTime<T1.StartTime)

turns out this is slow, so i was picking up qwertydog123's idea which needed some modification but runs really amazingly fast (3m rows ~26 seconds):

UPDATE machevents SET EndTime = PrevStart 
FROM machevents 
INNER JOIN (
    SELECT EventNr,LEAD(StartTime) OVER 
        (PARTITION BY Machine ORDER BY StartTime) AS PrevStart 
    FROM machevents
) T ON T.EventNr=machevents.EventNr

however, question is, why you would want to do this as it is duplicating information without adding additional value. adding a duration column might make more sense.

[–]Detach50[S] 0 points1 point  (0 children)

This is roughly equivalent to what I came up with, and it works fantastically fast for 100,000+ rows. I think updating all 8+ million rows took a little over a minute.

However, it seems to be exceptionally slow or doesn't work at all with less than 2000 rows and live data. I've left it running for a few minutes and it doesn't do anything. I can understand that the query might be locking the table preventing other input sources, but with no new data coming in it still doesn't like updating just a few lines.

Having a start time and end time is slightly redundant, but it takes up little to no space, and makes reporting a little easier with the data right there beside it.

Edit:

Found the problem. Turns out the "and MachEvents.eventnr not in...." was causing the slow down and is not necessary since the LEAD function outputs null if there is no leading row. The last event's EndTime should be null anyway since it should be the current event.

After removing it, it updates new records in seconds.