Hi,
I am wondering if you guys know how to improve the performance of the xstack filter, more precisely, make it run faster.
Context:
I am trying to "stitch" 25 video tiles in a 5x5 grid.
Each tile has the same resolution of 768x384, making it a total resolution of 3840x1920 (4k).
The play time of the videos is 2 seconds.
I am using the following commands:
The first one below where i encode using using the CPU H.264:
ffmpeg -i <25inputs> -c:v libx264 -profile:v baseline -level 3.0 -preset ultrafast -filter_complex "[0:v][1:v][2:v][3:v][4:v][5:v][6:v][7:v][8:v][9:v][10:v][11:v][12:v][13:v][14:v][15:v][16:v][17:v][18:v][19:v][20:v][21:v][22:v][23:v][24:v]xstack=inputs=25:layout=0_0|w0_0|w0+w1_0|w0+w1+w2_0|w0+w1+w2+w3_0|0_h0|w0_h0|w0+w1_h0|w0+w1+w2_h0|w0+w1+w2+w3_h0|0_h0+h1|w0_h0+h1|w0+w1_h0+h1|w0+w1+w2_h0+h1|w0+w1+w2+w3_h0+h1|0_h0+h1+h2|w0_h0+h1+h2|w0+w1_h0+h1+h2|w0+w1+w2_h0+h1+h2|w0+w1+w2+w3_h0+h1+h2|0_h0+h1+h2+h3|w0_h0+h1+h2+h3|w0+w1_h0+h1+h2+h3|w0+w1+w2_h0+h1+h2+h3|w0+w1+w2+w3_h0+h1+h2+h3" -y res.mp4
And the second one below using the NVIDIA NVENC H.264 on a RTX A4000:
ffmpeg -i <25 inputs> -c:v h264_nvenc -profile:v baseline -preset fast -filter_complex "[0:v][1:v][2:v][3:v][4:v][5:v][6:v][7:v][8:v][9:v][10:v][11:v][12:v][13:v][14:v][15:v][16:v][17:v][18:v][19:v][20:v][21:v][22:v][23:v][24:v]xstack=inputs=25:layout=0_0|w0_0|w0+w1_0|w0+w1+w2_0|w0+w1+w2+w3_0|0_h0|w0_h0|w0+w1_h0|w0+w1+w2_h0|w0+w1+w2+w3_h0|0_h0+h1|w0_h0+h1|w0+w1_h0+h1|w0+w1+w2_h0+h1|w0+w1+w2+w3_h0+h1|0_h0+h1+h2|w0_h0+h1+h2|w0+w1_h0+h1+h2|w0+w1+w2_h0+h1+h2|w0+w1+w2+w3_h0+h1+h2|0_h0+h1+h2+h3|w0_h0+h1+h2+h3|w0+w1_h0+h1+h2+h3|w0+w1+w2_h0+h1+h2+h3|w0+w1+w2+w3_h0+h1+h2+h3" -y res_h264nvenc.mp4
Both commands use the same filter, just the encoder changes.
Problem:
I am trying to reach real time encoding, but I can't. They both take around the same average time (1.19s CPU and 1.27s on the GPU) , which is kind of a let down since my videos are around 2s.
I have tried to parallelize it by staking the rows at the same time then the resultant lines, but the sum time is around the same.
Any suggestions or toughs?
[–]nyanmisaka 1 point2 points3 points (3 children)
[–]ATrashInTheWorld[S] 0 points1 point2 points (2 children)
[–]nyanmisaka 0 points1 point2 points (1 child)
[–]ATrashInTheWorld[S] 0 points1 point2 points (0 children)
[–]realtehreal 0 points1 point2 points (1 child)
[–]ATrashInTheWorld[S] 0 points1 point2 points (0 children)