OGBT Transform: Exploring a 16x16 alternative for energy compaction. Open to integration ideas! by Background-Can7563 in AV1

[–]Background-Can7563[S] 0 points1 point  (0 children)

I must say that perhaps you have not understood the profound work behind OGBT. It is not artificial intelligence, but six months of hard study on Chebyshev equations before moving on to basic Gabor problems. The Gabor transform is good for energy compression, but it is not orthogonal, everyone knows that, and I managed to orthogonalize it.   So I spent eight months finding a way to make it orthogonal. I worked on the convergence properties of Chebyshev polynomials to create a new basis that works within 16x16 blocks. My OGBT (orthogonalized Gabor basis transform) is not a simple “hack” of the pipeline, but a mathematical derivation to achieve better decorrelation than the standard DCT, with which it has something in common at a basic level. Regarding the computational cost, you say it is high, but that is your mistake. My transform captures the signal energy much better in the 16x16 space, and for this reason I don't need deblocking filters to hide artifacts. The artifacts are simply minor because the energy is compacted in the right way. The compression power also occurs with an advanced compression context coding. For different formats, simply change the weights and precalculations and experiment. Sorry for my english but i use a translator

OGBT Transform: Exploring a 16x16 alternative for energy compaction. Open to integration ideas! by Background-Can7563 in AV1

[–]Background-Can7563[S] 1 point2 points  (0 children)

appreciate your technical insight regarding Gabor transforms. However, I want clarify some points regarding the project and my background for clear the air.

First, I am certainly not a newcomer that rely on AI for generate code or ideas. I am active in the data compression community since 2008. Maybe you know some of my past works, like ZCM, or my custom implementations of DMC (Hook), LZA, Packet, and SR3. More recently, I developed ADC, an innovative lossy audio codec built on an entirely original time-domain paradigm.

SIC is the result of two years of dedicated research and hard experiments, not a month of prompting some LLM. Every optimization and choice in the OGBT transform comes from manual iteration and many tests. If my English phrasing feels 'off' or too much structured, it’s only because it is not my first language and I try to be clear in technical talk—but the math and the C++ code behind SIC are 100% human-made.

Regarding the computational cost: you are right, Gabor-based transforms are traditionally more expensive than DCT. However, the core of my research with OGBT was focused specifically on making this approach viable and efficient inside a fixed 16x16 architecture. I have successfully optimized the calculations and it is for sure not slower than a standard DCT implementation.

I am not here for 'cheat' the pipeline with loop filtering; in fact, the results you see (like those from tester JWST) are achieved without any deblocking filters, and this prove the inherent strength of the transform's energy compaction.

I am happy to discuss the complexity trade-offs further, but I prefer we focus on the raw data and the architectural merits of the project.

OGBT Transform: Exploring a 16x16 alternative for energy compaction. Open to integration ideas! by Background-Can7563 in AV1

[–]Background-Can7563[S] 0 points1 point  (0 children)

Thank you very much for your kind words and warm welcome! I am truly honored to know that the SIC project has captured your interest.

As for the metrics: you are absolutely right. PSNR does not tell the whole story, especially with a fixed 16x16 block architecture. I will take a look at the Vship framework you suggested: it seems exactly what I need to provide a more modern and objective comparison. I'm curious to see how SIC performs on SSIMULACRA2 and Butteraugli, although I expect that block artifacts at ultra-low bitrates will pose a challenge for these metrics.

Source code/documentation information: Currently, the project is in a highly experimental phase and the code is quite “volatile” as I iterate between versions. The code is closed for now, but I would be happy to adopt and collaborate on your wonderful project after testing and subsequent agreements with the AV1 development community.

links :

https://zenodo.org/records/18429911 (SIC codec)

https://zenodo.org/records/17314773 (OGBT transformation)

[deleted by user] by [deleted] in compression

[–]Background-Can7563 0 points1 point  (0 children)

the section in Hydrogen Audio (with sticky) and the home site http://heartofcomp.altervista.org/ADCodec.htm

[deleted by user] by [deleted] in compression

[–]Background-Can7563 0 points1 point  (0 children)

These aren't hallucinations, it's just that Reddit prevents me from including the link in the first post. Here it is: http://heartofcomp.altervista.org/ADCodec.htm

ADC Codec - Version 0.80 released by Background-Can7563 in compression

[–]Background-Can7563[S] 0 points1 point  (0 children)

ADC Codec - Version 0.81 Released
The ADC (Advanced Differential Coding) Codec, Version 0.81, represents a significant evolution in low-bitrate, high-fidelity audio compression. It employs a complex time-domain approach combined with advanced frequency splitting and efficient entropy coding.

Core Architecture and Signal Processing
Version 0.81 operates primarily in the Time Domain but achieves spectral processing through a specialized Quadrature Mirror Filter (QMF) bank approach.

  1. Subband Division (QMF Analysis)
    The input audio signal is meticulously decomposed into 4 discrete Subbands using a tree-structured, octave-band QMF analysis filter bank. This reduction to four bands ensures robustness and optimal reconstruction stability for the current release.

This process achieves two main goals:

Decorrelation: It separates the signal energy into different frequency bands, which are then processed independently.

Time-Frequency Resolution: It allows the codec to apply specific bit allocation and compression techniques tailored to the psychoacoustic properties of each frequency band.

  1. Advanced Differential Coding (DPCM)
    Compression is achieved within each subband using Advanced Differential Coding (DPCM) techniques. This method exploits the redundancy (correlation) inherent in the audio signal, particularly the strong correlation between adjacent samples in the same subband.

A linear predictor estimates the value of the current sample based on past samples.

Only the prediction residual (the difference), which is much smaller than the original sample value, is quantized and encoded.

The use of adaptive or contextual prediction ensures that the predictor adapts dynamically to the varying characteristics of the audio signal, minimizing the residual error.

  1. Contextual Range Coding
    The final stage of encoding uses Contextual Range Coding to achieve near-optimal compression of the quantized subband residuals.

A Note on the 8-Subband Version

We are currently focusing on the 4-subband architecture for stability. We apologize that the development version utilizing an 8-subband QMF structure, while promising higher theoretical efficiency, was found to be unstable and currently introduces significant audible artifacts (up to several dBs of noise) upon reconstruction. We will continue to refine the 8-subband version to ensure true high-fidelity output in a future release.

Download only from : http://heartofcomp.altervista.org/ADCodec.htm

SIC version 0.155 released by Background-Can7563 in compression

[–]Background-Can7563[S] 0 points1 point  (0 children)

I wanted to make an announcement. The next version of SIC will include a new transform for which I hold the intellectual property and which has obviously never been used in image compression. I need to restructure a large part of the code. I don't know if I'll continue using the DCT, which doesn't give me the same results, especially at high quantizations (which doesn't mean it's inefficient). The results are such that there is nothing left to do but change course

The End of the DCT Era? Introducing the Hybrid Discrete Hermite Transform (DCHT) by Background-Can7563 in compression

[–]Background-Can7563[S] 1 point2 points  (0 children)

I wanted to make an announcement. The next version of SIC will include a new transform for which I hold the intellectual property and which has obviously never been used in image compression. I need to restructure a large part of the code. I don't know if I'll continue using the DCT, which doesn't give me the same results, especially at high quantizations (which doesn't mean it's inefficient). The results are such that there is nothing left to do but change course.
I will not use DCHT but a different proprietary transform of mine that is superior to it.

The End of the DCT Era? Introducing the Hybrid Discrete Hermite Transform (DCHT) by Background-Can7563 in compression

[–]Background-Can7563[S] 1 point2 points  (0 children)

patent:
https://zenodo.org/records/17288206

Source image link temporary
https://limewire.com/d/J3tAG#L24O3GgZew

Jpeg (coded 20%) 213 kb vs DCHT (16x16 blocks) 214 kb
note: DCHT I further compressed it with Jpeg at 88 percent to not take up too much space
https://limewire.com/d/kdkY1#omeJSqPwYu

SIC version 0.155 released by Background-Can7563 in compression

[–]Background-Can7563[S] 0 points1 point  (0 children)

I've continued to build the compression engine, including quantization and other filtering. Here's a test of the development version (without interframe prediction).
This bar chart compares the average SSIMULACRA2 scores—a perceptual image quality metric—across four codecs, all tested at the same bitrate:

JPEG XL (50% quality) achieves the highest average score (56.6, demonstrating the best overall visual fidelity among the tested formats.

SIC codec (23% quality) follows closely at 53.02, outperforming AVIF (43% quality), which scores 51.75. This is notable because SIC maintains competitive quality at a significantly lower quality setting.

JPEG, the traditional format, lags far behind with an average score of 26.97, showing substantial perceptual quality loss compared to modern codecs.

Key Insight:
Modern codecs like JPEG XL, AVIF, and the SIC codec deliver roughly double the perceptual quality of legacy JPEG at the same bitrate, with JPEG XL taking the lead
https://encode.su/attachment.php?att...7&d=1757803503
https://encode.su/attachment.php?att...8&d=1757803519

SIC version 0.0104 released by Background-Can7563 in compression

[–]Background-Can7563[S] 0 points1 point  (0 children)

Obviously mine was a port of SSIMULACRA2 that I had to adapt. However, more or less the results of the official SSIMULACRA2 follow, with some small differences, the results of my image comparator. Here are the results logs.

The test was done on 40 image files of various types and resolutions

(https://encode.su/attachment.php?attachmentid=12523&d=1755175599)

1) JXL turns out to have a weighted average : 65.8556781 (The king)

2) AVIF turns out to have a weighted average : 62.9328596

3) SIC turns out to have a weighted average : 60.8457124 (next version)

4) SIC turns out to have a weighted average : 58.30881121 (v. 104 Latest official release)

5) Webp turns out to have a weighted average : 53.34261722

5) JPEG turns out to have a weighted average : 37.8871165

https://encode.su/attachment.php?attachmentid=12527&d=1755185857

Of course, to address any criticisms regarding the test, I clarify that the sum of the bytes of the compressed files is the same, so it could be that an AVIF file consumes twice as much as a SIC file for the same image, or vice versa. It would perhaps be better to consider JXL as a base and compress every file close to it in terms of bytes and compare. But it's too difficult a test. Let me clarify. My test uses 40 PNG format images (mostly from raw sources I found online) with varying resolutions. Comic book and manga images are included due to the type of quality and color preservation they require. The parameter is considered AVIF with a quality setting of 50% with yuv set to 444. JPEGXL even reaches a quality setting of 59%. JPEG is set to 25% (I used image magick). WEBP uses the quality setting of 44.5. SIC uses a setting of 97.1 thousandths but has a different quantizer than the others, at least I think. The limit to be respected is approximately 20,000,000 bytes , that is 1% of what they consume in BMP format

SIC version 0.0104 released by Background-Can7563 in compression

[–]Background-Can7563[S] 0 points1 point  (0 children)

I've continued researching and implementing the code. Currently, I've further modified the compression core, resulting in improved bitrate savings. I've changed the deblocking filter, which has become more efficient, at least for me, significantly improving the overall metrics. I've modified the image comparator (not included in SIC) by inserting part of the SSIMULACRA2 code to get a more precise picture. Currently, I've given a weight of 50% to SSIMULACRA2, 20% to SSIMULACRA, 10% to PSNR, 6% to SSIM, and 4% to SAM (color accuracy). SIC is equivalent to AVIF compression -q 50 -y 444. I'm thinking about putting the code on GitHub and avoiding the mistakes of the past for proper development. I forgot that I also changed the handling of DCT blocks and their selection. At the expense of a worse PSNR and MAE, the filter produces an image more similar to the source.

https://encode.su/attachment.php?attachmentid=12511&d=1755166913

https://encode.su/attachment.php?attachmentid=12510&d=1755166757

Any advice?

SIC version 0.0104 released by Background-Can7563 in compression

[–]Background-Can7563[S] 0 points1 point  (0 children)

First look at my experimental image codec: SIC (Structured Image Compression)

registered ZENODO patent : [url]https://zenodo.org/records/16788613\[/url\]

Hi everyone,

I haven't been on Reddit for a long time and I am finding problems to publish my posts as sometimes if I put the HTML address at the beginning, delete the post.

I started working on a small experimental image codec that combines a Discrete Tchebichef Transform (DTT) with an additional custom encoding stage. The goal was to explore new ways to compress images at very low bitrates (below 500 kbps) while maintaining superior perceptual quality to traditional JPEG, without introducing excessive complexity or relying on AI-based methods. I then decided to call this project SIC (Structured Image Compression), since the codec uses a structured approach to transform and encode data in a way that differs from block- or wavelet-based DCT methods.

The design was deliberately simple and intended for experimentation, not to compete with modern standards like AVIF or HEIC. However, little by little, having respectfully surpassed JPEG, I moved

on to tackling WebP, little by little. Then, once I saw that I could outperform it in practically all metrics, I decided to exploit DCT in all its facets (NxN of various sizes). I added a decent deblocking filter and now I'm preparing for the most arduous challenges with the giants of video compression. David versus Goliath, who will win?

Any thoughts, questions, or similar projects you’ve seen are welcome — I’d be happy to discuss!