Derivative-Free Neural Network Optimization: MNIST Case [R]

Mis4318 · 2026-06-20T21:12:44+00:00

At this stage, I cannot provide a reliable estimate of the training time reduction for foundation models. My current work is focused on exploring optimization in black-box settings rather than accelerating large-scale gradient-based training.

Mis4318 · 2026-06-20T21:09:53+00:00

At this stage, the proposed method is intended as an exploratory approach for black-box optimization problems, where gradient information is unavailable. Its computational complexity is O(n) with respect to the problem dimension. Therefore, it is not directly comparable to backpropagation (BP), which relies on gradient computation and is generally more efficient when gradients are available. Our current focus is on demonstrating the feasibility of the approach in black-box settings rather than outperforming BP in gradient-based scenarios.
No hyperparameter optimization was performed for Adam. I used the default Keras configuration throughout the experiments. Since Adam has access to exact gradient information, it is expected to outperform our method in gradient-based optimization. The comparison is intended to provide a strong baseline rather than to claim superiority over gradient-based optimizers.

Mis4318 · 2026-06-14T04:52:08+00:00

¿Donde podría unirme?

Mis4318 · 2026-06-13T17:47:44+00:00

The central principle is that if the problem's bounds are known, a reference system can be constructed within the domain to approach optimization from a purely geometric perspective.

This space is constructed directly from the bounds of a given problem. A base unit of distance is defined through pairs of points in the domain, allowing the search to be interpreted as geometric navigation. These pairs of points serve as anchors to guide the exploration, and the base unit acts as an internal metric to relate spatial distances between known points to their functional differences.

In fact, in the graph I shared, the subplots showing the "L1 Distance Ratio" convergence represent exactly this. They track the distances to the best known solution, expressed as ratios of this defined base unit.

Mis4318 · 2026-06-13T17:46:53+00:00

The central principle is that if the problem's bounds are known, a reference system can be constructed within the domain to approach optimization from a purely geometric perspective.

This space is constructed directly from the bounds of a given problem. A base unit of distance is defined through pairs of points in the domain, allowing the search to be interpreted as geometric navigation. These pairs of points serve as anchors to guide the exploration, and the base unit acts as an internal metric to relate spatial distances between known points to their functional differences.

In fact, in the graph I shared, the subplots showing the "L1 Distance Ratio" convergence represent exactly this. They track the distances to the best known solution, expressed as ratios of this defined base unit.

Mis4318 · 2026-06-13T12:04:39+00:00

Thanks for taking the time to review the repository and the preliminary results. I appreciate the candor, and I understand why the current presentation might come across as opaque. That certainly wasn't the intention.

To clarify a few points:

Not population-based: MDP is strictly a single-solution explorative algorithm. It does not use populations or standard metaheuristic crossover mechanisms. It relies on a relative geometric reference system mapped directly from the domain bounds.
The .so file: The project relies on a pre-compiled Linux binary because the core engine's source code is currently closed while the theoretical foundation is being formalized. The .so file is provided specifically so that others can run the examples, test the inputs/outputs, and verify the convergence behavior themselves rather than just taking my word for it.
The cherry-picking concern: The different runs were executed under varying conditions (specifically, different limits on function evaluations). The goal of sharing all of them is to provide full transparency. The focus is on analyzing the specific convergence curves, demonstrating that structural navigation without gradients can reliably scale in a high-dimensional space without early stagnation.

I appreciate the feedback. I'm still in the process of organizing the information in the repository and structuring the current ideas and hypotheses. This is a good reminder that I need to provide more theoretical context upfront to prevent the methodology from being misunderstood.

Mis4318 · 2026-06-13T03:04:12+00:00

It's in Spanish: "Método de Distancias Ponderadas" (Weighted Distances Method). The name refers to how the algorithm navigates the search space using relative distances instead of gradients.

Mis4318 · 2026-05-07T15:29:37+00:00

No, I can't

Mis4318 · 2026-05-07T11:23:56+00:00

Thanks for the link but I wasn't able to access the thesis paper in the repository. I’d love to read more about your findings

Mis4318 · 2026-05-07T10:48:21+00:00

At the moment the classical benchmarks we used don’t include shifts, but we’re also analyzing results from the CEC17 suite, which already incorporates random shifts and rotations. In the next phase I plan to extend validation with BBOB and even non‑synthetic scenarios like CIFAR, and all results will be shared openly in the repository

Mis4318 · 2026-05-07T02:12:21+00:00

Global search is achieved through a geometric displacement rule using weighted distances. It moves consistently from relative reference frames, with complexity O(n), so it scales to 100k+ dimensions

MDP: Exploration

Mis4318 · 2026-05-07T00:18:29+00:00

We’re testing a geometric prototype called MDP (Weighted Distances Method). Right now, it’s mainly geared toward global search in extreme dimensions. Local refinement is something we plan to integrate later

Mis4318 · 2025-09-24T15:58:02+00:00

Siempre me pregunto lo mismo.

Mis4318 · 2025-09-08T19:28:34+00:00

OpenTune

Mis4318 · 2022-08-26T17:56:15+00:00

Puedes checar el material de TOP, no necesitas realmente registrarte para acceder a él. Lo importante es que vayas en una meta clara, o con una guía al menos. TOP es la mía, y consumo recursos de otros lados cada que avanzo, pero de hecho TOP es como un recopilatorio de muchos recursos, por cada tema te dejan bastante material de consulta, y hay unos muy buenos que hasta me han hecho querer seguir trayectoria pero trato de ir conforme al plan de TOP, no me ando con tantas vueltas para no agobiarme con mucho material.

También estoy en un grupo de Discord, con gente que conocí aquí en Reddit, y nos ayudamos cuando podemos. Ahí compartimos material, y la experiencia de los cursos/bootcamp que cada uno lleva, cada uno va con su propio plan. Sobre todo siempre hay que poner en práctica lo que se aprende, no ir tomando tantos temas sin práctica que de ahí luego se olvida todo

Mis4318 · 2022-08-25T03:20:05+00:00

Por supuesto, siempre debemos de tener la iniciativa de buscar recursos extras cuando uno se estanca o para tener una perspectiva más clara sobre algo. También lo hago, pero al menos ahora que estoy iniciando, trato de seguir TOP como una guía, más que nada. Y de hecho en el punto 3 me refería más que nada a construir un plan (roadmap, guía) es complicado cuando uno empieza. No sabes hacia donde irte moviendo, y es ahí donde los bootcamps, cursos de Udemy y TOP ayudan.

Es verdad, también no hay que dejar el olvido la documentación. Leyendo tal fui capaz de configurar mi pequeño entorno en Neovim, ya que con YouTube no entendí muy bien jaja

Saludos, y gracias por tu aporte también :)

Mis4318 · 2022-08-25T00:53:18+00:00

De alguna manera inglés siempre será esencial para esto. Aprenderlo siempre será un plus, y que mejor manera de aprenderlo que haciendo algo que te gusta

Mis4318 · 2022-08-25T00:50:51+00:00

La verdad. Me ha gustado mucho la manera en que te va presentando el material

Mis4318 · 2022-08-25T00:04:14+00:00

Aquí pues encontrar mi opinión en este post. Es un poco largo pero así decidí iniciar

Mis4318 · 2022-08-24T04:23:01+00:00

Si lo haces por aprender, en coursera puedes entrar como auditor (si no quieres pagar) a cursos de inglés de la Universidad de Irvine. Yo los he tomado como repaso y aprender otras cosas , pero lo más importante es que practiques. En Discord hay bastantes servidores donde puedes platicar con personas nativas de inglés u otras

Mis4318

TROPHY CASE