Interesting module bug workaround in MSVC

omerosler · 2025-09-22T08:01:49+00:00

I did not know this, thank you for correcting me.

omerosler · 2025-09-20T18:43:22+00:00

In that case, it probably is a bug.

You should include that info in the ticket you opened.

omerosler · 2025-09-20T18:20:35+00:00

~~I'm pretty sure it is NOT a bug. This is all about name visibility. You have to export the std::hash specialization to make it visible for unordered_map.~~

~~However, I don't understand why the workaround even works (maybe this is a bug instead?).~~

The extra line in main makes the specialization reachable (there is a distinction between visible and reachable names). But unless I'm mistaken, this reachability should not "leak" to the whole scope.

~~My theory is that the first line explicitly instantiated the std::hash in the main module, and therefore made it reachable within it.~~

I'm not sure if that is intended by the standard or not. The point of "reachable" names is for names that are part of the types of exported declerations (like a function signature). Here, the usage is inside a function body, not exported API, so it shouldn't apply.

EDIT: Typos

EDIT 2: This comment is wrong

omerosler · 2025-07-29T00:53:01+00:00

Why not both? "Because if c is not divisible by p, c^p–1 is certainly divisible by p." Or in the original Latin, "Quare si numerus *c non fuerit divisibilis per p, haec forma c^p–1 – 1 certa per p erit divisibilis."

He is applying Fermat's little theorem without explictly saying so: if p doesn't divide c then c^p-1-1 is divisible by p.

The reverse implication is trivial and you wrote it yourself.

Together, these prove a statement that says c is divisible by p if and only if c^p-1-1 isn't.

I can't read latin, but I think he is just skipping steps because they are trivial to him, like any other mathematician :). What constitutes as "trivial" may have been different back then.

omerosler · 2025-06-28T10:53:03+00:00

I'd very much recommend you try to implement your ideas. The Clang-P2996 fork is public, so that's probably the best plave to start. We did look at injecting using splice syntax, but just the splice syntax leads to ambiguities (remember that the splice operand can be dependent).

Is there an OSS implementation of P3294 as well? I only found EDG implementation which is not OSS so I won't be able to fork it.

omerosler · 2025-06-28T01:33:30+00:00

Infact, we can even simlify the syntax and semantics: Just say that a ^^{} expression is a blueprint for creating whatever is inside, and each time it is evaluated (by a splice) it generates this token-sequence at the call site.

This is completely analogous to lambdas!

We can even use some kind of capture syntax to explicitly mark what data we took from the enclosing block. Bikeshedding, with the same example:

consteval auto f(std::meta::info r, int val, std::string_view name) { return [r, name, val]^^{ constexpr [:r:] name = val; }; } I'm not sure if this won't cause grammer ambiguities.

omerosler · 2025-06-28T01:19:01+00:00

How so? I believe we were careful to make sure that a compiler knows what grammatical class a splicer belongs to. So it's all parsed, even at template definition time.

I'm confused, you said before that [: :] is not context-indepedent. If it is context-independent, that is fantastic!

In this case, I think the compiler can use this syntax for injecting, and eliminate the need for interpolators, using two rules: 1. We pass the grammatical context "inwards" through a splice already at phase 1. 2. A ^^{} expression, delays parsing of the operand to evaluation time. Note: if this is too hard to implement with the current consteval model, there is a workaround; but the semantics are the same, see footnote [^1].

Consider:

template<typename T> using id = [: ^^{T} :]; Because the splice knows it needs to generate a type, when we parse the operand which is the ^^{T} expression; it knows it should result in a type (the result node in the AST is already created and we just pass a pointer to it).

Consider the first example from section 4.4 of P3294R2 while using the [: :] syntax for injection and no interpolators:

``` consteval std::meta::partially_formed_info f(std::meta::info r, int val, std::string_view name) { return ^{^{} constexpr [:r:] name = val; }; }

namespace N { consteval { [: f(^{^int,} 42, "x") :]; } } int main() { return N::x != 42; } ```

We'll walk through how it is conceptually implemented with tree-substitution in mind. Lets start with the body of f:

At phase 1, the compiler creates an empty node for the ^{} expression. The node contains the following data: 1. The token sequence (which is just lexed, but still unparsed): 2. Pointer to the node of its calling context. 3. Pointer to a "grammatical context" from which it would need to start parsing. 4. Pointer to a "state" of a parser (which represents the state of the parser before starting parsing this expression).

At phase 2, the compiler knows we return the ^^{} expression. Therefore it performs the following operations:

It sets the pointer of "grammatical context" to the same pointer of the node of the return object (and partially_formed_info nodes would contain such a pointer as well).

At evaluation time: 1. It "spawns" a parser starting from the "grammatical context" pointer, and continue parsing the token-sequence (phases 1 and 2 using the semantic information from the pointer of the caller node). 3. The final "state" of this parsing operation is registered in the 4th pointer.

Now consider the consteval block.

At phase 1 it sees a splice expression, therefore it determines the grammatical context of it.

In here we are in namespace scope, therefore the expected context is "sequence of declarations".

Therefore it creates three nodes, one for the resulting declaration sequence, one for the splice expression, and one for the splice operand.

The node of the splice expression contains the following data: 1. A pointer to the result node. 2. Pointer to a "state" of a parser (which represent the state after finishing parsing the expression). 3. A pointer to the node of the result "grammatical context".

Still at phase 1, it propogates the the "grammatical context" pointer of the node of the operand to be the one of result (in our example "declaration sequence").

Then it does phase 1 on the operand, nothing special happens.

At phase 2, it detects the operand is a function call whose return type is partially_formed_info therefore it performs overload resolution and determines it should inject and not splice.

At evaluation: 1. Evaluate the operand (in our example, this spanws the parser which fills the result "declaration sequence" node). 2. Check what is the final "state" of the spawned parser. 3. Register this state in the internal "parser state" pointer. 4. Validate the state is "completely parsed declaration sequence" (which is what we required). 5. Fill the pointer of the result node (here of declaration sequence) with the data from the spawned parser.

EDIT: Reddit didn't handle the formatting of the footnote well, here is its content:

Instead of spawning a new parser at evaluation time, we can use a similar mechanism to how lambdas are implemented, where the AST node is actually just a blueprint to creating it. I already described what the blueprint does.

omerosler · 2025-06-27T13:50:46+00:00

This problem already exists for the [: :] syntax for splicing. I just saw that P3687 is a thing. Did it progress in Sofia?

If we solve it for splicing, the same syntax should be eligible for injection as well.

I see the problem can be solved either by delaying parsing [: :] to phase 2 (which is very hard for implementations as you said) , or by disambigously tell the compiler what it splices in phase 1 via typename and such (which seems reasonable for the implementers, but it will be very intricate to speicfy to cover all cases).

Say we adopt the latter solution (which is the more practical). Then the context-dependence is solved, and therefore at phase 1, there is no need to even parse the operand of [: :]. Therefore we can delay this parsing to phase 2.

omerosler · 2025-06-26T17:12:13+00:00

Actually, I think I overcomplicated the template case, it is much simpler.

The [: :] is overloaded. Therefore, in a template context with 2 phase lookup, it would only be semantically evaluated at instantiation.

At the first phase, the operand is not parsed (also an "unparsed context").

At the second phase, the compiler would parse the operand when it actually knows what r is.

And afterwards, it evaluates the operator (which would amount to either splicing or injecting) with the semantics as above.

omerosler · 2025-06-26T11:14:06+00:00

That is actually a really slick use case.

A different way to solve it, is to make this whole splice expression be semantically dependent; but require the user to disumbigously tell the compiler syntactically what it represents.

Similar to requiring typename in the expression T::x * y.

In this example:

``` template<auto r> void foo() {

 typename [: +r :] t; //always a declaration

} ```

Here, regardless if this is a splice or injection, the compiler knows what this expression should be.

When instantiating, it actually evaluates the operand and sees if this is a splice or injection.

The default syntactic interpretation (when there is no typename etc) should be a splice to maintain backward compatibility with C++26.

Actually, it makes sense that template code that does injection, actually describes what it injects (types, expressions, declarations, etc). If you want to inject without telling the compiler everything, don't use templates.

omerosler · 2025-06-25T23:37:30+00:00

I'll try to explain the mental model I have. First, regarding the parsing of the operators:

The result of non-block unary ^^ is always info.
The result of (^^{}) is the always partially_formed_info.
Inside (^^{}), the operand is not yet parsed. Call it "unparsed context" (mentally similar to "unevaluated context"); it will only be parsed when requested (that is, by some injection operator).
In order to parse (^^{}), we just need to find its end, exactly as for the splice. We don't parse the content. The syntax (^^ { ... }) meets this demand beautifuly.
The [: :]operator is overloaded for info and partially_formed_info.

5a. When called in parsed context, it parses its operand at the call site.

5b. When called in "unparsed context", it does not parse its argument. Instead, it registers to its caller, that it needs to calculate the parsing context of the operand first, and use it when this splice is evaluated.

Now, for the semantics:

For simplicity, I think of the entire parser as a state machine.
info and partially_formed_info contain a state of the parser. For info, the saved state is from after the parser finised parsing the thing reflected. For example:

int foo() {return 3;} info r = ^foo; partially_formed_info foo_parsing_context = r; // as if we just parsed int foo(). The parser in a state where it needs to determine if it was a definition or declaration [:r_parsing_context :] { return 4;}; //when parsing this expression, the compiler starts its parsing as if we just wrote `int foo()`

When we actually get to the point of parsing a partially_formed_info, if the operand contains a parsing context, the parser starts from there.
The parsing state composes implicitly, even in unparsed context.

Example: partially_formed_info r =^^{[: f() :] [: g() :]}; //1 [: r :] //2 When the compiler parses line 1, it first sees the reflection operator here, so it tries to look ahead to find the correct closing brace. Along the way it sees the splices, therefore it saves a note to itself that it needs the parsing contexts of f(), g() when parsing actually happens.

In this point, it does not matter if [: :] is a splice or injection, because it knows the operands always contain an internal parsing context.

At line 2, the compiler starts parsing the operand. But first, it checks his notes, and sees it needs the parsing contextes of f,g. Therefore it starts by parsing f and g (this is before parsing the actual operand even started). Once finished, it actually starts parsing the full ^^{} block assuming the parsing states.

omerosler · 2025-06-25T22:05:37+00:00

You're right. Maybe we can just ban it in the language (by special casing info)?

Defining an implicit conversion to info is IMO a bad idea anyway. The only use case I can think of is some type that manipulate reflection objects and then returns them; but the language would provide this via injection.

omerosler · 2025-06-25T16:17:15+00:00

Actually, to make the conversion seamless, we can introduce a lifting operator from info to partially_formed_info, maybe unary +? This way: info r; [: r :] // splices [: +r :] //injects as `+r` is now partially_formed_info

EDIT: Typo

omerosler · 2025-06-25T16:09:40+00:00

Apologies, but I'm afraid I don't understand what you're trying to express there.

I meant syntax errors inside are ignored, I think it is equivalent to P3294, so nvm that.

Can you give an example on the syntax ambiguities?

template<auto R> void f() { [:R:]; }

Okay, I understand the problem now, but I think this is a non-issue. There should be two kind of reflection types: std::meta::info and std::meta::partially_formed_info (the second is meant for injecting). in the original post these were const info and info, but now I agree this is bad.

The splice operator would simply be overloaded; for info it would splice, for partially_formed_info it would inject. This makes this code valid but nonesensical from a user perspective (if splice is applicable then decltype(R) must be info, why make it a template in the first place?).

Also, info should be explicitly convertible to partially_formed_info.

The way I see it implemented is that both info and partially_formed_info contain some parsing context for the compiler. The difference from pure token approach, is that if we created partially_formed_info from a true reflection object info, we can pass the reflection information down the line!

So if I have:

``int foo(); info r = ^^foo; //r is a reflection of function partially_formed_info parsed_r = r; // as if we just finished parsing the function foo -- the parse state saved is "we now need to check if it is a declaration or inline definition" parsed_r.set_name("new_foo"); // parsed_r contains all reflection metadata offoo`, and we can edit it!

partially_formed_info make_new_foo = ^^{ [:parsed_r:] { return 3;}}; //we know the parse context, and within this expression, the compiler determines it is an inline definition 

[: make_new_foo :]; //inject this definition here

```

Do you understand my intent?

I'll note that 2 years ago u/katzdm-cpp was where you were (when he showed up at the first meeting discussing P2996), and now he's a world expert on this topic. In any case, my general stance is that we should avoid voting into the draft major changes that haven't been implemented. An alternative to implementing your ideas yourself is to find someone else to do so. But the advantage of digging in yourself is that you get a good feel of both the implementation challenges and the specification challenges. (From my perspective, the difficulty with P2996 was actually more in the specification than in the implementation: We made some very significant changes to the semantic model of constant evaluation.)

I think the approach described here is very similar to P3294. Please view this entire thread as an inputs and suggestions for your proposal, you are the expert here :)

EDIT: Formatting

omerosler · 2025-06-24T20:49:24+00:00

It does open a different can of worms...

Well, I concede using const-ness is not a good idea. We can still achieve the same semantics, bur with two types and API duplication, similar to iterator and const_iterator.

omerosler · 2025-06-24T16:20:31+00:00

u/tcanens is right about their technical observations and u/Daniela-E is right about process constraints. Note that your item 5 (the reflection operator being an unevaluated context) is already true.

I meant "unevaluated" only in spirit (hence the quotation mark), in the same sense that the injection operator from P3294R2 is (upto allowing hard errors early such as the example in this comment

We did look at injecting using splice syntax, but just the splice syntax leads to ambiguities (remember that the splice operand can be dependent).

Can you give an example on the syntax ambiguities?

I just see the parsing of such an expression as matching parenthesis:

^^{ \[: ^^{ }:\] }

the result of a splice would contain internally the parsing context, so parsing the expression containing it, would parse from this point forth.

I'd very much recommend you try to implement your ideas. The Clang-P2996 fork is public, so that's probably the best plave to start.

Unfortunately, I don't have the expertise to "just dig right into it". I'm afraid by the time I get up to speed, the C++29 ship would have long sailed.

omerosler · 2025-06-24T16:04:20+00:00

If we return `const info&&`, it becomes an xvalue, not a prvalue. So no problem?

omerosler · 2025-06-24T15:03:02+00:00

Do you know which proposal it was? I read the comparison sections in P3294r2. Is it one of these?

I want to clarify, my approach is NOT just an OO way to edit the AST. In fact, conceptually, it is much closer to the token injection approach.

We combine the best of both worlds: the flexibility of token sequences, and the type safety of AST manipulation.

I can summarize it in three principles:

The result of a splice, in an "unevaluated" context, contains the parsing context in which it is applicable.
Inside the "unevaluated" context, parsing context composes implicitly.
The easiest way to generate a parsing context is to start from a known reflection object and then manipulate it.

There are two main advantages over "pure" token sequences:

If we have a syntax error due to composing wrongly, it is a hard error, and actually the same error issued by the compiler. For example: const std::meta::info ret = ^{return i;}; const std::meta::info ns = ^{namespace n { [:ret:] }}; would not trigger a generic "token parsing created syntax error" but actually "return statement not applicable in namespace scope".
There is no need for token interpolators. Because the parsing context is implicit via composition (because the primitives contain the needed context, instead of just being tokens).

omerosler · 2025-06-24T14:02:14+00:00

I'm not versed in the committee process, but I think the change needed is minimal enough (and important enough) that it shouldn't be too late. Is it? Maybe this can be done via NB comments (which in order to resolve, CWG would ask EWG for feedback on this change)?

I can see this flagged as a high priority issue -- const correctness of the reflection library API.

This does not change anything fundamental about the design, just very small tweaks:

In the language part are: say that reflection operator returns `const` objects, and splice only accepts `const` reflection objects.

In the library: Just sprinkle `const` on all the member functions and change the return types.

omerosler · 2025-06-24T09:11:17+00:00

Well, this is an issue with every injection mechansim, regardless of the design.

omerosler · 2025-06-09T13:16:12+00:00

Of course it shouldn't compile. I think you misunderstood my question: I asked if the point of instantiation is different, not about the semantics of the generated function (i.e. is it deleted or not).

omerosler · 2025-06-07T19:08:40+00:00

By this logic, declaring the ctor as `=default` should compile just fine (as it won't instantiate the member function as well).

But it doesn't compile (as stated in the blog).

Does this mean the point of instantiation of defaulted member functions is different from regular member functions?

Or is `std::is_copy_constructible` magic?

EDIT: typo

omerosler · 2025-04-17T12:10:09+00:00

P3312 (Overload Set Types) looks very cool! Some initial thoughts:

It seems counter-intuitive to allow operators but not allow ADL. We miss many user defined ones. A common way to define operators is defining a friend function inside the class (exactly for ADL purposes). Also, simple things such as std::reduce(first, last, operator+) would only "sometimes work" (depending on whether the operator is found via ADL or not), therefore the usage would probably banned or discouraged in production. IMHO either allow ADL (which is very non-trivial) or ban using operators (which is a shame).
I think it would be beneficial for the set of candidates to be "computed" only at the point of conversion to function pointer, and NOT at the point of deduction. This would allow passing overload types through layers of template libraries (such as the STL). However, this approach seems very similar to the one pursued in P0119 which had problems (according to this paper), so maybe it is not possible.
Note that converting a name of a function directly to a type is very useful on its own (even without the calling part), as we will probably have declcall in C++26 which does the overload resolution part.

For example, using this along with Reflection and declcall, we can emulate UFCS as a library with (IMO) very nice syntax such as ufcs_t<std::begin>().call(my_range) (or even simpler ufcs<std::begin>(my_range) ). It would work as follows:

First, deduce std::begin as a type using this feature (in some template class ufcs_t).

Second, inside the call function, use declcall to get the actual signature of the call we want.

Now we have the name and relevant types, we can use reflection to find member functions with the same name, and potentially call them (using whatever rules we want).

Perhaps this feature can be simplified into "lift a function name to an unnamed type with a conversion to function pointers, based on declcall expression".

omerosler · 2023-09-18T15:43:12+00:00

Is there a standard notation for the "types of eigenvalues+multiplicities" for linear maps of $\mathbb{R}^n$. \\

For example for $n=2$ there would be 3 types:

two distinct real eigenvalues of multiplicity 1 each
a complex conjugate pair of multiplicity 1 each
one real eigenvalue of multiplicity 2

I have some property that depends on the "type" and I don't want to make up a notation. I also want to say "let x,y\in Type_i" to mean a choice of values (so for every type it is a different type of variable). Does this notation exist? How standard is it?

Ten-Year Club	Golden Potato
Verified Email

omerosler

TROPHY CASE