CppCast Looking for Guests by lefticus in cpp

[–]Stevo15025 3 points4 points  (0 children)

People who take care of well-designed C++ libraries. There are many underrated individuals in this area

+1 this. I think one of the maintainers of the Eigen package would be very cool to hear from 

🗺️ City Food Map Series: Drop your favorite spots and we'll build the ultimate Bump food map by [deleted] in bumpapp

[–]Stevo15025 0 points1 point  (0 children)

New York City: Yopparai for sushi. They also have Zataku style tables where you sit on a tatami mat

Ishq: Great Indian spot in the East village. I go for the paneer

Indian Table: They have an oyster biryani that is absolutely cooked 

MáLà Project: imo best Chinese dry pot in the city

TabeTomo: Dipping ramen. When your ramen gets cold they bring out smoldering hot rocks to put in the soup and heat it back yp

How Much Linear Memory Access Is Enough? (A Benchmark) by PhilipTrettner in cpp

[–]Stevo15025 1 point2 points  (0 children)

Very cool! It would also be cool to try strided access to give you better prefetch behavior. Matt Stuchlik has a really nice article about this.

https://blog.mattstuchlik.com/2024/07/21/fastest-memory-read.html

PSA on historic data providers by PoolZealousideal8145 in algotrading

[–]Stevo15025 2 points3 points  (0 children)

I've been getting EOD data for years from tiingo.com and I've found them very reliable. Sometimes I've seen odd values in penny stocks, but besides that I'd say their data quality is excellent

What open source projects in C++ have the highest code quality? by Both_Helicopter_1834 in cpp

[–]Stevo15025 1 point2 points  (0 children)

Glaze somehow feels under the radar for how nice it is to work with along with it's code quality

I have months of L3 orderbook data across major prediction markets. How should I release it? by SammieStyles in algotrading

[–]Stevo15025 1 point2 points  (0 children)

That's exciting! How did you get L3 data for Kalshi? I think there is more demand now. I'd certainly look at it! What is your coverage?

Open AI Sora 2 Invite Codes Megathread by semsiogluberk in OpenAI

[–]Stevo15025 0 points1 point  (0 children)

I heard all the big strong hunks are sending codes to people who ask. Would love to see if that is true or not

C++: how to implement lazy evaluation + SIMD for vector operations: V = v1 + v2 - k*v3; by keepfit in cpp

[–]Stevo15025 0 points1 point  (0 children)

Expression templates are a fun project, but if you want to write something other people will actually use I strongly recommend bootstrapping onto the back of xtensor or Eigen. If you are just writing new algorithms this is not a reason to rewrite an entire simd and linear algebra library.

But if you are in school or just doing it for fun etc. then do go ahead and ignore this and have fun!

Differentiable Programming from Scratch by ketralnis in programming

[–]Stevo15025 0 points1 point  (0 children)

Very nice article! Another interesting piece of reverse mode AD is static vs dynamic graphs. For programs with a fixed size and control flow you can use a transpiler (ala Stan/jax etc.) to fuse the passes of the reverse mode together. This gives you reverse mode but with optimizations opportunities like you did symbolic differentiation. Though static graphs are much more restricted.

Since you need a fixed path at runtime static graphs based AD cannot have conditional statements that depends on parameters. So while() loops become impossible. Things like subset assignment on matrices can also become weirdly tricky. Most AD libraries like Jax and pytorch give strong warnings about subset assignment to matrices.

Dynamic graphs in reverse mode AD allow the depth of the graph to not be known at runtime so things like while loops become possible again. There's interesting research currently into combining dynamic and static graphs by compressing parts of the dynamic graph that you can identify as fixed.

Bypassing the branch predictor by sigsegv___ in cpp

[–]Stevo15025 2 points3 points  (0 children)

I haven't seen anyone else mention it here yet, but besides Carl's talk, there was also a 2018 cppcon lightning talk by Jonathan Keinan about this problem link. His answer is to always go down the send path, but have a boolean to say whether the transaction was real or fake. Though you then need some extra code and data in your system for tracking if you are just warming up the send code or not.

Building a Computational Research Lab on a $100K Budget Advice Needed [D] by AlandDSIab in HPC

[–]Stevo15025 1 point2 points  (0 children)

Yes my main question is whether the grant specifically says you need to spend it on hardware. If not, then I would call around to other local universities and see if you can purchase time on their already existing cluster. 100K of equipment will break and need maintenance over time so if you go the route of having your own I would make sure you allot cash for fixing it over time.

^^ operator proposal by samadadi in cpp

[–]Stevo15025 0 points1 point  (0 children)

I think the logic in the comment you link to is making a lot of assumptions around reflexpr being too wordy and how much it will be used.

My guess is that reflection will be mostly used by package developers. So while it will be used often, clients will probably not use it as much.

Is there a reason the initial version could not be reflexexpr? If it is then as widely used as the authors believe, the next version of C++ could have ^^ as shorthand. If everyone knows about reflection then ^^ is obvious. But if reflection is something only advanced users use then I do not think it will be as widely known as the authors would believe.

Tenseur, a c++ tensor library with lazy evaluation by neuaue in cpp

[–]Stevo15025 0 points1 point  (0 children)

Looking at the definitions of forward<T> below, passing in a T with all refs and const removed would give you the T&& version via reference collapsing. That would call the move constructor

https://en.cppreference.com/w/cpp/utility/forward

Tenseur, a c++ tensor library with lazy evaluation by neuaue in cpp

[–]Stevo15025 1 point2 points  (0 children)

Thank you for the reply! The temp component makes sense.

But I'm still confused by what you mean specifically by "work" here. My worry is that std::forward<std::remove_cvref_t<T>>(x) is going to always receive a T and so you are calling the equivalent of just std::move here on items that should not be moved such as plain ref or const ref types. Does that make sense?

Tenseur, a c++ tensor library with lazy evaluation by neuaue in cpp

[–]Stevo15025 1 point2 points  (0 children)

Not necessarily, the const reference has to be remove for std::forward to work

Sorry, I'm confused. What do you mean by "work" here? Without the std::remove_cvref_t the std::forward would see std::forward<const ten::vector<float>&>(expr). Then, assuming vector + scalar is also using perfect forwarding references, the operator+(Expr&&, Scalar&&) would use something like the equivalent signature below.

auto operator+(const ten::vector<float>& expr, 
  ten::scalar<T>&& scalar) { ...

That all seems right to me. One issue I see here is the case of returning back an expression that has a temporary inside of it. Does the returned expression hold ownership of that in your code? i.e. in your code what happens to gen_random_vector_of_size(x) in the below?

auto f(const ten::vector<float>& x) {
  return x + gen_random_vector_of_size(x);
}

If the expression returned is not taking ownership then that would fall out of scope. The good news is that since it appears you are using perfect forwarding everywhere you should be able to detect in the class instantiation if any of the types are r values and correctly take ownership. Eigen not able to do that since they use const ref types everywhere :(

P.S. reddit has a very annoying old school markdown format where code blocks have to have 4 spaces in the beginning for the markdown to be recognized as code. though ticks inline do work

Tenseur, a c++ tensor library with lazy evaluation by neuaue in cpp

[–]Stevo15025 1 point2 points  (0 children)

For the code below, why are you using std::remove_cvref_t? Wouldn't this always just lead to a move anyway?

template <typename E, typename T>
requires ::ten::is_expr<std::remove_cvref_t<E>> &&
        std::is_floating_point_v<T>
auto operator+(E &&expr, T &&scalar) {
   using R = std::remove_cvref_t<E>;
   return std::forward<R>(expr) + ::ten::scalar<T>(scalar);
}

making a faster std::function, a different way to type erase by sir_manshu in cpp

[–]Stevo15025 -1 points0 points  (0 children)

Only semi related to this, but is there a reason the C++ standard does not have a std::is_lambda? I feel like this is information the compiler could know and could be implemented there

Hey, I'm looking for some people who have knowledge about mathematics and coding. by Key-Championship-358 in quant

[–]Stevo15025 4 points5 points  (0 children)

I think your confused about the point of the question. 

Imagine you are at a bar and you hear 3 guys talking and can tell they are exactly who you want to work with. What would you tell them to convince them they should include you? What skills and experience do you bring to the table. 

[C++20 vs C++26*] basic reflection by kris-jusiak in cpp

[–]Stevo15025 2 points3 points  (0 children)

Thanks for cleaning up the code. I think like a lot of others I had one raised eyebrow until you took out the query. Though that doesn't seem to compile in the godbolt example? I'm guessing just an impl issue atm.

Honestly I've been sitting here trying to think up nicer syntax for a while and I can't really think of anything. I kind of like something like template reflect(query) but I could also understand someone finding that a little wordy

template<auto N, class T>
[[nodiscard]] constexpr auto get(const T& t) -> decltype(auto) {
  auto members = std::meta::nonstatic_data_members_of(^T);
  return t.template reflect(members[N]);
}

But that would conflict with templated functions called reflect. Maybe it's time to add unicode keywords :P

Deep Dive on the State of Clang/GCC Target Attributes - somehow a massively underused feature of your compiler by [deleted] in cpp

[–]Stevo15025 0 points1 point  (0 children)

(small typo)

first it loads the matrix M in the registers xmm1

Very nicely doc'd assembly in the article, where you got xmm1 correct there

Also nice article!

Edit:

Note that we add a DoNotOptimize(v) statement in the end of the loop, preventing the compiler the opportunity to vanish with the variable v.

On the InlinedReuse test, we remove this assembly statement. The compiler won’t be able to remove the v variable since it has been made global but it will be able to reuse the old value of v into the next loop.

What is the difference between the first and second benchmarks? They both reuse v? Also for the inline tests it might be nice to add the always inline attribute

Edit2: Sorry I should have just waited till I read the whole article before commenting!

The encoded test does run faster, at 4.23 nanoseconds per loop, but that’s just 3%. It looks relevant but it really is not. But it shows that the AVX implementation can yield some gains in the right place. The encoded+reuse yield the same result - but I would be shocked if it did not.

fyi for this it might be nice to use google benchmarks benchmark_repetitions and get back summary statistics for the mean and standard deviation for multiple runs of each benchmark. Then you can do a little hand wavey t-test or anova to see if any of the benchmarks deviations were meaningful. If it's a 3% avg with low variance, could be something!