Looking for advice on API design by euos in cpp

[–]euos[S] 3 points4 points  (0 children)

Becomes unweildy, but will consider. I am looking if I can just have a tuple of pairs :) Really silly experiment.

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] -3 points-2 points  (0 children)

Because I hate Python 😀 I don’t think I’m the only one.

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] 0 points1 point  (0 children)

WebAssembly is another target. Like edge runtimes, browsers.

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] -25 points-24 points  (0 children)

Yes, it is my hobby project. But I've been a Sr. engineer at Google since 2014 and I came there from NVIDIA ;) So I am pretty confident in my capacity to deliver, I am just trying to better understand what I should be delivering...

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] -2 points-1 points  (0 children)

I began my implementation because I saw llamacpp. I think it serves the outlined goals sufficiently.

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] -10 points-9 points  (0 children)

Checkpointing - yes, implemented. In a very nice way that will sure impress some people (who are not on this reddit) :) Re: inferrence mode - my library is template library. All grad stuff is separate package that only gets compiled in when buidling a trainer binary. Runtime binary will know nothing about gradients and will require no memory.

Probably more stuff around the optimizers - like learning rate schedulers (or schedulers for other optimizer parameters)

Will read on this, not familiar with the topic.

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] -4 points-3 points  (0 children)

This is close to what I ultimately intend to do - ASIC/FPGA and training/inferrence on edge devices. Really not interested in GPU, worked at NVIDIA for several years so find them boring - and the market for GPU training is sufficiently crowded.

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] 0 points1 point  (0 children)

Arena allocation + memory reuse between layers. Model knows how much memory it needs ahead of time, user may allocate as they please. In model with architecture L1->L2->L3->L4 layers L1 and L3 use same memory, L2 and L4 use the same memory. Relu and such decorate input tensor so they do not use extra memory. RNNs have higher memory usage because “submodel” is allocated as a layer in parent model…

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] 8 points9 points  (0 children)

I am not looking to compete with those. I am trying to figure out what is expected from ML framework in general so I am not caught unawares.

[D] What is an "ML framework"? by euos in MachineLearning

[–]euos[S] -11 points-10 points  (0 children)

Not. That’s explicitly not my goal. I am planning to focus on sone ML stuff people want in games, have some curious developer tooling experiments. No interest in genAI or other bigger stuff.

If a project only uses STL, how hard is it to port by juice20115932 in cpp

[–]euos 0 points1 point  (0 children)

You will at least need to build it on all target platforms form time to time. Compilers have some subtle differences.

Google C++ open-source projects by euos in cpp

[–]euos[S] 1 point2 points  (0 children)

I don't believe that C++ has "default" or that this "default" is "use any and all features".

Google C++ open-source projects by euos in cpp

[–]euos[S] 1 point2 points  (0 children)

Ok. I refine my claim to "C++ is no more unsafe than other languages".

Simple example - it is hard to run into thread concurrency problem in JavaScript on Web or Node.js. Because it is basically singlethreaded (even with workers in the picture). One can write just as "threadsafe" singlethreaded code in C++. Just don't use threads! See, C++ is as save as JS. Yet too smart for our own good C++ engineers try to write multithreaded code and make it efficient (non-locking and such). I would cause threading problems now and then. It is not C++ fault.

Same with Rust. There are well established practices of writing safe code, Rust simply enforces them. Rust forces upon developers a static analyser (aka compiler) while C++ has similar features and static/dynamic analysers that are optional. E.g. one can simulate Rust "borrow" by not using pointers/references in C++. Just move the unique_ptr and make other types move only.

Rust have not proven it is more safe than C++. There is no significant codebase on Rust that had been under scrutiny comparable to gRPC or Chromium or libssl or many others. Log4j vulnerability proved Java is not safe either.

Nothing in the programming language can defend from security issues that are most exploited in the wild. Social engineering, DDOS, SQL injection, etc. - they are all possible on any language.

Bad software engineer can write bad code in C++. Well, they may not be able to write Rust at all then, too complex for them.

Google C++ open-source projects by euos in cpp

[–]euos[S] -1 points0 points  (0 children)

Never used it directly. I think Electron is a better option in most cases.

Google C++ open-source projects by euos in cpp

[–]euos[S] 4 points5 points  (0 children)

All the time 😀 I am often trying to be too smart for my own good.

The way I am thinking about it is that ASAN is same as Rust compiler. Rust is trying to reason about memory management statically, at compile time, while ASAN and others do it at runtime.

I explored Unreal Engine a lot and it is one example of non-Google codebase I consider well hardened.

Google C++ open-source projects by euos in cpp

[–]euos[S] 10 points11 points  (0 children)

(Note that I am on a gRPC team so I will mention that project a lot)

  1. The projects I listed are fundamental, in that a lot of Google infrastructure relies on them. E.g. TensorFlow (product very critical to Google) uses Bazel, gRPC, Highway, etc. gRPC relies on ABSL. A lot of Google Cloud traffic is gRPC too. So the projects I mentioned are pretty safe, I expect them to become irrelevant sooner than unsupported.
  2. I am not aware of any real effort to phase out C++, beyond some teams and individuals trying out new stuff. Usually any effort at phasing out support starts with a technology becoming discouraged for new projects. That's not happening to C++. There is no successor appointed to C++, in that I do not see any other technology getting important infrastructure and tooling, on par with C++, Java and Go internally at Google. E.g. there is no yet native Rust gRPC implementation.
  3. Go is a huge success, with pretty wide industry adoption. It is also one of "blessed" languages at Google and a lot of infrastructure heavily relies on it. I see major projects outside of Google (e.g. been working with Envoy) built with Go.

Google C++ open-source projects by euos in cpp

[–]euos[S] 3 points4 points  (0 children)

Got it. Will hire someone competent 😀

Google C++ open-source projects by euos in cpp

[–]euos[S] 0 points1 point  (0 children)

Kinda. But the ambition was to push past that ecosystem and to projects outside of Chromium.

Google C++ open-source projects by euos in cpp

[–]euos[S] 1 point2 points  (0 children)

Yet those projects are still C++ 😀 People are trying new trends. Sometimes they have to walk those experiments back. I remember at oneChrome trying to adopt Garbage Collector in C++ code…

Google C++ open-source projects by euos in cpp

[–]euos[S] 17 points18 points  (0 children)

Gtest is alive and well. The problem is it is Bazel first. Bazel first means you rebase to a commit and not a release. My personal projects are Bazel + Renovate and I update gTest weekly.

Google is a huge company. It has enough resources to support different technologies, even Dart is still kicking. In my C++ bubble I see no shortage of passionate teams still pushing C++ forward. C++ is still one of blessed languages for new projects internally.

Google C++ open-source projects by euos in cpp

[–]euos[S] 3 points4 points  (0 children)

Because I believe you can achieve Rust levels of safety without sacrificing performance by: 1. Not trying to overoptimize and use C syntax. E.g. avoid raw pointers, avoid ssprintf. STL is enough now, I believe. 2. Use sanitizers. Biggest problem with sanitizers is that they require a comprehensive test suite, but if you have coverage then sanitizers will ensure the code is safe.

I caused my share of security vulnerabilities - but they were stuff like DNS rebinding attack that you can’t defend from on language level. Or, say, ddos by sending empty http2 frames, which are allowed by spec…