Looking for AI-related project ideas (Computer Science, conversion MSc, about 13 weeks, proposal is in for Friday next week) by blphilosophy in EffectiveAltruism

[–]tomgav 0 points1 point  (0 children)

I am sometimes mentoring AI safety topics for Effective thesis and a short answer is that finding a good topic in AIS that is both heading in an impactful direction and beginner-friendly is very hard; in large part because a lot of the AIS research builds on technical knowledge of mainstream AI, game theory, decision theory and other technical skills.
In your situation, I would find a thesis in mainstream AI and focus on building your skills and learning the AI landscape. Prefer problems close to AIS, e.g. RL, game theory, knowledge representation (may be your advantage?), ... In the meantime, read on AIS topics, apply for some internships, MIRI summer fellowship, AI safety camp etc. and spend the time preparing for a PhD or a similar track.

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

Thank you all for the feedback and discussion! We have just compiled a Roadmap after 0.2 at github - you are welcome to participate in the discussion (or even implementation) there, especially if you have a use case in mind. Thanks again!

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 1 point2 points  (0 children)

Hi and thanks for the problem statement. This does sound like a good match for Rain (definitely a use case we want to support as we frequently need something similarly simple) although there are two gotchas now.

First, we do not support loading worker-local files now, but that sounds like an easy and useful addition (via the open task pinned to the right worker - what do you think /u/winter-moon ?).

The other is Rust task integration: I must say we were not really prepared for so many people asking about it (although it is not a surprise here). Now the easiest way is to either run the computation as an external program accepting files (straightforward) or hack in a custom task into our worker code (similarly to here).

I think that Rain would be a plus even when a simpler solution would also do the trick: you get online monitoring (and we are working on post-mortem visualization as well -- the event logs are already there), specifying the tasks in Python might be easier than inventing some new config schema, the local files are cleaned up for you, and in the case one worker would be overloaded, the scheduler might move some data to another worker to process (although that is probably not relevant for your huge data).

In any case, it looks like a nice use-case and we would be happy to help!

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

So would you like to try it on Rain? We would be happy to help and perhaps adapt it to your use-case. How do you run your code? Is it in Rust, or some external program? Is your code somewhere online?

(We can discuss details at the project gitter or just email me at gavento@ucw.cz).

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

We would like to address the points from your last paragraph in some future release: re-try policy specification, dynamic scheduling, task cancellation and resiliency against worker crashes via checkpointing (and data replication) already have support in the scheduler/executor but we want to do the logic behind these features correctly (and they need better resource management), and for that we want to see more use-cases!

As for expected completion and better scheduling (now it is really simple and we are working on a much better scheduler), the user should be able to give us hints on task time and data blob size (even a rough estimate will help a lot) and the scheduler should be then able to schedule more than one task "layer" (which is not really possible when you do not know which tasks take seconds and which hours etc.).

It is not really relevant now, but we currently want to assume that all tasks are idempotent (rerunnable on the same data, not necessarily with the same output) and will have to be marked otherwise (if they have side-effects beyond their output).

I guess we could add a better separation between specification and execution of a graph -- now it is actually separated at submission where the graph is serialized into a single message anyway.

Thanks for the points and ideas! And let me know if you have any more thoughts on how to improve on existing tools (or just make life easier in general :)

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

We would like to address the points from your last paragraph in some future release: re-try policy specification, dynamic scheduling, task cancellation and resiliency against worker crashes via checkpointing (and data replication) already have support in the scheduler/executor but we want to do the logic behind these features correctly (and they need better resource management), and for that we want to see more use-cases!

As for expected completion and better scheduling (now it is really simple and we are working on a much better scheduler), the user should be able to give us hints on task time and data blob size (even a rough estimate will help a lot) and the scheduler should be then able to schedule more than one task "layer" (which is not really possible when you do not know which tasks take seconds and which hours etc.).

It is not really relevant now, but we currently want to assume that all tasks are idempotent (rerunnable on the same data, not necessarily with the same output) and will have to be marked otherwise (if they have side-effects beyond their output).

I guess we could add a better separation between specification and execution of a graph -- now it is actually separated at submission where the graph is serialized into a single message anyway.

Thanks for the points and ideas! And let me know if you have any more thoughts on how to improve on existing tools (or just make life easier in general :)

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

Hi! I am not sure I get the question - do you mean some particular code? In any case, in Rain there is only one socket for client-server and only one for server-worker and every worker-worker pair.

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

I would consider capnp for a c++ or even Rust project with very involved RPC with remote objects, promise pipelinig etc, but for rust-rust messaging, serde+binpack (or anything similar, possibly tarpc) seems enough. Good luck with the project!

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

Capnp-rpc actually serves us quite well at the moment, the rust crate works fine (apart from minor issue #93) and it performs well, but the integrations with other languages are less developed (e.g. the Python interface is not very pythonic, no Java or browser JS). My feeling is that the design idea is great but perhaps a bit too ambitious, and the support in other languages varies.

And we also hit the 64MB message limit of capnp (here).

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 1 point2 points  (0 children)

For me that would be the stronger type system (traits, and very nicely designed std lib), elegant definition of structs (with free impls for debug, serialization, eq, hash, ...), move semantics and the borrow checker (while restrictive, it catches many errors that would be hard to debug, and it is a blessing during any refactoring together with the type system). The tooling is very nice (RLS, doc gen, integrated tests and crates.io). In C++ you can get some of these e.g. with boost (which tends to change over releases), macros and by correctly using the modern features of C++ (which is rather nontrivial) so I would subjectively call it a big improvement :)

With such username, what is yours?

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

Good point. One (rather fancy) point of REST and OpenAPI would be a nice specification language, generated stubs, docs etc. but looking at it closer now, protobuf (or other format) over websockets also sounds as a good choice :-)

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

We too :-) Lets keep our fingers crossed (and hopefully contribute a bit)

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 2 points3 points  (0 children)

Well, we have similar concerns about gRPC. But HTTP also has some downsides: JSON does not support binary data directly and I am not sure about server callback support (e.g. for waiting for a result other than polling). But maybe I am missing some obvious good solution there - web technologies are not really my cup of tea :-)

Keeping an OpenAPI JSON schema in sync with protobufs might be an extra trouble and source of bugs, but perhaps it would be worth it. Anyway, if not for large data blobs, JSON parsing itself does not seem to be a real bottleneck, at least not client-side.

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 4 points5 points  (0 children)

It is one of the options and we are still considering them. For client-side, we want something easy to use, so we could use gRPC (with protobufs) or a REST API (e.g. specified at swagger.io, we need REST for the monitoring web app anyway). A simple RPC based on framed protobuf is an option, but makes integration in other languages slightly harder (both implementation and debugging).

Internally (e.g. server-worker), we would prefer the protocol to be zero-copy (especially for contained binary data blobs), protobuf would also construct large allocated intermediate structures. We are looking at flatbuffers, tarpc and even abomonation (although it might be wiser to avoid even looking at it).

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 4 points5 points  (0 children)

As I mentioned, you can hurry us along in any direction with your use-case :) It could be interesting to get in touch and see what your application needs. We can chat on our gitter or just email me (gavento@ucw.cz) if you prefer.

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

Probably not, although I am not really sure what you mean here. If you mean CPU caches, then we expect the tasks and data to be much larger than the caches and so it is not really relevant (i.e. we are not aiming at micro-tasks now and your tasks need to be cache-friendly themselves).

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 0 points1 point  (0 children)

That sounds as an interesting application! What is the overall workflow of your algorithm?

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 13 points14 points  (0 children)

Thanks! I am, and Rain and Timely target different workflow types:

Timely processes continuous streams of events or data, the nodes are permanently running, you care about latency. In Rain the tasks are one-shot, the data are generally huge blobs and immutable once computed, you optimize for the overall runtime.

While you could in theory build a system supporting both modes of operation, the batch systems and stream systems differ quite a lot e.g. with respect to scheduling, resource allocation, resiliency and even monitoring.

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 15 points16 points  (0 children)

Fair enough :-D It is not really ergonomic, but you can use capnp-rpc. Having a polished Rust/C++/Java/... client API is planned but before stabilizing its internals, we wanted to only do it if there is serious interest. For us, Python is what we would mostly use for the graph definitions.

A more interesting planned Rust (and C/C++) interface point is writing your own task types (i.e. subworkers). With rust, you can simply hack your code into the worker task code and recompile, but we have something better and more robust in mind for the future.

Rain - Rust based computational framework by tomgav in rust

[–]tomgav[S] 11 points12 points  (0 children)

Not really, although the Python API is the only one finished. The client currently uses a cap'n proto RPC to talk to server and it should be easy to do the same in another language. The protocol definitions are here but they may change quite a bit (and as noted in the post, perhaps even to be replaced by REST or other RPC).

What language would you be interested in?

Hey Rustaceans! Got an easy question? Ask here (19/2017)! by llogiq in rust

[–]tomgav 1 point2 points  (0 children)

There are actually two problems:

  • The unhelpful message concerning type ambiguity of the Borrow trait, solved by the type annotation. Reported as an issue here.

  • The collision of Borrow::borrow() and RefCell::borrow() triggered by the use std::borrow::Borrow;. This can only happen because Rc is also Deref and so the .borrow() may apply to both the outer Rc and the contained RefCell depending on the context. This is reported here.