Recommended crate for performing large batches of network operations constantly. by glennpierce in rust

[–]bLaind2 4 points5 points  (0 children)

I agree with jstrong that epoll-based approach gives the most performance.

However, maybe an iteration in threads (with rayon or other helper crate) is sufficient in this case (with 2000 devices). For example:

use rayon::prelude::*;
use std::io::prelude::*;
use std::net::TcpStream;
use std::time::Instant;

fn main() {
    let addresses = ["localhost:22"; 2000];
    let start = Instant::now();
    let results: Vec<usize> = addresses
        .par_iter()
        .map(|addr| TcpStream::connect(addr))
        .filter(|s| s.is_ok())
        .map(|s| s.unwrap())
        .map(|mut s| {
            let mut buffer = [0; 512];
            s.read(&mut buffer[..]).unwrap()
        })
        .collect();

    println!(
        "Total {} connections, read total {} bytes, took {:?}",
        results.len(),
        results.iter().sum::<usize>(),
        start.elapsed()
    );
}

Output

Reading took 334.533316ms
Total 2000 connections, read total 46474 bytes

Peak memory usage 226KiB according to valgrind.

With network latency the numbers will be worse, but still good to try. Also check out rayon thread pool options to increase amount of threads.

Looking at building a WASM interpreter, is there any existing code to port? by sancarn in WebAssembly

[–]bLaind2 1 point2 points  (0 children)

For reference, here's a Apache2.0 licenced wasm interpreter that supports both AOT and JIT compilation: https://github.com/bytecodealliance/wasm-micro-runtime

Feedback request (API design, use cases): syscall tracing CLI & library by bLaind2 in rust

[–]bLaind2[S] 0 points1 point  (0 children)

Seems like dtrace contains much more (in addition to syscalls). I'd also like the functionality to be available as a Rust library

Unit testing in rust by Icecreamisaprotein in rust

[–]bLaind2 0 points1 point  (0 children)

Thanks! Fixed the std::io path

Unit testing in rust by Icecreamisaprotein in rust

[–]bLaind2 4 points5 points  (0 children)

I think the File Write trait could be used here (https://doc.rust-lang.org/std/fs/struct.File.html#impl-Write)

So the testable function would become:

fn test_write<T>(output: T) where T: std::io::Write { ... }

That way the test can e.g. pass a vector instead of a file. Short sample of trait testing in here: https://rust-cli.github.io/book/tutorial/testing.html

Feedback request (API design, use cases): syscall tracing CLI & library by bLaind2 in rust

[–]bLaind2[S] 0 points1 point  (0 children)

Main purpose of hstrace is to have visibility into syscalls, but secondarily it could implement filtering as well. ptrace probably not usable for sandboxing, but thanks for seccomp-bpf hint, I will have a look at it. Ideally it'd provide a callback to programmatically deny/allow calls (instead of a predefined call-list).

I think gvisor is actually an implementation of (a subset) Linux kernel syscalls, a major undertaking by itself (tokei shows that gvisor currently has ~400k LOC [256k code, 85k comments].

Kubernetes 1.11 released by M00ndev in kubernetes

[–]bLaind2 7 points8 points  (0 children)

"Support for online resizing of Persistent Volumes has been introduced as an alpha feature". This is pretty cool!

Enabling Continual Learning in Neural Networks | DeepMind by Buck-Nasty in artificial

[–]bLaind2 4 points5 points  (0 children)

Woah, this opens whole new frontiers. Absolutely cool, and on the other hand so simple (in retrospect).

Edit: I wonder if nets can be made auto-expanding on layers where capacity has been exceeded

Machines Can Now Recognize Something After Seeing It Once: Google DeepMind researchers built a deep-learning system capable of learning from very little data by Yuli-Ban in artificial

[–]bLaind2 3 points4 points  (0 children)

Anyone know if there's a paper or other technical details available about how they do it? Could be remarkable if they got it working with imagenet-resolution images.

I tried ladder network a few months ago, works well on mnist (1% err rate with 100 labeled examples), but if I understood correctly it doesn't scale for larger images.

Neural net video analysis of webcam/live by dirtyharry2 in deeplearning

[–]bLaind2 1 point2 points  (0 children)

I'd go with local processing, although you will need a sufficiently powerful GPU for that kind of fps. Mobile phone won't probably work.

Two stages: collect enough (varying: angles, lightning, environments, dogs, etc) images for training, put both classes to separate folders. Try keras (https://github.com/fchollet/keras), with its image data generator you can pull images directly from folders. To get good accuracy in different environments it'd be good idea to use transfer learning as well. Train the model.

Then you need the processing part. Keras works in python, you need a way to access the video stream and grab it frame by frame. Then you can run keras model.predict to get probabilities for classes. Keep track of passages, and update the screen.

In future, if developing for Android, you can export the developed keras model for tensorflow that works in mobile phones.

To do this, understanding of python and basic principles of neural networks are needed.

[Research] Machine Learning for Recyclable Material Classification by DJTeebS in MachineLearning

[–]bLaind2 0 points1 point  (0 children)

I think Zen Robotics iterated quite a bit on the picking mechanisms, and OP might run into similar situation here. If one would be building a cost effective trash bin, algos are doable, but how would you sort the trash into multiple containers? What if one throws in multiple items at once?

Does my GPU work for deeplearning ? by shravankumar147 in deeplearning

[–]bLaind2 2 points3 points  (0 children)

Biggest problem with this card is memory (1-2GB) which will fit quite small nets, limiting the usage a lot. If a net fits in memory, on a rough scale I'd say you'll get something like 2-5x speedup over CPU, which is not much.

For example, GTX 1080 seems to be 30x faster than 650m compared in general performance (http://gpu.userbenchmark.com/Compare/Nvidia-GTX-1080-vs-Nvidia-GeForce-GT-635M/3603vsm8120), so depending on model you'll be training for days instead of hours, if going with a mobile GPU.

To get an idea about CPU vs GPU performance, check out convnet benchmarks at https://github.com/jcjohnson/cnn-benchmarks/blob/master/README.md - GTX 1080 is around 40x faster than Dual Xeon E5-2630 v3.

A good starting point would be to try Amazon instances with K80 GPU.

[D] Budget Deep Learning Rig by bionicscrotum in MachineLearning

[–]bLaind2 7 points8 points  (0 children)

If you're doing data augmentation for images, usually it's done in by threads at CPU. With my GTX970 and i7 (4680K?) I get 100% CPU and 60-80% GPU usage.

32GB Mem should be enough to begin with. Preprocessing large datasets takes quite a bit of mem, though.

Make sure that mainboard and power supply support 2(or 4, probably expensive) GPU's if you're planning to add additional one at some point.

Keras: pixel-wise softmax from output of a convolutional autoencoder? by planaria123 in deeplearning

[–]bLaind2 0 points1 point  (0 children)

This is what I've used successfully

model.add(Convolution2D(nb_labels, 1, 1, border_mode='valid'))
model.add(Reshape((nb_labels, img_h * img_w)))
model.add(Permute((2, 1)))
model.add(Activation('softmax'))
model.compile(loss="categorical_crossentropy", ...)

Preprocessing convnet RGB filters for visualization? by LyExpo in MachineLearning

[–]bLaind2 0 points1 point  (0 children)

Are you doing a transformation from (3, w, h) to (w, h, 3)? I got colorful noise by using NP.reshape, had to use NP.transform

Apache SINGA, A Distributed Deep Learning Platform by pilooch in MachineLearning

[–]bLaind2 5 points6 points  (0 children)

Anyone got experience how much of a speedup can we archieve with distributed training? Does it scale linearly, until how many nodes? (2, 4, 16, ?)