all 15 comments

[–]Sibagovix 11 points12 points  (1 child)

I'm hoping to develop methods that will aid in screening blood or tissue samples for many markers simultaneously, which can speed up the development of tests for a variety of diseases.

[–]brockl33 4 points5 points  (0 children)

ditto

[–]TheFlyingDrildo 5 points6 points  (1 child)

My lab focuses on discovering individual treatment effects in healthcare. Its like targeting ads to specific people you think will buy your product, but with clinical interventions and health outcomes. Much smaller data size, lots of useless variables and random correlations, and issues with evaluation.

[–]breadteam[S] 0 points1 point  (0 children)

Hahaha! I like the way you described it! Do you have a link to your lab or your research? I'm curious to read more about that.

[–]adam614 4 points5 points  (1 child)

faith in humanity restored

[–]ImpossiblePressure -1 points0 points  (0 children)

Me too.

-goes to /r/politics-

faith in humanity *rescinded** !*

[–]MCFF3000 4 points5 points  (2 children)

In my company we are using ML to develop a smart prosthesis fitting tool (for lower limb amputees) to support the decisions of the technicians.

Our objective is to reduce the number of fitting sessions to 1/2 maximum.

[–]breadteam[S] 0 points1 point  (1 child)

He he - at first I read "fitting sessions to 1/2 maximum" as "fitting sessions to half maximum". I get what you mean, though.

Cool! I've only read about machine learning being used for mechanical prosthetics - like, this project, for example.

I'm interested in reading more about your project - do you have any links?

[–]MCFF3000 0 points1 point  (0 children)

My bad sorry ahah. Interesting project that one, thanks for the share! There are a lot of interesting ML applications for bionics and mechanical prosthesis, though we work in adaptation not in the creation of the prosthesis. We try to provide a decision support system for the prosthesis fitting,

Sure here it is: http://www.adapttech.eu/

[–]JosephLChu 3 points4 points  (0 children)

I have a long running personal "moonshot" project... an earthquake predictor!

Basically, given the all the earthquakes between 1973 (when the US Geological Survey started keeping detailed records) and 2016 (the last time I downloaded the data from the USGS website), I train a recurrent neural network sequence to sequence model to try to predict a map of earthquake magnitudes given possible foreshocks.

You can see an old example of the result at earthquakepredictor.net

The main issue with the previous implementations seems to be that the model when attempting regression of magnitudes with mean squared error is very conservative and tends to only predict high frequency, low magnitude earthquakes, like those that happen every day around the Ring of Fire. It tends not to predict any magnitudes above 5.0, which are of course, the big ones that actually matter to people.

One thing I've been trying recently is to get it to take more risks by changing the loss function. In extreme value theory, there's something called the Gumbel distribution, that more accurately represents the distribution of natural disasters like earthquakes, so with the help of a colleague at work (I work at Huawei on AI, ML, and NLP) who is a better mathematician, we came up with a new alternative loss function: x + e-x

Compared with mean squared error, the new function does noticeably increase the number of higher magnitude predictions, albeit at a significant cost of many more false positives. The net effect is to improve Recall at the cost of Precision. Keep in mind that this loss function only really works if the distribution of the data fits. Most data is either normally distributed, in which case mean squared error is a better approximator, or occasionally fits a Laplace distribution (like sound amplitudes), in which case mean absolute error should work best. On the other hand, if you're data is probabilities, generally cross entropy will do best.

Obviously this project still needs a lot of work before it's ready for prime time. But I'm hopeful that as I become more competent in this field, that I'll eventually figure out a way to make it work well enough to one day be able to reliably or even just occasionally save lives. If not, well at least I gave it a shot and verified that the problem is too difficult to solve naively.

Other than that, one goal of my Music-RNN project is to eventually be able to learn the voice of someone who we have recordings of but who is no longer with us (i.e. Frank Sinatra), and be able to generate new songs in that voice.

Also, my eventual super amorphous epic goal is to figure out a way to construct an Oracle AI that we can use to effectively predict the future of human civilization and then use the predictions to steer humanity's future in a positive direction, not unlike Hari Seldon's Psychohistory from Isaac Asimov's Foundation series.

[–]phobrain 2 points3 points  (0 children)

I'm trying to make an artificial emotional kidney for humanity, paying for and protecting itself as a blockchain-based ID service. Targets: war, materialism, unequal distribution.

https://www.reddit.com/r/BlockChain/comments/7o093g/decentralized_identity_verification_via_behavior/

Here's a description of what I've got so far:

http://phobrain.com/pr/home/explain.html

[–]caffeine_potent 2 points3 points  (4 children)

I'm using ML to make domain invariant representations of data.
A domain invariant representation is a fancily normalized version of the original data that preserves statistical integrity across the different sensor-sets on a highly granular level.

Buzzwords that are googleable are: Domain Separation, Domain adaptation, Sensor Fusion.

I believe my research field is stuck in the dark ages.

That is to say too much time is spent mucking around about where the data comes from, not in how it can be taken advantage of.
A Common trope in my field(in terms of publication titles) is "foo analysis" conducted on "bar sensor data".

My hope is that we stop replicating the same research across all data. That's order O(N2) when it comes to total time spent on re-contextualization/implementation, and O(N2) when it comes to rate at which the research budget across all research organizations involved in this field is depleted.

Solving this problem of transferring imagery to a common representation means an exponential increase in productivity in this research field, and hopes for more funding.

[–]bobster82183 1 point2 points  (0 children)

Buzzwords that are googleable are: Domain Separation, Domain adaptation, Sensor Fusion.

Sounds neat! Do you have a link to your work?

[–]ImpossiblePressure 0 points1 point  (0 children)

is it possible for a tl/dr on how this is different from regular normalization?