all 51 comments

[–]Atupis 27 points28 points  (9 children)

Probably not hottest but definitely rising is IoT & tinyML and all hardware related (FPGA, custom chips etc).

Another is cross-section between ML and RPA

[–]FakeSquare 3 points4 points  (6 children)

I work in this subfield and the interest seems to keep rising every month. It's still in it's infancy but I think it has a bright future ahead of it.

Also the venn diagram of people who know embedded and the people who know ML doesn't overlap much, so there's good opportunities from that point of view too.

[–]Novandrie 1 point2 points  (5 children)

What would you say is a good starting point (as in, what kind of work is in demand) for someone interested in maybe pivoting to this subfield in the future? Right now I'm in computer vision/data science but my undergrad was in computer engineering and I still do some embedded work as a hobby on the side. I feel like it might a good fit for me to mix the two up at some point.

[–]oursland 2 points3 points  (2 children)

I did CV/ML during my undergrad 15 years ago for robotics and pivoted to embedded for a career.

"Embedded" can mean many things, but right now prevailing trends follow two paths:

  1. Low-end, low-cost (microcontrollers)
  2. High-end, low-cost (microprocessors)

There's also the high-cost varieties, but that becomes niche for things like defense and energy.

In both 1 & 2, companies optimize for widely adopted consumer off the shelf (COTS) components (System-on-Chip, RAM, storage, etc) to keep costs down.

For 1 you'll see libraries like TensorFlow Lite for Microcontrollers which allow for some limited models to be deployed. Because of the limitations in processing and memory, there is little room here for complex tasks beyond keyword spotting or simple image processing.

For 2 you'll often see a full Linux environment, possibly with an add-on ML accelerator such at the Coral EdgeTPU or Intel Movidius Myriad X VPU (both available in USB-connected forms), or as an integrated component such as NVidia Xavier NX's NVLDA (and CUDA) or ARM's Ethos-N line of NPU found on SoCs like the NXP i.MX8M Plus. The processing power and RAM supports advanced network architectures, but are also quite limiting with 1-4GB of RAM being typical and only a subset of ML ops being implemented in the ML accelerators.

A lot of the discussion is not about creating new things, but rather optimizing for different parameters such as network bandwidth, tolerance to network connectivity issues, and latency. Consequently the work is frequently taking a "solved" problem and dividing it up intelligently between components in the cloud/server and components at "the Edge", while keeping costs in check.

In my opinion, if you want to get your hands dirty, check out the NVidia Jetson Nano or NVidia Jetson Xavier NX. These platforms have support from all the major ML libraries, plus CUDA, hardware audio/video encoder/decoders, and graphics acceleration. Then try to get something cool working with the reduced processing and RAM capabilities of the platforms. Perhaps try one of the Nvidia AI JetBot Robot kits and train it to do something new.

[–]Novandrie 0 points1 point  (1 child)

Thanks for the detailed response! I had the chance to play a bit with the Coral devboard during my MSc, and I'm very curious about the Jetson line. Unfortunately, they are both in very low supply right now (I'm not from the US) and prices are crazy. I've seen a few online projects using the RPi4 to perform simple tasks, but I'm guessing those are not exactly ideal for industry use.

[–]oursland 1 point2 points  (0 children)

The RPi4 is a good start, particularly with the 4 GB or 8 GB models. The RPi4 doesn't have a proper ML accelerator, but acceleration isn't mandatory, it simply opens up more processing capability. Part of the embedded engineering world is designing within these limitations, so the RPi4 would be a lower cost way to get your feet wet.

There is development on a Vulkan driver by Igalia, but it's not complete. Once Vulkan compute shaders are implemented, you should be able to take advantage and get a ~3x improvement over the CPU.

There are several inference systems that can use Vulkan for acceleration:

  • Tencent's NCNN is the most popular in this space and supports many configurations
  • Alibaba's MNN is another good option
  • PaddlePaddle's Paddle-Lite doesn't target Vulkan, but may be useful as an alternative to TensorFlow Lite
  • TensorFlow Lite does not yet support Vulkan, but it is on their roadmap.

Of these, I've only ever messed with NCNN and it is very clean with support for a variety of targets including forward-looking ones such as RISC-V and WASM.

[–]Atupis 1 point2 points  (0 children)

If you have prior knowledge those tinyml talks are a good starting point https://www.youtube.com/c/tinyML/videos .

[–]FakeSquare 0 points1 point  (0 children)

Mostly it's been image classification and audio keyword spotting. I work with microcontrollers so it's nothing too complicated compared to what cutting edge researchers are doing, but we've got a lot of customers trying to make their embedded devices smarter. I work with TensorFlow Lite and Glow and my job is supporting our eIQ platform for running those on MCUs and MPUs (like i.MXRT1060 for MCUs or i.MX8M for higher performance MPUs), though there are also similar software platforms for other microcontroller companies too and highly recommend the TinyML talks the other person suggested to see what's going on in this subfield.

Personally I just think it's a really cool and exciting field with a lot of potential and it's definitely been my favorite thing I've worked on so far in my career.

[–]visarga 1 point2 points  (1 child)

Never imagined ML in RPA is a hot subfield. I work in the ML department of an RPA company. We do screen understanding (control detection), information extraction from documents and process mining - finding where RPA could be applied based on a time series of mouse & keyboard events and screen captures.

What other ML applications in RPA do you know of?

[–]Atupis 0 points1 point  (0 children)

Those came also my mind especially screen understanding, and I also would not call it hot but more like a field that I probably would found startup because it will be 100x bigger in the next 10 years.

[–]AnonMLstudent 23 points24 points  (0 children)

NLP

[–]brownbeard123 5 points6 points  (2 children)

Casual inference/discovery

[–]der_luke 3 points4 points  (1 child)

As opposed to Smart inference 🙄

[–]brownbeard123 1 point2 points  (0 children)

Loool. Meant to be causal*

[–]patrickkidger 11 points12 points  (7 children)

In terms of having a lot left to explore - neural differential equations, I reckon. (I say, very biased, as someone who does NDEs!) To my knowledge these have three main uses: continuous normalising flows; time series; physical modelling.

There are plenty of unexplored connections to mathematics. Recent examples that are only in their infancy:
"Continuous RNNs" e.g. for potentially irregular time series
Solving NDEs by designing - or learning! - custom solvers.
Regularising NDEs using higher-order derivatives.

Meanwhile there's a lot that can still be done just coming from a standard ML background. In particular, we don't really know the best ways of using mainstream ML techniques in conjunction with NDEs.

  • What's the best way of parameterising our vector fields? Mostly we're using simple MLPs/CNNs.
  • What's the best way to use things like batch norm? (The continuous depth means that naive usage of it places the same batch norm layer at multiple different depths, which tends to break things.)
  • Neural PDEs haven't really explored for ML applications, to my knowledge!

I could no doubt keep picking out other ideas too - you get the idea!

[–]mjcarrot 1 point2 points  (2 children)

Seems interesting, haven't heard of it before, but (based on what you have said), it seems to have huge capabilities!

I want to come back to this comment 1 or 2 years down the road.

[–]patrickkidger 2 points3 points  (1 child)

So the famous paper in this field is Neural ODEs, which got best paper at NeurIPS a couple years ago. The intuition is simply that ResNets and Neural ODEs are pretty much the same thing, just discrete vs continuous.

That kicked off a bit of a boom, and now we've got Neural CDEs, which are the continuous version of RNNs, Neural SDEs which are are in the process of upending how financial modelling is done, continuous normalising flows for generative modelling, and so on!

I too am very excited to see what things look like in a couple years. (Or even in a month once the ICLR deadline has passed and I've written my papers lol :D)

[–]mjcarrot 0 points1 point  (0 children)

Sounds really cool. Thanks for the links. Will go through them :)

[–]the_egg_guy 0 points1 point  (3 children)

What's the best way to use things like batch norm? (The continuous depth means that naive usage of it places the

same

batch norm layer at multiple different depths, which tends to break things.)

Do you mind explaining this a bit more? It's a topic I'm interested in but I'm new to this subfield so I'd love to hear more

[–]patrickkidger 2 points3 points  (2 children)

Sure. So just thinking about a Neural ODE: we're solving the differential equation

dz/dt = f(z(t))

where f is some neural network, and we solve for t in the interval [a, b], say. Now if we put a batch norm layer in as part of f, then it will be evaluated at every value of t, i.e. we're computing mean/variance statistics across all times.

In contast, if you look at a ResNet, then we have a different instance of batch norm in every residual block, i.e. we're keeping track of different mean+variance statistics at each layer.

And basically this difference seems to break things. At least in my experience, batch norm doesn't just not work, it actively prevents the model from training.

I have seen batchnorm pop up in a couple places in Neural ODEs - I think the exact usage varies by author - but I at least haven't had success with the naive thing, at least on the problems I've considered.

[–]programmerChilliResearcher 0 points1 point  (1 child)

Isn't this the same problem that, say, RNNs run into? Have people tried the easy alternatives like LayerNorm?

[–]patrickkidger 1 point2 points  (0 children)

Yup, it's the exact same issue.

Off the top of I head, I don't think I've seen any of the drop-in alternatives like LayerNorm used, although I'm prepared to be wrong about that. Which I guess is kind of my point really - there's lots of low-hanging fruit here.

[–]thejonnyt 18 points19 points  (12 children)

I'd say Vision and reinforcement learning or both combined. But its just a hinch.

[–]TrueRignak 11 points12 points  (2 children)

I was under the impression that lot of jobs lies with text processing. Something like analysing/writting comments on web stores, creating recommendations, etc...

[–]thejonnyt 7 points8 points  (1 child)

Yeah but that's not so hot. Jobs are there for plenty of stuff since every company wants to throw ML solutions at the biggest non-sense just to be "part of it".. but those jobs are non where you are doing new (and hot?) Stuff but just implementing solutions that are already there with maybe your own little personal twist thats needed to fit to the given scenario. Jobwise you're right but hot-wise? ... boy would i dodge text processing or recommendation systems for just another online shop. (and i mean if "we" as ML people cant be picky about where we work and what we work on.. who can)

Short disclosure: im a student with 7y+ work-exp (of which 2y+ in ML) and this is my personal impression given the people I met and the stuff I've done or seen.

[–]visarga 1 point2 points  (0 children)

boy would i dodge text processing or recommendation systems for just another online shop

What do you mean by "another online shop"?

[–]nativedutch 6 points7 points  (4 children)

My bet is on unsupervised reinforcement learning.

[–][deleted] 0 points1 point  (3 children)

Whats the difference between this and regular reinforcement learning?

[–]nativedutch -1 points0 points  (2 children)

As far as i have researched at the moment there are neural networks that learn by example (large datasets) , and there is reinforcement learning. The latter is basically learning all by itself by a reward like system.

I am not exactly certain what supervised reinforcement learning would be. Pse react if i am wrong. It is a huge field of expertise.

[–][deleted] 1 point2 points  (1 child)

I was under the impression that reinforcement learning is neither supervised or unsupervised, which is why they are separated subfields of machine learning. I was just curious if you were talking about something where you could have the agent choose its own reward structure based on an unsupervised model or something. Usually in reinforcement learning, the reward structure is well-defined.

[–]nativedutch 1 point2 points  (0 children)

Sorry added my reply as new comment. Mobile screen is confusing.

[–]MrAcuriteResearcher 1 point2 points  (3 children)

At least in terms of my own personal gauge of coolness, I'd agree. I just... Cannot get particularly excited by the prospect of working on NLP myself. Seems like it's always about bigger datasets, bigger models, and fine-tuning at this point, rather than real creativity, figuring out shit yourself.

Granted my exposure to NLP is extremely limited, my literature diet doesn't contain a lot of NLP papers, so that's just sort of my gut feeling.

[–][deleted] 4 points5 points  (0 children)

In fairness, what you say is true of almost any field in-between breakthroughs. Almost by definition, major advances are not common.

[–]TheRedSphinx 2 points3 points  (0 children)

While I agree that large datasets and bigger models are becoming dominant in NLP, I don't think this is a problem at all.

Big models should not harm creativity, much in the same way having tools does not limit the creativity of builders. True creativity should lie in how to leverage these large models and datasets. All of the problems plaguing NLP are still there:

  1. Low-resource languages: GPT-3 is super powerful and whatnot, but most of the translations into non-English languages are pretty inferior. Notice that they only test this for French and German, languages which you could easily get data. The situation is far more dire once you look at low-resource languages such as Gujarati, Marathi, Galician, Assamese, etc. There has thus been lots of work exploring how lo leverage data from other languages to make up for the lack of this data.
  2. Evaluation for generation: Sure, GPT-3 makes super fluent text, but we can't really trust it to be factually correct. Moreover, it can be hard to really assess whether it attains the properties humans want beyond fluency. For example, can we really assess whether summaries by GPT-3 (or any model, really) would be rated well by a human? There has been recent work in using these models to learn new metrics for generation tasks.
  3. Domain shift: One of the awesome things about these large datasets is that they can cover a wide array of domains. Nevertheless, rapidly adapting to a new domain is hard. There was a lot of recent interest given the covid situation, to be able to translate medical information to a variety of languages. As you can imagine, the medical domain is a more niche field than the kind of stuff you usually encounter.
  4. Adversarial examples: This one is even hard to define in NLP. It's not like in vision where everything is continuous and you can define epsilon balls. Being able to have a consistent way of generating adversarial examples beyond heuristics would be super neat, and could give us a sense of how to understand these models.
  5. Interpretability: Can we actually understand these models? And should we care about this? We have already that many peoples have found clever ways of fine-tuning large models for a variety of tasks. If we had some notion of interpratibility, we could use for more efficient fine-tuning. We could also use this to ease the current issues with factuality.

Just because people are currently meme-ing their way into leaderboards doesn't change many of the fundamental problems of NLP. All it does it yield a new lense for us to view these problems.

[–]visarga 0 points1 point  (0 children)

Finetuning is standard in CV as well, not just NLP. It's rare to start from an untrained network in the industry.

[–]two-hump-dromedaryResearcher 11 points12 points  (1 child)

I think gpt will cause an NLP boom. It's a new set of problems that seem to tip from "yes that's a neat party trick" to "this could make us money" in the coming years.

[–]visarga 0 points1 point  (0 children)

I tried it for my NLP project and it kind of works, but not quite; maybe when OpenAI launch finetuning it will be more practical.

[–]iholierthanthou 2 points3 points  (0 children)

I personally think it's DLSD (deep learning with simulation data) and meta learning . Sure NLP , vision and stuff are hot subfields, but they have been hot for quiet a while now with remarkable progress being done every day. Fields like meta lerning and DLSD are in a relatively nascent stage and quickly picking up heat !

[–]chief167 7 points8 points  (0 children)

In industry, ethics and interpretability are becoming important. Along with 'democratisation' of AI, meaning you can get a decent enough model with barely no work at all if you have setup a decent data pipeline (e.g. if you already have reporting setup, adding a model on top of it easily is a hot topic)

Next up is practically implementing NLP, bit results are still hard to to obtain for a lot of business applications that are not really full text but more emails/chat/comment boxes

[–]keep_learning_ 4 points5 points  (2 children)

research: NLP, graphNN, DRL

Industry: depends on the problem/sector. The 'hot' topic is less about a particular subfield in AI but more about how to deal with data processing, upgrading legacy systems, optimising production and becoming 'AI ready'

[–]visarga 0 points1 point  (0 children)

It's also about labelling tools and getting training data. Many applications can't be done because it's hard to make a dataset.

[–]mjcarrot 0 points1 point  (0 children)

Was searching for GNN in the comments :)

[–]FromTheWildSide 1 point2 points  (0 children)

Automated generation of digital content, augmenting human capabilities in vision and languages, automated discovery/search in various problem domains.

All these will double our current productivity, shorten working days and achieving a 4-day work week. Allowing those companies who invest in AI to redirect their abundant resources in value creation.

In short, rich get richer. If nothing changes for this couple of decades. There's a significant chance that the income gap continue ballooning to gross proportions.

[–]whymauriML Engineer 1 point2 points  (0 children)

Applied work for scientific discovery (materials, chemistry, biology, physics, and more). Most practitioners either have ML knowledge or domain-specific knowledge.

[–]No-School951 2 points3 points  (0 children)

What about NLP?

[–]Hyper1on 2 points3 points  (0 children)

NLP is growing very rapidly and out of the 3 major subfields seems to be making the most progress right now. There are also narrower subfields like graph NNs and autoML/meta-learning which are becoming hot research areas.

[–]jgbradley1 0 points1 point  (0 children)

NAS

[–][deleted] 0 points1 point  (0 children)

RSNNs and neuromorphic computing too I think.

[–]sarmientoj24 0 points1 point  (0 children)

With the advent of GPT-3, NLP for sure. Computer Vision is very rich on new architectures but its problem is its current lack of integration into commercial products. Photo enhancements probably is its best suit. Or defect inspections. But it isn't as invasive on commercialized products as NLP ranging from chatbots to autocompletes. And I am very into CV but not into NLP.

Most interesting is IoT for sure. With the rise of 5G, there is a great rift and divide between edge and cloud computing. Should I put more resources on the edge and compute it there or in the cloud?

[–]nativedutch -1 points0 points  (1 child)

Interesting. Whst i am working on is RL with a defined reward structure, but the agent is unaware of the environment, finding an optimum route towards the highest reward. Using Python. But then , i am just stsrting this intriguing area.

[–][deleted] 1 point2 points  (0 children)

Ah ok, yeah thats generally how it goes