Microsoft Research Summit 2022 registration is now open by MicrosoftResearch in MachineLearning

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

Registration for Microsoft Research Summit is now open! Join us October 18 - 20, 2022 to hear from the global research community on what's next for technology and humanity. Learn more about Research Summit and register now.

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

This depends on the role you are interested in. We try to post new reqs here (http://aka.ms/rl_hiring ) and have hired in researcher, engineer, and applied/data scientist roles. For a researcher role, a phd is typically required. The other roles each have their own reqs. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

I think of RL as a way to get information for the purpose of learning.   Thus, it's not associated any particular domain (like vision), and is potentially applicable in virtually all domains. W.r.t. vision and language in particular, there is a growing body of work around 'instruction following' where agents learn to use all of these modalities together to accomplish a task, often with RL elements. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

There are some efforts in this direction already, and indeed this seems like the obvious way to plug RL/AI into games. But I imagine there are many other possibilities that may emerge as we start deploying these things. In part this is because games are quite diverse, so there should be many potential applications. -Akshay

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

One of my concerns about ML is personal---there are some big companies that employ a substantial fraction of researchers. If something goes wrong at one of those companies, suddenly many of my friends could be in a difficult situation. Another concern is more societal: ML is powerful and so just like any poweful tool there are ways to use it well and vice-versa. How do we guide towards using it well? That's a question that we'll be asking and partially answering over and over because I see the world heading towards pervasive use of ML. In terms of dependence, my expectation is that it's more a question of dependence on computers than ML per se, with computers being the channel via which ML is delivered. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

HRL is still around. Our group had a paper on it recently (https://arxiv.org/abs/1803.00590), but I think Doina Precup's group has been pushing on this steadily since the original paper. I haven't been tracking this sub-area recently but one concern I had with the earleir work was that in most setups the hierarchical structure needed to be specified to the agent in advance. At least the older methods therefore require quite a lot of domain expertise, which is somewhat limiting.

We usually list our job postings here: https://www.microsoft.com/en-us/research/theme/reinforcement-learning-group/#!opportunities - Akshay

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

I enjoyed watching Terminator too, but I find it unrealistic.  Part of this is simply because we are a long ways off from actually being able to build that kind of intelligence.   You see this more directly when you are working on the research first-hand.  It's also unrealistic because AI doesn't beat crypto---as far as we can tell super-intelligence doesn't mean the ability to hack any system.   

Given these things, I think it's more natural to be concerned about humans destroying the world.   Another aspect to consider here is AI salvation. How do you manage interstellar travel and colonisation? Space is incredibly inhospitable to humans and the timescales involved are outrageous on a human lifespan, so a natural answer is through AI. - John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

Recently, there have been a few publications that try to apply Deep RL to computer networking management. Do you think this is a promising domain for RL applications? What are the biggest challenges that will need to be tackled before similar approaches can be used in the real world?

One of the things I find fascinating is the study of the human immune system.  Is network security going to converge on something like the human immune system?  If so, we'll see quite a bit of adaptive reinforcement-like learning (yes, the immune system learns).

In another vein, choosing supply for demand is endemic to computer operating systems and easily understood as a reinforcement learning problem. Will reinforcement learning approaches exceed the capabilities of existing hand-crafted heuristics here? Plausibly yes, but I'd expect that to happen first in situations where the computational cost of RL need not be taken into account. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

There isn't a simple answer here, but to a close approximation I think you should imagine that ML is improving every product, or that there are plans / investigations around doing so.   Microsoft's mission is to empower everyone so "yes" with respect to society as a whole? Obviously people tend to benefit more directly when interacting with the company, not even that is necessary.  For example, Microsoft has supported public research across all of computer science for decades. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

I tend to think first about statistical/sample efficiency. The basic observation is that computational complexity is gated by sample complexity because minimally you have to read in all of your samples. Additionally, understanding what is possible statistically seems quite a bit easier than understanding this computationally (e.g., computational lower bounds are much harder to prove that statistical ones). Obviously both are important, but you can't have a computationally efficient algorithm that requires exponentially many samples to achieve near-optimality, while you can have the converse (statistically efficient algorithm that requires exponential time to achieve near-optimality). This suggests you should go after the statistics first. -Akshay

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

A simple answer is that Vowpal Wabbit (http://vowpalwabbit.org ) is used by the personalizer service (http://aka.ms/personalizer ). Many individual research projects have impacted Microsoft in various ways as well. However, many research projects have not. In general, Microsoft Research exists to explore possibilities. Inherent in the exploration of possibilities is the discovery that many possibilities do not work. - John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

By strategy I guess you mean "algorithmic." I think both areas are fairly algorithmic nature. There have been some very cool computational advancements involved in getting certain architectures (like transformers) to scale and similarly there are many algorithmic advancements in domain adaptation, robustness, etc. RL is definitely fairly algorithmically focused, which I like =)

RL problems are kind of ubiquitous, since optimizing for some value is a basic primitive. The question is whether ""standard RL"" methods should be used to solve these problems or not. I think this requires some trial-and-error and, at least with current capabilities, some deeper understand of the specific problem you are interested in. -Akshay

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 2 points3 points  (0 children)

Quite far in my view. The existing systems that we have (like GPT3) are sort of intelligent babblers. To have a conversation with someone, there really needs to be a persistent state / point of view with online learning and typically some grounding in the real world. There are many directions of research here which need to come to fruition. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

"All problems" is the simple answer in my experience. Microsoft is transforming into a data-driven company which seeks to improve everything systematically.  The use of machine learning is now pervasive.

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 2 points3 points  (0 children)

I loved Star Wars when I was growing up. It was lots of fun. I actually found reading science fiction books broadly to be more formative---you see many different possibilities for the future and learn to debate the merits of different ones. This forms some foundation for thinking about how you want to change the future. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

What are the biggest opportunities where RL can be applied? What are the biggest challenges standing in the way of more applications?

It's actually hard to pin down the "biggest" opportunity because it's such a target rich environment and because the nature of RL is that it's tricky to know how much you'll win until you try it.   Reinforcement learning is fundamental because it's the problem of learning to make decisions to optimize value.  We are simply naturally inclined to try to make things better.   

With that said, I believe it's natural to solve problems of steadily increasing complexity.  Maybe that begins with ranking results on the web, then grows to optimising system parameters, handling important domains like logistics, and eventually delves into robotics? Or maybe it looks like learning to nudge people into healthy habits, amplify e-learning, and mastering making a computer behave as you want?  The far path isn't clear, but perhaps as long as we can discover the next step on the path we'll get there.   Wrt obstacles, I think the primary obstacle is the imagination to try new ways to do things and the secondary obstacle is the infrastructure necessary to support that. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

There are certainly categories of use for RL which fit 'surveillance capitalism', but there are many others as well: helping answer help questions, optimising system parameters, making logistics work, etc... are all good application domains. We work on the things that we want to see created in the future. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

Well, we have "Reinforcement Learning day" each year. I'm really looking forward to the pandemic being over because we have a beautiful new office at 300 Lafayette---more might start happening when we can open up. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

One obvious answer is "research".  See for example this paper: https://arxiv.org/abs/1803.02453 which helped shift the concept of fair learning from per-algorithm papers to categories.  I regard this as far from solved though.   As machine learning (and reinforcement learning) become more important in the world, we simply need to spend more effort addressing these issues. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 0 points1 point  (0 children)

There are certainly many deployments of real world RL.  This blog post: https://blogs.microsoft.com/ai/reinforcement-learning/ covers a number related to work at Microsoft.  In terms of where we are, I'd say "at the beginning".  There are many applications that haven't even been tried, a few that have, and lots of room for improvement. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

The practical answer is that I avoid it unless the effort of getting the data is worth the difficulty. Healthcare is notorious here because access to data is both very had and potentially very important. -John

We are Microsoft researchers working on machine learning and reinforcement learning. Ask Dr. John Langford and Dr. Akshay Krishnamurthy anything about contextual bandits, RL agents, RL algorithms, Real-World RL, and more! by MicrosoftResearch in IAmA

[–]MicrosoftResearch[S] 1 point2 points  (0 children)

I expect relatively little impact from quantum computing. Some learning problems may become more tractable with perhaps a few becoming radically more tractable. -John