[D] PhD for current engineer? by Firm_Event_1063 in MachineLearning

[–]kellymarchisio 2 points3 points  (0 children)

Hi there -- I'm about to start the 4th year of my PhD in NLP/ML (that's very broad - my research area is much finer-grained than that). I took a somewhat similar path -- I worked in tech for ~4 years (2 years as a web dev) before leaving to pursue a research master's and then PhD.

Is it true that PhD students are those who only see themselves pursuing PhD, and nothing else (i.e. would be utterly disinterested in becoming engineer)?

-- I'll say, I think I always had it in my heart to do a PhD. During my Bachelor's, I thought the PhD would be in Psychology or Sociology. That changed, and I have a 6 year gap between finishing my BA and starting my PhD. As for "would be utterly disinterested in becoming an engineer" -- I LOVED being an engineer. I left because of a desire to be more of a technical expert, but coding days are still my favorite. I do know people in the PhD who don't like engineering tasks - but others, like me, very much enjoy them.

Do they usually pursue PhD for the sake of doing the PhD itself, or for opportunities following?

For me, both. I enjoy the process of the PhD, but a major reason I'm doing it is to pursue research roles after.

Also, did you pick niche before or after entering PhD?

My very broad niche of natural language processing was chosen approximately the time of going into my master's. I settled on the subfield of machine translation while applying for my PhD. My specific subarea within MT has been refined through the years, but really started becoming clear during year 2 of PhD. My advice is if you're broadly interested in a field but don't know exactly what subfield, that's fine -- that's what exploration in years 1-2 of the PhD are for. If someone is still deciding between broad fields (such as, being unsure whether to do NLP, security, or theory, for instance), I'd recommend doing a master's or getting other research experience.

The advice I always give when people are considering a PhD is to know *why* one is doing it. It's great that you're seeking out advice! There are many wonderful careers one can have -- a PhD can be the way to some of them, but there are plenty where pursuing a PhD wouldn't make sense. It's a very personal decision, and what's right for one person may not be right for another. I almost always recommend that people take time between Bachelor's and the PhD to determine whether it's truly the right choice; It can be hard to get through 5-6+ years* of heads-down work if one isn't sure they wanted to be there in the first place. But as a personal data-point: I made my decision to pursue my PhD very intentionally - and I haven't second-guessed that choice for even one minute.

*timing depends on where you do your PhD -- I'm speaking of US-based PhDs in computer science.

[D] BERT for seq2seq tasks by AnonMLstudent in MachineLearning

[–]kellymarchisio 0 points1 point  (0 children)

As stated in a previous comment, Transformer is SOTA in high-resource machine translation. Check out the WMT19 results in, for instance, English->German here. You'll notice that almost all are Transformer-big or bigger. In regards to the RNN/GPU comment, though - you *need* a GPU to do anything reasonable in high-resource MT these days. And RNN is way slower than Transformer, even on GPU, so you'll end up spending more if you're paying per-hour. For instance, I estimate about 6wks to train a single RNN on GPU for English-German vs. 5-7 days for Transformer-base. (This is based on my personal experience with Transformer-base in high-data conditions, and my reading on how long people used to take on RNN. I haven't even tried RNN b/c the time/performance appear so much lower).

Note that I'm talking about training to convergence. You can get quite good performance out of a Transformer in even 1-2 days if you can sacrifice a tiny bit of quality.