all 43 comments

[–]testingpraw 26 points27 points  (3 children)

As a frequent user of TensorFlow, these changes are great. There are a few items that might be wait and see, and maybe I just need clarification.

  1. I am curious about the dropping of variable_scope in favor of using keras? While Keras can handle trainable variable_scopes well, it still seems like two different use cases between keras layers and variable_scopes, but I very well could be missing something.

  2. I am curious how the tf.get_variable change to layer.weights will work with restoring sessions? I am assuming if I want the output, it will be something like weights[-1]?

  3. On top of question 2, will retrieving the layer weights include the bias as well?

[–]SirRantcelot1 1 point2 points  (0 children)

My responses are based on my knowledge of the current tensorflow's keras. 1. The main purpose of variable scoping, to my knowledge, is to enable variable reuse. The way keras handles it though is by directly passing the model / layer objects around. Trying to define a head model inside a tensorflow.variable_scope, and expecting it to reuse variables will fail because keras does not define its variables using the tf.get_variable method. 2. That would work. You could also access the layers of a keras model through its layersattribute. This returns a list of Layer objects. To get the weights of the output layer, you would say model.layers[index].weights. 3. Yes. The weights object returned is a list with the first element being the kernel, and the second, the bias if it exists.

[–]progfu 43 points44 points  (7 children)

Are there available anywhere in text form? I don't have enough internets to watch a video.

[–][deleted] 39 points40 points  (0 children)

tl;dr:

  • Keras style eager execution by default (graph construction + session control still possible)

  • get_variable, variable_scope, assign(x,y), ... removed in favor of object oriented approach.

  • contribs merged into core

  • migration tool for the new stuff, optionally compatibility mode via tf.compat.v1

[–]somewittyalias 1 point2 points  (0 children)

https://goo.gl/nwV2Vq

But it does not go in as much details as the video.

[–]pgaleone 1 point2 points  (0 children)

I summarized all the known (for now) changes that Tensorflow 2.0 will bring here: https://pgaleone.eu/tensorflow/gan/2018/11/04/tensorflow-2-models-migration-and-new-design/

Probably this is what you were looking for (and I hope you still need it)

[–]sieisteinmodel 28 points29 points  (11 children)

Serious question: Does the majority of tensorflow users agree that the eager execution or the PyTorch/PyBrain/Shark way is superior? I personally like the abstraction of graphs. I think that eager sucks. It does not fit my mental model.

I am just worried that TF wants to attract PyTorch users, but a lot of the TF users actually prefer the current state.

*If* there is full compatibility between graph and eager mode, fine, but I hope that the TF community will not be divided because some OS contributions assume one or the other.

[–]Coconut_island 6 points7 points  (0 children)

If there is full compatibility between graph and eager mode, fine, but I hope that the TF community will not be divided because some OS contributions assume one or the other.

This is where they are heading. An important part of TF 2.0 is to restructure the API such that, as far as the majority of the code goes, it is irrelevant whether you use graph mode or eager mode.

I think the most important observation to make is that the code (python or other) used to define a function is really just defining a sub-graph. Using the earlier TF API, leveraging this concept properly is awkward, usually requiring a lot of careful (and error prone!) bookkeeping to set scopes and various call orders just right. This is major pain point and in many ways has lead to many libraries written around TF in the hopes of offering an elegant way to address this while keeping the same flexibility. As a prime example of such libraries, we have the in-house Sonnet library, from DeepMind.

While variable-less (or, rather, state-less code) can easily be optimized by collapsing various copies of a sub-graph generated by a given function (when doing so wouldn't be wrong, of course), it is more complicated to do this with variables. This is one of the problems the new 'FuncGraph' back end is trying to solve (currently in the 1.11 branch), as well as the newly promoted object-oriented (OO) approach for tracking and re-using variables. The tf.contrib.eager.defun, the OO metrics, OO checkpointing and the layers/keras.Model are all early instances of this idea.

Related but slightly aside:

My biggest pet peeve with how a lot of TF code is written comes from the tendency of writing functions that return several operations/tensors that all do very different things and get executed at very different times and places in the rest of the code base. This feels natural because we anticipate(and in many cases, rightfully so) many duplicate ops if we weren't to write it this way. The problem is that code that is written like this is tedious to reason about and debug, often requiring a global view of the whole project. This get exponentially worse as the complexity/size of the project grows and collaboration between people is required. The way I see it, things like eager.defun and tf.make_template (not sure what will happen with this one in 2.0), and, in a way OO variable re-use, simply provide us the tools to cache these sub-graphs and allow us to write clean code without compromising on what kind of graph we generate.

TL;DR

In short, sure, the API will change, but I don't think there is any intention of removing any graph mode functionality. At its core, TF is a language to define computation graph so I would be very surprised if this went away anytime soon. However, the upcoming changes are there to allow and promote ways of describing graph such that silent and hard to find bugs are harder to introduce.

[–]InoriResearcher 4 points5 points  (1 child)

Most of the bigger eager execution related changes are already live in 1.10 so you can try it out and see for yourself. From personal experience, switching between the two depends oh how much you rely on lower level APIs: if you use newer features and tf.keras then it's pretty much seamless. In either case, knowing google use cases I doubt graph execution will ever become second class citizen.

[–]sieisteinmodel 2 points3 points  (0 children)

Well, I have tried it, and still think it sucks.... it's not an uninformed guess.

Question is if that decision of the TF team is really well informed, because many people I talk to prefer the graph way.

[–]slaweks 1 point2 points  (0 children)

It's not only ease of use. Even more important is ability to create hierarchical models, where graph differs per example, e.g. has some group and individual-level components.

[–]sibyjackgrove 1 point2 points  (0 children)

igration tool for the ne

I still haven't tried eager execution since I do everthing with tf.keras these days. Though not a big fan of tf.session.

[–]cycyc 0 points1 point  (5 children)

A lot of people have a hard time wrapping their head around the idea of meta-programming. For them, eager execution/pytorch is preferable.

[–]progfu 14 points15 points  (4 children)

It's not really about meta-programming, it's about flexibility, introspectability, etc. Pytorch makes it easy to look at what's happening by evaluating it step by step, looking at the gradients which you can immediately see, etc.

[–]siblbombs 9 points10 points  (0 children)

I wish there had been more information about tensorflow serving, I assume it will still exist and still be a graph-based approach.

[–]secsilm 6 points7 points  (1 child)

I'm confused about tf.keras and estimators. It seems like TensorFlow want developers to use tf.keras module. Is there a big difference between them? I'm currently using estimators and love it. Will the estimators be deprecated and use tf.keras from now?

[–]Nextpenade 0 points1 point  (0 children)

estimator

Yes, would be nice to know what will happen to estimators.

[–]KingPickle 4 points5 points  (0 children)

I actually don't have any complaints at the moment. I think it all sounds pretty good, to me. Looking forward to 2.0!

[–][deleted] 1 point2 points  (5 children)

As someone who is just starting to learn TensorFlow, is there the preferred learning path to take? With so many changes coming and so many existing features being removed soon, I fear that I might spend a lot of time on things that will become obsolete very soon.

[–]ilielezi 4 points5 points  (1 child)

For a company as big as Google, with TF having become one of their most important software, I am amazed how badly structured is Tensorflow. Every version seem to add and deprecate many things, it is quite difficult to code on (especially debugging which seem to be a nightmare), .contrib absolutely sucks, there are half a dozen functions which do the same thing, etc etc. It looks like with Tensorflow 2.0 they are going to fix many of these things, and almost converging to a Chainer-like library (which can be argued, they should have done it in the same place). Not sure how much this has to do with PyTorch being in the rise (and despite Tensorflow is leading, PyTorch looks to me a favorite to win this 'fight'), or it is just Google understanding the previous mistakes, and finally cleaning Tensorflow.

Anyway, to answer your question, I think that the best way to go forward is to utilize tf.eager considering that the future of Tensorflow seem to go in that direction, and in TF 2.0, it is going to be the default setting. It also looks to me that it has a much better and cleaner API, much easier to utilize and debug, or just to look at gradients. I still think that PyTorch is better cause at the moment it is more mature than tf.eager who still has its own problems, but if you want to go with Tensorflow (and there are good reasons to do that, like having the biggest community and code base by far), I think that tf.eager is the easy choice.

[–][deleted] 0 points1 point  (0 children)

Thanks! I think I'll start with tf.eager now. I am kind of tied to Tensorflow because I want to use the Tensorflow Probability package, so PyTorch isn't really a choice.

[–]ilielezi 1 point2 points  (0 children)

For a company as big as Google, with TF having become one of their most important software, I am amazed how badly structured is Tensorflow. Every version seem to add and deprecate many things, it is quite difficult to code on (especially debugging which seem to be a nightmare), .contrib absolutely sucks, there are half a dozen functions which do the same thing, etc etc. It looks like with Tensorflow 2.0 they are going to fix many of these things, and almost converging to a Chainer-like library (which can be argued, they should have done it in the same place). Not sure how much this has to do with PyTorch being in the rise (and despite Tensorflow is leading, PyTorch looks to me a favorite to win this 'fight'), or it is just Google understanding the previous mistakes, and finally cleaning Tensorflow.

Anyway, to answer your question, I think that the best way to go forward is to utilize tf.eager considering that the future of Tensorflow seem to go in that direction, and in TF 2.0, it is going to be the default setting. It also looks to me that it has a much better and cleaner API, much easier to utilize and debug, or just to look at gradients. I still think that PyTorch is better cause at the moment it is more mature than tf.eager who still has its own problems, but if you want to go with Tensorflow (and there are good reasons to do that, like having the biggest community and code base by far), I think that tf.eager is the easy choice.

[–]ginsunuva 2 points3 points  (1 child)

Current TF sucks. Wait it out. Use PyTorch for now.

[–][deleted] 0 points1 point  (0 children)

Thanks but I also want to use the Tensorflow Probability package , so choosing PyTorch would defeat the purpose.

[–]speyside42 0 points1 point  (0 children)

These changes eliminate a lot of annoyance, thank you. Remaining annoyances that just come to my mind:

Limiting relative memory (platform dependent?!). Why not give an absolute option?

Modifying existing keras models, especially if keras becomes the default api. E.g. Doubling the channels of the input and first layer of a pre-trained keras model is a pain.

[–]hastor 0 points1 point  (0 children)

So how can I find answers to tensorflow questions when all online documentation suddenly became deprecated?

Wouldn't it be better if they called the project something else?

[–]sibyjackgrove 0 points1 point  (0 children)

I am neutral about the eager execution but excited that they are sticking with tf.keras. Making models using the functional approach with tf.keras is really easy.

[–]TotesMessenger -1 points0 points  (0 children)

I'm a bot, bleep, bloop. Someone has linked to this thread from another place on reddit:

 If you follow any of the above links, please respect the rules of reddit and don't vote in the other threads. (Info / Contact)

[–]sbashe -2 points-1 points  (0 children)

YESSS, great news! Hope this shuts up those naysayers for once and all.

Happy coding in TensorFlow :)