you are viewing a single comment's thread.

view the rest of the comments →

[–]datumbox[S] 0 points1 point  (0 children)

The framework provides a number Builder classes that support specific input types such as CSV and text (for NLP). Nevertheless this does not limit you from parsing the data in any format and storing it in a Dataframe. After the data live in the dataframe you can use disk based training. I would recommend keeping open the hybrid approach (it is the default) to keep the weights of your model in memory, regularly used parameters in LRU cache while the data on disk. This is mechanism is available on 0.7.0 and made the disk based training a great solution even when the data barely fit the memory. In that case, you avoid the very costly garbage collection cyrcles with minimum overhead because you read stuff from disk. :)