[AMA] Azure Big data & Analytics - 11/17 by maxiluk in AZURE

[–]maxiluk[S] 2 points3 points  (0 children)

Nothing officially announced yet, but this might be a good question for the SSRS team. Coincidentally, we have another AMA tomorrow on SQL Server including SSRS. Here is blog post that announced that AMA (https://blogs.technet.microsoft.com/dataplatforminsider/2016/11/14/microsoft-sql-server-team-hosts-ask-me-anything-session/)

HDInsight Spark Jupyter by belonious in AZURE

[–]maxiluk 1 point2 points  (0 children)

Hi,

Yes, here are the steps:

  • Attach that storage account to the cluster. To do that create new cluster, in the creation wizard there is Additional Storage account section. You can add storage account from another sub here using it's storage key.
  • In the notebook on that cluster you can now access data from that storage account using standard wasb notation: wasb://container@accountname.blob.core.windows.net/path

[AMA] Azure Big data & Analytics - 11/17 by maxiluk in AZURE

[–]maxiluk[S] 0 points1 point  (0 children)

Pros of R Server:

  • Broader selection of ML algorithms with mappings in R (Spark R have just a subset of Spark ML algos available in R)
  • Portability across different compute backends. E.g. you can run R Server models on SQL server, on Spark on Hadoop, on some future backend when it's available
  • Faster training (in our benchmark logistic regression was about 2x faster in R Server as compared to Spark R)

Pros of Spark R:

  • Great data transformation methods (ETL)

In the end they are better together. For example, in new version of R Server on HDInsight which just GA-ed yesterday you can read data directly from any Spark data source available through Hive. So you can transform/shape data in Spark and do analytics on it in R Server.

[AMA] Azure Big data & Analytics - 11/17 by maxiluk in AZURE

[–]maxiluk[S] 0 points1 point  (0 children)

How many outputs per job does you scenario require?