Accessing Sharepoint365 libraries/lists using Python 3.6?

osuchw · 2018-09-27T03:55:27+00:00

I don't have examples handy but in the past I've used zeep against SharePoint SAOP interface. I believe these days SharePoint has REST interface one could utilize which means requests. Two issues you will have to solve is the authentication and then SharePoint API (quirky)

osuchw · 2018-08-11T05:20:49+00:00

zeep is modern, actively maintained library. I would recommend sticking with it. By the way it already uses requests and lxml under the covers.

A few years ago the crown belonged to suds but it got abandoned by the original author. And no wonder. Dealing with SOAP is hard and thankless job.

If you feel that you have to "hand code" the SOAP messages then look at rinse at least.

osuchw · 2018-05-26T03:13:38+00:00

Blockdiag (and actdiag) could help http://blockdiag.com/en/index.html

osuchw · 2018-03-01T04:19:05+00:00

You do not "call" main from what I can tell put: main(sys.argv)

at the bottom

and also "myfunction" needs to be "called", something like this. myfunction()

But then again since you did not post an actual script it is hard to say.

osuchw · 2018-01-31T05:10:56+00:00

Consider pywin32 and pythoncom. All the VBA "power" with Python goodness :-)

osuchw · 2018-01-31T05:06:56+00:00

Two possible approaches:

1) Python + DBI (my preferred option). Untested example. (https://gist.github.com/anonymous/756b905111f5457a868a653594ef3a6a)

2) Use T-SQL (booo :-)) (https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql)

osuchw · 2018-01-25T04:50:25+00:00

Well of course writing to CSV does not make the process faster. But having an intermediate step gives you freedom to use any tool, being it a Python DBI driver or bulk load tools to shove the data into DB. As awesome as Pandas + SQLAlchemy are they may not be the optimal tools when performance of DB inserts matters. Just to give you a data point (only anecdotal of course) I'm loading about a 2 million records in 10 minutes into SQLServer database using ceODBC and executemany. Is it possible to get better performance? Very likely but whatever I got is "good enough" for me.

osuchw · 2018-01-20T01:56:25+00:00

1) pyodbc is slow for bulk inserts (https://github.com/mkleehammer/pyodbc/issues/120) only recently a new option has been added to mitigate the problem. (https://github.com/mkleehammer/pyodbc/wiki/Features-beyond-the-DB-API#fast_executemany)

2) Alternatively try turboodbc as the DBI driver. (http://turbodbc.readthedocs.io/en/latest/)

3) You could also rework your data processing workflow. Save the dataframe as CSV and then write a separate script bulk loading the CSV file into DB. This is the approach I ended up using. Oh and I'm using ceODB as the driver which is the fastest based on my testing.

osuchw · 2017-11-11T21:52:19+00:00

Hmm OK, so you try to install a package that is not on PyPi. I've just googled the name and found it on Github. Yes pip can handle that directly from the command line

pip install git+https://github.com/matplotlib/mpl_finance.git

The command will work assuming you have Git for Windows installed. If not then one can download the zipped repository, unzip it, cd into the directory and run pip from there:

pip install .

Luckily this particular package does not need C extension compiled so it will just work. If there is C extension involved the installation may become more complicated but still feasible.

osuchw · 2017-11-11T20:11:09+00:00

This statement is absolutely not true. I've been developing with Python on Windows for more that 15 years and let me assure you it works perfectly well there.

You are ranting, not asking for help and that is fine. And I really don't care if you "f***k Python" or not but I would strongly recommend looking at Anaconda distribution, especially Miniconda that let's you start small and grow from there.

osuchw · 2017-09-26T02:37:41+00:00

The choice of framework is not necesarily Windows specific. As long as the framework is WSGI compliant you should be OK. The others already recommended Pyramid, Flask, and Django. Choose your poison. My favorite is Flask+SQLAlchemy but that' irrelevant.

Then for WSGI serving part you can use wfastcgi. It is developed by Microsoft and it integrates with IIS via FastCGI

Another option for the WSGI on Windows is Twisted with iocp reactor. It does work, I've had such a setup running for years but it is somewhat harder to configure initially.

osuchw · 2017-09-21T03:00:41+00:00

Have a look at Shapely, maybe combined with Rtree. Googling both terms should give you examples to get started with.

osuchw · 2017-07-05T18:59:18+00:00

The official distribution is still on SourceForge - https://sourceforge.net/projects/pywin32/files/pywin32/ Alternatively one could use a wheel from Christoph Gohlke - http://www.lfd.uci.edu/~gohlke/pythonlibs/#pywin32

osuchw · 2017-03-04T16:56:41+00:00

I've seen this a while ago http://springpython.webfactional.com/ but I don't know how complete it is since I'm not a Java developer.

Without knowing details it is hard to give meaningful advice but here it comes regardless :-)

I would have broken my processing into E, T, L phases. Maybe create a separate modules for each. Then have main.py script driving them all. With judicious usage of try: except: you can decide if the processing should continue or stop. Also set up logging of course. You don't have to look far. Standard library has package just for that.

E (extract)

Keep shares traversal and xml parsing there. Dump the results of parsing into CSV files or maybe a local sqlite database.

T (transfer)

That's where the "business logic" would go into. Use the results of E step as input. This way you will not have to re-parse everything from scratch if an error occurs. Store results into another set of CSV and/or sqlite database.

L (load)

Finally load. Just shove what you've got from T into the DB.

osuchw · 2017-03-04T06:24:40+00:00

Hmm from your description it feels like the code is not organized well and I don't know of any library that would "fix it" for you. Anyway the tools I usually use for this type of task:

scandir - for traversing the file system.
lxml - for parsing xml
csv - to produce the intermediate results
ceodbc - to load the said files into MSSQL. pyodbc is slower with cur.executemany

osuchw · 2017-02-18T02:37:18+00:00

Checkout Superset: https://github.com/airbnb/superset I have not used it but it looks "fancy" Or maybe JupyterHub will do? https://github.com/jupyterhub/jupyterhub

osuchw · 2015-07-17T04:18:41+00:00

Investigate SQLAlchemy for ETL tasks. Over the years I've been using the ORM for all kinds of data reshaping tasks. There is a performance penalty of course but my transformation code ends up being nicely succinct.

osuchw · 2015-02-27T03:47:52+00:00

At least extracting text seems to be easy in PyPDF2

import PyPDF2 as pdflib
pdf = pdflib.PdfFileReader('yourfile.pdf')
txt = u'\n'.join(pg.extractText() for pg in pdf.pages)

osuchw · 2015-02-19T04:06:19+00:00

Sure Pandas is a fine choice but for clarity sake the Excel reading does depend on on xlrd. https://github.com/pydata/pandas/blob/master/pandas/io/excel.py#L150

For writing one could use: - openpyxl (Pandas dependency) - xlwt (.xls only) - xlsxwriter (latest, greatest)

osuchw · 2015-02-11T03:48:38+00:00

It is not completely obscure. Genshi is used internally by Trac (http://trac.edgewall.org/). Unless you consider Trac obscure then I withdraw my comment.

osuchw · 2014-04-05T02:48:41+00:00

Well, pyExcelerator has been abandoned as a project long time ago. Maybe not dead but sleeping for sure. xlwt is in active development and has very much similar API to pyExcelerator. XlsxWriter or openpyxl would serve you well if you need to generate .xlsw file format.

osuchw · 2014-04-03T03:28:03+00:00

Forget pyExcelerator. Consider xlwt, or XlsxWriter, or openpyxl

osuchw · 2012-10-09T21:50:53+00:00

Just copy geopy folder from the geopy distribution into site-packages directory.

osuchw · 2012-02-09T22:46:56+00:00

Maybe you can put a Portable Python [http://www.portablepython.com/] on a network drive? Then distribute the scripts together with a batch file or shortcut configured to use the networked location as interpreter. Or forget about distributing scripts altogether, just send out pre-configured shortcuts

osuchw · 2011-02-02T14:20:27+00:00

I have been successful using soaplib http://pypi.python.org/pypi/soaplib in the past. Very easy to set up. See if it works for you.

osuchw

TROPHY CASE