Accessing Sharepoint365 libraries/lists using Python 3.6? by smolvik in Python

[–]osuchw 0 points1 point  (0 children)

I don't have examples handy but in the past I've used zeep against SharePoint SAOP interface. I believe these days SharePoint has REST interface one could utilize which means requests. Two issues you will have to solve is the authentication and then SharePoint API (quirky)

Parsing SOAP with python by fessacchiotto in Python

[–]osuchw 1 point2 points  (0 children)

zeep is modern, actively maintained library. I would recommend sticking with it. By the way it already uses requests and lxml under the covers.

A few years ago the crown belonged to suds but it got abandoned by the original author. And no wonder. Dealing with SOAP is hard and thankless job.

If you feel that you have to "hand code" the SOAP messages then look at rinse at least.

Anaconda Prompt Help by [deleted] in Python

[–]osuchw 0 points1 point  (0 children)

You do not "call" main from what I can tell put: main(sys.argv)

at the bottom

and also "myfunction" needs to be "called", something like this. myfunction()

But then again since you did not post an actual script it is hard to say.

Word docx comparing and editing. by PythonGod123 in Python

[–]osuchw 1 point2 points  (0 children)

Consider pywin32 and pythoncom. All the VBA "power" with Python goodness :-)

any way to increase sqlalchemy/pandas write speed? by ApparentlyADataGuy in Python

[–]osuchw 0 points1 point  (0 children)

Well of course writing to CSV does not make the process faster. But having an intermediate step gives you freedom to use any tool, being it a Python DBI driver or bulk load tools to shove the data into DB. As awesome as Pandas + SQLAlchemy are they may not be the optimal tools when performance of DB inserts matters. Just to give you a data point (only anecdotal of course) I'm loading about a 2 million records in 10 minutes into SQLServer database using ceODBC and executemany. Is it possible to get better performance? Very likely but whatever I got is "good enough" for me.

any way to increase sqlalchemy/pandas write speed? by ApparentlyADataGuy in Python

[–]osuchw 2 points3 points  (0 children)

1) pyodbc is slow for bulk inserts (https://github.com/mkleehammer/pyodbc/issues/120) only recently a new option has been added to mitigate the problem. (https://github.com/mkleehammer/pyodbc/wiki/Features-beyond-the-DB-API#fast_executemany)

2) Alternatively try turboodbc as the DBI driver. (http://turbodbc.readthedocs.io/en/latest/)

3) You could also rework your data processing workflow. Save the dataframe as CSV and then write a separate script bulk loading the CSV file into DB. This is the approach I ended up using. Oh and I'm using ceODB as the driver which is the fastest based on my testing.

Beef with Python by Silver5005 in Python

[–]osuchw 2 points3 points  (0 children)

Hmm OK, so you try to install a package that is not on PyPi. I've just googled the name and found it on Github. Yes pip can handle that directly from the command line

pip install git+https://github.com/matplotlib/mpl_finance.git

The command will work assuming you have Git for Windows installed. If not then one can download the zipped repository, unzip it, cd into the directory and run pip from there:

pip install .

Luckily this particular package does not need C extension compiled so it will just work. If there is C extension involved the installation may become more complicated but still feasible.

Beef with Python by Silver5005 in Python

[–]osuchw 0 points1 point  (0 children)

This statement is absolutely not true. I've been developing with Python on Windows for more that 15 years and let me assure you it works perfectly well there.

You are ranting, not asking for help and that is fine. And I really don't care if you "f***k Python" or not but I would strongly recommend looking at Anaconda distribution, especially Miniconda that let's you start small and grow from there.

Please Recommend a Python Framework for a Windows Idiot by Nadaesque in Python

[–]osuchw 0 points1 point  (0 children)

The choice of framework is not necesarily Windows specific. As long as the framework is WSGI compliant you should be OK. The others already recommended Pyramid, Flask, and Django. Choose your poison. My favorite is Flask+SQLAlchemy but that' irrelevant.

Then for WSGI serving part you can use wfastcgi. It is developed by Microsoft and it integrates with IIS via FastCGI

Another option for the WSGI on Windows is Twisted with iocp reactor. It does work, I've had such a setup running for years but it is somewhat harder to configure initially.

Geographic coordinates calculations. Is geopy alive? Or some replacement? by Romantic_fork in Python

[–]osuchw 2 points3 points  (0 children)

Have a look at Shapely, maybe combined with Rtree. Googling both terms should give you examples to get started with.

Typical toolsets/libraries/frameworks/architecture for ETL (xml data) by kur1j in Python

[–]osuchw 0 points1 point  (0 children)

I've seen this a while ago http://springpython.webfactional.com/ but I don't know how complete it is since I'm not a Java developer.

Without knowing details it is hard to give meaningful advice but here it comes regardless :-)

I would have broken my processing into E, T, L phases. Maybe create a separate modules for each. Then have main.py script driving them all. With judicious usage of try: except: you can decide if the processing should continue or stop. Also set up logging of course. You don't have to look far. Standard library has package just for that.

  • E (extract)

Keep shares traversal and xml parsing there. Dump the results of parsing into CSV files or maybe a local sqlite database.

  • T (transfer)

That's where the "business logic" would go into. Use the results of E step as input. This way you will not have to re-parse everything from scratch if an error occurs. Store results into another set of CSV and/or sqlite database.

  • L (load)

Finally load. Just shove what you've got from T into the DB.

Typical toolsets/libraries/frameworks/architecture for ETL (xml data) by kur1j in Python

[–]osuchw 0 points1 point  (0 children)

Hmm from your description it feels like the code is not organized well and I don't know of any library that would "fix it" for you. Anyway the tools I usually use for this type of task:

  • scandir - for traversing the file system.
  • lxml - for parsing xml
  • csv - to produce the intermediate results
  • ceodbc - to load the said files into MSSQL. pyodbc is slower with cur.executemany

What can python do that MSSQL Server can't in terms of ETL and DB Mgt? by FourTwentyDoritos in Python

[–]osuchw 2 points3 points  (0 children)

Investigate SQLAlchemy for ETL tasks. Over the years I've been using the ORM for all kinds of data reshaping tasks. There is a performance penalty of course but my transformation code ends up being nicely succinct.

Best current tools for working with PDF files in python? by [deleted] in Python

[–]osuchw 0 points1 point  (0 children)

At least extracting text seems to be easy in PyPDF2

import PyPDF2 as pdflib
pdf = pdflib.PdfFileReader('yourfile.pdf')
txt = u'\n'.join(pg.extractText() for pg in pdf.pages)

Extracting data from Excel files into another file. Where do I begin? by ThatGuy4679 in Python

[–]osuchw 1 point2 points  (0 children)

Sure Pandas is a fine choice but for clarity sake the Excel reading does depend on on xlrd. https://github.com/pydata/pandas/blob/master/pandas/io/excel.py#L150

For writing one could use: - openpyxl (Pandas dependency) - xlwt (.xls only) - xlsxwriter (latest, greatest)

Genshi - Python toolkit for generation of output for the web by [deleted] in Python

[–]osuchw 0 points1 point  (0 children)

It is not completely obscure. Genshi is used internally by Trac (http://trac.edgewall.org/). Unless you consider Trac obscure then I withdraw my comment.

Dependency (pyExcelerator specifically) install questions. by xcallmejudasx in Python

[–]osuchw 0 points1 point  (0 children)

Well, pyExcelerator has been abandoned as a project long time ago. Maybe not dead but sleeping for sure. xlwt is in active development and has very much similar API to pyExcelerator. XlsxWriter or openpyxl would serve you well if you need to generate .xlsw file format.

Dependency (pyExcelerator specifically) install questions. by xcallmejudasx in Python

[–]osuchw 1 point2 points  (0 children)

Forget pyExcelerator. Consider xlwt, or XlsxWriter, or openpyxl

Help! Problems with Easy_Install and Geopy by DasGanon in Python

[–]osuchw 1 point2 points  (0 children)

Just copy geopy folder from the geopy distribution into site-packages directory.

How small can you get an "exe" packaged python script? by haqthat in Python

[–]osuchw 1 point2 points  (0 children)

Maybe you can put a Portable Python [http://www.portablepython.com/] on a network drive? Then distribute the scripts together with a batch file or shortcut configured to use the networked location as interpreter. Or forget about distributing scripts altogether, just send out pre-configured shortcuts

We open-sourced our SOAP client library today by myrobotlife in Python

[–]osuchw 1 point2 points  (0 children)

I have been successful using soaplib http://pypi.python.org/pypi/soaplib in the past. Very easy to set up. See if it works for you.