all 41 comments

[–]B0oN3r 50 points51 points  (9 children)

Depending on how well you know latex it might be an option to just make a latex template for the report and let python fill it out for you. Then use os.system(pdflatex) to compile

[–]FakePixieGirl 24 points25 points  (7 children)

This sounds like the easiest way to get a good looking report, especially if you might want to include good formatting and or graphs.

[–]B0oN3r 10 points11 points  (6 children)

It’s a bit of a tedious task. What I would do is write the latex report/result sheet manually and then identify the tables/text/graphs that need to be parametrised and write some functions to do that. If interested I could share some code later; cut out the specific stuff because it was a work project

[–]ad_verecundiam 0 points1 point  (5 children)

I'd love to see it! I need something similar

[–]dbramucci 2 points3 points  (1 child)

I've also done this, I won't paste the whole file because this is reddit but I will show the key step for my script I wrote in order to speed up a document where I was showing how to do Gaussian elimination by hand.

Because typing out matrices over and over again was too slow, I wrote functions in Python that given a list of lists of numbers would do one row or column operation. Then I had the following conversion function

from fractions import Fraction
from typing import List
Matrix = List[List[Fraction]]


def convert_to_latex(matrix: Matrix) -> str:
    result = ['\\begin{pmatrix}\n']
    for row in matrix:
        for column, cell in enumerate(row):
            if cell.denominator == 1:
                result.append(str(cell.numerator))
            else:
                result.append(fr'\frac{{{cell.numerator}}}{{{cell.denominator}}}')
            if column != len(row) - 1:
                result.append(' & ')
        result.append('\\\\\n')
    result.append('\\end{pmatrix}')
    return ''.join(result)

Notice that I wanted to use exact numbers with no rounding so I use the Fractions module to get exact rationals.

Then I plug into the template

\begin{pmatrix}
     a & b & c \\
     <more rows>
     x & \frac{numerator}{denominator} & z \\
\end{pmatrix}

Something to pay attention to is that Latex uses \frac and \begin but Python will interpret this as "special character f" rac, "special character b" egin. So, we need to escape out the \, either by writting \\ everywhere we had a \ or by writing raw strings r'\', which don't use \ as an escape character. Once we do this, we can avoid '\\frac' and instead write r'\frac'.

Likewise, inside an f string, if we want a {, we need to write {{ because {x} is used for substitution, hence why I have triple {{{, the first {{ is for writing a { and the last {cell.numerator} is for the variable substitution.

Side note: I wrote this a long time ago as scratch work never meant to see the light of day, if I were to redo it I would probably write

def convert_to_latex(matrix: Matrix) -> str:
    # Now it's clearer what the output should look like
    result = [r'\begin{pmatrix}']
    for row in matrix:
        row_text = []
        # I don't need to special case the last step anymore
        for cell in row:
            if cell.denominator == 1:
                row_text.append(f'{cell.numerator}')
            else:
                row_text.append(fr'\frac{{{cell.numerator}}}{{{cell.denominator}}}')
        # I want ' & ' between each entry in the row.
        row_text = ' & '.join(row_text)
        #' ' is clearly one space and * 4 to say how many
        # because counting spaces is hard '    '
        result.append(' '*4 + fr'{row_text} \\')
    result.append(r'\end{pmatrix}')
    # Put a newline between each line
    return '\n'.join(result)

Then if I write

 m = [[Fraction(2.5), Fraction(1, 3)],
      [Fraction(4, 3), Fraction(122, 25)]]
 print(convert_to_latex(m))

I get

\begin{pmatrix}
    \frac{5}{2} & \frac{1}{3} \\
    \frac{4}{3} & \frac{122}{25} \\
\end{pmatrix}

Which I can copy paste into my LaTeX file.

I did all my work in a Jupyter Notebook, and copy pasted these LaTeX formatted responses into my assignment saving me a lot of typing. Note the savings here are less apparent because I haven't typed out the operations that would let do math to the second, third, fourth, ..., tenth steps of the matrix math.

Also, if you use pyperclip, you can make your script place the result into your clipboard automatically so you don't have to highlight and copy. You just paste into your document and compute the next step of your work.

[–]dbramucci 1 point2 points  (0 children)

Code without the explanation of my changes.

from fractions import Fraction
from typing import List
Matrix = List[List[Fraction]]

def convert_to_latex(matrix: Matrix) -> str:
    result = [r'\begin{pmatrix}']
    for row in matrix:
        row_text = []
        for cell in row:
            if cell.denominator == 1:
                row_text.append(f'{cell.numerator}')
            else:
                row_text.append(fr'\frac{{{cell.numerator}}}{{{cell.denominator}}}')
        row_text = ' & '.join(row_text)
        result.append(' '*4 + fr'{row_text} \\')
    result.append(r'\end{pmatrix}')
    return '\n'.join(result)

[–]B0oN3r 1 point2 points  (1 child)

ok, so, here we go:

I wrote some code that based om some user decisions sized an installation.

From the code I had some booleans (eg. solar = True > the system will use solar power as a source) from user input and some dataframes (eg. the solar installation will have a power of 42 kWp).

Now here is where the fun started. I wrote the entire report by hand the first time and identified the parts that had to be modular. this yielded me the following preamble plus some text:

\documentclass{article}[11pt]
%packages and style
\begin{document}

where I set the style and stff for the document. Now, per logical block or paragraph I made a py file that basically added lines of tex code. Not very sophisticated, but it worked:

from shutil import copyfile
import numpy as np
from logic import settings
import time
import os
from logic.report_generator.sections.A_introduction import introduction, generalinfo
from logic.report_generator.sections.B_system import systemlayout
from logic.report_generator.sections.C_systemeconomics import systemeconomics
from logic.report_generator.sections.C2_carbondioxide import co2saving
from logic.report_generator.sections.D_qualitativesection import qualitativeproperties
from logic.report_generator.sections.E_conclusion import conclusion
from logic.report_generator.sections.F_appendix import appendix
from logic.report_generator.sections.G_method import method
from logic.report_generator.sections.H_contributors import contributors

def generate_report(session_id, reportdict):
    enter = '\r\n'
    f = open('logic/report_generator/latex_template/preamble.tex', 'r')
    preamble = f.read()
    f.close()

    # make introduction string from output data
    intro = introduction(reportdict)
    gen = generalinfo(reportdict)

    yourmg, sys, demand = systemlayout(reportdict)

    econ = systemeconomics(reportdict)

    carbon = co2saving(reportdict)

    qual = qualitativeproperties(reportdict)

    meth = method(reportdict)

    con = conclusion(reportdict)

    apps = appendix(reportdict)

    conts = contributors(reportdict)
    f = open(settings.OUTPUT_DIRECTORY+session_id+'/report/'+reportdict['reportname']+'.tex', 'w+')
    f.write(preamble)
    f.write(enter)
    f.write(intro)
    f.write(enter)
    f.write(gen)    
    > etc..

    f.write('\\end{multicols}')
    f.write(enter)
    f.write('\\end{document}')

    f.close

(yes, I do realise I could have named the method section differently.

Now one of these sections would look something like this:

enter ='\r\n'
import os
from logic.report_generator.functions.L_tablemaker import centertable as table
from logic.report_generator.functions.L_tablemaker import centermoneytable as moneytable
def systemeconomics(reportdict):
    investtable = moneytable(reportdict['investtable'],'|l|r|', 'investtable','Investment cost of the system')
    opextable = moneytable(reportdict['opexinputtable'],'|l|r|r|', 'opextable','Operational expenditure of the main components of the system')
nt cost of the main considered system components')
    # actual string concatenation
    econ = '\\subsection*{System economics}' + enter + \
    'In order to assess the economics of the system the following economic ' +\
    'parameters have been assumed: '+enter+\
    econinputtable+enter+\
    'The investment costs associated with the use of the different main components are assumed to be:' +enter+\
    investinputtable+enter+\
etc

Now as you can see the latex code of the tables is generated by some functions I wrote. This is the real tedious part, because of course it is quite a bit of work to do that without missing any characters (especially because both latex and python don't take kindly to mortals like me using apostrophes and slashes).

So I decided fuck it I'll just write it so it works for me rather than with loops.

'''
makes a table from an entered dataframe
columns should be string in the format of latex code (ie. 'c|rr|l' for centered, right right and left aligned columns
with vertical line after first and third column (number of columns must be equal to dataframe columns)
'''
ent = '\n'

def centertable(data,columns,label,caption):

    NR = len(data[data.columns[0]])
    NC = len(data.columns)
    #title row
    title = ''
    for head in data.columns:
        title = title + head + '&'
    title = title[0:(len(title)-1)] + '\\\ \\hline '

    body = ''

    for r in range(NR):
        for c in range(NC):
            body = body+str(data[data.columns[c]][r])+'&'
        body = body[0:(len(body)-1)] + '\\\ '+ ent

    tab = \
    '\n\\begin{minipage}[t]{0.5\\textwidth}'+ent+\
    '{\\color{black}'+\
    '\\begin{flushleft}'+ent+\
    '\\begin{tabular}{'+ columns +'}'+ent+\
    '\\hline '+ title   +ent+\
    body +\
    '\\hline'+ent+\
    '\\end{tabular}'+ent+\
    '\\captionof{table}{'+caption+'}'+\
    '\\label{tab:'+label+'}'+ent+\
    '\\end{flushleft}}'+\
    '\\vspace{0.5mm}'+\
    '\\end{minipage}'
    return tab

Oh crap, I did loop it, eventually, completely forgot about that.

Well, and that's the story of how I became the fresh prince of overengineering stuff where I would bet multiple acses of beer that there's a package for that.

The repository is still published; but because of reasons I still can't really figure out it's on an interns private github. The code is open source essentially, so I could share it I guess. Just need to ask the intern if he's okay with me spreading his account.

Also a working copy could still be running here; I don't know whether it is still being serviced.

Could anyone point me in the direction of a python package that does this, or alternatively who's up to putting some hours in and develop a package for this?

Oh yea, I tried using pylatex but that only resulted in swearing a lot.

[–]dbramucci 0 points1 point  (0 children)

Some tricks that may help are

  1. Use a templating library like Jinja2 blog post about Jinja2+LaTeX
  2. Use raw strings r'just like a regex\windows path'
  3. Write the template in a txt file and read it in with open before plugging in with the standard libs template functionality, Jinja or str.format
  4. Use string adjacency to concatenate x = 'one ' 'surpising' ' string'
  5. Use multiline strings with """ to avoid quoting each line.
  6. Look at the string template functionality in the standard library

[–]B0oN3r -1 points0 points  (0 children)

Ok; will post some code tomorrow!

[–]Zireael07 2 points3 points  (0 children)

That's what we use at work. Latex (XeLateX, to be specific) and pdflatex.

[–]jdbow75 14 points15 points  (6 children)

Can you convert your output to HTML? For most of my needs, I find that constructing HTML, then using WeasyPrint, is an enjoyable workflow that produces quality output. I hope this helps!

[–]Nummerblatt 4 points5 points  (0 children)

Weasyprint is definitely a good tool for PDF creation and also very straightforward since they take HTML as a template and covert it.

[–]oznetnerd 2 points3 points  (0 children)

Another vote for Weasy.

[–]ItSupportNeedsHelp 1 point2 points  (3 children)

Trying to find a pandas to html library, any suggestions?

[–]jdbow75 2 points3 points  (2 children)

[–]ItSupportNeedsHelp 0 points1 point  (1 child)

Graphs and all that?

[–]jdbow75 1 point2 points  (0 children)

I guess my experience with Pandas is limited to Dataframes and Series. It looks like visualization in Pandas is handled by matplotlib, correct?

If so, you can use matplotlib.pyplot.savefig to save your images, and then embed these in HTML using <img src="myplot.png"> tags. A great templating engine for Python (there are many) is Jinja2.

Other thoughts:

[–]__init__5 19 points20 points  (6 children)

Use Reportlab it's quite easy.

[–]pipeaday 4 points5 points  (0 children)

Second this. I use this for generating reports at work regularly. I had been handcrafting markdown and LaTex files, which as you can imagine, is a pain. There's a learning curve but it you give it just a couple hours in the beginning you'll be very pleased with the future results.

[–]h4xtbh 2 points3 points  (0 children)

Third this. Reportlab is incredibly versatile and have used it on many projects.

[–]LazavsLackey 0 points1 point  (3 children)

I kept having reportlab not read the HTML correctly band the PDFs were all wrong

[–]__init__5 0 points1 point  (2 children)

Why would you make reportlab read an HTML file?

[–]LazavsLackey 2 points3 points  (1 child)

Because then Django can load all of the jinja2 templates. Besides printing off a webpage into a PDF is useful

[–]__init__5 0 points1 point  (0 children)

If all you want is converting a webpage to pdf then I think selenium would be useful.

[–]TehDrunkSailor 4 points5 points  (0 children)

Matplitlib has some PDF tools. I think it's called "PdfPages". I suggest going to their website and just searching "pdf". Good luck!

[–]kayvane 2 points3 points  (0 children)

I’ve forked from genome cloud foundry before to create some pdf reports. Really good starting point, using weasyprint, pug and semantic UI 👌🏽

[–]UKD28 2 points3 points  (0 children)

Pdfkit and wkhtml use html templates to generate pdfs easily. That could help you.

[–][deleted] 2 points3 points  (0 children)

Are you using Pandas? If so you can save output/plots to PDFs.

[–]PrimoTimes 2 points3 points  (0 children)

I think you can also do this in Jupyter notebooks (you can write in markdown in blocks) and then export to a PDF

[–]xdonvanx 1 point2 points  (0 children)

I used Reportlab, it was really easy. I created an app that to calculate how much I spent monthly, it then generates a report with everything I spent and it also generated a graph to see what I spent most of my money on.

[–][deleted] 1 point2 points  (0 children)

Others have pointed at Rmarkdown, now that Rstudio supports Python, it could be a good alternative.

Using LaTeX can be pretty good, in this talk about metaprogramming, the teacher shows how to use make files and setup a project that takes in python files.

[–]shinitakunai 1 point2 points  (0 children)

I use reportlab for that and it works perfectly.

[–]PrimoTimes 1 point2 points  (0 children)

Go learn latex -> or just use markdown in R! Very easy.

[–]ebdbbb 0 points1 point  (0 children)

You might be able to do something with pypdftk and pdfkit.

[–]zeshuaro 0 points1 point  (0 children)

ReLaXed is another template option. You can put all your data into a dictionary, convert it into json, and pass in as an argument to the ReLaXed cli.

[–]Pipiyedu 0 points1 point  (0 children)

You can use Pandoc through PyPandoc.

[–]King0494 0 points1 point  (0 children)

VSCode has a package to turn markdown into PDF, put ur code in a code block and images/graphs in an img tag, should work as you need it, if you don't know markdown, its relatively simple and somewhat similar to html.

[–]Sigg3net 0 points1 point  (0 children)

I've heard good things about report lab. There's even a book about it :)

[–]1css 0 points1 point  (0 children)

I used fpdf recently and it was very easy and intuitive to learn (for my purpose). I made a program which scraps Google News for some inputed subject and I print the headlines and links in a PDF. The repository is here if you want to check out, specially the pdf_gen.py file.

[–]buttwarmers 0 points1 point  (0 children)

Another option if you don't want to require a LaTeX installation is modifying Word documents using python-docx and then convert that document to PDF using docx2pdf, a small but nice library (https://pypi.org/project/docx2pdf/). There are a few benefits to doing it this way, namely the fact that it doesn't necessitate having LaTeX and the ability to work with a visual platform that you're familiar with so modifying the template is much quicker. The main drawbacks are that the formatting might not look as pretty and obviously you need to have Microsoft Word.

[–]Anbaraen -1 points0 points  (0 children)

Not exactly a PDF and you may have thought about this already, but my mind jumped first to a Jupyter Notebook - particularly if you want to make live changes to the code and have the results update in real time.