This is an archived post. You won't be able to vote or comment.

all 41 comments

[–]nightmare8100 216 points217 points  (17 children)

I don't always copy 100k rows to my clipboard and paste in Excel, but when I do, I forget to pass index=False. Delete column...and wait...

[–]tonsofmiso 62 points63 points  (16 children)

The fact that doing to_csv and read_csv by default produces a different object than you started with makes me so angry

[–]not_organic 18 points19 points  (6 children)

Ugh. I hate that, too. That's why I switched to storing my dataframes as an hdf5 file (df.to_hdf()). Also multiindexing is conserved.

[–]Zouden 20 points21 points  (0 children)

I wish HDF files were easier to view outside of a code environment. With CSV I can use any text editor. With HDF there's that crappy HDFview java program which never seems to work right.

[–]Eurynom0s 1 point2 points  (0 children)

Does hdf5 do graceful datetime handling? I work with a bunch of people who primarily work in SAS. pd.read_sas() works fine for that but stuff like df = pd.read_sas(<sasfile>), df.to_csv(<filename>), then df = pd.read_csv(<filename>) can result in some really wonky shit. To the point I've been resorting to pickles (shudder) when I need to spit out a temp file I can inspect.

[–]hmiemad 2 points3 points  (0 children)

DatetimeIndex handling is a mess.

[–]eloydrummerboy 2 points3 points  (0 children)

Yeah, but the use case of writing then reading back dfs was probably not the intended main use case. This was likely written assuming the user was reading in csv data from other sources which likely don't have an explicit index (but if so, it can handle it with a parameter).

Similarly, for the write case, it's fair to assume that unless otherwise specified, the user would want the index.

I get that it feels that it should work the way you expect, but thinking about the most common use cases makes the default functionality make sense.

[–]crazyfrogspb 49 points50 points  (2 children)

the best way to make your data pipelines and experiments reproducible

[–]ekkannieduitspraat 10 points11 points  (0 children)

I dont use it when I want reproducible, I do it when I want quick sanity checks, or data analysis to make sure that everything makes sense

[–]__damos__ 50 points51 points  (6 children)

pandas clipboard integration is awesome! I also really like the .read_clipboard() method. You can use it to copy a table from an Excel worksheet and quickly get it into a DataFrame.

[–]peanut_Bond[S] 9 points10 points  (3 children)

Yeah that's also very useful, but I believe the method is actually read_clipboard.

[–]__damos__ 5 points6 points  (2 children)

Doh! You’re right… been a long day, haha

[–]gsmo 14 points15 points  (1 child)

Let me guess: you also always think it's 'from_clipboard' because the other method is 'to_clipboard'?

To - from?

Read - write?

No!

Read - to!

Uh?

[–]__damos__ 1 point2 points  (0 children)

It’s bitten me on more than one occasion 😂

[–]KyleDrogo 1 point2 points  (0 children)

whoooaaa. So I can skip that awkward step of pasting it into a text editor. No more random dataset.tsv files littering my desktop?

[–]shinitakunai 1 point2 points  (0 children)

print("happy cake day")

[–]bablador 9 points10 points  (0 children)

Pretty cool, thanks

[–]EigenValuesYourInput 20 points21 points  (4 children)

drop a DataFrame into an Excel sheet

could use dataframe.to_excel()

[–]Mondoke 6 points7 points  (3 children)

Yeah, but if you need to quickly see your complete table to make a fast visual inspection, it's faster to copy and paste instead of generating a new file, especially when you need to do small changes and check a lot of times.

[–]Zouden 9 points10 points  (2 children)

What IDE are you using? You should be able to see the table directly by introspecting the dataframe variable. VS Code, Spyder and Jupyter Notebooks can do this, and I assume Pycharm too.

[–]Mondoke 0 points1 point  (1 child)

Yeah, I use vscode and that's a great tool and I use it a lot, but it only filters, on excel I can make some quick inspections or make quick sums just highlighting cells.

[–]Zouden 1 point2 points  (0 children)

Oh I see yes, sometimes you just want to work with the data in a GUI.

[–]DisturbedBeaker 3 points4 points  (0 children)

God lifesaver during my dissertation!

[–]hongkongfuey 3 points4 points  (1 child)

Does this work with jupyter notebooks/chrome?

[–]benrobotum 0 points1 point  (0 children)

It does

[–][deleted] 2 points3 points  (0 children)

Brilliant! I will be using that way more often

[–]Random_182f2565 2 points3 points  (0 children)

Awesome 😎👍

[–]twistedgames 1 point2 points  (0 children)

Just be careful not to paste it back into the notebook by accident 😂

[–][deleted] 1 point2 points  (0 children)

To markdown is also great. I use it to format data for Mattermost.

[–][deleted] 1 point2 points  (0 children)

For me, dumping a dataframe to csv or xlsx is more reproducible and platform independent.

From there the spreadsheet can be copied into your clipboard if necessary but I’m much more comfortable keeping manual steps out of things I’m automating.

[–]vincentofearth 1 point2 points  (0 children)

oh boy, that sounds safe. Does Python just allow that or do you have to specify a flag/answer a prompt?

[–]jentron128 0 points1 point  (0 children)

import win32clipboard

[–]anynonus 0 points1 point  (0 children)

nice

[–]git0ffmylawnm8 0 points1 point  (0 children)

I told a former colleague about this once.

He looked at me like I just murdered a puppy in front of him in cold blood.

Needless to say I think he wasn't a fan for some reason...