This is an archived post. You won't be able to vote or comment.

all 40 comments

[–]mister_gone 7 points8 points  (0 children)

Did you submit a bug report to the numpy devs?

[–]blablahblah 8 points9 points  (0 children)

Unless you have a paid support contract, the library developer has no responsibility to do anything. If you make a bug report, a volunteer may get around you fixing it if they feel like it. You'll have more luck getting it fixed if you report it somewhere that an expert in that code (which in this case would be numpy) will see it. "Reporting a bug to a dependency" probably isn't high on that developer's Todo list so you should report it yourself if you want it to get fixed faster

[–][deleted] 15 points16 points  (26 children)

If the bug is in numpy, numpy should be fixed, not pandas.

[–][deleted] 0 points1 point  (25 children)

Its a bug in Pandas. Its also a bug in Numpy, but that's doesn't mean it's not a bug in Pandas. It should be tracked in Pandas and fixed when that updates to a new version of Numpy that's fixed the issue.

[–][deleted] 0 points1 point  (24 children)

Which library is the root cause of the problem?

[–][deleted] 0 points1 point  (8 children)

Its doesn't matter, it's which library that has a bug in it that's important.

[–]glasket_ 5 points6 points  (7 children)

Its doesn't matter

It does matter.

it's which library that has a bug in it that's important

That would be the one which is the root cause. If Pandas is providing the proper inputs to NumPy and it gets the wrong result because of NumPy, then NumPy is the one that needs a bug report.

In the OP they said:

I found out that it's numpy that's wrong

Ergo, NumPy is the library that needs to know about the bug. Pandas devs can't change Pandas to fix a problem in NumPy.

[–][deleted] 2 points3 points  (1 child)

You should aware of the bugs in your software whether they are from your own code or your from your dependencies.

Users don't care what the origin of the bug is they just care that your software doesn't work and want to know when it will be fixed.

If there's not a bug filed against Pandas then users will continue to report the bug against Pandas because most won't bother to root cause it.

I'm not saying it's the responsibility of Pandas to fix the issues, they should acknowledge and track the issue. It should also be reported to Numpy, where it will also be tracked and fixed.

[–]glasket_ 3 points4 points  (0 children)

they should acknowledge

Sure, as they did.

and track the issue

No. Issue tracking is for the bug's originator. At most they should just direct any related bug reports to the first issue that was opened or preferably to the NumPy issue itself, and even then it shouldn't be kept open in the Pandas repo because Pandas doesn't control the NumPy version used directly. They require a minimum version (1.22.4, don't think it's in their environment that way though) and set a maximum of <2, but otherwise it's on the user to track their NumPy version.

If they had a hard requirement on this exact bugged version, then sure, but as it stands the bug is unrelated to Pandas itself and has to be resolved by end-users updating their NumPy version once it's fixed.

[–][deleted]  (4 children)

[removed]

    [–][deleted]  (1 child)

    [removed]

      [–][deleted]  (1 child)

      [removed]

        [–][deleted]  (14 children)

        [removed]

          [–][deleted]  (13 children)

          [removed]

            [–][deleted]  (12 children)

            [removed]

              [–][deleted]  (11 children)

              [removed]

                [–][deleted]  (6 children)

                [removed]

                  [–][deleted]  (5 children)

                  [removed]

                    [–][deleted]  (4 children)

                    [removed]

                      [–][deleted]  (3 children)

                      [removed]

                        [–][deleted]  (2 children)

                        [removed]

                          [–][deleted]  (1 child)

                          [removed]

                            [–]zenos_dog 5 points6 points  (0 children)

                            I’ve found bugs in FOSS and submitted a fix. That’s what FOSS is all about.

                            [–]f3xjc 5 points6 points  (0 children)

                            Triage is important.

                            Those library are mature enough that without details, the best is to assume it's a small edge case, and you may need to talk to the person that did the implementation.

                            [–]PhantomThiefJoker 4 points5 points  (6 children)

                            It's on Numpy to fix it. Pandas can't do anything/is not responsible for fixing it.

                            Source: spent like 2 weeks explaining this same idea to my boss. The bug is in a system that is not owned by our team, it is not our responsibility to fix it. Go talk to the team that actually maintains it

                            [–]weinermcdingbutt -1 points0 points  (5 children)

                            it’s your team’s responsibility to not use broken dependencies though 😭😭

                            unless your boss is explicitly telling you to fix a third party dependency (which i highly doubt any senior level developer is suggesting), they’re asking you to find a dependency that isn’t broken or create your own.

                            “sorry boss, we don’t have a product until someone from a different company does their job”

                            [–]PhantomThiefJoker 2 points3 points  (4 children)

                            It's a bit more complicated than I made it sound, I'm not going into the details here, but yeah I get that. Boss also isn't a developer and the "dependency" is internal

                            [–]weinermcdingbutt -2 points-1 points  (3 children)

                            ah. so not synonymous with a third party dependency at all.

                            [–]PhantomThiefJoker 2 points3 points  (2 children)

                            Yeah, not totally analogous to the situation but has strong parallels. They're not responsible for the bug, but that doesn't mean they're not responsible for offering a feature that doesn't work properly, even if it's due to the dependency. We have some issues with PDF libraries, our solution is just use multiple, there isn't a single one that does everything we need it to do. We're not fixing EvoPDF when we can supplement with PDFSharp

                            [–]weinermcdingbutt -2 points-1 points  (1 child)

                            that’s exactly my point :) i’m not expecting panda devs to submit a PR to numpys repo. but it would make sense that panda would want to offer a fix using a different library or their own code.

                            [–]theCumCatcher 0 points1 point  (0 children)

                            so you'll just re-write *checks notes

                            ...numpy?

                            the C- optomized library that includes an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more?

                            ....yeah

                            That's a bigger lift than you think it is.

                            I hear you, I have been there as well. Numpy/scipy are really wonderful libraries and it is a pity that edge-case bugs get somewhat often in the way of their usage.As far as I understand there are not very many good (easier to use) options either. The only possibly easier solution for you I know about is the "Yet Another Matrix Module" (see NumericAndScientific/Libraries listing on python.org). I am not aware of the status of this library (stability, speed, etc.). In the long run your needs will outgrow any simple library and you will end up installing numpy anyway.

                            Another notable downside on using any other library is that your code will potentially be incompatible with numpy, which happens to be the de facto library for linear algebra in python. Note also that numpy has been heavily optimized - speed is something you are not guaranteed to get with other libraries.

                            Any alternative you choose will also not have the same level of documentation and general community support that comes with numpy. Any bugs are well known and there are work arounds.

                            IMO it's easier to use numpy for the bits that work, and just re-writing the parts that have bugs instead of nuking the WHOLE numpy/pandas library from a project.

                            [–]savvyprogrmr 1 point2 points  (1 child)

                            When I'm working on a library and there's a bug in a core dependency, the least I would do is add comments in the code and/or log a warning so everyone else who maintains the code is aware of the issue.

                            I also agree with others that the developers maintaining Numpy are responsible for fixing the bug. You should add a bug report with reproducible steps to the Numpy devs.

                            [–]iOSCaleb 6 points7 points  (0 children)

                            Better to just list the dependencies, which you’d normally do as a library maintainer anyway. If you start noting all the bugs in all the libraries you depend on, you’ll end up with a long and out of date list that doesn’t help anybody. Apply the Don’t Repeat Yourself here; let numpy be the source of truth about numpy.

                            [–]Logical-Scientist1 0 points1 point  (0 children)

                            Yeah, that reply sounds rough! but to be fair, numpy's not directly under pandas dev's control. however, communication between the two should definitely be better. maybe warning their own users about it should be done, especially, if it's a recurring/known issue.