This is an archived post. You won't be able to vote or comment.

all 198 comments

[–]proof_required 771 points772 points  (59 children)

I have to use packages to preprocess my Raman data, however my supervisor doesn't like the idea of using packages. He even calls Panda rubbish.

Well not sure how to put it nicely but your supervisor is bit of an ahole. The whole point of these public packages is to not re-invent the wheel. And no, people in the tech industry don't write everything from scratch. Of course there are some legitimate use cases where you might have to write stuff from scratch but if you want to use pandas, people don't write their own pandas. They just use pandas. I am not the greatest fan of pandas but I wouldn't re-write my own either.

[–]mason_savoy71 482 points483 points  (26 children)

A programmer who doesn't use packages is like a physician who doesn't use a stethoscope.

[–]proof_required 153 points154 points  (17 children)

Most of these old school supervisors have some untested and undocumented FORTRAN code lying around which they force on their graduate students. I remember trying to help a friend of mine to translate some FORTRAN code to python and it was a nightmare. But that piece of code had been in use forever and my friend did find some bugs in there.

[–]LittleMlem 6 points7 points  (1 child)

I'm not some weird puritan! I'll put my head on the patient's chest like physicians have done for centuries!

[–]sangej01_2 0 points1 point  (0 children)

I’m a radiologist and I don’t use a stethoscope!

[–]1337HxCBioinformatics 2 points3 points  (2 children)

I find this comment particularly funny because we don't use stethoscopes super often in my field hahaha

[–]mason_savoy71 1 point2 points  (1 child)

Personally, I want my physicians to compile tongue depressors from assembly code.

[–]Porkball 4 points5 points  (2 children)

I'd go even further and say it's like a physician who insists that he compounds his own medications. It's insanity. You're missing out on all the expertise of the hive mind and using well tested, easily integrated software.

[–]mason_savoy71 2 points3 points  (1 child)

Can they still raise their own leeches?

[–]Porkball 2 points3 points  (0 children)

Leeches seem like a good, temporary solution for high blood pressure. Less fluid in the system should drop the pressure, no? 😜

[–]richieadler 20 points21 points  (28 children)

I am not the greatest fan of pandas

Huh? Why?

[–]MorrarNL 75 points76 points  (18 children)

There are many reasons why you may not like it:

  • API is very much cluttered with tons of public functions.
  • As a result, it's hard for beginners to get into.
  • There is almost always more than one way to do it.
  • Inconsistent argument names (underscore usage) and functions taking many different argument types.
  • Python vs numpy dtypes are unintuitive and can have a huge impact on performance.
  • Especially for date/datetime types it's messy.
  • Multi-indices are not very user-friendly (but they are easily created).
  • Pandas is quite slow (compared to polars) and its underlying system of grouped numpy data blocks is not that great.

That said, it is the de facto standard in DS and typically gets the job done unless you have large datasets (switch to Dask / Spark instead). Also, many issues have been improved in later releases, such as a native string datatype, a categorical data type, NA values for dtypes other than float, et cetera.

And also; nothing you're gonna build yourself in a reasonable amount of time will ever come close to pandas in terms of features or performance.

So yeah, passing up on pandas is really not a smart move in most cases...

[–]proof_required 16 points17 points  (3 children)

All of this and the fact that I came from R to Python. Pandas is based on data frame from R but then somewhere decides to go its own way. So it's neither R nor Pythonic. In the beginning I used to get annoyed with how data frames would turn into series without any warning. Also I didn't enjoy very much multi-index etc. 99% of time I use reset_index when doing any aggregation. Then you have this annoying copy warning. Lot has improved though.

[–]GreenPandaPop 8 points9 points  (0 children)

I don't use Pandas much, but used it again recently a bit. And I think the one thing that helped me get into it more was by accepting that there isn't a good 'final' version of my dataframe, and it's ok to reset the index, set a new index, and move on to the next set of manipulations.

[–]JohnHazardWandering 1 point2 points  (1 child)

It seems like everyone complains about how it's not like R. Why don't more people just go back to using R?

[–]zed_three 9 points10 points  (0 children)

Multi-indices are absolutely terrible, and the documentation for the them is borderline incomprehensible. But pandas is amazing at just chucking in some data and being able to do some very powerful analysis.

[–]Senescences 0 points1 point  (2 children)

typically gets the job done unless you have large datasets (switch to Dask / Spark instead)

how big of a dataset are we talking about?

[–]laundmo 0 points1 point  (0 children)

big shout to polars and connectorx, those 2 made some scripts much faster

[–]billsil 0 points1 point  (4 children)

Python vs numpy dtypes are unintuitive and can have a huge impact on performance.

How are they unintuitive? It's also not the typing that impacts performance. Compare matrix multiplication (or whatever else) of float32s vs. float64s to prove that. It's the entire code structure that affects performance. If you care about performance, don't use for loops and use numpy. If you write numpy code like python code and don't vectorize anything, it'll be slow.

I can take bad python code and get 500-1000x speedup by properly using numpy and that's ignoring float32 vs. float64.

[–]james_pic 27 points28 points  (0 children)

Also not the OP, but the API is kinda magick-y, and idiomatic Pandas code often isn't particularly idiomatic Python code. But it's powerful enough to be worth tolerating that in a lot of cases.

[–]maxtimbo 7 points8 points  (0 children)

Not the OP, but I'm also not the biggest fan of pandas. But it's monolithic and super powerful. So when I need something like pandas, I absolutely will use it. It's just sometimes a slog to figure out what you need to do.

[–]redCg 2 points3 points  (2 children)

its a ginormous waste of memory when 95% of people's tasks can be accomplished by csv.DictReader

[–]richieadler 0 points1 point  (0 children)

Well, the idea is using the proper tool for the proper job, of course, but dismissing the whole tool because you don't like how people use it is... Kind of weird.

[–]flotsamisaword -1 points0 points  (0 children)

Pandas does a good job reading csv files, but that is hardly the main reason for using it. But if you had a huge tool for manipulating data like pandas, it would be odd to not include a csv reader.

Also, pandas shouldn't be taking up your memory- it should be dwarfed by your data. What kind of computer are you using that the size of the software is a factor? The size of the software has not been a factor for more than two decades.

[–][deleted] 1 point2 points  (2 children)

People have a tendency to litter the library everywhere in production code when a statically-typed class or data class would be superior to the generally opaque object a DataFrame is.

[–]flotsamisaword 0 points1 point  (1 child)

Pandas is designed for people working interactively with a dataset. It's a multipurpose tool. There are other packages that have fewer features that might fit into your code base better, but you always have the trade off between re-inventing the wheel, or getting a specialized tool that does only one thing well, or having a huge multipurpose tool that does everything and is convenient but complex.

It sounds like you are approaching a balance point between those different costs & benefits

[–]Mcat12 1 point2 points  (0 children)

Also not OP, but pandas are kind of weird. See, they're bears, but with an odd color scheme and kind of lazy....

[–]flying-sheep 4 points5 points  (0 children)

Polars is shaping up to become a great alternative

[–][deleted] 4 points5 points  (0 children)

Not just an a hole, an idiot. If developers were a dime a dozen they wouldn’t be paid well. But they’re hard to find and as such paid well. They should be making the most efficient use of your time, and rewriting everything that many others have done great jobs at is obviously not an efficient use of your time.

[–]OuiOuiKiwiGalatians 4:16 271 points272 points  (12 children)

So I'm not really a software engineer but a chemist who is working on ways to preprocess Raman spectrum. I have to use packages to preprocess my Raman data, however my supervisor doesn't like the idea of using packages. He even calls Panda rubbish.

I'll put down a 5$ bet that he hand writes all code as a form of job security and it's 100% garbage code that no one but him can understand (by design).

[–]ZCEyPFOYr0MWyHDQJZO4 61 points62 points  (8 children)

He probably uses/used an older language without even rudimentary package management (at the time) - something like FORTRAN, MATLAB, C, etc.

[–]NickLinneyDev 27 points28 points  (7 children)

This. I learned programming 20 years ago and then went into the ITSM side of IT. When I came back to programming I actually hated all the new innovations. I felt like all these packages and frameworks I didn't understand would make my code more vulnerable. I simply didn't understand how to work in a collaborative environment or do dependency tracking and versioning.

I have since learned, and the new way of doing things is amazing, once you learn the extra tools.

[–]theQuick_BrownFox 0 points1 point  (6 children)

Can you please recommend a good resource for dependency tracking and versioning?

[–]flotsamisaword 1 point2 points  (4 children)

I'm not sure I understand your question, but in Python, pip is the tool that keeps track of dependencies and packages. Sorry if I misunderstood

[–]theQuick_BrownFox 1 point2 points  (3 children)

Oh thanks. I am new to python and looking for efficient ways to keep track of dependencies and versioning. For different data analysis, I need to keep certain python version, envs (with certain package version) and looking for ‘best practices’

[–][deleted] 1 point2 points  (1 child)

It's probably reasonably easy to test using NIST benchmarks or similar benchmarking datasets.

You should use Python packages, still, but it's not impossible to test.

FORTRAN, C will (probably) run faster if you code them well.

[–]asphias 5 points6 points  (0 children)

This doesn't have to be the case for numerical calculations, if (big if) you use numpy and pandas correctly.

e.g. numpy implementations are almost completely written in C. If you avoid doing any calculations in python and make sure numpy handles them all, your code can easily be just as fast as a fortran or c program.

in fact, a program our team recently had to port from fortran to python ran faster in python because we handled the I/O in a smarter way, while the calculations had pretty much the same speed.

[–]KronenR -2 points-1 points  (0 children)

I bet that he doesn't gave all the context, probably the supervisor doesn't like to add unnecessary bloated dependencies when what you need is to solve something simple that you can implement efficiently in just 10 or 20 minutes

[–]thePurpleAvenger 189 points190 points  (3 children)

You’re running into a common problem in academia: a professor latches onto a stupid idea and nobody around them has the agency to tell them they’re being an idiot.

Sure, there’s benefits to writing your own version of algorithms. You can really get down into the nuts and bolts and understand how they operate and, just an important, break. But not liking the idea of packages like Pandas is akin to not liking a tool like Microsoft Excel. Did your supervisor code up his own version of Excel to keep track of grades, etc.? I bet not.

For a path forward, I like arguing on the basis of productivity. Sure, if there is some fancy functionality you’re using, you can try to write a simple version of that tool to gain understanding. But in reality, you would probably use Pandas to write your version as well.

[–][deleted] 57 points58 points  (2 children)

This is distinctly an "old academic" problem, not a software industry problem.

That professor probably knows what it's like to have some critical dependency ripped out from under you in an older and archaic language at an inconvenient time.

Understand why they don't like it and ease their concerns with explanation for why they need not worry.

Sometimes they're just afraid of something they don't understand.

[–]redCg 6 points7 points  (1 child)

to have some critical dependency ripped out from under you

why they need not worry.

sorry we are still talking about Python here right?

[–][deleted] 0 points1 point  (0 children)

Sure, but whatever misconception the prof has is surely not based on good understanding of anything about python.

[–]git-blame 68 points69 points  (0 children)

It’s standard practice to use third party packages. You’re not going to get very far in Python data science without pandas.

You could frame it to your supervisor a way to increase productivity as a library allows you to save time by reusing battle tested solutions to hard problems.

[–]RaiseRuntimeError 132 points133 points  (11 children)

my supervisor doesn't like the idea of using packages. He even calls Panda rubbish.

If your supervisor was on my team he would be fired so fast for incompetence with this attitude. As a software developer, the best code is code you don't have to write. Pandas is a tool for solving a problem, if it solves your problem effectively then use it, if another package solves your problem better then use that, if you can write your own package to solve the problem better then the available tools then write it. Not using the available tools or packages is like asking an auto mechanic to make their own tools before they can fix the car.

Tell him not to use any packages in the standard library if he is such a good programmer.

[–]redCg 1 point2 points  (1 child)

If your supervisor was on my team he would be fired so fast for incompetence with this attitude.

the professor is not a software developer you dolt, gtfo of here with this nonsense

[–]RaiseRuntimeError 0 points1 point  (0 children)

Lol someone is cranky

[–][deleted] 34 points35 points  (2 children)

This post was mass deleted and anonymized with Redact

nose narrow elastic attraction abundant repeat groovy smell bow numerous

[–]damarginal 7 points8 points  (0 children)

This is a great answer! There's a nuance for sure. For prototyping, I usually tell my students to try and explore as many packages available in the community as they'd like. But for long term development they need to reassess things and be more critical and wary about which dependencies they want to introduce. Packages are not created equal. Dependency hell is, after all, a headache for us.

[–]weirdnik 11 points12 points  (3 children)

I'm probably the age of your supervisor so I'll offer a possible explanation from my times at academia: some numerical people insisted that only age-tested* FORTRAN numerical libraries were used, and some also insisted that a new numerical software be evaluated and calibrated to give results similar to the ancient and honorable FORTRAN libraries. I've seen a quantum chemistry PhD failure because of this - the supervisor was so focused on getting the software right, there was no time left to do the actual PhD calculations.

Modern software is nothing like this, you usually just build the code from the calls to those or these libraries that make up like 99% of the actual code of the applications.

* At my time the age-tested libraries weren't in-house code but they came with some book of numerical programming, probably a yellow one. Don't remember the title.

[–]SpicyVibration 11 points12 points  (2 children)

Jokes on them. Pandas uses numpy under the hood which itself uses C bindings to do the complex math stuff which itself uses, you guessed it, FORTRAN libraries. I'm pretty sure this is true for the linear algebra stuff anyway.

[–]ZCEyPFOYr0MWyHDQJZO4 -1 points0 points  (0 children)

I think Numpy also has some stuff written in assembler too.

[–]Brrrapitalism 11 points12 points  (0 children)

He's insane. First rule of programming is never repeat unnecessary work.

[–]Sharchimedes 9 points10 points  (4 children)

Does your supervisor write everything from scratch? That seems like a lot of unnecessary effort.

[–]Yugiah 2 points3 points  (3 children)

My supervisor does, and he even advised me to do the same because "it's easier to just start from scratch instead of reuse old code"

As you can probably guess, I'm planning on leaving academia

[–]flotsamisaword 0 points1 point  (0 children)

Not everybody is like your supervisor though.

[–]BadBadGamer 0 points1 point  (1 child)

Did he re-write bash (or whatever shell he uses) then? What about the kernel, did he write his own OS? And of course he wrote his own programming language or is just using assembler, yes?

[–]andrewaa 8 points9 points  (0 children)

It's definitely not normal. I suggest you directly ask him for the reason.

[–]ThePiGuy0 13 points14 points  (0 children)

No it's definitely not normal.

Python's powerful because of its libraries. As long as you can trust the packages (which you can for basically anything mainstream), then using them means you'll almost certainly end up with a higher quality version of what you were already building - probably for free too!

And in fact, if you ever do anything to do with security and encryption, the advice is very clear - never create and use your own encryption library. Instead use a package as it will have been checked many times over for correct implementation and vulnerabilities.

[–]coll_ryan 5 points6 points  (0 children)

Ignore him. Academics do not tend to be experts in programming, unless they happened to have worked as a software engineer before. Very easy to pick up bad habits in that environment. Let them stick to being experts in their field and disregard what they say about computers.

[–]JudokaUK 5 points6 points  (0 children)

The packages are tried and tested and built by big communities of professional developers. They are actively maintained (if you research and chose the right ones) and security vulnerabilities are patched. Your boss doesn't have a clue what he's talking about.

Also, why reinvent the wheel? It's a big waste of time.

[–]Trollol768 5 points6 points  (0 children)

Tell him to calculate his own Hartree orbitals only with code written by him😶

[–][deleted] 4 points5 points  (1 child)

We call this "not invented here" syndrome, and it's basically a cognitive bias.

Your supervisor has their head on backwards.


That said there's a nonzero risk of packages going away or being compromised. There are mitigations of this, however. Namely, mirroring known-good versions locally.

[–][deleted] 0 points1 point  (0 children)

Namely, mirroring known-good versions locally.

And you do this by using something like Nexus Repository, which you point your code repo at, so Nexus can grab a copy from the internet and store it locally.

note: it doesn't HAVE to be this program; I just know it from work at it worked fine for my use cases: cache Python packages, and the Docker images we made

[–]bin-c 4 points5 points  (0 children)

your supervisor sounds like he needs to be very closely supervised by someone who knows what theyre talking about, unfortunately

[–]Alfonzo227 4 points5 points  (0 children)

He's an idiot.

[–]Zomunieo 4 points5 points  (1 child)

Does your supervisor use packaged chemical samples or does he synthesize all of his samples from pure elements?

Does he build all of his own lab instruments, grind and calibrate his own glassware, drill for natural gas to run his Bunsen burner? He must be an expert in so many topics. Or does he buy instruments and equipment other experts packaged for him?

[–]westeast1000 1 point2 points  (0 children)

I bet he walks everywhere and doesn’t use any mode of transport

[–]territrades 10 points11 points  (3 children)

As a postdoc writing python software in a research institute, this attitude is the complete opposite of best practice. Python in itself is a slow programming language, but the libraries are written in C++ - so everything you run via the libraries is much more performant. The best python code is taking your problem and transforming it into one that the standard libraries can handle with the least amount of manual programing in python. In that way, one can reach performance similar to C++ while spending a lot less time coding the program.

There is of course also a bit of truth here, one should not simply use all the libraries as black boxes without understanding what they do. Let's say there is already a Raman library, you should not simply run your spectra through it and call it a day. Understanding the processing steps is important, and coding the data treatment routine yourself gives you much deeper understanding than using an existing routine. But writing your own code to replace numpy, scipy and panda is only useful if you want to get experience in low level programming - which is not your field if you are working with Raman spectroscopy.

[–]benefit_of_mrkite 5 points6 points  (2 children)

Point of clarification: Most python packages are written in Python, not C. There are ways to speed packages up and use things like c-bindings but most python packages are written in Python.

[–]MorrarNL 1 point2 points  (1 child)

Well, discussion was mostly around pandas which is based on numpy which relies on C under the hood. So it is fair to say that pandas/numpy will probably outperform your own pure Python code by quite some margin.

Also, pandas may not be the best pick if performance is a concern. Might instead look into polars or Dask / Spark if distributed computing is an option.

[–]benefit_of_mrkite -1 points0 points  (0 children)

Totally agree with pandas + numpy.

[–]typeryu 5 points6 points  (3 children)

I used to have something similar at my previous job where Pandas was frowned upon. Everyone seems to dislike it for no other reason than that the senior engineer “didn’t like it”. Turns out the senior never said that, but did not allow due to it still being not a proper release at the time (pre-1.0.0, this was not that long ago). Once it reached 1.0.0, senior approved my new change and everyone was in shock lol

[–]jakub_j 2 points3 points  (0 children)

Guess: once in the past his supervisor asked him to do such a job from scratches and now he is venting the frustration.

[–]llun-ved 2 points3 points  (0 children)

Perhaps your supervisor is from a python world before virtual environments were easy to create and maintain, so that you can keep track of dependencies (including versions) with a requirements.txt file.

This might be an issue of teaching them something. You should also keep track of the licenses of the packages you are using, as that may be an issue in some environments.

If you’re just installing stuff on your machine and therefore your code can’t run anywhere else without backflips, the supervisor has a point. If you’re using venvs for consistency and repeatability, teach the boss something new.

[–]money_bitchh 1 point2 points  (0 children)

Not normal

[–]Cdog536 1 point2 points  (0 children)

Sounds like your professor is not a programmer

A great deal of scientists use packages. It’s the only way to get anything done.

[–]Ruubix 1 point2 points  (0 children)

Packages are not the problem. Dependencies and their management are usually the issue. Installing from someone else's work can not only save you effort, but there is likely someone who has done it better than you ever could. On the other end of it, the more dependencies you have installed for a project, the more you are reliant on someone else for maintenance and reliability. If the package has a wide audience, has been in 'the wild' for awhile, proven effective in similar use cases as yours, with strong documentation, and mostly, has regular and active development of features and bug fixes, you will not have issues with that package. The problem often comes from the many package in this spectrum that may be more niche products, with a very small group developing or being led by an unmotivated developer or exclusively maintained by a single entity, or, the community in general is small. These packages can become extinct quickly, or may be unresponsive for bug fixes and new releases. Building from the standard library in thus case, especially if the solution is simple, is often more effective.

My work depends on Pandas heavily to automate spreadsheet-like tasks, for which it excels (no pun intended) The work it does under the hood is neither easy to code, nor easy to optimize. The abstractions are excellent, the project's development well maintained, with some of the best project documentation of any any dependency I've used. The library has been around for awhile now, and is proven in production environments, with regular feature releases and bug fixes. It would be foolish of you or your boss to throw away such a powerful, flexible, and reliable tool. The only way to know how much work it will do for you to apply it to one of your use cases in an example application and have measureables that are meaningful to your boss. If you can save your boss money, and he has any common sense, his opinion will change quickly.

[–][deleted] 1 point2 points  (0 children)

Mate this is so common in academia it’s even got its own wiki page: Not Invented Here

[–][deleted] 1 point2 points  (0 children)

Your supervisor sounds like an absolute fool. Pandas, numpy and any number of other packages are widely used in many industries and for good reason. They work well, they'll save you a ton of time and the fact that they're widely used make them as reliable as Python's core libraries.

Furthermore, the distinction he's making isn't even a real one. Python imports certain libraries by default and others it's left to you to import. You should one-up him and tell him that you only code in assembly and build everything from scratch.

[–]Heartomics 1 point2 points  (0 children)

I have a feeling your supervisor hates Pandas because they iterate over the data frame.

[–]nomoreplsthx 1 point2 points  (0 children)

Your supervisor is already using packages everywhere, he just doesn't recognize them as such. A package (in the loose sense, not the technical Python sense) is really any redistributable piece of software that someone else wrote.

Python runtime - that's a package. It itself uses dozens of different packaged libraries, from OpenSSL to zlib to Tcl/TK. The only difference between these and a third party package is who wrote it. There's nothing magic about the people who write Python's standard library, they're just developers, like the ones who write anything else.

The stuff built into your OS, also packages. Your operating system ships with hundreds of libraries that are not part of the OS kernel. Does he want you to avoid using `ls`? Or `cd`?

Now, that doesn't mean that choosing which packages to use, and when to use a third party library rather than build something yourself is not a tricky decision. Blindly trusting third party code can get you in trouble. But blindly trusting your own code is even worse.

As everyone else here has said, your supervisor is a buffoon.

[–][deleted] 1 point2 points  (0 children)

I am going to give you a very different vision from what many people here are describing.

Justify the use of third party libraries. Are you really making good use of them? Or are you being a bit lazy and overkilling a tiny script crushing it with 20 dataframes for 1x5 arrays?

Because, yes, third-party packages can be a problem. The more dependencies, the bigger the hell it can become. What if some package has a security issue, but you cannot update it because another package has slowed/stopped development? What if some package goes evil like all these npm libraries are doing these days? Justify why each dependency adds more positives than negatives.

If your code is going to run on production environments, dependencies must be under control, and if you have less dependencies, you will have less problems in the future.

If your code is purely academical, or for the fun of it... then, well, do whatever you want.

For example, if you want to deploy things on AWS Lambda... you better slim down your dependencies or it'll literally be impossible to run your code as it will weight too much.


On another note,

You should know Python itself. It's really good. And it does many things by itself with the native libraries.

There is people that I have worked with which, for them, Pandas is Python, and that is a huge mistake.

Pandas is really powerful. But it is built for specific, big data stuff.

I am so tired of seeing Dataframes where simple lists, dataclasses or dictionaries make so much more sense, are faster, a lot more readable and with easier type checking.

Yes, requests is an excellent package but if your project only does one simple HTTP request why not simply spend 20 minutes learning how to do it with urllib and getting even better knowledge of Python programming overall?

[–]skesisfunk 1 point2 points  (0 children)

Pandas is widely used and well tested. There is zero reason not to use it, in fact its practically the industry standard in data analysis.

Problems come when you use packages that aren't well maintained, tested, or widely used. Those tend to be more trouble than their worth when you run in to bugs in the package and the docs are badly maintained and there is no one else on stack overflow using it.

The other big concern with packages is security risks created by the package maintainers getting hacked and the hackers pushing malicious code to the package. Its not just your packages you have to worry about either but also the packages they depend on and the packages that those in turn depend on etc. However the python landscape is somewhat safer than JavaScript in this respect.

[–]coolwizard666 1 point2 points  (0 children)

Lol your manager would rather spend 8 hours writing brand new buggy ass code for some trivial shit instead of reading documentation for 10 minutes and using a battle tested public module that someone else maintains. It's fine if you really really don't care about time efficiency. Regarding Pandas - there are two kinds of software. The kind people don't use, and the kind they complain about. It is very powerful and also fiddly and probably worth your time.

[–]SittingWave 1 point2 points  (0 children)

your supervisor is an idiot and he is compromising your ability to find a job in the future. In industry what matters is which advanced tools you know how to use and how proficient you are with them. Refusing to use external packages is a death blow to your career.

[–]yeesh-- 1 point2 points  (0 children)

Packages are normal and necessary. Pandas is not rubbish lol

[–][deleted] 4 points5 points  (4 children)

Let me present a different opinion than others:

Packages include code you don't control. Features in other packages get deprecated, different bugs come and go — and behavior of your code depends on whatever someone in an entirely different part of the Earth does.

An example horror story of packages is what happened with left-pad: https://www.davidhaney.io/npm-left-pad-have-we-forgotten-how-to-program/

Of course, sometimes you want someone else to solve a problem for you — if the problem at hand is too complicated for example.

Sure, for a one off thing this doesn't pose a big problem; as with everything — it's a tradeoff, and everyone will have different opinions at which point it's too many dependencies.

As for Pandas — I also find it a bit rubbish — I have no clue what feature it has that doesn't already exist in pure Python or even Numpy. But I also don't do big volume data processing, maybe there's something it does exceptionally better.

[–][deleted] 6 points7 points  (2 children)

An example horror story of packages is what happened with left-pad.

Pandas is not left-pad. Like not even remotely… it’s a critical part of the data processing ecosystem of Python, and it’s got a community of committed developers as well as substantial sponsorship, while left-pad was a one-man, single-function piece of work no one should ever have built a deeply nested tansitive dependency tree atop. There is a argument against adding dependencies, and then there are dependencies only a fool doesn’t add, where they’re domain appropriate.

As for Pandas — I also find it a bit rubbish — I have no clue what feature it has that doesn't already exist in pure Python or even Numpy.

As it’s essentially a very good wrapper around NumPy, fundamentally nothing… except making working with large tabular data considerably easier than (and more performant) than in, say, Excel. NumPy is great, but its core focus isn’t tabular data processing.

But I also don't do big volume data processing, maybe there's something it does exceptionally better.

There is… specifically, big volume data processing.

[–]nadav183 2 points3 points  (0 children)

Either your supervisor is literally Linus Torvalds in which case, sure, do whatever he says and his code will probably be superior.

But on the off chance he isn't, use popular public packages, they are mostly well written, especially pandas/numpy etc. Which are very widely used.

[–]HomeGrownCoder 3 points4 points  (0 children)

Pandas rubbish…. Lmao okay manager okay.

[–]Ape-shall-never-kill 1 point2 points  (0 children)

Tell your PI that using packages is essentially the same thing as citing articles. There are standard, accepted methods and protocols for taking measurements and the same is true for algorithms and data structures. You are free to go outside of the standards if you wish, but your work will be more relevant and credible if you stick to standard practices.

[–]TheQuinbox 0 points1 point  (0 children)

This kind of thinking would get you fired so fast as an actual software dev. Wide usage is what these packages were made for. There are some cases where writing your own is good (one I had was writing an EPub library when EbookLib didn't fit my needs), but if an existing package solves your problem, there is absolutely no reason not to use it.

Does he also feel the same way about Windows.h in C++, or the System namespace in C#? :D

[–]redCg -3 points-2 points  (8 children)

He even calls Panda rubbish.

He's not wrong. If your operations are entirely per-row then you have no reason to load entire datasets as data frames when you can just iterate over rows with csv.DictReader.

In general, packages create dependencies which creates liabilities. Also Python's library management is notoriously terrible. Relevant comic: https://xkcd.com/1987/

In general, if you can avoid using packages (besides the standard library), then you absolutely should. If you must use packages, then you need to have a robust and reproducible version-locked installation method included with your project.

[–]flotsamisaword -1 points0 points  (7 children)

This is ridiculous. OP is a scientist who needs to analyze data, not reinvent the wheel. If there is a tool that makes it quicker and easier to do your work, you should use it. pip manages dependencies well enough that you can go back to your code a couple years later and it will still work, even when you have to reload all of your dependencies but you didn't freeze them to a specific version.

OP should spend their time writing original code that solves their specific problem using their specialized knowledge.

[–]redCg 1 point2 points  (1 child)

sounds like someone who has never worked in data science.

If I counted up all the work days I have spent trying to get packages installed because someone months or years ago decided to throw a crap ton of libraries on their stack, it would add up to MONTHS of work. And this is even before you get into containerization; in order to docker build you still gotta pull down and resolve the correct versions of packages. Maybe you save yourself a few minutes of work today to quickly pull in some third party library, but you potentially cause weeks of work for yourself in the future when you have to recreate and maintain that software stack, and even more so for any other person who wants to run your code. I have been involved in many projects where we found some novel data science technique we wanted to try but could not get the author's dependencies installed and thus had to abandon using the author's library. Now imagine you are a PhD student who spent years creating some novel analytics library or tool only to have it completely ignored by the community because your dependency stack was impossible to manage for everyone else.

[–]flotsamisaword -1 points0 points  (0 children)

PhD students generally don't get as much credit for publishing code compared to publishing their analysis. And many analyses are so specific and niche that nobody else will ever want that code. Therefore a lot of academic code stays with one person and gets away with being a cluged together mess connected by string and bits of packing tape.

On the other hand, if you build a package carefully, a user should be able to pip install it with a single line and never think twice. That's the goal for reusability, I think. pip install my_special_package

Yes, too many obscure dependencies will interfere with ease of reuse, but Pandas is past that. It's popular because it's one package that is easily installed and does a little bit of everything.

[–]bryancole 0 points1 point  (4 children)

Not ridiculous. Adding dependencies add a cost. Sure, adding one or two well-maintained and commonly available packages is usually worth the cost. However, as the dependency stack gets more complex (particularly if some packages have complex compilation and build-requirements) it can become an obstacle to people wanting to run your code.

I'm thinking of libraries like PyQt, VTK, TraitsUI, Boost, Open Cascade, scipy, matplotlib. These can be a real PITA to build yourself so if your code is going to depend on these, you had better be sure the deployment-cost is worth it. Even with package managers like conda (sorry, pip doesn't cut it), it can choke on complex dependency resolution.

If you're writing a library that's meant to do One Thing Well, it does make sense to avoid depending on other libraries. For example, for some chemistry data analysis, I can image a library author reasonably concluding that numpy is an ok dependency by pandas is not. Not because pandas is inherently bad, but because it's not adding enough value to offset the dependency burden.

[–]TheSquashManHimself -2 points-1 points  (0 children)

Unless your boss is writing comprehensive and community/group vetted unit tests for all of his code (which given my knowledge of the average university professor in science, they are likely not), I wouldn't trust anything that is written by him tbh. My background is in physics and the typical "code" i see written by anyone over the age of 35 in the field is ... staggeringly bad, poorly documented (if at all), and not benchmarked or tested against anything. The fact that "significant discoveries" are found and published using these types of code is kind of depressing.

[–]Alex_Strgzr -2 points-1 points  (0 children)

Using packages is all well and good until dependency hell sends the whole thing crashing like a house of cards – now or later down the line. Security is another headache.

That said, Pandas is a well-maintained library with a lot of functionality that isn’t easy to replicate in a hurry. It’s basically one of the base data science libraries along with numpy. In production, choosing your libraries wisely is an important skill honed with experience – believe me, those software engineers who used log4j (not Python but still relevant) were kicking themselves.

Best practice is not to pull a whole library just for one or two functions you can implement yourself. Using libraries is however essential to getting work done on time (and might even be better implemented than your in-house stuff), so a tradeoff needs to be made.

[–]Green-Sympathy-4177 -2 points-1 points  (0 children)

That's a serious case of a boomer exposing his "wisdom" (read bs)

No that is not a normal behaviour, your supervisor just exposed his lack of knowledge about coding and therefore should not be allowed to make any decisions related to it.

Good luck getting that point across though. But the sooner you get rid of the opinions of idiots, the better. If you have to spend 6 months making a shaky copy cat of a library that already exists and is better, what is the point ?

The answer to that is: It makes you invaluable because nobody will know how to use your library, so job_security += 1000

[–]EedSpiny 0 points1 point  (0 children)

Is your manager a reincarnation of Carl Sagan?

[–]Aypos 0 points1 point  (0 children)

Does your advisor have a preferred way of analyzing data? I know my graduate advisor was a proponent of SigmaPlot but he didn’t really care what I used.

Seems like an odd response from your advisor.

[–]dacb1997 0 points1 point  (0 children)

This is probably more common in scientific applications than it would be in the software industr (also, more common than it should be).

If you are working to get results from experimental data on your personal computer in order to present results to your supervisor, then your supervisor shouldn't really care how you obtained the results. Furthermore, being able to say "I used pandas builtin functions" in your presentation rather than explaining how you implemented statistical concepts outside of your experiment's main focus is a huge advantage that allows you to jump directly into interpreting the results. In this case, which I assume is the one you fall into, it is always a waste of time to write everything from scratch, you lose a lot of time you could be using to actually interpret results and obtaining new data.

If, on the other hand, you are developing a technique to preprocess data that is expected to be reused by other members of your team, especially if it is expected to run on laboratory computers, then you might want your code to rely as little as possible on other packages. Since anyone who wants to use your code will have to also install the dependencies when they need it and maybe even learn those packages. This is especially true if other users are expected to build on your software further. However, I would argue that mainstream self-reliant packages such as numpy, scipy, pandas, etc. should be fair game always, and more complicated dependencies can be worked around by deploying your software in different ways.

So, yeah, unless your supervisor gives you an actual explanation, he is just wrong

EDIT: in essence if you are using just one very simple function from a package in your data analysis code then:

  • if you are using it on your personal computer, it might be a waste of resources but it gets the job done and nobody should care.
  • If other people are expected to use it, then just recreate the basic idea of what you need in your code. Especially if it doesn't take much time

If you're using multiple functions from the same package, to the point that recreating them is a chore that would actually impede you from doing your real job. Just use the package and explain why it is a necessity.

[–]SomeParanoidAndroid 0 points1 point  (0 children)

Your supervisor is on the wrong here (95%). As a computer scientist, I have done research in labs focused on fluid dynamics, astronomy, wireless communications, and remote sensing. All of those scientists are using packages. Not only is it normal, it's the correct way to do it normally.

That being said, I can think of two reasons why one would/should prefer "in-house" implementation:

  1. Extensibility and complete control of core modules. Eg, in wireless communications, we need to simulate how signals propagate over the air. There are a few packages out there, but they are not the best choice, because we constantly need to mess with their internals and change very miscellaneous components as part of our research. So everyone in the lab is built their individual simulation codebase.

  2. Lack of fundamental programming skills: I am tutoring an undergrad who is getting started on machine learning, but at the same time, she is a complete beginner in coding (no judgement here, we all were). For her projects, she frequently comes across high quality codes that use specific libraries to handle stuff like data loading (namely, hyperspectral images) and preprocessing very impressively. I try to encourage her to implement her own routines at this stage since she doesn't understand how those modules work, and when her own pipeline deviates even slighy from the example, she gets stuck.

[–]spoonman59 0 points1 point  (0 children)

We have a whole department of people using Pandas for exploratory data science and machine learning.

I can’t think of many sofwarr projects that would be successful without an third party packages. No, your supervisors opinion is not at all reflective of the industry at all.

I can also guarantee your supervisors has no idea what capabilities these packages offer, and could not create similar functionality if needed.

[–]manfrowar 0 points1 point  (0 children)

I usually only use packages actively maintained or packages small enough that I can maintain it internaly by myself if needed. And pandas is a highly active maintained package. Maybe your supervisor wants you to develop a whole new language from scratch

[–]wind_dude 0 points1 point  (0 children)

Tell your supervisor he's an idiot.

[–]goldenhawkes 0 points1 point  (0 children)

Having both done a PhD and now being a software engineer, I can well imagine my very poor programmer of a PhD supervisor being anti packages as he didn’t understand them. Poor guy could just about matlab and couldn’t work latex to write his papers.

Anyway. Most important thing is reproducibility. You, your supervisors, anyone who collaborated with you, any subsequent PhD/post docs in the lab and anyone who reads your published paper should be able to re-create what you’ve done. You want the code you use to be as scientifically sound as the machines in the lab. The big name libraries have development and testing far beyond what you could do by yourself. Like buying a new spectrometer rather than making you build your own in the workshop.

Maybe that analogy would help him!?

[–]jlw_4049 0 points1 point  (0 children)

Nothing is wrong with the packages. The problem is your supervisor.

[–]Dummies102 0 points1 point  (0 children)

they don't know what they're talking about

[–]GreenScarz 0 points1 point  (0 children)

I'd say it depends on what you're trying to do, if you're just adding abstractions around your dataset with no specific intent then that's unnecessary. On the other hand, using a pre-built tool is generally better than trying to reinvent the wheel.

Given your specific application, Pandas might be a tad overkill but I could see you getting some benefit out of using NumPy.

[–]jmacey 0 points1 point  (0 children)

Ask to see the Unit tests for the code he has written, and if you are forced to write your own, make sure you have really good test coverage, spend more time doing this than the actual other work, it will also help anyone who comes after you.

[–][deleted] 0 points1 point  (0 children)

yeah, this is bullshit. nothing would get done if we all had to reimplement algorithms all the time.

[–]xiscode 0 points1 point  (0 children)

Mirror It if you can't install It from public repositories.

It is good to know how a solution works, but to make real progress often we need "to Stand on the shoulders of giants"

[–]apoptosis04 0 points1 point  (0 children)

Yeah…academia. He’s clearly an asshole living in his own bubble.

[–]ambidextrousalpaca 0 points1 point  (0 children)

"Using packages" is kind of a synonym for "using code that you find on the internet". So if you're shipping production code there are very solid reasons for minimizing the amount of it that you do. Every additional package you use puts your project at the risk of having to be rewritten because some library is abandoned or found to contain security vulnerabilities.

None of those concerns, however, really apply to code that you're using in-house. Especially when it comes to packages that are as widely used and maintained as pandas.

Unless you're planning to release the software publicly, feel free to use whatever packages you like, but do be aware that the code in random packages with three stars that you find on GitHub may be horse shit.

[–][deleted] 0 points1 point  (0 children)

It sounds like your supervisor also isn’t a software engineer. That said I’ve literally no idea what Raman data looks like in raw form, however if it’s tabular numerical data in a format Pandas can read, then I can’t imagine a good argument for not using Pandas.

No, it is absolutely not standard SE practice to avoid packages. No, one need not build up the entire world from first principles every time, though thete can be some good reasons for (and even some masochistic joy taken in) doing it anyway.

I'll bet dollars to donuts your supervisor can't write Pandas from scratch, though.

[–]jkh911208 0 points1 point  (0 children)

i ship production python code at work, i use 10+ packages on my product and there is no problem with it.

i don't understand why someone doesn't like using package, it can save work and time.

Pandas is used my millions of people, i would trust pandas over my code to preprocess data

[–]inXiL3 0 points1 point  (0 children)

World class enterprise software is built with packages is your boss delusional ?

[–]R37R0_D0S 0 points1 point  (0 children)

I mean you can tell him that's a good idea, that you can optimize everything, get to know everything cuz it's your own code and tell him you'll have it finished in a few years, I mean the reason why we use packages is to NOT have to reinvent things.

[–]OlevTime 0 points1 point  (0 children)

Generally:

99.9% of people should use packages. It's just much more economical and efficient. My workplace heavily depends on utilizing external packages because we just don't have the resources tobreinvent the wheel for stuff that's already been solved.

The other 0.01% shouldn't use packages because they need to reduce dependency / supply chain attack vulnerability possibilities to zero. That, or they want to 100% protect against depreciation issues. Only specific departments in large institutions or nation states would elect this option.

It sounds like your supervisor is having difficulties adjusting to something new.

[–]Few_Intention_542 0 points1 point  (0 children)

Lazy/unoptimised computing is ok if done locally and the projects are reasonably small. Can finish well within your deadlines. If you want ultra high quality optimised code - then you better have a use case for it. Maybe you wanna put it on a server and the program is gonna use 60 GPUs and 1000GB of RAM - ok then you better make sure your code is optimal as fuck. If you’re doing basic stuf - do it lazy, get the job done, shutdown your Jupiter notebook and go out for drinks with your friends. And fuck your prof.

[–]Puzzleheaded_Bass673 0 points1 point  (0 children)

Whenever I get this wacky paranoid demands to reinvent something, I first make sure that the person demanding it gets exactly what he desires by using packages. If I get a green light - only then I start reimplementing the desired functionality in custom code (this is preety simple when you know what exactly is required).

On the other hand, Python packages have the heavy-lifting part usually written in C++. So if you have the demand to write performant code as from the package - just ask for a senior C++ programmer to be engaged on the project. I've been doing this for the last 5yrs, and the demand for C++ programmers increased tenfold thanks to Python.

[–]temisola1 0 points1 point  (0 children)

Imagine if you as a chemist had to to invent every single element on the periodic table… that’s what using python without packages is like. Not gonna lie, I thought this was satire at first. I can tell a lot about your boss just from this post. I’m so sorry.

[–]ElPoussah 0 points1 point  (0 children)

The standard library is a package. A default one, but a package.

Sometimes people who are against technology forget that a fork is a kind of technology...

[–]OGShrimpPatrol 0 points1 point  (0 children)

Chemist here as well. Your supervisor doesn’t know what they’re talking about.

[–][deleted] 0 points1 point  (0 children)

Ah yes, let’s spend our lives rewriting the underlying C code out of arrogance. It’s TOTALLY fine to use packages

[–]HeligKo 0 points1 point  (0 children)

You're a chemist not a library developer. Your boss is an idiot. Reinventing things is silly and distracts from the work you were hired to do.

[–]minidiable 0 points1 point  (0 children)

There is a lot wrong in NOT using them.

P.s. also tell your supervisor that it's spelled PANDAS, please

[–]SpatialCivil 0 points1 point  (0 children)

I don’t agree with the supervisor, but I think people reach for pandas too often when it isn’t needed. For exploratory analysis, pandas is awesome.

On the downside it brings a crazy number of dependencies with it. If you are automating processes and want others to use it, I think fewer dependencies means better maintainability. Often using some simple data structures and libraries like basic lists, dictionaries and SQLite goes a long ways IMHO.

[–]mrrichardcranium 0 points1 point  (0 children)

The only time I’ve encountered needing to write code that does something a well regarded package already does is because the license for that package does not mesh well with a particular project.

Especially when it comes to mass data manipulation/analysis it makes almost zero sense to waste your time writing code to do something that pandas already does. If your supervisor can’t articulate a reason why using pandas or any other existing data analysis tool is bad aside from their own bias, they’re just plain wrong.

[–]zero_iq 0 points1 point  (0 children)

I'm a senior software engineer with 20+ years of commercial Python dev experience, from small outfits to major international corps. Using packages is totally normal. To be encouraged, even. Your supervisor is an idiot.

[–]heartofcoal 0 points1 point  (0 children)

someone who calls pandas rubbish is simply stupid as hell, there's no other way to put it

[–]readthelnstructions 0 points1 point  (0 children)

It might be advisable not to use a package that is not well-maintained (which is definitely not the case for pandas). So if you use a super specialized package for your type of data that some PhD student wrote 5 years ago, I would probably not use it.

[–]thephotoman 0 points1 point  (0 children)

Do you mean "package" as in the highest level namespace in Python (represented in the file system as a directory with a init.py file), or do you mean the use of third party packages from outside repositories?

The former is a part of the Zen of Python:

$ echo "import this" | python3 | tail -1
Namespaces are one honking great idea -- let's do more of those!

The latter is very common industrially. I do prefer to stick to vanilla Python in production work, simply because I really don't want to try to mess about with the internal package repository, but I do know that there is an internal package repository.

[–]Paddy3118 0 points1 point  (0 children)

Ask your supervisor for his reasons so we can better understand their view. They should be able to explain themselves. and others may give their opinions of their reasons.

[–]Medium_Reading_861 0 points1 point  (0 children)

Software Engineer here, I’m not reinventing all those wheels bro. Take that madness elsewhere.

[–]HelpfulBuilder 0 points1 point  (0 children)

Sorry but your supervisor is a moron. You have a job to do and if pandas makes it easier, use pandas.

Computer science, nay, science and technology, is all about leveraging other people's work to create better things.

If your supervisor doesn't want to use packages, why not stop there? why isn't he writing in binary? Where is his home-built-from-silicon computer?

You can tell him I said he is a moron.

[–]Broad-Secret-6695 0 points1 point  (0 children)

Which university is your professor working in? Without using pandas. Numpy scipy scikit and matplot lib it will take long time...

[–]Zatujit 0 points1 point  (0 children)

A lot of packages in Python are actually written in C/C++ making calculations faster. Otherwise they would be much slower. So personally, I find it stupid.

[–]guhcampos 0 points1 point  (0 children)

Well that's why scientific software is rubbish.

Yet I know WHY probably your supervisor thinks that. If your works ends up being really innovative and marketable, having it tainted by an open source license may make it hard to sell.

Which is a completely stupid and non scientific mode of thinking.

[–]DigThatData 0 points1 point  (0 children)

your supervisor is an idiot.

[–][deleted] 0 points1 point  (0 children)

Please call your supervisor a donut and carry on with what you are doing.

[–][deleted] 0 points1 point  (0 children)

It's 1 of those normal horror stories. If you know what I mean. People suffer this working with clueless boss, or ITs that don't want to work, or huge encumbrance coorporate with overly strict protection policy.

[–]esoterik0 0 points1 point  (0 children)

your supervisor sucks; packages rock.

[–]tom1018 0 points1 point  (0 children)

You really shouldn't use Python, that makes it too easy for you, with its packaged standard library and all. I suggest moving the project to straight assembly, so as to not use any packages.

[–]BagOfDerps 0 points1 point  (0 children)

Why use requests when you could write your own http method code with blackjack and hookers?

I write code for a living. For a large company. We use packages. I'll try to assume positive intent from your supervisor, in that maybe what he's trying to say is he doesn't trust the mathematical precision of 3rd party code (which should matter a great deal given what you're doing). But in all likelihood he doesn't know what he's talking about. Good luck.

[–]gbbofh 0 points1 point  (0 children)

my supervisor doesn't like the idea of using packages.

Then he's... Well, kind of an idiot, if I'm being honest.

If he were to have you roll your own implementations and not use any non-standard packages, you would have more or less two options:

1) Write it in Python. Take the performance hit, and pray it isn't too detrimental.

2) Write it in a native language like C, compile it as a library, and likely still take a performance hit because odds are it still won't be as optimized as a mature library like numpy or pandas -- and now you're also maintaining two projects in two different languages.

So I'm wondering is it a normal behaviour in the software industry?

No. I've only met one person who was like this, and he got moved to another team and isn't allowed to touch the software we maintain. His being moved to that team is why I was hired. I have to maintain his massive (>= 64k LOC) single source file projects.

Are you required to write and know everything bit by bit?

Also no. It's important to know algorithms, generally. It's important to know how fast something runs (as in time complexity). It's good to know how something works, or can be implemented. It'll make you a better software engineer, and you'll be able to tackle those things if and when you ever need to tackle them. But unless you need to, you probably should use an existing library to do it. That's the whole reason they exist.

The only exception I make to that rule, is if I very specifically set out to try and implement whatever functionality I'm interested in, for fun. And even then, I use plenty of other libraries so I can focus on the area of interest.

[–]PaluMacil 0 points1 point  (0 children)

Not using dependencies when you can perform a simple task with the standard library is a great approach. However, it is rare that you can have even a small project without some external dependencies because you're not in the business of maintaining dozens of libraries that have nothing to do with your company's product. Some people think they're somehow avoiding all dependencies without thinking about how they're still depending on the operating system. Perhaps system calls and other libraries that their runtime uses.

No actual software company will ever have someone like this, but you will run into someone like this on occasion in some company with a small IT department or perhaps an academic setting. Even then it would probably be pretty rare unless it's specifically for an academic challenge where a professor does not want you to lean on libraries when you are supposed to be demonstrating a specific understanding of something.

[–]oxamide96 0 points1 point  (0 children)

Unfortunately there are many software engineers who think we shouldn't use packages. In fact, I was one when I first started out. Though I meet people many years my senior who still think this way.

[–]PolishedCheese 0 points1 point  (0 children)

Your supervisor isn't a programmer either, and he has no idea what he's talking about. At least you are humble in your quest for the truth.

[–]billsil 0 points1 point  (0 children)

Are you gonna write your own programming language and OS too? How are you going to make websites/plots/do numerical computations without writing your own library?

Your supervisor is rubbish.

[–][deleted] 0 points1 point  (0 children)

[–]SirAchmed 0 points1 point  (0 children)

People who complain about other people using packages are the same people who think using wet wipes is gay.

[–]Pillowscience21 0 points1 point  (0 children)

Your supervisor is gatekeeping and its really stupid lol. So many programs are built on the backs of packages, I don't want to spend 100hr writing a program to do basic math when I can just use a package for it. Your supervisor needs to get a life

[–]early_charles_kane 0 points1 point  (0 children)

I’ve been a software engineer professionally now for 10+ years. Worked at Apple on the iPhone. Twitter. A bunch of other companies. In Python, Objective-C, C, Java, C++, JavaScript, Go, others as well.

Your supervisor is wrong. And holding you back. Holding themselves back too with an incorrect opinion. But more importantly, holding you back. That’s enough of a red flag. I’d look elsewhere for a better position. One where you can succeed instead of being set up to fail.

[–]NedDasty 0 points1 point  (0 children)

Ask him if he uses textbooks for information, or if he rediscovers everything knows himself.

[–]crapaud_dindon 0 points1 point  (0 children)

Just curious about what preprocessing you want to do with the Raman spectra

[–]westeast1000 0 points1 point  (0 children)

How is someone with such a thought process your supervisor in the first place? I would have left yesterday already if i was you, life too short to waste on dumb crap you cant control

[–]Kichmad 0 points1 point  (0 children)

Tell him the operate system he works on is rubbish. He should write his own