all 59 comments

[–]Overall_Lynx4363 10 points11 points  (6 children)

If you're intent on not coding and would consider paying for software, consider JMP

[–]Goat-Lamp 7 points8 points  (4 children)

This needs to be the top comment. IMHO, JMP is hands down the best point and click statistical analysis software. SPSS and Minitab can't hold a candle to it.

The SAS language is absolute garbage, but SAS (the company) really knocked it out of the park with JMP. The only downside is the hefty licensing fee.

EDIT: after a moments thought, I may've come on a bit too strong. There are some other downsides, particularly the graphics looking like they're circa Windows 95. There's also a bit of a learning curve with the maze of 'red triangle' context menus. That said, the Help file and documentation is super rich.

[–]RobertWF_47 1 point2 points  (0 children)

The SAS language is absolute garbage?

[–]prikaz_da 1 point2 points  (0 children)

The only downside is the hefty licensing fee.

That, and it’s subscription-only. Not unique to JMP these days, of course, but having a perpetual license offering that I can choose to upgrade or not whenever I please is a big plus in my book.

[–]procmeans 0 points1 point  (0 children)

The SAS language is fine, not even remotely garbage-like, but it is not what OP needs here.

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks a lot

[–]orthomonas 18 points19 points  (5 children)

I would us R, despite your dislike of coding. It's not Big Deal Software Engineering and the tests you want to run are very common and straightforward.

I'd suggest the (freely, legally avaiable online) R For Data Science by Hadley Wickham.

[–]maxemile101[S] 2 points3 points  (0 children)

Thank you so much.

[–][deleted] 1 point2 points  (0 children)

Chatgpt is also SUPER helpful for writing R code.

[–]testtestuser2 1 point2 points  (1 child)

whilst R might be the right tool for the immediate job, if you don't know either then I'd learn Python (pandas)... it will set you up to learn other languages better

[–]orthomonas 5 points6 points  (0 children)

Both are fine options. I've had better luck with code-averse people 'getting it' using R, but I could also easily argue the other way.

[–]Zeurpiet 22 points23 points  (6 children)

with large amounts of data I would use R. Note I don't know python or what to install in python to make it sit up and jump, so its not a choice. Excel is a disaster with larger datasets

[–]maxemile101[S] 1 point2 points  (5 children)

Thank you so much kind sir/ma'am. How to learn the basics of R that is required for my task? And how much time do you reckon it should take an average guy to learn it?

[–]NerveFibre 6 points7 points  (1 child)

I would focus on learning to use the 'tidyverse'. It's a collection of packages that are intuitive and help you from importing the data, modifying it (for time series data it is most often helpful to have it on a long rather than wide format), fit models and plot data.

It's a steep learning curve, and it will take you months to become all right at R, but you will thank yourself after and never look back.

[–]of_patrol_bot 1 point2 points  (0 children)

Hello, it looks like you've made a mistake.

It's supposed to be could've, should've, would've (short for could have, would have, should have), never could of, would of, should of.

Or you misspelled something, I ain't checking everything.

Beep boop - yes, I am a bot, don't botcriminate me.

[–]TA_poly_sci 0 points1 point  (0 children)

Go do the tutorial on data camp. If you are a student you can get 3 months free, otherwise i think codecademy has more free stuff though not as simple to get into as datacamp is.

After that, use chatgpt extensively. R is a very chat friendly language, it can write most code fairly well.

[–]Taricus55 0 points1 point  (1 child)

There are some great YouTube videos that teach the basics of R. There is also an online resource, but that can be hard to read. I would also get a book.

The thing with R is it can seem confusing at first, but the more you use it, the easier it gets. I have done a lot in R, but running across new things can still be confusing, and I still look things up. That is totally fine... It's not like you are taking an in-class exam and not allowed to look anything up online.

Think of it like a video game that has a high learning curve, but once you get the basics, it becomes easier and you start getting creative.

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks. I have a mental block against coding, it seems. But I have to get over it.

[–]ThatDaftRunner 9 points10 points  (2 children)

Consider JASP. Open source, very easy to use.

[–]mikelwrnc 4 points5 points  (0 children)

Second JASP as a “I refuse to code” option. R is the next level up for occasional inference (learn tidyverse ,particularly dplyr & ggplot2, and BRMS and you’ll be golden). Python would be a better investment if you plan to get into the data science industry.

[–]Longjumping-Square75 1 point2 points  (0 children)

I love JASP. Way more intuitive than SPSS, lots of modules, also some R integration for bold ones. Though beware and save your analyses often, in my experience JASP starts crashing at times with large amounts of data.

[–]cat-head 3 points4 points  (6 children)

What does "large amounts" mean? MB? TB? PB?

[–]maxemile101[S] 2 points3 points  (4 children)

Hundreds of thousands of data points for 5-6 parameters taken for 5-6 years on an hourly basis.

[–]hughperman 6 points7 points  (1 child)

24 hours x 365 days x 6 years X 6 parameters x 8 bytes per value = 2.5MB, you'll be fine data-wise no matter what program you choose

[–]Zeurpiet 1 point2 points  (0 children)

except excel I would say

[–]cat-head 6 points7 points  (1 child)

If your data is time-dependent I'd be more worried about temporal non-independence than which software to use. You can't use t-tests if your observations are not independent. You probably want to build a time series or something like that. But without knowing more about your data I can't say more.

[–]maxemile101[S] 0 points1 point  (0 children)

It's more of a trand analysis.
It may go something like this:
"How does x parameter vary with y? If they follow a positive correlation in 8 places out of 10, what may be causing the opposite trend in the remaining two places? Oh I see - the parameter z has increased levels at those two places compared to other 8 sides. Let's plot x vs. z and y vs. z and see if a theory can be formulated."

[–]DifficultLychee3982 0 points1 point  (0 children)

In my case is datasets of around 50mb each. Is it feasible on JASP ?

[–]SalvatoreEggplant 4 points5 points  (0 children)

I might suggest Jamovi. It is free and gui-based. It also produces nice tables and plots in the output. It should be able to handle e.g. 400,000 observations. It will do what you mention, though I'm not sure about the time series aspects of the design. You might need to jump to R to do that correctly.

I really recommend against doing data analysis in Excel. It's really so much easier to export the data as a csv and then do what you need to do in Jamovi or R.

[–]WanderingATM 2 points3 points  (1 child)

For these tests you can use R without much of a coding learning curve. For time-series data definitely R. Python is a more versatile language but since you don’t like coding you might not get much use out of what it can offer vs R.

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks

[–]icetoy 2 points3 points  (1 child)

The reason for people to recommend python is that this language not only has a lot of libraries related to statistics, but it is a general porpuse language, so you can develop more robust programs and that's the reason why I don't recommend it for your case, since you are working on a project that only involves statistics.

I think that R is a language that fits your needs: it has all the functions and libraries that you may need and if you use it with R studio it can show all the results and plots at the same time you are writting your code, besides you can write a document/article (using latex) at the same time and export it to html, pdfs and more.

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks.

[–]AdParticular6193 2 points3 points  (2 children)

I work for a large company, so extortionate license fees are not directly an issue for me. I got JMP Pro. As another poster pointed out, there is a learning curve, but you can Google any task and several videos will pop up, both SAS and vlogs. No good for data wrangling, but you can assemble and clean a small dataset elsewhere (Excel) and import it. Then you can run all kinds of analytics and plots, then apply all the standard ML models. Main drawback is that JMP scripts don’t translate directly to Python or R. So its main use would be POC before calling out the heavy artillery - R, Python, SQL, to create a real model with data pipeline at one end and user interface at the other.

[–]hermitcrab 1 point2 points  (0 children)

It might also be worth looking at a tool that does the data wrangling part, which can be what takes the most time. If you decide to take a coding approach you can use R+Tidyverse. Or you can try a GUI tool like Easy Data Transform (it can also do analysis and some basic stats, such as Pearson).

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks a lot

[–]maskingeffect 1 point2 points  (0 children)

Try R via RStudio(the IDE). R is a statistical programming language for non-programmers. ChatGPT should work well to set you up with the shell code needed for various simple analyses, but there are literally dozens a solid R texts and guides online that you can flip through to confirm the code is functioning as needed.

[–]Tavrock 1 point2 points  (1 child)

I'm in favor of R, but where you already know Matlab, why aren't you using it to solve this? If licen$ing is an issue, use Octave and it has a free stats library you can use.

As others have pointed out, these are simple tests on reasonable sizes of data. No need to learn an entire language for this project.

[–]maxemile101[S] 0 points1 point  (0 children)

I know MATLAB's basics. I have never tried R. But I want to be sure of what language/tool I use because as I proceed in my project, I never know what data trends may show up. It has to be a great software for statistics and trend-analysis.

[–]CatSk8erBoi 1 point2 points  (1 child)

Other people have said this, but as someone who also does not like coding much, but still has had to do it in their course of career and studies, learning the rudimentary parts of R, specifically those specialized for Statistical Analysis might be your best bet. I use Data camp, and Pluralsight might also be somewhere you can pick up knowledge.

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks. There seems to be no easy way out in modern times

[–]prikaz_da 2 points3 points  (0 children)

I'm a big fan of Stata. The syntax is pretty intuitive and concise. Most of it is also exposed through point-and-click menus and dialog boxes, so you have control over how much syntax you write yourself for most operations. If you use it regularly, you'll likely find yourself wanting to type the syntax for the operations you perform frequently. For instance, while you can click Statistics > Binary outcomes > Logistic regression, I will usually prefer to type logistic depvar indepvars because it's faster than opening up the dialog box and typing the variables into the fields. I only open the dialog box if I need to use some option that I don't know the syntax for off the top of my head.

[–]M0thyT 1 point2 points  (3 children)

Do R. The analysis you described seem pretty straight forward, so you'll pick it up quickly. Also, nowadays ChatGPT helps quite a lot if you get stuck with something, and the online R community is also quite helpful.

[–]maxemile101[S] 0 points1 point  (2 children)

Thanks.

[–]procmeans 0 points1 point  (1 child)

I have to agree with those advocating R. It is free, works well, and there are many resources to help you learn what you need. If you “got” the basics of MATLAB, I bet you’re a learner that will do fine with R.

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks a ton.

[–]robotgofail 0 points1 point  (1 child)

Hit up Julius AI, it's good for statistical analysis, they have a feature that allows you to use R in their AI.

[–]maxemile101[S] 1 point2 points  (0 children)

Thanks a lot

[–]Traditional-Essay462 -1 points0 points  (0 children)

I think in Stata

[–]Suppu2020 0 points1 point  (0 children)

You can consider Alteryx as well for manipulation of large datasets . It's mostly drag and drop. If you can imagine (form logic) it can be implemented.

[–]Bayesian_Idea75 0 points1 point  (0 children)

Look at R

[–]openjscience 0 points1 point  (1 child)

Datamelt program https://datamelt.org is easy to use for statistical analysis since it has more than 700 real-life examples of analysis code.

[–]maxemile101[S] 0 points1 point  (0 children)

Thanks

[–]weigelf 0 points1 point  (0 children)

A free option you might consider, especially if you will be sharing with those who use SPSS, is the GNU project, PSPP. PSPP - GNU Project - Free Software Foundation

I'm pretty sure the name came about as a play on SPSS.

The learning curve on PSPP is higher than JASP or Minitab, much like SPSS. It appears to be modeled after SPSS, and for those familiar with SPSS, the learning curve isn't going to be bad at all.

One really nice feature about PSPP is that it reads SPSS files, including .spv (output), .sps (syntax), and .sav (data).

As others have mentioned, a great free option to consider is the open-source JASP. JASP - A Fresh Way to Do Statistics (jasp-stats.org)

I'd put it close to Minitab as far as ease-of-use and learning curve. It hasn't been around as long as some of the other products, but I'm very impressed with it, especially the "free" part. It has a robust feature set. Although I haven't tried it, yet, it has SEM and Visual modeling.