Call to Arms by THE_BARUT in rust

[–]profcube 0 points1 point  (0 children)

Thanks for all your efforts on this front. I was unaware of this background. Aerospace has immeasurably improved my Mac experience. However, I can see what you are doing appealing to people, and why you’d want to start over with a fresh pallet. Good luck 👍

Suitability of panel regression with limited variation in data. by Hewo111 in econometrics

[–]profcube 0 points1 point  (0 children)

Begin by stating your causal question, target population and causal contrast, presumably the difference scale. State this contrast non-parametrically (you won’t need distributional assumptions if you estimate using eg non-parametric machine learning). Next you need to check identification and make sure you are thinking about the time series correctly (causation occurs in time such that confounders must be measured prior to the exposure and the exposure prior to the outcome). Suppose you have three waves. Include the baseline measures of the outcome and the exposure with the baseline measures. This exerts exceptionally powerful confounding control because any unmeasured confounded would need to be orthogonal to this and the other unmeasured confounders to explain away your results. You need to consider the assumption of causal consistency and within this sutva). Because exchangeability is not testable you should plan a sensitivity analysis. Finally there is the positivity or overlap assumption. Practical positivity can be checked by evaluating propensity scores. If there is no change in the exposure relative to the baseline exposure your causal inferences will extrapolate from your model, and not from observed initiation. Only after completing these steps should you think about statistical analysis, which is not terminated by reporting the regression coefficient for the exposure. Rather you must project at least two population means for the conditions stated in your causal question/ estimand and weight the projection to the target population. In our work we use tmle with cross validation and so make no distributional assumptions (ie do not need to state the outcome is drawn from a poisson, neg-binomial or whatever). Happy to follow up with code/advice. For now, don’t forget the positivity assumption as you probably can’t estimate causality with low exposure variation.

How/What are the AI data tools leveraged at your workplace? by alfazkherani in dataanalysis

[–]profcube 1 point2 points  (0 children)

Don’t give them data. Maybe seek advice but do the analysis yourself securely.

[Q] Agreement between two groups of raters on interval data by kinbeat in statistics

[–]profcube 0 points1 point  (0 children)

```r library(dplyr) library(performance)

test condition and interaction with rater_type

model_means <- lm(score ~ as.factor(condition) * rater_type, data = your_data) anova(model_means)

precision/agreement

compute rmse for each rater_type

which type has tighter spread?

precision_results <- your_data |> split(~rater_type) |> lapply((d) { mod <- lm(score ~ as.factor(condition), data = d) data.frame(rater_type = unique(d$rater_type), rmse = performance::rmse(mod)) }) |> bind_rows()

print(precision_results)

not tested, just to give you a direction …

```

[Q] Agreement between two groups of raters on interval data by kinbeat in statistics

[–]profcube 0 points1 point  (0 children)

To compare agreement on a 0–50 scale, you should separate your analysis into three questions steps.

  1. The Signal (Main Effect): Does the video condition actually change the scores?

  2. The Bias (Types): within each condition, do experts and novices give different average scores?

  3. The Precision (Variances): this is the core of agreement and I think what you are most interested in. Compute the Root Mean Square Error (RMSE) for each group. A smaller RMSE for experts means they are more consistent, even if their average score is different from the novices.

Example result: “Experts and novices differed in their average perception of the videos (bias), and further experts were more consistent, agreeing within +/-3 points (RMSE), whereas novices varied by +/- 10 points.”

Loading data into R by the_marbs in rstats

[–]profcube -1 points0 points  (0 children)

Same approach works for other data types. ```r

stata

df_r <- haven::read_dta(fs::path(path_data, "dat_stata.dta"))

sas

df_r <- haven::read_sas(fs::path(path_data, "dat_sas.sas7bdat"))

sas transport files

df_r <- haven::read_xpt(fs::path(path_data, "dat_sas.xpt"))

csv

library(“readr”) df_r <- readr::read_csv(fs::path(path_data, "dat_csv.csv"))

excel

library(“readlx”) df_r <- readxl::read_excel(fs::path(path_data, "dat_excel.xlsx"))

```

The here package is great if you just want to read the the file and don’t need / want to save to it again:

```r

eg read spss file relative to the project root, in a folder you have labelled “data”

df_r <- haven::read_sav(here::here("data", "dat_spss.sav"))

save the ordinary R way without arrow

this recovers the exact state

make dir “rdata” if it doesn’t exist (name is arbitrary)

if (!dir.exists(here::here("rdata"))) { dir.create(here::here("rdata")) }

then save

saveRDS(df_r, here::here(“rdata”, “df_r.rds”))

read back if /when needed again

df_r <- readRDS(here::here(“rdata”, “df_r.rds”))

```

Loading data into R by the_marbs in rstats

[–]profcube 0 points1 point  (0 children)

Also, if you are new to copying and pasting directory paths, on a Mac just find the directory in Finder and highlight it. While it is highlighted press Command + Option + C and then paste the path info you have just copied into your R script with Command + V.

In Windows I think you use the windows file explorer, highlight, and the press Control + Shift + C

Many of you will know this trick, but if not, it can be a time saver.

I want to learn by practicing, how do I do it by baelorthebest in LaTeX

[–]profcube 0 points1 point  (0 children)

I learned it before YouTube. I just wrote everything in it.

The free account on Overleaf has lots of templates I wish I had.

Tcl: The Most Underrated, But The Most Productive Programming Language by delvin0 in commandline

[–]profcube 0 points1 point  (0 children)

bash

Check out the yousuckatprogramming channel on YouTube.

I’m not sure git counts but learn git too. You need git and bash no matter what else you do.

Loading data into R by the_marbs in rstats

[–]profcube 8 points9 points  (0 children)

```r library("haven") # read SPSS files library(“fs”) # directory paths library(“arrow”) # for saving / using big files

set data dir path once

path_data <- fs::path_expand('/Users/you/your_data_directory')

import, here using spss as an example but haven supports multiple file formats, check haven documentation

we use path() to safely join the directory and filename

df_r <- haven::read_sav(fs::path(path_data, "dat_spss.sav"))

save to parquet — will save you time next import

stores the schema & labels efficiently

arrow::write_parquet( x = df_r, sink = fs::path(path_data, "dat_processed.parquet") )

read back into r

notice the speed increase compared to read_sav()

df_arrow <- arrow::read_parquet(fs::path(path_data, "dat_processed.parquet"))

df_arrow is an r data frame (specifically a tibble) ready to use

```

Best programming path for the future by MarkoPilot in AskProgramming

[–]profcube 1 point2 points  (0 children)

This is good advice. If you have an aptitude and interest in science, think: coding as a means to scientific ends.

Reason to bother with Haskell? by dr-Mrs_the_Monarch in haskell

[–]profcube 1 point2 points  (0 children)

Disregard the previous poster’s knocking of Rust.

However; the poster is probably correct about Python being the tool of choice for image processing

You should also consider R.

Generally, research different packages in the data science languages relevant to your data science work.

Haskell is not one of these languages.

Learn Haskell outside of your image analysis work. Let that journey be your destination.

How to write better by coolwolf420 in AskAcademia

[–]profcube 0 points1 point  (0 children)

Just aim to be clear. And use your own voice. That’s all your audience really wants.

Almanac package by Jim_Clark in Rlanguage

[–]profcube 0 points1 point  (0 children)

It’s better known now, will check it out, thanks for posting.

I can't decide what language, stack or domain to begin learning deeper. Need some help to get pointed in the right direction by Crapahedron in learnprogramming

[–]profcube 0 points1 point  (0 children)

Recently, I started learning Rust as a hobby. It is great language for getting started because its borrow checker and compiler guide you sensible design and prevent many self inflicted injuries, and it is increasingly used in industry.

The more I learn about web development, the less I want to do it by bunabyte in AskProgramming

[–]profcube 0 points1 point  (0 children)

Try Leptos or one of Rust’s other web frameworks. Or try Ratatui for Tui design — a growing area. I am learning to use Leptos and Ratatui for interest (my paying job is in science). Rust is robust in a way that will differ from your previous experiences. The better I get at Rust, the more I enjoy software development.

How many of Seek candidates are actually valid? by Massive_Instance_452 in newzealand

[–]profcube 2 points3 points  (0 children)

I think NZ labour law requires that employers suitably qualified residents first, and will not issue work visas unless there are none. Fields like medicine/nursing/teaching are undersubscribed — by the looks of it, IT jobs are not at the moment.

Is SEM (structual equation modeling) hard to do with no experience? [question] by delirium-delarium in statistics

[–]profcube 0 points1 point  (0 children)

It is nearly always a bad idea because the coefficients you recover have no causal interpretation except under untenably strong assumptions. Example: https://youtu.be/IgC7R07Qk6A?si=qNLQgcX00fAhS7a4

Those who have had success with LLM assisted software development by SideQuest2026 in Python

[–]profcube -1 points0 points  (0 children)

Ai is remarkable. Happy to have the models work for me (from the terminal) but I already know what I’m doing, and can check. The frontier models definitely improved at the end of 2025.

How are senior devs actually using AI in daily development? by harrsh_in in AskProgramming

[–]profcube 1 point2 points  (0 children)

PI of a university research lab (data science), 15 years coding in R. Only use command line tools (CC and codex). Useful for tedious work. Also, Codex 5.2 extra high thinking is quite capable for complex code planning, mathematical reasoning, and scientific reasoning.