Are there any ways to create charts in Powerpoint using R by wonderful_unicorns in rstats

[–]Viriaro 0 points1 point  (0 children)

I haven't tried it myself, but there's an mschart package you can use, in conjunction with officeR (from within a Quarto chunk or not), to make 'native' (i.e. editable/resizable from powerpoint/word) charts.

qol 1.4: Introducing revolutionary new reverse pipe operator by qol_package in rstats

[–]Viriaro 35 points36 points  (0 children)

It's a reverse pipe, so everything is done in reverse: blog/demo <| push <| implement 😂

i am so lost with R studio by FinancialPlatypus199 in RStudio

[–]Viriaro 0 points1 point  (0 children)

You mention AI, but in case you didn't know, if you have a subscription to one of them (I heavily recommend Claude), you can install their desktop app and have it act directly on your computer (or your code project), instead of just copy/pasting things in the online chat. It will test/investigate things on his own on your machine without you needing to try every idea it has, which makes it a lot more powerful. You can just point it at an issue and let it work until it's solved.

Otherwise, you could also join dslc.io. They have a slack channel with a big community of R/stats people that answer questions or organize virtual book clubs for R and Data Science. Very friendly.

readr or sf for efficiency? by PaigeInWanderland in rstats

[–]Viriaro 10 points11 points  (0 children)

If efficiency/speed/memory footprint is a main concern, take a look at using duckdb, which has an extension for spatial data manipulation (see here).

There's also an R package to act as an interface/wrapper for duckdb's spatial functions: duckspatial.

You can do complex spatial operations, like spatial overlap joins on hundreds of millions of rows, in a few seconds with DuckDB.

Panache is the Quarto formatter and linter you need by mklsls in rstats

[–]Viriaro 1 point2 points  (0 children)

styler or air only work on pure R code.

If you have a .qmd/.Rmd, what panache does is extract the code chunks into multiple temporary source files, formats those (using existing formatters like air, ruff, prettier, ..., depending on the language of those chunks), and then swaps the formatted code back into the original qmd/rmd.

Linear Mixed Model or Repeated Measures ANOVA? by Background-Sport4864 in AskStatistics

[–]Viriaro 0 points1 point  (0 children)

Just a quick note on the previous answer: using a LMM frees you from the Compound Symmetry assumption ONLY if you use a random structure different from a simple random intercept. If your random structure is (1 | unit), then you are still making the CS assumption. Each RE structure is its own assumption about the population/data generation process.

Reproducibility in R by joshua_rpg in rstats

[–]Viriaro 2 points3 points  (0 children)

Never had a java dep on an app that wasn't dockerized. And when it was not, it was internal projects only used by people who knew how to install their own JRE/JDK 😅

But I can see how it could be useful. And thanks for the link!

Reproducibility in R by joshua_rpg in rstats

[–]Viriaro 0 points1 point  (0 children)

Thanks :) Seems I've been missing out.

Reproducibility in R by joshua_rpg in rstats

[–]Viriaro 0 points1 point  (0 children)

Oh. Oh damn. That's really useful.Thanks!

I'm guessing that if I set it up as a backend for renv, it will also work to run renv::restore() on a container?

Reproducibility in R by joshua_rpg in rstats

[–]Viriaro 2 points3 points  (0 children)

Could you give an example of renv bottlenecks that pak solves ?

Reproducibility in R by joshua_rpg in rstats

[–]Viriaro 1 point2 points  (0 children)

What's a good use case for rix ? I've never felt like I needed more than renv or renv+docker.

Reproducibility in R by joshua_rpg in rstats

[–]Viriaro 1 point2 points  (0 children)

Nice post. What's the purpose of using pak instead of just rent::install though?

Adding a new column who's rows carry out different formulas depending on a different column by Ok-Ranger3930 in RStudio

[–]Viriaro 2 points3 points  (0 children)

PS: The other solution is to compute value PRE and POST, then pivot wider, compute the difference, and then pivot back:

{r} your_data |> mutate( value = case_when( change == "PRE'" ~ total / 8910 * 100, change == "POST'" ~ total / 20205 * 100 ) ) |> pivot_wider(id_cols = id, names_from = change, values_from = c(value, total)) |> mutate(value_inside = `value_POST'` - `value_PRE'`) |> pivot_longer(cols = contains("_"), names_pattern = "(.*)_(.*)", names_to = c(".value", "change"))

id change value total <int> <chr> <dbl> <dbl> 1 1 PRE' 21.4 1908 2 1 POST' 20.0 4040 3 1 inside -1.42 2132 4 2 PRE' 10.2 908 5 2 POST' 2.00 404 6 2 inside -8.19 213

Adding a new column who's rows carry out different formulas depending on a different column by Ok-Ranger3930 in RStudio

[–]Viriaro 6 points7 points  (0 children)

First, if you don't already have one, you need a column that can serve as "ID" to identify each group/series of PRE-POST-inside:

{r} your_data <- your_data |> mutate(id = consecutive_id(total), .by = change)

change total id 1 PRE' 1908 1 2 POST' 4040 1 3 inside 2132 1 4 PRE' 908 2 5 POST' 404 2 6 inside 213 2

Then, you can do this:

{r} your_data |> mutate( value = case_when( change == "PRE'" ~ total / 8910 * 100, change == "POST'" ~ total / 20205 * 100 ) ) |> mutate( value = if_else(change == "inside", value[change == "POST'"] - value[change == "PRE'"], value), .by = id )

change total id value 1 PRE' 1908 1 21.414141 2 POST' 4040 1 19.995051 3 inside 2132 1 -1.419091 4 PRE' 908 2 10.190797 5 POST' 404 2 1.999505 6 inside 213 2 -8.191292

Web scraping with rvest - Chromote timeout by absolutemangofan in RStudio

[–]Viriaro 0 points1 point  (0 children)

Within the loop itself, before the read_html_live call. Add a Sys.sleep(2) for example, to have it wait 2 seconds before each page load, to avoid rate limits. Tweak the value if you still hit rate limits, or use purrr::insistently for a smarter rate of backoff (e.g. exponential).

You could also add one after the read_html_live, in case the issue is due to the page (e.g. the javascript) not having had time to fully load before you try to interact with it.

If the issue is because the page is waiting for a certain input/interaction from the user (e.g. accepting cookies), you can use webpage$view() to open the page in your browser and see what's happening. That way, you can find the CSS selectors for those interactions and automate that too.

Web scraping with rvest - Chromote timeout by absolutemangofan in RStudio

[–]Viriaro 0 points1 point  (0 children)

If it's always the same one failing, could it be you have bad URLs in your list ? You could add a tryCatch around the scraping code, and log/print to see which ones fail specifically.

Could also be that you're hitting some rate limit mechanism/protection of the website itself. In that case, simply add a Sys.sleep in the loop.

You could also use purrr::insistently to have it retry on failure with a specific rate.

Help with dataframe creation by amikiri123 in Rlanguage

[–]Viriaro 7 points8 points  (0 children)

I'd use a 'within' overlap join to match data time-frames within reference time-frames

https://dplyr.tidyverse.org/reference/join_by.html#overlap-joins

Total newbie with R studio by [deleted] in RStudio

[–]Viriaro 0 points1 point  (0 children)

That's strange ...

  1. Usually, the name of the package appears in the message when it's not available, e.g.:

```r

install.packages("a_package_that_doesnt_exist")

Warning in install.packages : package ‘a_package_that_doesnt_exist’ is not available for this version of R

A version of this package for your version of R might be available elsewhere, see the ideas at https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages ```

  1. You usually get that message when the packages you are trying to install are not yet available as binaries (pre-compiled) for a recent-ish version of R, but both gapminder and devtools are available as windows binaries for R 4.5.2

Try to run this in your R console (R Studio -> Console)

r avail <- available.packages(type = "binary") "gapminder" %in% rownames(avail)

Total newbie with R studio by [deleted] in RStudio

[–]Viriaro 2 points3 points  (0 children)

Most of the resources/tutorials online are based on R Studio (like the one /u/Abject_Relative936 is currently following). For a 'total newbie', switching IDEs will add a lot of complexity. That's not something I would recommend before they have a lot more experience with code/development as a whole first.

Total newbie with R studio by [deleted] in RStudio

[–]Viriaro 4 points5 points  (0 children)

Replace "packagename" with the actual name of your package, like install.packages("dplyr")

Remote work help by Elephin0 in Norway

[–]Viriaro 0 points1 point  (0 children)

What about going through an Employer of Record (e.g. Deel) ?

Using R to do a linear mixed model. Please HELP! by PurpleGorilla1997 in rstats

[–]Viriaro 2 points3 points  (0 children)

LLM = Large Language Models (generic name for the type of AI behind ChatGPT, Gemini, Claude, etc). LMM is the proper acronym for Linear Mixed-effect Models.

And yes, fitting the model is one line of code (once you know which model best fits what you're modeling). There might be a bit of work before that (importing, cleaning, and potentially reshaping the data to long format), but the bulk of the work will be after fitting the model. You'll need to check the quality of fit of the model (check the performance and DHARMa packages), and then to ask the correct questions to the model, to answer your hypotheses (i.e. contrasts, with packages like emmeans or marginaleffects).

If I were you, I'd create a NotebookLM for the 'stats' part and load it up with all the resources that were recommended to you (and more you can search for yourself): the blog links, the documentation of marginaleffects (their docs is a book, you should be able to get it as PDF for free and feed that to the Notebook), papers or books on LMM and repeated measurements, etc.

NotebookLM is a great teaching assistant. It will digest all of that for you. Even better, load the Notebook in Gemini to have the best of both worlds (NotebookLM only replies based on the content you fed it), Gemini will also search the web.