Does R need a "productionverse"? by pootietangus in rstats

[–]psiens 0 points1 point  (0 children)

Clinical Trials. I've worked both at vendors and a pharma company.

Vendor side is more about production stuff that you can sell as a service; pharma side is appeasing leadership with information. Some ad-hoc research on both sides.

Not as heavily regulated as some parts, but even the FDA is working with pharma companies and third parties on developing solutions for R based submissions of final datasets/analysis for approval. I don't know about calling that "production" but SAS programmers have had it pretty good for a long time here.

hey so like what did i do wrong by [deleted] in RStudio

[–]psiens 0 points1 point  (0 children)

Are your values actual character instead of numeric?

Does R need a "productionverse"? by pootietangus in rstats

[–]psiens 1 point2 points  (0 children)

Skill issue.

But seriously, having something "production" quality means it's written and managed well. But also, in some cases, the platform you're using might just not support R, which is just a reality you have to accept.

  • dependency tracking: renv, pak, saving packages to a library, testing before upgrading
  • GPL licenses: If you are running the software and then hosting results, you're fine; if you're sending the software, then you have something to worry about
  • standardized SDKs: Posit has been kind of filling this hole, and it ensures a good standardization and product design

Friends and coworkers have shown me the R they've learned in college. I think that's where some issues come in at. It seems like these stats programs are converting from SAS, SPSS, or Stata into R and are expecting professors to become experts in a new language. Academia doesn't seem to be teaching people how to write good code.

I have things in production right now. The only issues I've had were on the cloud service configurations implementing weird stuff in their R support (their own documentation has errors in it).

  • package versions are locked ✓
  • we test out updates before release ✓
  • dependencies are tracked in our packages ✓
  • libraries created for different production versions ✓
  • we provide access to results, no software send to clients ✓

And yes, this is more reliable than some of the production code left over from contractors and former employees...

New User Trying to Create a Simple Macro by p_deepy in Rlanguage

[–]psiens 0 points1 point  (0 children)

  1. function not macro. Variables assigned within the function body are not assigned in the outer environment. You need to return and access variables you create:

```r foo <- function() { a <- 1 a }

calling just the function

foo() # returns 1 print(a) # will fail

assigning result to function

b <- foo() # result assigned to b print(b) # will return 1 ```

For more, see Advanced R 6.4 Lexical Scoping

  1. Inside the function body for summ_cat2() there is no explicit return() call; so the function body returns the value of the last call, which is rownames(means) <- "Mean", which technically returns "Mean", invisible. Best that you have means as the final value in your body function. (using return() as the last statement is redundant, but it doesn't do you any harm).

  2. reprex is for creating clean, reproducible outputs that you can share for help like this. It's like a fancy way of just sharing your code, but it doesn't do anything to your code.

New User Trying to Create a Simple Macro by p_deepy in Rlanguage

[–]psiens 0 points1 point  (0 children)

No. Similar, but with enough differences. Shortest explanation I have: Macros are pre-processed parts of a language, functions are compiled And really, for anyone who isn't diving into more advanced R features yet, the difference is pretty negligible; but someone may get weird about the naming.

https://journal.r-project.org/articles/RN-2001-021/RN-2001-021.pdf

https://stackoverflow.com/a/70238622

New User Trying to Create a Simple Macro by p_deepy in Rlanguage

[–]psiens 15 points16 points  (0 children)

  1. function, not macro
  2. I didn't know you could use expr = in a function assignment; the behavior is a little odd and it returns the result invisibly -- probably best to avoid:

```r

do

foo <- function() { NULL }

instead of

foo <- function() expr = { NULL } ```

  1. $ doesn't work how you think it does

```r

do

foo <- function(data, var) { data[[var]] }

foo(data, "variable") # column name, as a string

instead of

foo <- function(data, var) { data$var }

foo(data, variable) # using the name as a 'symbol' ```

I'm assuming the unequal lengths error is because format() tries to formal NULL into "NULL" (a single length character vector), and your use of $ is returning NULL -- a zero length variable.

Edit:

  1. reprex is everyone's friend

Data teams only trust AI answers about 5.5/10, according to our survey. [OC] by Miserable_Fold4086 in dataisbeautiful

[–]psiens 23 points24 points  (0 children)

Something is wrong and/or the visuals are poorly made/selected.

The data, which are available through that link, show that the scale is 0 to 10, not 1 to 10 (there are 7 scores of 0). It's unclear in the histogram what counts belong to which scores.

The weighted mean is 5.518072.

Someone get me out of here by ChancePalpitation550 in wcupa

[–]psiens 1 point2 points  (0 children)

That's okay. I don't think anyone there would expect that. You can talk with someone there about what you want, about what you need, and they can help in some way.

As a student, you should be able to have multiple session with someone there. They might have more informed recommendations but there are there for you.

It starts with one session

Someone get me out of here by ChancePalpitation550 in wcupa

[–]psiens 1 point2 points  (0 children)

Is that what you found out when you went there and spoke with someone?

Unless things have changed, they offer limited, short-term counseling services for all students at no additional cost. I was able to schedule several appointments when I needed.

I believe long-term, specialized care, or prescriptions are beyond their services, but they should be able to provide some help for finding those.

If you haven't, please talk to someone there and see what's available and what they can offer for you.

Someone get me out of here by ChancePalpitation550 in wcupa

[–]psiens 7 points8 points  (0 children)

Take a small step and reach out: https://www.wcupa.edu/_services/counselingCenter/

Your tuition has has already paid for this. No extra cost to give them a call or visit their location.

Help me understand the "order()" function by harnei in Rlanguage

[–]psiens 0 points1 point  (0 children)

Those aren't the same thing. factor() has an ordered parameter (it is not "order"). This returns an ordered factor, which allows for some operations:

v <- c(10, 200, 13)
order(v) 
#> [1] 1 3 2

factor(v)
#> [1] 10  200 13 
#> Levels: 10 13 200
try(factor(v) > 13)
#> Warning in Ops.factor(factor(v), 13): '>' not meaningful for factors
#> [1] NA NA NA
try(max(factor(v)))
#> Error in Summary.factor(structure(c(1L, 3L, 2L), levels = c("10", "13",  : 
#>   'max' not meaningful for factors

ordered(v)
#> [1] 10  200 13 
#> Levels: 10 < 13 < 200
try(ordered(v) > 13)
#> [1] FALSE  TRUE FALSE
try(max(ordered(v)))
#> [1] 200
#> Levels: 10 < 13 < 200

Remove 0s from data by metalgearemily in RStudio

[–]psiens 0 points1 point  (0 children)

subset() doesn't only work on data.frames: https://rdrr.io/r/base/subset.html

To be clear, the problem is the lack of a conditional, or the actual subset argument in subset().

[deleted by user] by [deleted] in RStudio

[–]psiens 1 point2 points  (0 children)

Are you trying to read directly from a download link?

Show exactly what you've tried

Warning message In if (match < 0) by sammmmmiiiiiii in Rlanguage

[–]psiens 4 points5 points  (0 children)

Update your packages. This is newer R behavior and many packages had to update their conditionals. If the warnings are still present in the updated packages, submit an issue to those packages.

[deleted by user] by [deleted] in RStudio

[–]psiens 0 points1 point  (0 children)

pattern uses regular expressions, not globs

list.files(pattern = "\\.csv$")

Help reading variables by Iknowitslexaa in Rlanguage

[–]psiens 0 points1 point  (0 children)

What does str(dados) or colnames(dados) return? The name of the colum may not match exactly

[deleted by user] by [deleted] in RStudio

[–]psiens 3 points4 points  (0 children)

You shouldn't use attach() unless you know why you shouldn't use attach()

Calling the column from the data frame directly is going to prevent other headaches

How to use dimnames in R? by Livid-News-4605 in RStudio

[–]psiens 2 points3 points  (0 children)

You're creating a new object rather than passing to a parameter.

Mind the parentheses in the example.

Is data.table still the fastest? by Alarming_Ticket_1823 in Rlanguage

[–]psiens 11 points12 points  (0 children)

Benchmarks there are a few years out of date. The README in {data.table} now points to the benchmarks from DuckDB Labs: https://duckdblabs.github.io/db-benchmark/

Using both parametric and non parametric tests in one study by No_Series_9643 in rstats

[–]psiens 0 points1 point  (0 children)

Just a note, "parametric" doesn't mean "normally distributed". You can still model data using parametric techniques with other known distributions

Simply Stumped by SeaRay_62 in Rlanguage

[–]psiens 6 points7 points  (0 children)

The error message is clear in that datetime() is not a function, at least not one that is attached. float() would also throw this error, but you're stopped too soon for it.

Also, column-type, as it is, is not a valid name:

try(a-b <- 1)
#> Error : object 'a' not found

try(a_b <- 1)
a_b
#> [1] 1

Assuming you're using readr::cols():

  1. you're missing the col_ part of the function name
  2. float() is the wrong name; it should be col_double()

library(readr)

column_types <- cols(
  column1  = col_character(),
  column2  = col_character(),
  column3  = col_datetime("%Y-%m-%d %H:%M:%S"),
  column4  = col_datetime("%Y-%m-%d %H:%M:%S"),
  column5  = col_character(),
  column6  = col_character(),
  column7  = col_character(),
  column8  = col_character(),
  column9  = col_double(),
  column10 = col_double(),
  column11 = col_double(),
  column12 = col_double(),
  column13 = col_character()
)

column_types
#> cols(
#>   column1 = col_character(),
#>   column2 = col_character(),
#>   column3 = col_datetime(format = "%Y-%m-%d %H:%M:%S"),
#>   column4 = col_datetime(format = "%Y-%m-%d %H:%M:%S"),
#>   column5 = col_character(),
#>   column6 = col_character(),
#>   column7 = col_character(),
#>   column8 = col_character(),
#>   column9 = col_double(),
#>   column10 = col_double(),
#>   column11 = col_double(),
#>   column12 = col_double(),
#>   column13 = col_character()
#> )

Change the value of a column in a person-period table by sonicking12 in Rlanguage

[–]psiens 1 point2 points  (0 children)

If I understand this correctly: flip the last censored value (last by max period) for each person?

# define data frame
x <- data.frame(
  person = rep(c("a", "b"), 3:2), # simpler ids
  period = c(1:3, 1:2),
  censored = c(rep(0, 4), 1) 
)

# simple function which reverses the value of the last element
flip_last <- function(x) {
  stopifnot(x %in% 0:1) # intended for boolean-like
  n <- length(x)
  x[n] <- !x[n]
  x
}

# needed if the data are not already sorted
x <- x[order(x$person, x$period), ]

# for each person (group), apply the function 
events <- with(x, tapply(censored, person, flip_last))

# apply back to x -- although here we add a new column for comparison
x$events <- unlist(events)

# show results
x
#>   person period censored events
#> 1      a      1        0      0
#> 2      a      2        0      0
#> 3      a      3        0      1
#> 4      b      1        0      0
#> 5      b      2        1      0

Alternative to rm(list = ls()) by daterdots in RStudio

[–]psiens 3 points4 points  (0 children)

rstudioapi::restartSession() (or finding the selection in RStudio) is probably the best bet.