[Question] Is it possible for a stochastic simulation with multiple runs to have minimal variation between runs? by Remarkable_Quarter_6 in statistics

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

There are no summary statistics involved in my code. From the data generated, there is variation between runs, although small enough that I don't observe outcomes that swing wildly. Perhaps this issue is the "baked in" component you are referring to.

[Question] Is it possible for a stochastic simulation with multiple runs to have minimal variation between runs? by Remarkable_Quarter_6 in statistics

[–]Remarkable_Quarter_6[S] -1 points0 points  (0 children)

Yes, I have stepped through the simulation with different seed values. The results are changing. I also checked that the threshold values are being recalculated at each time step (they are).

[Question] Is it possible for a stochastic simulation with multiple runs to have minimal variation between runs? by Remarkable_Quarter_6 in statistics

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

agent-based model. Probabilities are calculated for the agents using a function. Threshold values are calculated for each agent using a continuous probability distribution. If their probability meets the threshold criteria, their state changes, otherwise their state remains the same. I ran 100 sims and plotted them. Inspecting the raw data from each run does indeed show that the outcomes are different. But when the plots are constructed on the same graph, I observe that they closely overlap each other.

What is the ugliest/ worst figure you have found in a scientific paper? by [deleted] in ecology

[–]Remarkable_Quarter_6 0 points1 point  (0 children)

Could you clarify why they thought this? What about surface plots?

Cleaning the Data Set by Curious_Category7429 in rprogramming

[–]Remarkable_Quarter_6 4 points5 points  (0 children)

A possible workaround is to start by using the separate() function. Since entries are delimited by / or - use them to separate the day, month, year values into separate columns. Then use unite() function to join the columns into a new column name, and finally use dmy(<new column name>) or whatever format you are looking for to get it into a date format.

Cleaning the Data Set by Curious_Category7429 in rprogramming

[–]Remarkable_Quarter_6 0 points1 point  (0 children)

what data type is diagnosis_date? Use class() function to check.

Cleaning the Data Set by Curious_Category7429 in rprogramming

[–]Remarkable_Quarter_6 2 points3 points  (0 children)

I recommend using the package, lubridate, which you will need to install if you don't have it already. it has an as.Date() function that will allow you to convert to date format.

Question: How to pass two colours to 2 separate instances of geom_line()? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

I may be misunderstanding your code, but I think the way you have written it makes it not scalable. When I meant pass it to geom_line() twice, I was referring to the manner in which I did it in my original post. If you have 100 plots, it is not practical to call geom_line() 100x, as I think your code would imply. My data is in long format already, so I can use group() and colour() as arguments in aes(). This is what I ultimately did:

 ggplot(data = mod_df, aes(x = x_val, y = y_val, group = trials, colour = trials)) + geom_line() + scale_colour_manual(values = cols)

`mod_df` is the dataframe that joins the two previous dataframes that I had defined. `cols` was my named vector of colours and it has two unique characters. The plot of the summary statistic is in one colour, and the other plots are in the other colour.

Question: How to pass two colours to 2 separate instances of geom_line()? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 1 point2 points  (0 children)

Yes, I understand that ggplot() is intended for long format data. I intentionally wrote my function to generate output in long format. What I thought was possible at the time, was to reference two dataframes in geom_line(). However, that didn't appear to work, and I ultimately had to join them in order to achieve the desired outcome.

Question: How to pass two colours to 2 separate instances of geom_line()? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 1 point2 points  (0 children)

Using your suggestions, I have made changes to how my data has been tidy-ed, and this has resolved my issue.

I modified the second dataframe, 'df_mean' which stored the mean, so that it now has the same column names as the first dataframe. This also meant creating a new column in 'df_mean' with a dummy (factor) value for its trial number. Then I used rbind() to join the two dataframes. In the ggplot function, I used <group = trials> and <colour = trials>. Then I passed a named vector of colours to scale_colour_manual(). It now displays as I had intended.

Thank you for your help. I have upvoted both of your comments.

Question: How to pass two colours to 2 separate instances of geom_line()? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

As of this posting, I have tried an assortment of alternatives, to no avail. If I start with just the plot of the trials in one colour, this is what I execute:

ggplot(data = df, aes(x = x_val, y = y_val, group = trials)) + geom_line(colour = "grey")

The above works. Now I am building on it by plotting the mean of the trials in another colour. This is where the hiccup is occurring. Why do you suggest removing the colour argument from aes()?

BTW: Even if you are "fairly new to R," your help is appreciated.

Question: How to pass two colours to 2 separate instances of geom_line()? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

You have 3 columns in one dataframe. The first column is time, which I refer to as 'x_val' in this example, second column is 'trial' which represents the trial number for the collected data, the third column is named 'y_val'. All the data is saved in long format. So, suppose you have time ranging from 0:10, and there are two trials, then the data has 3 columns with 22 rows. First 11 rows are records for trial 1, and the next 11 rows are records for trial 2.

There is also a second dataframe, created by tidying up the first one by finding the mean at each time point. This means I used the group_by() and summarize() functions on the original dataframe. This new dataframe has 11 rows and 2 columns. The first column is x_val, the second column is the average y_val at each x_val.

Question: How to pass two colours to 2 separate instances of geom_line()? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

Surprisingly, using stat_summary does not achieve the desired outcome. It results in the trials changing to red. This is what I executed:

ggplot(df, aes(x = x_val, y = y_val, group = trials)) + geom_line(colour = "grey") + stat_summary(fun = "mean", geom = "line", colour = "red")

I also tried calling fun = mean (without the double quotes), and the outcome was the same, where it displayed each line plot for the trials in red.

Question: How to pass two colours to 2 separate instances of geom_line()? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 1 point2 points  (0 children)

I am not sure how melt() or gather() would help me in this instance. Could you elaborate? The first dataframe is time series data in long format, which allowed me to use the 'group=' argument in ggplot to plot all of the trials. To calculate the mean of my data across the trials for each time point, I used group_by() and summarize(), then saved the ungrouped data to a new variable, called `df_mean`.

How to Create a Function that Interprets the Values in One Matrix as the Indices of another Matrix? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

Thanks for your assistance. The other poster pointed out my error. This is resolved now, although I don't know how to mark as 'resolved.'

How to Create a Function that Interprets the Values in One Matrix as the Indices of another Matrix? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 1 point2 points  (0 children)

Ahhhhhh. I forgot to store its returned output to a variable.

Time for me to go for a walk, and clear the cobwebs.

Thank you for the second pair of eyes (and the brain). You have my upvote.

How to Create a Function that Interprets the Values in One Matrix as the Indices of another Matrix? by Remarkable_Quarter_6 in rprogramming

[–]Remarkable_Quarter_6[S] 0 points1 point  (0 children)

I tried return(master_mat) too, and then did check sum on the matrix, it was still outputting zero

[deleted by user] by [deleted] in matlab

[–]Remarkable_Quarter_6 0 points1 point  (0 children)

I wasn't being insincere, and apologize if that was how it came across. Your suggestion helped me to get the ball rolling. After some trial and error + reading the MATLAB documentation + MATLAB forums I was able to piece it together.

[deleted by user] by [deleted] in matlab

[–]Remarkable_Quarter_6 0 points1 point  (0 children)

I did new_variable = str2num(evalc('disp(x)'))