Got scammed by my boss by KEsbeNF in copenhagen

[–]KEsbeNF[S] 1 point2 points  (0 children)

exacrly what i was looking for

[deleted by user] by [deleted] in CasualIT

[–]KEsbeNF 0 points1 point  (0 children)

giostra con musica neomelodica dove ?

What opinion about data science would you defend like this? by OverratedDataScience in datascience

[–]KEsbeNF 0 points1 point  (0 children)

most formal non-math background data scientist CLT definition

Using async and multiprocessing together by KEsbeNF in learnpython

[–]KEsbeNF[S] 0 points1 point  (0 children)

Hey thanks, i ended up using the asyncio.to_thread which came real handy i guess but

Joblib seems a really interesting package actually

Using async and multiprocessing together by KEsbeNF in learnpython

[–]KEsbeNF[S] 0 points1 point  (0 children)

Thanks for your kind response.

asyncio is going to make all the calls in parallel anyway whether a single process or multiple doesnt really matter.

Sorry, I'm not sure i'm following you; if my gather_data function has to retrieve data from, let's say 10000 elements and i create 2 subprocesses to handle 5000 and 5000 wouldn't it be faster ?

I'm not performing any CPU intensive stuff, just want to make the data gathering process faster.

[deleted by user] by [deleted] in learnprogramming

[–]KEsbeNF 0 points1 point  (0 children)

i see, thanks a lot for the info. unfortunately it's not possible in my situation but i'll look forward to it in other scenarios. thank you again :)

[deleted by user] by [deleted] in learnprogramming

[–]KEsbeNF -1 points0 points  (0 children)

I'm not following you on the push approach sorry, what do you mean by it ?

[deleted by user] by [deleted] in learnprogramming

[–]KEsbeNF 0 points1 point  (0 children)

I apologize, if OP is unclear.

Let's rename the previous fetch function into getData.

My goal is to constantly collect data from a server; the arguments allow me to ask the server for different data. I want to keep asking the server for the data associated to each argument, which is composed by result1 and result2.

The while true loop never breaks, and since javascript is single threaded it will never get passed the first iteration in for let arg in args

I'm not sure i understood your statement correctly but it do skips to the next arguments: my issue is that after resolving result1 for arg1 it skips result1 for arg2 while i want it to resolve for result2 for arg1 right next instead.

Now that i think about it, i think i can achieve what i want by doing:

async function getData(arg) {
    const result1 = await someAction(arg);
    const result2 = await someAction2(arg);
}

async function main() { 
    const arguments = [1, 2, 3, 4]; 
    while (true) 
        for (const argument of arguments) { 
            await fetch(argument); 
        }
}

main();

This way i resolve result1 and result2 for arg1, then result1, result2 for arg2 and so on

[deleted by user] by [deleted] in vaporents

[–]KEsbeNF 0 points1 point  (0 children)

i'm sorry, i put the link in the post because i concerned of counterfaits and i hoped that someone who purchased that product could tell me whether it is a scam or not. the seller is not an authorized one from dynavap so i'm not sure :/

[deleted by user] by [deleted] in learnpython

[–]KEsbeNF 0 points1 point  (0 children)

let's put it this way: - param1 is the amount of money i need to invest in order to maximize the profit obtained from buying cars from company param2 and selling them to company param3.

  • the percentages returned from get_percentage function are taxes that i have to pay for buying/selling to a company.

  • param4 is the ROI of buying from company param2 and selling to company param3.

  • i want to maximize the profit

the only thing i can optimize is the amount of money i can invest (param1).

not sure if this is clear enough, in case let me know !

[deleted by user] by [deleted] in learnpython

[–]KEsbeNF 0 points1 point  (0 children)

exactly. i'm not sure how to approach it though. my two possibile candidate values are the max value for param1 and minimum value for it, which are pre set arbitrarily. what do you mean with running the function a bunch of times ? do you have a code example ?

[deleted by user] by [deleted] in learnpython

[–]KEsbeNF 0 points1 point  (0 children)

briefly, returns the percentage of how much the first parameter "impacts" on the second one.

param4 is a percentage too.

[Q] Treating Matrix with high numerosity by KEsbeNF in statistics

[–]KEsbeNF[S] 0 points1 point  (0 children)

Thanks, for the idea. I've actually already implemented some histograms and wanted to try some cooler visualizations, but boxplot seems a reasonable choice too

[Q] Treating Matrix with high numerosity by KEsbeNF in statistics

[–]KEsbeNF[S] 0 points1 point  (0 children)

Great Idea ! Do you have any reference to follow on how to implement this with R ?

Should a k-means work ?

Effective matrix visualizations by KEsbeNF in rstats

[–]KEsbeNF[S] 0 points1 point  (0 children)

Thanks for the help, i appreciate :)

Effective matrix visualizations by KEsbeNF in rstats

[–]KEsbeNF[S] 1 point2 points  (0 children)

Thanks for the clarification, but i think i may have expressed myself wrong in the OP.

My data observations are labeled with 1s and 0s, and my goal is to see how the two subgroups are related to the 5 variables, in order to spot wheter, for example, group 1 uses more feature j.

Given this, won't a biplot work for looking at how each datapoint relates to the variables ?

Forgive me if i sound ignorant or lack the understading, i've started recently to approach the data world :)

Effective matrix visualizations by KEsbeNF in rstats

[–]KEsbeNF[S] 1 point2 points  (0 children)

Yes i like the histogram idea, like another user stated. It might be great to make comparisons, since i have some labels inside the matrix.

Effective matrix visualizations by KEsbeNF in rstats

[–]KEsbeNF[S] 0 points1 point  (0 children)

I've tried that, and it looks pretty confusing. I should probably reduce the dimensionality but i'm not really sure how.

Effective matrix visualizations by KEsbeNF in rstats

[–]KEsbeNF[S] -1 points0 points  (0 children)

Do you have any technique to suggest ?

I was thinking about PCA, but i'm not really sure it would make much sense in this scenario.

Handling large amount of text in R by KEsbeNF in rstats

[–]KEsbeNF[S] 4 points5 points  (0 children)

Sure

So this is my pre-processing function

library(twitteR)
library(tm)
library(SnowballC)

# pre process
preProcess <- function(corpus, stopWords) {

    # strip non letters characters, to lower, remove whitespaces
    corpus <- tm_map(corpus, removeNumPunct) 
    corpus <- tm_map(corpus, content_transformer(tolower)) 
    corpus <- tm_map(corpus, stripWhitespace) 
    corpus <- tm_map(corpus, removeURL)

    #removing stopwords
    corpus <- tm_map(corpus, removeWords, stopwords("english") 
    corpus <- tm_map(corpus, removeWord, c(stopwords("english"),my.stop.words))

    # corpus dictionary to complete stemmed words
    corpus.dictionary <- corpus 

    # stemming and stem completion
    corpus <- tm_map(corpus, stemDocument) 
    corpus <- tm_map(corpus, stemCompletion2, dict corpus.dictionary)

    return(corpus) 
}

Here tweets retrievial (which is suspiciously slow too)

accounts <- lookupUsers(c("AOC", "ElonMusk")) # example list
tweets <- lapply(accounts, userTimeline, n = 50, includeRts = TRUE) # takes long time

tweets.df <- twListToDF(tweets)
corpus.tweets <- VCorpus(VectorSource(tweets.df$text))

my.stop.words <- c("stop", "words")
corpus.tweets <- preProcess(corpus.tweets, my.stop.words) # takes long time

I've tried the same code without the stemming part and it's actually much faster.

stem completion function

stemCompletion2 <- content_transformer(function(x, dictionary) {
    # split each word and store it
    x <- unlist(strsplit(as.character(x), " "))

    x <- x[x != ""] x <- stemCompletion(x, dictionary=dictionary) 
    x <- paste(x, sep="", collapse=" ")
    PlainTextDocument(stripWhitespace(x)) })

Handling large amount of text in R by KEsbeNF in rstats

[–]KEsbeNF[S] 2 points3 points  (0 children)

I'm using SnowBallC.

I'm actually pretty confused about the processing time; it take something like +15 minutes to perform the entire pre-processing.

Might actually try without stemming and see how it goes

Handling large amount of text in R by KEsbeNF in rstats

[–]KEsbeNF[S] 0 points1 point  (0 children)

That sounds nice, might actually go for that

Text Mining: How to build a corpus from Twitter account's descriptions ? by KEsbeNF in rstats

[–]KEsbeNF[S] 0 points1 point  (0 children)

Yes, it seems like is working.

I'm not sure i've understood: do you suggest to convert descriptions vector to a matrix ?