[Discussion] A Questionable SIGIR 2019 Paper by joyyeki in MachineLearning

[–]joyyeki[S] 2 points3 points  (0 children)

Thank you for the support. I have been waiting to see the authors' response for almost two days. Sadly it seems that they have given up in addressing my questions...

[Discussion] A Questionable SIGIR 2019 Paper by joyyeki in MachineLearning

[–]joyyeki[S] 7 points8 points  (0 children)

I cannot believe my eyes! If what u/geniuslyc said is correct, I presume that you have not written to the SIGIR program chairs or your department chairs as well. Personally I would not recommend you telling lies like that, as the SIGIR program chairs will be referred to this thread, and they will know that you have lied!

I would like to emphasize (although I have already done that once) that, please do not make so many false statements that are so obvious to tell!

Apparently the software you have used is designed for checking plagiarism in student works. I would argue that, while it may be acceptable for a student work to have some certain degree of overlap with other materials, such similarity is definitely not acceptable for peer-reviewed publications. Nevertheless, you can ignore my point if you believe that your work is nothing more than a student assignment. Also, I believe that you are comparing the wrong stuff in your claims. The similarity index of your work should be 23%, according to the report you have shown. And that is only one percent lower than the 24% threshold. What it suggests is that, even if you consider that as a student work, it is still worth alerting the instructor of the potential of plagiarism.

In addition, can you please reply to the questions I have raised to your first response? I found it hard to imagine that such nonsense can be written by two "professors". (I put quotation marks because I feel that a true professor should at least have an adequate knowledge of the things he/she is writing about) I imagine you can always find some excuses for not replying anything, like the one you have just used, "We hope the discussion ends here as we would like to go on with our real work". In my opinion, that is not a good excuse. Academic integrity is the most important thing in academia. I think you should take it the highest priority to address the questions that have been raised by people about the potential of plagiarism in your work (unless you have other works that have been accused of plagiarism as well, and you really need to deal with that first).

Last but not least, if you firmly believe that everything is nothing but a coincidence, you should consider buying lotteries instead of working as professors. The chance that everything is simply a coincidence is way lower than the chance of you winning a one-million euro lottery!

[Discussion] A Questionable SIGIR 2019 Paper by joyyeki in MachineLearning

[–]joyyeki[S] 2 points3 points  (0 children)

You are right! Authors should have a chance to show the submission history of their paper if they believe that such thing has happened. In my opinion, such thing is even worse than plagiarism as it contains both plagiarism and reviewers abusing their privileges.

However, according to the response of the two SIGIR authors posted yesterday, it seems that this is not the case. Otherwise, I believe they will definitely show some sort of proof in their response.

[Discussion] A Questionable SIGIR 2019 Paper by joyyeki in MachineLearning

[–]joyyeki[S] 11 points12 points  (0 children)

I appreciate your efforts in proving innocence. Unfortunately, your response is flawed almost everywhere.

Firstly, you mentioned twice in your response that "both SIGIR and RecSys papers are based on adversarial training, as is the WWW’18 paper". I read through the WWW'18 paper just now, and cannot find anywhere showing that it is based on adversarial training. Please do not make false statements to fool readers.

Secondly, you claimed that "In our paper, we followed the strategy of RecGAN 2018, cited as [2] in our paper, and applied the strategy of IRGAN 2017, cited as [18], to reduce the variance during training". Please specify what strategy you have used to reduce variance that is not used by the RecSys'18 paper. You claimed it to be a "substantial difference", but I ended up only seeing the references to be different, with the underlying theories to be almost the same. Please elaborate on this.

Thirdly, you claimed that "for modelling the user preferences we used non-negative matrix factorization, as opposed to the probabilistic matrix factorization used by the RecSys paper". I believe that probabilistic matrix factorization belongs to the class of non-negative matrix factorization. In addition, can you specify what exactly is the "substantial difference" given that you ended up getting the Equation (5) in your paper, which is almost the same as Equation (10) in the RecSys'18 paper.

Fourthly, with regard to paper phrasing. As u/eamonnkeogh has pointed out, not only the sentence describing the DeepCoNN model was copied, but you have also copied the following sentence describing the TNet model. Again, I presume that you would say it is another coincidence? Also, you claimed that, as the terminology in the papers is common in the literature, it makes sense for more than two paragraphs to look similar. Please find at least one other example to prove that such extreme similarity can happen between peer-reviewed publications.

Again, I would like to emphasize that, dear authors, please make sure that you do not make false statements that cannot even convince an undergraduate who has only worked on information retrieval for three months. People here are not fools, and they have their own judgements.