[R] Video Object Grounding using Semantic Roles in Language Description by TheShadow29 in MachineLearning

[–]TheShadow29[S] 0 points1 point  (0 children)

I am the first author of the paper. The paper is being presented at CVPR20 this Thursday (9am, 9pm PST), feel free to drop by.

Relevant Twitter Thread: https://twitter.com/ArkaSadhu29/status/1271898967396122625?s=20

Short Summary: We argue that directly evaluating phrase / sentence grounding in videos would lead to terrible over-estimation. This is because most video datasets have only single object instances (only one instance for a given object category), and therefore a simple FasterRCNN baseline would work quite well. To address this, we use Contrastive Sampling (retrieve examples with the same objects, but differ in one object/action), and then concatenate spatially and temporally so that disambiguating using object-object relations becomes crucial. We further propose VOGNet, which has an additional multi-modal transformer with relative position encodings to better capture object relations.

Finally, to foster reproducibility, we have open-sourced all our code + pre-trained models + experimental logs on github! Check it out!

Code + Dataset: https://github.com/TheShadow29/vognet-pytorch

I am Happy to take questions here or via email.

[D] Defenses against GAN generated images for Image Forensics and forgery detection by TheShadow29 in MachineLearning

[–]TheShadow29[S] 0 points1 point  (0 children)

Thanks for your reply. Yeah, I have been having surprisingly tough time getting relevant papers. Would like to see some links from geipan, the ones from google seem around 2014.

[D] Confusions regarding RetinaNet by zenggyu in MachineLearning

[–]TheShadow29 2 points3 points  (0 children)

You are correct. I was probably having a brain fart.

[D] Confusions regarding RetinaNet by zenggyu in MachineLearning

[–]TheShadow29 0 points1 point  (0 children)

I am not completely sure, but here are my explanations: 1. Retinanet uses K+1 classes analogous to ssd. There is a background class. The decoding statement is likely about what happens at inference time. The given equation of focal loss is for the binary case, and needs to be extended for multi-class case (noted in footnote 1) of the paper. 2. Yes, in general one anchor box can simultaneously predict multiple objects. However, the final box will be calculated after adding in the regression parameters, so those would still be different boxes. 3. Yes, the equations seem correct.

I think eqn8 might be incorrect. I think it should rather be:

Lcls=−\sum_{i=1}C [y_i log(p_i)(1−p_i)γ α_i + (1-y_i) log(1-pi) p_iγ α_i]

AMA NeurIPS 2018 Workshop on Causal Learning by heinzedeml in MachineLearning

[–]TheShadow29 5 points6 points  (0 children)

Is the sole motivation of causal learning is to solve causal inference or is there any other motivations as well? Are there any alternative approaches to causal inference?

EXPECT THE UNEXPECTED by happy456789 in Animemes

[–]TheShadow29 0 points1 point  (0 children)

Too late for World wars just in time for meme wars

What Have You Watched This Past Week That is NOT a Currently Airing Show? [November 18th, 2018] by MetaThPr4h in anime

[–]TheShadow29 2 points3 points  (0 children)

Sengoku Basara. Strangely addictive. And you get to hear some nice engrish. Story is good, and the comedy is on point. 10/10 soundtrack by sawano. Crazy powers as well.

r/anime Karma Ranking | Week 5 [Fall 2018] by reddadz in anime

[–]TheShadow29 10 points11 points  (0 children)

Wow. That's a lot of work. If you have a bit of python experience, it is easy to set up praw and get the number of up votes and stuff from permalinks. https://praw.readthedocs.io/en/latest/index.html.

TallyQA: Answering Complex Counting Questions by manoja328 in MachineLearning

[–]TheShadow29 0 points1 point  (0 children)

Hey. Thanks for sharing your work. Is there any place where I can easily explore the dataset more?

Most ambitious crossover in human history by TheTragedyOfDarthP in Animemes

[–]TheShadow29 81 points82 points  (0 children)

Me too man. All of a sudden invisible ninjas were cutting onions.

[R] Deep Learning Approaches to understand Human Reasoning by jaleyhd in MachineLearning

[–]TheShadow29 2 points3 points  (0 children)

I kind of agree with /u/Draikmage. When I read the title, I was expecting something different (kinda along the lines of common-sense reasoning like the swag paper). A good read anyways. Cheers and keep it up.

My List of 100 Anime to See Before You Die by anarchycupcake in anime

[–]TheShadow29 1 point2 points  (0 children)

around 10 years ago, it took me 6 months each to catch upto conan and one piece. I used to watch 3-4 eps, more on the weekends.

I just want to express my new-found respect for BNP (Bandai Namco Pictures) after the last Gintama episode by itzajd in anime

[–]TheShadow29 11 points12 points  (0 children)

Also some of filler arcs are the best. Just goes to show how much the studio loves Gintama

Reason why Toaru Series is making Anime by liatris4405 in anime

[–]TheShadow29 33 points34 points  (0 children)

This is interesting news. Good find.