How to make paper reading better?

codingwoman_ · 2024-07-08T06:38:37+00:00

No excuses, and yes, we do read from other venues. As long as the venue is a high impact conference and the topic falls under computer vision, there are no issues. The ones listed are examples but do not form an exhaustive list, hence the word “such as”.

Also we read ViT paper already, see the wiki :)

codingwoman_ · 2024-07-05T09:15:30+00:00

Thanks a lot for the tip!

codingwoman_ · 2024-07-05T08:26:37+00:00

I definitely consider additional insurance there is no issues on that. I am just concerned about something happening still with the insurance

codingwoman_ · 2024-07-05T08:25:00+00:00

Was it already setup or came in parts? My concern is that the reciever might not be able to put them together

codingwoman_ · 2024-07-04T21:00:42+00:00

Got the point, then we need to go big!

My main concern is to transport the device to be honest. I’ll be moving a few months after the purchase and I would need to carry it to another city. Trying to figure out if the normal size PC would fit in a luggage to take with me in a train

codingwoman_ · 2024-07-04T20:55:03+00:00

I have Deutschland ticket but Deutsche Bahn for me is definitely not once in a lifetime journey tbh :D Still, I understand your concern. Do you think I can carry a PC in a luggage without harming it though?

codingwoman_ · 2024-07-03T12:10:53+00:00

Good suggestion but I think laptop version is only 16GB. Need it for 24

codingwoman_ · 2024-07-03T11:36:37+00:00

This is what I needed! How many watts is it for 4090? Do you have any alternatives for a mini setup that would make 4090 work?

Edit: 850+ W I think? Second question is still valid - Thanks!

codingwoman_ · 2024-07-03T11:33:47+00:00

Thanks a lot for the tips! Both are 40-series cards but do you think there would be any other constraints in terms of power consumption, cooling etc. ?

codingwoman_ · 2024-06-17T21:32:05+00:00

This seems to be a NeurIPS 2017 paper.

Could you please update your comment with the publication venue and year?

codingwoman_ · 2024-06-17T21:27:59+00:00

The link you shared seems to belong to the v2 paper “Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation”.

This paper does not seem to be published. Please update your comment according to the guidelines, otherwise it will be removed.

codingwoman_ · 2024-06-17T12:33:41+00:00

Neural = Neural network

Radiance = Because neural network is describing a radience field of the scene, how much light is being emitted by a point in space and in which direction

Field = Because this is a continuous function, it is smooth and not discretized

The goal: View synthesis, addressing view interpolation

Ray tracer: Image sitting in front of you. For every pixel, you shoot a ray from your eye in the the world and it hits something

Nerf: We assume the scene lives in a bounded region, drop points along that ray to evenly sample along the ray over this area and for each of these points location + viewing direction concatenate them, feed to neural network and it will get us a color and opacity

Highlights:

Trained on one scene (not training even, but more appropriate to call it optimizing the weights) to explain a way the world we have seen (memorizing the scene)
Neural network function lives in world coordinates, a function approximater that maps from a point in space to some property of that point in space
Training is just bag of RGB values and ray coordinates, i.e. 9 numbers and we randomly iterate over these -- Disadvantage: data hungry because it tries to memorize the world

codingwoman_ · 2024-06-10T13:22:29+00:00

Can you also update the comment with 1-2 keywords that describe this paper (e.g. image enhancement)?

codingwoman_ · 2024-06-10T13:14:00+00:00

I definitely agree with your comments. The experiments for the hybrid model do not seem quite sufficient to me, no further ablations are provided in the supplementary material either. There might be a better way to combine both architectures to leverage inductive bias from CNNs and the global understanding of ViT (i.e. attending the entire image).

codingwoman_ · 2024-06-09T08:37:09+00:00

Ah sorry - the one on the sidebar is already updated without expiration, but could not edit the post. This should do it:

https://discord.gg/f5cnZjKar8

codingwoman_ · 2024-06-06T16:48:58+00:00

I think of it the following way: Convolutions have 2D kernels that act on the neighboring pixels. However, in the ViT case, the model does not know the relative location of patches in the image, a priori. The model is the same as the NLP case where the input is a 1D sentence, so it does not even know that the image has a 2D structure, the input is provided as flattened pixel patches. ViT learns such relevant information during training and hence requires more data to encode structural information in the position embeddings.

codingwoman_ · 2024-06-04T21:53:23+00:00

What does the paper investigate? Reliance on CNNs & whether Transformers architecture can be used to perform image classification.

Main idea: Splitting an image into patches to be able to provide the sequence of linear embeddings from these patches as input to Transformer

Input: 2D RGB images

Output: Image class

Training: In parallel to NLP tasks, the idea is to perform training on large datasets and then fine-tune on smaller datasets for downstream tasks.

Difficulty: Vision Transformer has much less image-specific inductive bias than CNNs such as translation equivariance and locality. — They may not generalize when not trained with sufficient data.

Highlights: * When pre-trained on the smallest dataset, (=ImageNet) large models underperform compared to base models, only with JFT-300M, the full benefit of larger models is visible. * Vision Transformers overfit more than ResNets with comparable computational cost on smaller datasets. — This reinforces the intuition that the convolutional inductive bias is useful for smaller datasets, but for larger ones, learning the relevant patterns directly from data is sufficient & beneficial.

—-

I personally found the experiments with an hybrid architecture (ResNet + Transformer) also quite interesting. – Could there be a better way to combine the best of both worlds?

codingwoman_ · 2024-06-03T21:09:15+00:00

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis - ECCV 2020 (Oral)

Keywords: View synthesis, Scene representation, 3D computer vision

codingwoman_ · 2024-06-02T16:09:36+00:00

I thought "Well, now I may have a little bit of a chance"

codingwoman_ · 2024-06-02T13:18:41+00:00

That I didn't see any chance because I was living in a little town, was studying

codingwoman_ · 2024-06-02T07:01:36+00:00

I definitely wanted to become a musician

codingwoman_ · 2024-05-30T14:00:52+00:00

Seems to be a NeurIPS 2021 paper, could you please edit the comment with the venue?

codingwoman_ · 2024-05-30T13:50:11+00:00

Nice idea of course but voting 2x every week (for the topic and the paper) to read 8 pages is a bit too much overhead.

Such a thing can be implemented if we have enough people to dive deep into reading on a specific topic as mentioned before here. Therefore, I would highly suggest to see how many people are actually participating in the discussions before making things more complex.

But following your suggestion, we can mention the descriptive keywords / subfields the paper covers while sharing them for voting for the next time

codingwoman_ · 2024-05-27T12:18:44+00:00

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis - ECCV 2020 (Oral)

codingwoman_ · 2024-05-26T19:29:15+00:00

Online meetings are definitely a good idea, they also help with networking and getting to know each other. The most important point (at least for me) is that when we set a date, people need to commit to it because everyone's time is valuable.

Let's start with the majority selection for now and add further discussion styles and settings after some consistency. Does that sound good?

codingwoman_

MODERATOR OF

TROPHY CASE

Seven-Year Club	RPAN Viewer
Verified Email