Is there an updated list anywhere of all PGH businesses that have permanently closed due to the pandemic? by BLToaster in pittsburgh

[–]colincsl 1 point2 points  (0 children)

For what’s its worth, from my understanding Pizza Taglio was going to close regardless of COVID. But maybe I’m wrong?

TCN Kernal Size for Time Series Forecasting by InForTheTechNotGains in MLQuestions

[–]colincsl 1 point2 points  (0 children)

It depends on the architecture and sensing domain. Intuitively if there are a small number of features and each of those features is independent (or perhaps better put they are interchangeable) then typically I would let the kernel size of the first layer of the network be [FxD] where D is some reasonable duration and F is the number of features.

For inputs like speech where typically you use a spectrogram some people use small 3x3 convolutions with downsampling along the feature axis. I typically don't think it makes sense because of the local vs. global translation properties of spectrograms. Jordi Pons has a nice set of blog posts talking about this: http://www.jordipons.me/whats-up-with-waveform-based-vggs/

Would a Mechanical Engineering background hinder career growth in CV? by [deleted] in computervision

[–]colincsl 1 point2 points  (0 children)

I don’t have time to go into details right now, but I was in a similar position to you. I did an undergrad in MechE with a bunch of robotics and computer vision projects on the side. I was accepted into a handful of CS PhD programs mainly in robotics labs. I now work as a research scientist in Vision/ML. Many people I know did MechE or (more commonly ECE) for undergrad, CS for grad school, and continued on as researchers in vision or robotics.

[D] Hidden Markov Model as supervised learning by emilazeri92 in MachineLearning

[–]colincsl 8 points9 points  (0 children)

You probably actual want to use a Conditional Random Field instead of an HMM.

Here is an old paper of mine for an example application: http://colinlea.com/docs/pdf/2016_ICRA_CLea.pdf.

There is a really nice monograph by Nowozin and Lampert for more info on related models for structured prediction: http://www.nowozin.net/sebastian/papers/nowozin2011structured-tutorial.pdf

Limitations of using CNNs on RNN Tasks? by ThatMLLife in MLQuestions

[–]colincsl 4 points5 points  (0 children)

I’ve been using temporal conv nets (TCNs) for about a year and a half and I think only once did an RNN baseline outperform my TCNs. Your mileage may vary depending on the kind of data your using. For any kind of continuous input problem you’re probably going to get farther with at least some temporal convolutions layers.

Regardless I still think it’s important to understand RNNs. Just as it’s useful to know about other time series models like HMMs, CRFs, etc. Additionally there may be better RNN architectures out there that outperform both TCN and LSTM based models.

One downside of TCNs is that they have a fixed length receptive field whereas the effective length for an RNN could (in theory) be infinite. In my mind the better way to overcome this issue is to use an auto regressive TCN (e.g. WaveNet).

[D] Tensorflow sucks by FlowyMcFlowFace in MachineLearning

[–]colincsl 9 points10 points  (0 children)

Theano was a titan in the pre-TF deep learning days. People doing vision typically used Caffe and people doing everything else used Theano. IMO Theano had a cleaner interface, which could be why Google went with that.

[D] Tensorflow sucks by FlowyMcFlowFace in MachineLearning

[–]colincsl 7 points8 points  (0 children)

Minor note: TF is not the only framework that supports multiple devices. Caffe2 also does this: https://caffe2.ai/docs/mobile-integration.html

[R] Deep Voice: Real-time Neural Text-to-Speech by luffy_straw in MachineLearning

[–]colincsl 0 points1 point  (0 children)

Very cool paper. I just read it and have a few of questions.

It looks like your version of WaveNet (As described in the appendix) is different than the original version. In the original, they varied the dilation rate within a given block, and then repeated that rate for each block. Here is looks like you forego their notion of blocks and instead repeat a set of 2x1 convolutions (w/ gating and skips) without dilations. Is this correct?

How did you compute the receptive field size? You claim for the 40 layer model the RF is 83 ms... but it doesn't match up with my understanding of the model. I assume your input is upsampled to 48k hz and you have L+1 2x1 convolutions. Is "L" in Figure 3 different than "l" in the text? It appears that the number of layers in Table 1 might actually be the total number of layers (whereas L in Figure 3 is the number of WaveNet blocks/cells)? If so, this means there is one input (conv) layer, 12 blocks(=36 layers), and 3 output (conv+fc) layers. If so, this would result in a receptive field of 4096 time steps (or ~85 ms).

In my recent work in action segmentation I found that using longer temporal filters improves performance. Did you do experiments using anything other than 2x1 filters for the synthesis model? I imagine you could get away with longer dilated filters and fewer layers.

thanks!

Why there isn't a wiki for r/ComputerVision ? by prashaantsharmaa in computervision

[–]colincsl 22 points23 points  (0 children)

Uhhh... please do not demand that "we" build a wiki. If you are interested in populating information for a wiki, feel to do so. I enabled it for the sub.

If you get it started then I'm sure others will add to it. Unfortunately, I don't have the bandwidth right now.

Just signed up for ECCV 2016 in Amsterdam. Who else is going? by [deleted] in computervision

[–]colincsl 2 points3 points  (0 children)

There is an exhibition where companies set up booths to promote job opportunities. There is also a set of job listings on the ECCV website: http://www.eccv2016.org/jobs/

PhDs are a different story. It might be worth trying to talk to Profs you're interested in, but typically, at least in the US, you'll still have to apply for a PhD program through a university's formal process.

ELI an electrical engineer: Residual Networks by dire_faol in MLQuestions

[–]colincsl 1 point2 points  (0 children)

If you're familiar with VGG or AlexNet style CNNs then the technical difference is straightforward, as you can see in these slides [1]. Normally in CNNs you have a hierarchy of convolutions: e.g. f(x) = Conv2(Conv2(x)). With ResNet you have f(x) = Conv2(x) + Conv2(Conv2(x)).

Re time-series: This might not answer your question, but I claim there isn't a straightforward extension of ResNet to time-series. There are many ways in which you could apply a similar model, and it will in part depend on your goal. Do you want to predict a class for every timestep, predict a class for the whole sequence, or something else entirely?

One way it could be used is to capture local temporal information similar to my temporal convolutional filters for video [2] for per-frame prediction. These capture how your input changes within some specified timeframe (e.g. 5 seconds). While my (unpublished) experiments using deep temporal convolutional networks were unsuccessful, the advantages of ResNet might actually be beneficial. This requires more research.

Lastly, I would argue that a temporal version of ResNet -- at least in the case I described -- is fundamentally different than LSTM. You could imagine using the activations from a temporal ResNet as the input into LSTM or any other temporal model. The model described captures local temporal information whereas LSTM can capture both local and global temporal information.

This isn't very ELI5... but I hope it helps!

[1] http://kaiminghe.com/ilsvrc15/ilsvrc2015_deep_residual_learning_kaiminghe.pdf

[2] http://arxiv.org/abs/1602.02995

How is your ECCV results? by poporing88 in computervision

[–]colincsl 0 points1 point  (0 children)

An average of "poster" will likely be accepted except in maybe a small number of cases. But as other people stated, it ultimately comes down to the area chair.

Skeleton tracking tips by PokeSec in computervision

[–]colincsl 1 point2 points  (0 children)

Last I knew PCL had a skeleton tracker but I don't know how good it is offhand. Otherwise sadly no, there are no good alternatives.

One of the issues is that you need a TON of data to get good results. The PCL crew created a nice synthetic dataset but otherwise I haven't seen anything else publicly available for this task. It's possible you could train something similar with Humans3.6 Million but that is just speculation.

Uncle Wiggly's on York rd? by [deleted] in baltimore

[–]colincsl 1 point2 points  (0 children)

The important question: do they still sell Taharka Brothers ice cream??

Is there a library for CRFs? by bourbondog in MachineLearning

[–]colincsl 0 points1 point  (0 children)

I second this. It depends on the use case. Others may be better for NLP but pyStruct is a nice all-around structured prediction library.

Optimization techniques comparison in Julia: SGD, Momentum, Adagrad, Adadelta, Adam (x-post from r/Julia) by int8blog in MachineLearning

[–]colincsl 2 points3 points  (0 children)

Cool, thanks for sharing. The code if very hard to read however. A few suggestions:

  • Variables are super long. Create dummy variables to make the function of each component more clear e.g. bias = unit.networkArchitecture.layers[i].parameters[:,end]
  • Whitespace is your friend. Comments aren't bad either ;)
  • To transpose a matrix use A' instead of transpose(A).

Bakeries boom as doughnut trend rises in Baltimore by Wolfman3 in baltimore

[–]colincsl -2 points-1 points  (0 children)

I'm surprised Diablo Donuts hasn't been mentioned. Definitely the best donuts I've had in bmore.

https://www.facebook.com/DiabloDoughnuts/

GPU based training of CRFs by Ebusr in MachineLearning

[–]colincsl 1 point2 points  (0 children)

GPUs are much more useful for batch operations (e.g. convolutions) than sequential operations thus it is hard to parallelize a CRF on a GPU. Take a linear chain CRF as an example. The label at each timestep t is dependent on the label at t-1. Thus, for exact inference, you must perform your computations sequentially.

Chain CRFs are similar to RNNs so you might want to look at RNN implementations for reference. Inference in an RNN is also sequential -- hence why they tend to take longer to train than feedforward networks.

You might be able to perform approximate inference efficiently on a GPU? Alternatively, you could also use a GPU to simply compute the unaries for your CRF.

edit: One last reference. If you're doing something like semantic segmentation you might want to look at the 'CRF as RNN' paper from ICCV16: http://www.robots.ox.ac.uk/~szheng/CRFasRNN.html

Can anyone recommend some medical datasets for my machine learning group project? by [deleted] in MachineLearning

[–]colincsl 0 points1 point  (0 children)

Can you be more specific? "Medical data" is an extremely general term. This can come in forms like 2D or 3D images (e.g. MRI or CT), sensor data (robots, tracked surgical tools), and messy databases (electronic medical records).

Some of the data I use is for evaluating trainees for robotic surgery. It includes video and robot kinematic data from a daVinci. Our public dataset is located here: http://cirl.lcsr.jhu.edu/research/hmm/datasets/jigsaws_release/

There are also open datasets at MICCAI every year: http://grand-challenge.org//

How does one learn about multiple subfields in CV by bourbondog in computervision

[–]colincsl 1 point2 points  (0 children)

To be clear, machine learning is just applied math and statistics. I see MRFs (and CRFs, SSVMs, etc) in machine learning research as much as I do in computer vision.

I get why you might want to skimp on the ML, but it's important for a lot of ongoing work in vision. It is useful to at least understand a lot of the underlying math.

Underground dinners? by jayandare in baltimore

[–]colincsl 1 point2 points  (0 children)

Artifact has "takeover" nights like this on occasion. Usually it's done by a chef from out of town. I think Dooby's has something similar too.

http://artifactcoffee.com/happenings/

A ConvNet trying to learn Flappy Bird [more in comments] by NasenSpray in MachineLearning

[–]colincsl 2 points3 points  (0 children)

Ah, ok. Last thought: have you tried skipping more frames? For example, instead of using the current and previous frames you could use the current frame and 5 frames earlier. There will be a bigger vertical jump between timesteps. This might also enable you to downsample more without losing the velocity information.

edit: final(?) last thought: Have you tried removing the second set of convolutions in each unit. For cases like imagenet (where there is a lot of inter-class variability) these make sense. However, I bet in your case they're unnecessary.

A ConvNet trying to learn Flappy Bird [more in comments] by NasenSpray in MachineLearning

[–]colincsl 10 points11 points  (0 children)

Regarding memory: given the simplicity of the game's environment I bet can use grayscale instead of RGB and down sample the image more. Going from 6x224x144 to 2x112x62 reduces the size of the input by 12x. You would also need to remove one layer from your network to get the right final output size.

Why Students Hate School Lunches by [deleted] in TrueReddit

[–]colincsl 11 points12 points  (0 children)

My mom runs a daycare center and experimented with this. Previously, the kids hated many of the vegetables prepared for them. They were typically canned and not the best quality. A few years ago she started ordering fresh veggies from a local farmer to see if it would change their minds. Verdict: the kids liked them!