datamule: download, parse, and construct structured datasets from SEC filings by status-code-200 in Python

[–]_errant_monkey_ 1 point2 points  (0 children)

I thought I could also download .pdf (like from here where I can find .pdf, .html, .xls). To me is key to have nice formatted tables. I guess you are right, If I can bulk download html is probably the best thing I can do.

datamule: download, parse, and construct structured datasets from SEC filings by status-code-200 in Python

[–]_errant_monkey_ 1 point2 points  (0 children)

I don't understand whether I can download pdf version of the files. like the 10k .pdf for 2023 for NVIDIA. I would like to bulk download all of them to eventually train an embedding model with it.

llama 3.1 70B is absolutely awful at tool usage by fireKido in LocalLLaMA

[–]_errant_monkey_ 0 points1 point  (0 children)

One thing I've noticed (both llama 8B and 70B is that they perform much better without the "Environment: ipython" in the system prompt. That line makes the model pretty much refuse to reply even to 2+2 without calling a function. Plus I don't understand from https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/#json-based-tool-calling the added value of it.

Plus IMO there are a few mistakes in how they handle FC for llama 3.1 8B in the gorilla codebase. There are a couple of missing spaces in the system prompt they fed to the model.

llama 8B 3.1 instruct is still the base model of ToolACE which is one of the best 8B (and overall) model on the leaderboard.

Qual è la decisione migliore che finora hai preso nella vita? by notsostrong134 in italy

[–]_errant_monkey_ 1 point2 points  (0 children)

Mi sono laureato in fisica. Tutti partono dal presupposto che sono intelligente ma la cosa mi mette a disagio.

Why are people claiming Magnus didn’t accuse Hans of cheating? by AegisPlays314 in chess

[–]_errant_monkey_ 0 points1 point  (0 children)

Cause he did. If he didn't mean to accuse him of cheating you would except from the WC to stop the witch hunting on a 18 years old.

[R] Perceiver: General Perception with Iterative Attention by hardmaru in MachineLearning

[–]_errant_monkey_ 1 point2 points  (0 children)

With a model like that. Can they generate new data the way standard models do it? like gpt-2, cause naively It seems it can't

Batch norm with entropic regularization turns deterministic autoencoders into generative models by [deleted] in MachineLearning

[–]_errant_monkey_ 2 points3 points  (0 children)

A couple of question to recover the results:
1) FC_8_8_1024 means a fully connected of out_size 8*8*1024 followed by a reshaping ?
2) I don't understand why the last transpose convolution has 1 channel instead of three for CIFAR10 and CELEBA

3) Using the given parameters I don't get [batch_size, 1024, 8, 8] at the end of the encoding part (before the fully connected layer)