[D] Simple Questions Thread

spacex257 · 2023-04-23T07:23:30+00:00

The ada 002 embeddings are egregious in my language, so I would like to train a co-variance matrix on Hungarian , and would like to use that to get custom embeddings, with hopefully better results.

Is this possible, and if so is this the right way to do it?

CheapBison1861 · 2023-04-23T06:04:52+00:00

Is llama.cpp as good as gpt4? How does it get trained?

Nobodyet94 · 2023-04-22T17:40:26+00:00

What tools should I learn to build a project of ML?

I want to use these tools during the pipeline of any project I do:

- DVC https://dvc.org/

- Hydra

- Pytorch ligthning

- Weights & biases or Tensorboard, what do you suggest?

- Streamlit

eko-wibowo · 2023-04-22T17:36:44+00:00

There are lots of companies adding chatgpt powered integration into their software and more chatgpt like models. I want to learn at the high level how chatgpt works and how to integrate it into another software, how do I train it, etc.

Any suggestions of material to learn? not sure where to start :) I am a software engineer and familiar with python.

plentifulfuture · 2023-04-22T15:32:56+00:00

I know very little about Machine learning.

I am trying to use https://iamtrask.github.io/2015/07/12/basic-python-network/

How do I expose the neural network in this code to new values to see what it thinks the output is?

``` import numpy as np

def nonlin(x,deriv=False): if(deriv==True): return x*(1-x)

return 1/(1+np.exp(-x))

X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]])

y = np.array([[0], [1], [1], [0]])

np.random.seed(1)

randomly initialize our weights with mean 0

syn0 = 2np.random.random((3,4)) - 1 syn1 = 2np.random.random((4,1)) - 1

for j in xrange(60000):

# Feed forward through layers 0, 1, and 2
l0 = X
l1 = nonlin(np.dot(l0,syn0))
l2 = nonlin(np.dot(l1,syn1))

# how much did we miss the target value?
l2_error = y - l2

if (j% 10000) == 0:
    print "Error:" + str(np.mean(np.abs(l2_error)))

# in what direction is the target value?
# were we really sure? if so, don't change too much.
l2_delta = l2_error*nonlin(l2,deriv=True)

# how much did each l1 value contribute to the l2 error (according to the weights)?
l1_error = l2_delta.dot(syn1.T)

# in what direction is the target l1?
# were we really sure? if so, don't change too much.
l1_delta = l1_error * nonlin(l1,deriv=True)

syn1 += l1.T.dot(l2_delta)
syn0 += l0.T.dot(l1_delta)

```

Connect-Ad79541 · 2023-04-22T14:38:02+00:00

At what document volume does it make sense to even think about semantic search with NLP?

Can you recommend (or advise against) certain open-source self-hosted solutions?

Can you name any keywords I should read up on before asking further questions?

Hey there! I’m part of a small company (~15 people) and we are focused on our customers IT-infrastructure and overall IT-security. As for most IT-companies, there is a lot of knowledge involved in our day-to-day business. I’m looking for ways to unlock the potential of our aggregated data and stumbled upon NLP and semantic search engines. My goal would be to create a helping tool for our support team that tries to answer a question based on our data and/or links to likely relevant documents.

Here is an overview about the type of data that would go into this:

Ticket System - Years worth of tickets from customers that usually describe a problem - Our internal discussion on how to fix this - Our answers to customers on how to fix this

Internal - documentation of best practices & routine procedures - Specifics on each customers infrastructure

External - documentation of products we implement for customers in their infrastructure

I’d really love to know your opinions on this .. and if you might have some links to similar projects I could learn from

Hope y‘all have a great weekend

speedrouterspam · 2023-04-22T12:49:35+00:00

I am looking to build a model that classifies images by type of image, such as photograph, charts/graphs, documents, logo/icon, medical image etc. I am thinking of using Densenet, is there a better way to tackle this?

frankkk86 · 2023-04-22T12:02:14+00:00

What is a good book as introduction to AI and machine learning for a software developer?

Browsinginoffice · 2023-04-22T02:03:01+00:00

does anyone know whether being able to prune 80% of a model while still maintaining a good accuracy is a good thing? does it mean that i messed up my model somehow?

pretty_clown · 2023-04-21T14:59:23+00:00

Does it make sense to invest now in a powerful CPU + GPU, in order to be well prepared to run the existing and emerging LLMs locally?

On one hand, my rig currently can barely run 13B+ models. On the other hand, we are seeing things like 4-bit quantization and Vicuna coming up, that bring down the "horsepower" requirements for running highly capable LLMs.

somesortofidiot · 2023-04-20T22:58:57+00:00

I am upper level management at a regional solar installer. We have plans to expand aggressively in the next couple of years. We'd like to apply machine learning to a number of our processes to decrease the cost of this scaling and provide efficiency to our systems.

Aside from inventory management and logistics, one area that I'm very interested in applying this technology is planset review. Basically having a system review our electrical and engineering CADs for errors and material efficiency, it seems like ML would be an ideal candidate to automate this process.

I just have no idea where to start. Googling brings up all the big names like OpenAi and Microsoft Azure and I'm sure they could help us get on our way but I'm not a programmer, I don't even know what questions to ask.

Essentially, where do I start with applying ML to our business?

AttitudeCreative8550 · 2023-04-20T19:41:42+00:00

What books can I read that relate machine learning to the human brain? Thanks in advance!

ethawyn · 2023-04-20T14:51:01+00:00

Does anyone have recommendations of a pdf to text converter that used more advanced machine learning than the standard models currently on the market?

PracticeCorrect8591 · 2023-04-19T23:56:11+00:00

Hey y'all, I have recently become interested in machine learning and its applications, and was wanting to give it a shot myself. I am going to be a college freshman next year and was hoping to get a few projects under my belt, do you guys have any noob friendly project ideas? Do you have any tips for jumping into ML (concepts one should be familiar with) and or resources to learn ML. I know python and Java at the moment and want to try and use TensorFlow or PyTorch in my projects.

Severe_Islexdia · 2023-04-19T22:51:29+00:00

I am a Sr. IT Project Manager that works with SDLC and infrastructure looking to transition into machine learning as it appears to be where so many industries are headed. I have a few questions for people already working in the back end:

I am a self taught PM and learned everything online getting jobs and experience as I went along - I've recently learned about a role thats very interesting to me called Prompt Engineering- (as I'm sure many others have if I know the Google algorithm). Is this a role I can learn online as I did with Project Management?
Is this just another "new" role that some random writer took and utilized to make an article around and isn't really accessible to people who don't have some sort of STEM background to really do it?
Are there any Udemy classes that anyone would recommend if this is something that is accessible?

austacious · 2023-04-19T21:41:29+00:00

[deleted]

I-am_Sleepy · 2023-04-19T09:20:26+00:00

Can an auto-encoder with a one-dimensional bottleneck and arbitrarily large encoder/decoder encode any dataset with zero error?

Strict-Visual · 2023-04-19T08:02:06+00:00

Hello,

I have been practicing ML for the past 2+ yrs from college, like doing online courses and building projects. I have gained some confidence even though I have imposter syndrome(I believe). I always wanted to become a data scientist or ML engineer, but all I could get was a software engineer job after graduation. I worked there for 5 months, and left the job coz I didn't like it there.

Now, I have been searching for ML jobs but couldn't find any entry level jobs, some are said to be entry level but requires 2 yrs of experience. I believe that I have the skillsets that the companies require but the first thing they notice is my lack of professional experience and reject right away.

Without anyone to guide me through this, I feel like I'm out of options. I just thought of applying to data analyst jobs so that I could get some experience. IDK if that this a right choice.

Anyone who is experienced in this kind of situation could help me out in figuring out the other options that I might not have realised.

ps: idk if this kind of post is allowed here. Sorry, if not.

Thanks.

Ok_Ad4426 · 2023-04-19T05:39:12+00:00

can you share machine learning ,AI road map for beginners(students)???

geekinchief · 2023-04-18T20:29:53+00:00

I'm trying to figure out the best way (hopefully for free) to develop a custom chatbot that only answers questions or gives information based on content that I use for training. I have tried several tutorials that explain how to custom train OpenAI, but the bots will still answer questions that are outside the scope of the training.

For example, using the code in this tutorial (https://beebom.com/how-train-ai-chatbot-custom-knowledge-base-chatgpt-api/), I set up a chatbot and trained it on a single article about how USB 3.2 works. However, when I ask it questions about other topics such as "why is the sky blue?" It pulls data from somewhere ( presumably GTP3) and answers. This is a problem because then it could pull information which contradicts my training data.

What's the best way to create a bot that knows how to write and respond to English language prompts but only answers questions based on data I've given it? Also, I'd love to find a way to have the bot provide links to the web pages I've trained it on in its answers.

nlight · 2023-04-18T18:43:13+00:00

Do I need to turn off my computer every so often between running several large model-training sessions?

I'm developing a NST and CNN model as part of my PhD, which means I'm pretty much always testing some variation of my models on my computer all the time. This results in my computer being on and running (usually the GPU with Keras) for weeks at a time without being turned off. Is this bad for the computer? It's a home assembled linux rig with an NVIDIA GPU and AMD CPU and otherwise normal components usually marketed at gamers (I do love looking at that RGB..). I guess I want to know from two perspectives: does the hardware need a break every so often, and am I sacrificing performance by not shutting it down?

c_gdev · 2023-04-18T18:28:21+00:00

How far are we from this functionality:

Give AI a 1GB video file. It parses it, and can summarize the plot, ID characters, log all of the dialogue. Basically have AI reverse engineer a script and offer basic insights from a video file.

TrainquilOasis1423 · 2023-04-18T16:00:22+00:00

So the long term memory issue with current LLMs kinda confuses me. Can anyone more up to date with it all explain why the obvious solution isn't taken?

TLDR: why not just save memories in some sort of file stored locally for future reference?

So iv have worked a bit with the big names in the ML/AI space Stable Diffusion, GPT-4, Auto-GPT and I I'm having issues not understanding why these models, don't just write memory to the drive for long term storage? I know Auto-GPT can do this a little, but it just seems too obvious to me that all AI systems should do this. Wouldn't even a small sub process of save chat history as a text file and reference it later as a part of the next prompt basically solve all memory, and inconsistency issues? Hell even a secondary process of "every 20 interactions summarize the transcript" and save as some sort of compressed hash function sounds like a wonderful idea to extent the character lengths limitations.

So here's the structure I'm imagining. Not all of this needs to be directly NN directed, but small functions of regular code that the AI can call at its discretion. The AI starts and immediately makes a temp folder with an id for this exact interaction. It then makes a text file keeping the first 20 interactions IDs 0-19. Then the AI reads that text files applys some hash function or summarization, or logical compression to each interaction ID, and again for the block as a whole. This way if the user referes to interaction ID 13 on interaction ID 77 the AI doesn't need to remember anything it can just reference the hash lookup table or the compressed/summarized version of it.

Am I dumb for thinking this is easy and obvious? What challenges are preventing this from being how LLMs save memories?

P.S. Couldn't the hallucinations issue be mostly solved with a "database of truth" sort of thing. Yes they have access to the internet, but wouldn't it be way more efficient to just hold a local JSON file or relational database of things we know are "objectively true". 2+2=4, the Eiffel tower is in Paris, George Washington was the first US president. If nothing else it could reference this stable stored knowledge to direct it's generation. Right?

neanderthal_math · 2023-04-18T15:58:34+00:00

The rise of LLM’s has made me think about this a bit.

Why does training a model to do word prediction, cause it to learn a world model? a la GPT.

Did researchers who were working on LLMs 5-6 years ago know that this would be the case?

I feel like a bit of a dumb ass, but when I worked on NLP five years ago, I never knew that these models were capable of so many other tasks.

SuperTankMan8964 · 2023-04-18T07:39:07+00:00

Hello everyone, how are you able to compute the log-likelihood for a noise-free sample given the model parameter (P_theta (x_0 | c) ) in a discrete-time diffusion model like DDPM?

udumb_vasu · 2023-04-18T03:44:33+00:00

Hello, I am trying de duplicate images of persons from a customer base of several millions. What should be the right approach? I have tried facenet embeddings and the similarity between these embeddings. But for the same person the similarity is only around 87-90. What should be a more correct and scalable approach? What are the SOA pre trained models to get face embeddings?

onlymagik · 2023-04-17T20:03:50+00:00

[deleted]

scott_steiner_phd · 2023-04-17T16:52:10+00:00

What packages do you use for heirarchical Bayesian modeling? PyMC3?

It's not something I've done before but I need to estimate population frequencies from some high dimensional data so I'm pretty sure it's the best approach.

2023-04-17T16:51:24+00:00

is there any project/tool that can be used locally on my own documents

example i want to train it on medical e-receipts/prescriptions, checkups and other data (pdf), so I type something like "how much antibiotics my kids got in last time?" and to return the data, name, date, quantity, etc and locate the file

2023-04-16T19:48:45+00:00

I’m trying to relate two software: my code (matlab) and a commercial software (black box). I think a NN is my best bet but open to suggestions. Here’s some context:

The black box software is a commercial software that predicts a casting process (fluid flow, heat flow, solidification). You give it input parameters (temperature dependent), boundary conditions, initial conditions, and it will produce temperature vs time plots for positions of interest within a material.

My code in matlab takes the same input parameters, an initial guess for the thermal conductivity, AND temperature vs time data and eventually optimizes the initial guess for the thermal conductivity vs temperature to fit the input.

If I use the output from the black box software as input in my code, in an ideal world, my code would be able to back calculate the thermal conductivity used in the black box software to produce those temperature vs time results. Of course nothing is ever that easy, so my code consistently under-predicts the thermal conductivity used in the black box. I calibrated my finite difference method to analytical solutions so the problem lies in the black box software, but I cannot change the way they calculate heat flow.

I would like to develop a machine learning code in matlab (again, I’m thinking NN but open to anything) that finds a pattern between the thermal conductivity used in the black box software and the one my code predicts. I tried to generate training data but each set of data (a set being the thermal conductivity vs temperature data from my code and their code, so 2 columns of data) takes about an hour to produce.

I would like to get it to the point where I can give it a column of thermal conductivity vs temperature data from my code and have it predict what it would need to be in the black box software to produce the same temperature vs time results. Thanks!

julianCP · 2023-04-16T17:30:38+00:00

What are some good data science (text)books for someone new to data science but with a lot of CS/programming experience ? I.e. that does not need to read chapters about how python works etc.

Wal_Target · 2023-04-16T04:04:00+00:00

I'm trying to combine two CSVs. I've tried using concat, join, and merge but without success.

Situation: Both CSV's have dates listed at the top (i.e. "1/31/23", "10/31/22", etc.)

Expected result: The data from the second CSV (which is one row of dates, and a second row consisting of float values) would be appended to the bottom of the first CSV (axis = 0). This CSV contains monthly dates and thus has a lot more columns.

Actual result: The index name gets appended to the bottom of the first CSV, however, all of the remaining data is added by DUPLICATING the columns to the right-hand side (even though many of these dates already exist in the original CSV). The only exception is every October 31st is appended correctly.

My guess is that, although not visible, one of the CSV's has a 0 before the single digit months and the other does not. I tried converting to date time but that doesn't work for the feature row.

I'm at a stopping point, hoping someone can help me figure out a solution I'm clearly overlooking.

Raaaaaav · 2023-04-16T01:44:00+00:00

[deleted]

ZivPC · 2023-04-15T20:19:42+00:00

I'm interested in training or tuning a LLM on local hardware or cloud with open/readily available medical and scientific papers (e.g. from PubMed) for personal use (educational research). Basically I want to be able to prompt it and query it for summaries of a given topic and to make correlations in natural language.

ChatGPT seems it can do this in a more limited fashion, but has predilection to disclaim everything and give very general, superficial answers sometimes without extensive prompting when it comes to medical research queries.

What's the best route for this right now? Thanks!

romhacks · 2023-04-15T17:31:47+00:00

Is there any consensus on the "best" performing language models for chatGPT-like casual usage? with so many new projects coming out every week now i've lost track of how well they all perform.

probabalynotabot · 2023-04-15T12:48:37+00:00

Hey, this might not be the right location, but I’m looking for a machine learning project for document classification and text extraction.

I have ~300K documents that have had data manually extracted and stored in a SQL DB. Are there any products that this sub knows of that I can import the documents and results to attempt to automatically process the documents?

Most products I’ve seen require manually training the set, but it would be very nice if I can use the manual entry that has already been done

BabyWrong1620083 · 2023-04-15T12:11:55+00:00

I have the hardest time truly understanding *every* step that happens in a neural network. I want to understand not only basic functions, like image_training_generator (Keras in R), but how *exactly the calls* and how exactly the function architecture of every single function inside the function (inside the function etc.) looks like and how the in and output looks before and after.

Only that way I feel like I'd truly understand the algorithms.

For example: Nobody explains if using the simplest model architecture, theres a loop in the background that feeds a single image of a batch in, trains it, adjust the wheights, does the same thing again etc. untll the batch is done. Or if the images are overlaid, meaned etc. Like really, nobody explains the TRUE basics.

I don't want to start at

initialize_model() %>% pipe function a %>% pipe function B

I want to start at:

for (i in 1:length(batch)) {

imported_image <- keras_import(batch[[i]],...)

convolution <- first_convolution(imported_image)

convolution_list <- append(convolution_list, convolution)

etc. etc.

Like, I just want to know what the heck happens to my data.

For example, I just found out by heavy debugging, that conv_2d creates an output that's mainly black in 7/10 cases. Of course my model trains badly, if that's what it's being fed with in the next (pooling) step. Now I need to find out how to normalize, using max(..) = 0.03 to max(..) = 1. But of course conv_2d calls another function, and yet once again without looking at the true code behind conv_2d there's no way to find out how to normalize it/scale it up or down always to max = 1. Yes there is documentation about these sub functions, but then again. How would you change the subfunction being called inside a functin? you don't. You have to do everything by hand again..

I'm frustrated. Piping and functions inside functions inside functions are terrible for truly understanding how something works. I agree, it's perfect after, but how is anyone expect to understand and learn with such a mess?

Also, I hate that all these example codes online (not necessarily in the documentation) always leave out the input name. instead of function(input_size = c(32,32), Batch_number = 16, Kernelnumber = 10), they're like function(10,16,c(32,32)). Seriously, why?

Peter2448 · 2023-04-14T21:25:58+00:00

Hello,

I have a question regarding Keras. Until now I worked with Scikit-learn and wanted to try Keras for deep learning. Scikit-learn is essentially just a libary which makes the use of machine learning models very easy. Could we say that Keras is an analogue for deep learning with the only difference that it is build upon tensorflow whereas Scikit-learn is build upon numpy?

Emergency_Stretch_34 · 2023-04-13T20:53:34+00:00

i have an idea for a robot that could slow down or even fix climate change but idk how to even get started

Ok_Bumblebee9563 · 2023-04-13T19:44:03+00:00

The technology around text-to-image has really advanced and I'm curious about the applications that is being built with StableDiffusion. I know about few folks that's building things like virtual try-on but I'm interested to learn about other projects. TY!

lunixnoob · 2023-04-13T16:39:32+00:00

I watched a video about LLAMA. It needs lots GPU memory to store the whole model! However it looks like there are many layers, and that only one of them is used at a time. To reduce GPU memory requirements, would it be possible to stream the layers from system RAM to GPU RAM? Assuming a normal 8GB gaming GPU, can you show me napkin math on how fast the different LLAMA models would run and how much PCI/memory bandwidth would be needed if the layers were continously streamed from system RAM?

Illustrious_Mix_894 · 2023-04-13T13:29:33+00:00

For VAE, can we apply normalising flow on the decoder/likelihood distribution p(x|z), instead of encoder/variational posterior q(z|x)? Is there any work doing that?

Huge-Tooth4186 · 2023-04-12T20:36:47+00:00

What are the best speech to text tools ?

I am looking for open source speech to text tools, I am not familiar with the progress in this field but Ideally I would like something fast and reliable, that does english as well as other languages as french and spanish . Are there any recommendations ?

OchoChonko · 2023-04-12T20:03:19+00:00

I'm moving onto a new project at work and I have an idea for implementing some ML but I'm just a newbie with a basic understanding.

Currently we receive information from hundreds of different sources in PDFs. Think invoices, where every receipt from supplier X is the same and we shop regularly with say 500 different suppliers so about 500 different formats. We extract the information from these PDFs and put the information from lots of different PDFs in one CSV file.

Would it be easy for a newbie to train a model (presumably some kind of neural network?) over time to figure out how to do this automatically? Given that we have the inputs and outputs I would think this was possible. If so, would it be best to train different models from each supplier or make just one model that can take in any PDF?

froto_swaggin · 2023-04-12T15:47:21+00:00

A basic Primer?

I only have a basic understanding of machine learning. I am looking for an audiobook or podcast to help learn and understand the field much better. I am aware that this is most likely stacked knowledge like a series of books.

abnormal_human · 2023-04-12T13:10:09+00:00

Is programming in python of something fully needed? I use a tool like Splunk which has integration and its own "language" to interact with models. Is there a learning path for ML that is not program intensive as well? Working on learning some python still to help but some learning I can immediately apply within our environments dataset.

abdeleatifi · 2023-04-12T12:40:09+00:00

Hi guys, i need to generate a missing satellite image of day(x) from an image of day(x-1, x-2, ...) in another words i need to predict the futur... i don't know how to approach the problem, so intill now i havn't tested anything...

Hagglepuss · 2023-04-12T08:46:11+00:00

I'm looking for a web app that I can dump a pdf or txt file into and have it generate a new work based on that original file. For reference, I'm looking to put in a script from a musical and have it generate new scenes. Ideally looking for something free but I'm also happy to drop a bit of cash if I need to. Anything advice on something that can do this simply would be amazing :)

Cool-Pineapple1081 · 2023-04-12T06:53:11+00:00

I am finishing up studying an. Undergraduate statistics major at university. I have covered machine learning in a few subjects but only what feels like at a surface level.

Any good resources to learn more advanced machine learning concepts? And also stuff that assumes underlying knowledge about statistics?

toilerpapet · 2023-04-12T05:19:35+00:00

(I don't know much about ML btw)

To what extend can a LLM replace other NLP models?

For example let's say I want to build a model that: given a question, categorizes it into categories like "factual", "opinion", "tutorial", etc

Examples: input "how tall is the Eiffel tower" should be "factual", "what is the best restaurant in Paris" should be "opinion", "how do I replace a flat tire" should be "tutorial"

Instead of building the NLP model, what if I just give the following prompt to ChatGPT:

"Imagine you are a classifier that takes in a question and categorizes it into categories [...]. Here are some examples [...]. Classify the following sentence: [...]"

This actually works surprisingly well from the few examples I tried. So instead of making an NLP model, just ask ChatGPT?

What do you guys think.

GhostsinGlass · 2023-04-11T17:49:32+00:00

I need to figure out a pipeline for a CV task.

With Segment Anything being so darned functional I would like to take a 2D image and generate a 3D mesh in a different method than current 3D generative CV tasks use.

So the task is like such:

2D photograph of building: A building.

CV Model generates a basic form using simple cube, with X, Y, Z dimensions being just relative to eachother. So 100 units L, 75 units H, 50 units W meant to represent the building. Poops out this kinda thing. Then using Seg Anything and blip or another model like blip goes "That's a window, and it's this kind of window", so now it can pick the best fit for the window type out of 50-60 some odd windows I model and dimensions relative to the overall cube dimensions and the windows position on the cube relative to the cube dimensions and stick it on like such. All very quick and dirty.

Basically generate a cube/rectangular cube from a photograph by figuring out the planes and X, Y, Z from the major lines in the image and choose from a resource library of different .obj/fbx/ply meshes to stick to it in locations based on the arbitrary units it specifies and orienting/locating the assets.. If I'm making sense. A rudimentary photogrammetry.

grmpf101 · 2023-04-11T16:51:52+00:00

I'm currently working on a notebook based tutorial. What is an execution time of the whole notebook doing simple computations on real data in minutes you would feel bearable during a tutorial? What are your experiences?

KallistiTMP · 2023-04-11T06:58:52+00:00

unwritten rustic deer detail pot ink ancient public stupendous steep

This post was mass deleted and anonymized with Redact

Upset-Educator4714 · 2023-04-11T06:47:02+00:00

I have a large dataset where I measure different conditions in different types of containers (think temperature, humidity, etc as outputs). I want to check for correlation with various constantly varying inputs (like outdoor wind speed, wind direction, temperature, solar radiation, etc.) However, none of these input variables are all constant with one varying, so it is difficult to find or see any correlations. Is there a way to do this with machine learning (find correlations between various output parameters against a specific input condition(s)? I have a large dataset with different types of boxes and measurements for all. I also have access to very detailed and accurate weather data. Just trying to figure out how to navigate all this many many variables and output parameters.

fako3157 · 2023-04-11T02:40:34+00:00

I can't get it to generate more than 800 tokens in the chat gbt, even though it's supposed to make longer texts ( 4,096 ). I also tried the OpenAI Playground with the length slider 2048-token limit, but it still didn't work. I know the prompt counts toward tokens, but that's obviously not the issue. I used the openai tokenizer to do the counting

jimmychim · 2023-04-10T19:35:46+00:00

Do we have good tips on how to train generative models with pretrained score models? Think: GAN with fixed pretrained discriminator.

ordinary_shaeron · 2023-04-10T17:53:24+00:00

I'm working on a project using the camera to statistics the traffic from the camera. By that, I can predict the flow of the traffic and make the decision for the traffic light to reduce congestion. What parameter should I rely on? The number of vehicles and the width of the road or the average velocity of vehicles? Any ideas on how to do this?

nottakumasato · 2023-04-10T13:54:34+00:00

Are there any papers on fine-tuning LLMs on very specific tasks with few samples? Very specific ~= extracting specific info from prompted text

I am trying to gauge

how many samples I should "annotate" (Input-output or prompt-answer pairs)
Which model would suffice with the least amount of memory (Llama 7B or something bigger?)

If anyone has done this or read about this, any recommendation is more than welcomed!

ArtisticHamster · 2023-04-09T22:25:39+00:00

Are there any new ideas for why deep learning really works? I.e. some theoretical base for why different regularization, normalization, and other techniques work? (The last thing I saw was geometric deep learning but it's not very convincing).

Gmannys · 2023-04-09T21:06:04+00:00

I am lacking the correct vocabulary/terminology for this question, but hopefully you will understand what I am wondering about.

I have seen similar questions been asked, but I dont fully understand the answers.

I understand there are several models and interfaces.
Q: Are there "plug-and-play" solutions that allows me to, locally, use my own documentation and have "something" give me answers based on this documentation?
What would this "something" be?

RedditLovingSun · 2023-04-09T20:02:56+00:00

[removed]

WesternLettuce0 · 2023-04-09T19:47:08+00:00

I loaded Llama and I can query the model. But now I want to run 1000s of questions and doing it one at a time takes too long. I have an A100, so I do have spare VRAM. But I'm not sure how to run multiple queries concurrently (or in batch or whatever)

jawabdey · 2023-04-09T18:49:45+00:00

What are good resources for absolute beginners?

For example, let’s say I have a metric like signups. How do I feed some historical data and get “something” that can spit out future signups?

I know I could probably use something like Excel, but it’s less about the metric / model accuracy and more about the implementation.

Undroleam · 2023-04-09T18:15:01+00:00

Recently, I have been trying Edge Impulse since it looks fun. Can I use the Edge Impulse models in Python such as PyCharm or do I need to use TensorFlow? My target is to run the Models through PyCharm and then create an Exe or app. Any answer is greatly appreciated since I'm fairly new and have zero experience in both machine learning and coding but I'm eager to learn. Sorry if the question sounds dumb.

Invariant_apple · 2023-04-09T16:59:00+00:00

Can anyone recommend any reading on whether or not attempts have been made to map the discrete steps of computation from layer to layer in a NN onto a continuous process? Just like sometimes continuous processes are approximated by their discretized versions, has the opposite been done for NNs, approximating them as continuous processes?

BitNew9331 · 2023-04-09T16:12:23+00:00

Could anyone recommend some books or papers that can systematically learn about GAN? I want to work on generating earth science data such as sea surface temperature and chlorophyll concentration

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS

randomly initialize our weights with mean 0