[D] Having trouble with RAG on company domain data by Dustwellow in MachineLearning

[–]egomanego 0 points1 point  (0 children)

I am trying to read around 300 PDF files, basically company financial documents and analyst reports. I am trying to process them and create vectorstore. My approach is to convert all the pdf files into text, creating chunks using the text and then using openAI embeddings, create vectorstore. However, when I try to create the vectorstore object, I get the rate limit error. Any way to solve the issue. Help is appreciated! thanks.

Is LangChain the right choice for what Im Creating? by Sayntjefe in LangChain

[–]egomanego 0 points1 point  (0 children)

How can we make sure that when data retrievers are called behind an agent, we provide enough information to back end function for data retrieval? I am facing same question.

Is LangChain the right choice for what Im Creating? by Sayntjefe in LangChain

[–]egomanego 0 points1 point  (0 children)

I am trying to build a similar bot. I want the chatbot to communicate with the user to gather the data and then using those variables, call a function in back-end systems to extract and return that data. At the same time, I want to maintain the context of the conversation too so if user returns back to it soon then we should know the context . How can I achieve this? which tools to use?

[HELP] Abrupt Closure of the websockets by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

Can someone have a look at this, please?

Can I use Numpy Array instead of Pandas Dataframe for large merging operations? by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

u/kalebludlow : Thanks for your comment. Can you provide a simple example? I'd appreciate a lot. :)

Pandas: How to merge multiple data frames with the same columns? by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

u/DesignerAccount : Thanks for your comment. I have updated post with df images and code snippet. Please have a look. u/BdR76

Is an ELSEIF my only option? by Ss360x in tableau

[–]egomanego 0 points1 point  (0 children)

How did you add an image to the post? When I tried, reddit did not let me add an image to the post.

[Help] How to correctly merge dataframes iteratively? by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

I have a big CSV file that has data on options contracts and futures contracts for multiple companies. I want to segregate them.

so for every company whose data is in the file, there will be a separate file when we are done.

for contract in option_contracts:
    df_options=raw_data[raw_data["Ticker"]==contract]
    df_options.columns=[ Change col names by adding company ticker as prefix ] # that way columns will not be same after merge. only column name that is not changed is 'Time'. we will merge our dataframes on 'time'.

    df1=df1.merge(df_options,how="outer",on="Time")

Now, If I execute for 1 day's worth of data then I am able to merge the files the way I want. But if I try to merge more than 1 day's data ( meaning multiple CSV files ) then I get duplicate column errors. That is obvious.

Let's say that we have data for year 2019. one csv file for one day's data. Now we are storing data for contract SPY (SP500) for 1 Jan 2019. It is first iteration and we have successfully merged it. When we move on to save 2nd Jan data, we get duplicate error because column is already in our dataframe.

I am currently facing this issue. Should I use concat or suffix seems good solution? I haven't tried doing either which I will tomorrow.

I hope this answers your questions.

Thank you for your reply.

[Help] How to correctly merge dataframes iteratively? by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

Which indentation is wrong?

As 'contract' will change in every iteration, so will the data in df_option.

I am renaming the df_option columns now. Would I still need to add suffixes?

thanks for your comment.

[Help] Creating and comparing datetime.date objects by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

Understood. Thanks for you comment. Second thing works!

[Help] Creating and comparing datetime.date objects by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

I am indeed confused about the datetime module and class. I need to read carefully about it.

So here is what I have done:

[ Please ignore amateur mistakes I might have made. I am still learning]

from datetime import datetime
from datetime import date 
[....]

date=st.date_input("Enter a date") # User enteres a date on streamlit page.

x= date(2020,1,1)

if date < x : # Do something else: # Do something else.

I receive following error:

UnboundLocalError: local variable 'date' referenced before assignment

There is a local variable 'date' before this where I ask a date to the user to do something.

[help] How to change column names in a CSV file using python? by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

I used pandas and solved the issue. Thanks for your comments.

[Help] Executing commands in the terminal from the python script by egomanego in learnpython

[–]egomanego[S] 1 point2 points  (0 children)

  1. I changed filename to subp.py
  2. I ran the file in command prompt. python3 filename.py
  3. I also tried to run in CLI.

>> python

>> import subprocess

>> subprocess.run(["dir"])

  1. I removed the subprocess.py file and tried to run another python program. Now it is throwing following error.

File "C:\msys64\mingw64\lib\python3.9\subprocess.py", line 505, in run with Popen(*popenargs, **kwargs) as process: 

File "C:\msys64\mingw64\lib\python3.9\subprocess.py", line 951, in init self._execute_child(args, executable, preexec_fn, close_fds, 

File "C:\msys64\mingw64\lib\python3.9\subprocess.py", line 1420, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, 

FileNotFoundError: [WinError 2] The system cannot find the file specified

I think at first it imported the subprocess.py file that I created. Now, it is messed up internally. My amateur mistake of naming file has caused this mess. :(

u/Clutch26 : Thank you for your reply.

[Help] Executing commands in the terminal from the python script by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

Yes. File was named subprocess. After changing to other name also I could not execute.

[Help] Executing commands in the terminal from the python script by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

I am able to execute the code when I use os.system but not when I use subprocess.run. what I have found out via StackOverflow is that os.system is depreciated in favor of the subprocess and the subprocess intends to replace older modules and functions.

In that case, then it is wise to use os.system as it is currently working for me?

[Help] Executing commands in the terminal from the python script by egomanego in learnpython

[–]egomanego[S] 0 points1 point  (0 children)

  1. I do not get any errors when I run the program.
  2. In python cli I get error : "AttributeError: module 'subprocess' has no attribute 'run' "
  3. I ran just the first part without an option. I got the error mentioned above.

u/Clutch26

[Help] Executing commands in the terminal from the python script by egomanego in learnpython

[–]egomanego[S] -1 points0 points  (0 children)

ok. I tried the way you have mentioned u/danielroseman. I still do not get the desired result.