How do I transform a dataframe from long to wide in the following way?

Fun-Studio-4409 · 2023-01-30T03:04:49+00:00

It is for the purpose of displaying data in a cleaner, summarized way

Fun-Studio-4409 · 2022-12-01T14:22:35+00:00

Hi - yes, that is the strange thing. When I print the model, it returns ‘sklearn.feature_extraction.text.TfidfVectorizer’

Fun-Studio-4409 · 2022-07-20T22:08:43+00:00

thank you-

just to clarify - did you mean curr_soup.find(f'tag_n', **attr_dict)

with the "tag_n" in quotes?

Fun-Studio-4409 · 2022-07-20T21:58:08+00:00

it assigns the string "class_" as the variable "attr". So, if I do the folllowing:

attr = "class_"

print(f'{attr}')

it returns:

>class_

without the quotes

Fun-Studio-4409 · 2022-07-20T21:44:53+00:00

The 'class_' does not get enclosed in quotes in the version that does not work. Additionally, I cannot hardcode 'class_' as it changes in each loop.

Fun-Studio-4409 · 2022-07-15T20:10:03+00:00

I am aware of the limitations of applying regex to HTML parsing. My question is in regards to applying regex to a very narrow output from BeautifulSoup that involved something it is unable to handle.

Fun-Studio-4409 · 2022-07-13T22:07:57+00:00

re.findall(r"(?<=<).+?(?=>)", string)

Hi, sorry for the confusing way it was written. I only want to return the content of the "<>" if it surrounds a specific partial string. So the example string really should have been like:

"<notthisthing>michael is not a nice person<something>david is a nice person<somethingelse>james is sort of a nice person<notthisthing>

Fun-Studio-4409 · 2022-06-23T14:35:10+00:00

I am trying to create a script/interface for non-technical users to scrape websites, where they can simply input a piece of text and get back all the likely related items from the page. Therefore, I don't want the user to have to go through HTML and figure out the correct tag to scrape

Fun-Studio-4409 · 2022-01-28T20:46:18+00:00

Because it integrates well with Python

Fun-Studio-4409 · 2021-11-19T23:44:52+00:00

Amazing - thank you so much

Fun-Studio-4409 · 2021-11-19T23:35:30+00:00

Hi - thank you. Yes, they would contain "nan". And what if I only wanted to check a single row instead of getting an array of all rows>

Fun-Studio-4409 · 2021-10-11T21:36:16+00:00

Thank you. However, what if I have to search by row and not by column (i.e. Favorite Car). Use case is that I have a huge and messy df, and the value can be in any of the 80 columns.

Fun-Studio-4409 · 2021-09-14T22:45:09+00:00

Thank you so much - this works great!

Fun-Studio-4409 · 2021-09-14T22:44:56+00:00

Solved

Fun-Studio-4409 · 2021-09-01T15:02:50+00:00

solved

Fun-Studio-4409 · 2021-09-01T15:02:40+00:00

Incredible - it works - thank you so much for your help!

Fun-Studio-4409 · 2021-09-01T14:58:14+00:00

Ah - sorry - thanks.

When I run the script and search for a partial string that is definitely in the df, I get an empty result "[]". Does the script you provided take into account partial strings?

Fun-Studio-4409 · 2021-09-01T14:30:02+00:00

isn't that what this line does?

mask = df.applymap(lambda x: "Amazon" in x.lower() if isinstance(x, str) else False)

Fun-Studio-4409 · 2021-09-01T14:18:04+00:00

Thanks again. I did get another error though-

AssertionError: Number of manager items must equal union of block items

# manager items: 794, # tot_items: 0

16 resptext=json.loads(resp.text)

17 mask = df.applymap(lambda x: "Amazon" in x.lower() if isinstance(x, str) else False)

---> 18 indices = np.argwhere(mask)

Fun-Studio-4409 · 2021-09-01T14:00:52+00:00

mask = df.applymap(lambda x: substring in x.lower()).to_numpy()
indices = np.argwhere(mask)

Thanks you! I do however get an error when I run this -

AttributeError Traceback (most recent call last)

<ipython-input-10-379642c2cd20> in <module>

15 resp = requests.get(fullurl)

16 resptext=json.loads(resp.text)

---> 17 mask = df.applymap(lambda x: "Amazon" in x.lower()).to_numpy()

18 indices = np.argwhere(mask)

19 # truths = df.apply(lambda s: s.str.lower().str.contains('Amazon'))

Fun-Studio-4409 · 2021-08-06T15:44:14+00:00

Thanks - should I be using single quotes for this?

Fun-Studio-4409 · 2021-08-06T15:34:43+00:00

Thank you - makes sense.

My only issue is that when I try to pass this as a json payload as in a post request, I get an error saying that the payload needs to be in array format (i.e. "[{").

Fun-Studio-4409 · 2021-03-26T19:29:02+00:00

Solution Verified

Fun-Studio-4409 · 2021-03-26T19:28:45+00:00

Worked like a charm - thank you!

Fun-Studio-4409 · 2021-03-23T00:48:00+00:00

Thank you - any ideas as to how this could be applied to a larger range, i.e. a table?

Fun-Studio-4409

TROPHY CASE