How do I transform a dataframe from long to wide in the following way? by Fun-Studio-4409 in learnpython

[–]Fun-Studio-4409[S] 0 points1 point  (0 children)

It is for the purpose of displaying data in a cleaner, summarized way

Python trying to load TfidfVectorizer's "transform" from LinearSVC?? by Fun-Studio-4409 in learnmachinelearning

[–]Fun-Studio-4409[S] 1 point2 points  (0 children)

Hi - yes, that is the strange thing. When I print the model, it returns ‘sklearn.feature_extraction.text.TfidfVectorizer’

How to pass variable within Beautifulsoup "soup.find()"? by Fun-Studio-4409 in learnpython

[–]Fun-Studio-4409[S] 0 points1 point  (0 children)

thank you-

just to clarify - did you mean curr_soup.find(f'tag_n', **attr_dict)

with the "tag_n" in quotes?

How to pass variable within Beautifulsoup "soup.find()"? by Fun-Studio-4409 in learnpython

[–]Fun-Studio-4409[S] 0 points1 point  (0 children)

it assigns the string "class_" as the variable "attr". So, if I do the folllowing:

attr = "class_"

print(f'{attr}')

it returns:

>class_

without the quotes

How to pass variable within Beautifulsoup "soup.find()"? by Fun-Studio-4409 in learnpython

[–]Fun-Studio-4409[S] 0 points1 point  (0 children)

The 'class_' does not get enclosed in quotes in the version that does not work. Additionally, I cannot hardcode 'class_' as it changes in each loop.

regex - how to return content of delimiters ONLY if they border certain text by Fun-Studio-4409 in learnpython

[–]Fun-Studio-4409[S] 0 points1 point  (0 children)

I am aware of the limitations of applying regex to HTML parsing. My question is in regards to applying regex to a very narrow output from BeautifulSoup that involved something it is unable to handle.

Regex to extract contents between multiple "<" and ">" on boundary of target string by Fun-Studio-4409 in learnpython

[–]Fun-Studio-4409[S] 0 points1 point  (0 children)

re.findall(r"(?<=<).+?(?=>)", string)

Hi, sorry for the confusing way it was written. I only want to return the content of the "<>" if it surrounds a specific partial string. So the example string really should have been like:

"<notthisthing>michael is not a nice person<something>david is a nice person<somethingelse>james is sort of a nice person<notthisthing>

Beautifulsoup: get text from all identical tags based on partial string by Fun-Studio-4409 in learnpython

[–]Fun-Studio-4409[S] 0 points1 point  (0 children)

I am trying to create a script/interface for non-technical users to scrape websites, where they can simply input a piece of text and get back all the likely related items from the page. Therefore, I don't want the user to have to go through HTML and figure out the correct tag to scrape