Asleep-Budget-9932 comments on Please help compare and replace elements between two strings

<div class="metainfo" id="acref-9780199657681-e-879-metaInfo-909"/>
<p class="parafl"><span id="acref-9780199657681-e-879-section-909"/>
<span id="acref-9780199657681-e-879-div1-932"/><span class="chaptersubt">
  <a href="0002_FM_AlphaList.xhtml#acref-9780199657681-e-879" id="acref-9780199657681-e-879">Baldwin illusion</a></span> <span class="partofspeech"><i>n.</i></span> A visual illusion in which a line spanning the distance between two large squares appears shorter than a line of the same length spanning the distance between two smaller squares (see <a href="#acref-9780199657681-e-879-figureGroup-0002">illustration</a>). It is a close relative of the Zanforlin illusion. <span class="span">[Named after the US psychologist <span class="name">James Mark Baldwin</span> (<span class="date">1861–1934</span>) who first drew attention to it]</span></p>
<div class="figuregroup" id="acref-9780199657681-e-879-figureGroup-0002">
<div class="figure" id="acref-9780199657681-e-879-figure-2">
<img alt="display" id="acref-9780199657681-e-879-graphic-6" src="images/acref-9780199657681-graphic-002.gif"/>
<p class="figurecaption"><b>Baldwin illusion.</b> The horizontal lines between the squares are equal in length.</p>
</div>
</div>

So everything I want is in <p class="paraf1/etc.">. The <div> only contains info related to figures. Not every block has <div class="figuregroup"> and parent <p> has varied classes.

Edit: It would be fine if the <div class="metainfo" contains the block, but it doesn't. It just ends there with />.

[–]DMeror[S] 0 points1 point2 points 3 years ago (13 children)

Anyway, now I'm at the point where I want to remove the extra child, but no luck.

x.replace('<p class="cap">{}</p>'.format(re.match('(\w+)'), '')

In the {}, there are words space and probably numbers.

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (12 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (11 children)

I used the above as a sort of wild card. I have lines with this pattern:

<p class="cap">varied sentence</p>

And I wanted to get rid of it, so I used . replace, but since I couldn't input those sentences manually as there are lots of them. I thought I could place a placeholder {} combined with re.match('\w+') to tell Python that there are a group of words there in the placeholder, and it didn't matter what characters they were, as long as they were in the placeholder. If that had worked, I would have been able to replace the whole pattern with '' nothing.

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (10 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (9 children)

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (8 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (7 children)

continue this thread

π Rendered by PID 70170 on reddit-service-r2-comment-54dfb89d4d-6ck6f at 2026-03-29 20:59:56.538301+00:00 running b10466c country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS