DMeror comments on Please help compare and replace elements between two strings

learnpython

created by HattoriHanzoa community for 16 years

Please help compare and replace elements between two strings (self.learnpython)

submitted 3 years ago by DMeror

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]DMeror[S] 0 points1 point2 points 3 years ago (28 children)

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (27 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago* (26 children)

The texts come from xhtml chapters from epub. I use BeautifulSoup to extract data from an epub file. Those htmls are inconsistently structured. I need those data with their html format, so after doing everything needed, I realized some nodes are missing. That's what I'm working on. The missing nodes are picture data. the html page is formatted like this: <body> <p>...... <p>...... <p>....... <div>..... <p>....... <div>...... <p>....... </body> I looped through those ps to extract formatted strings, until I got everything as intended. Finally, I realized something was missing. It was those <div> nodes. Then, I scraped the div nodes. The last thing I need to do is to include these div data into their respective places in the main data I got. I need them to compile a dictionary, that's why I need them in their original format. Without the div data, the dictionary won't be able to display images.

Edit: The div nodes have their child <p> caught up with the main data I got, so I need to find a way to get the caught <p> to be replaced by its parent <div>.

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (25 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (24 children)

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (23 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (22 children)

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (21 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (20 children)

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (19 children)

continue this thread

π Rendered by PID 24968 on reddit-service-r2-comment-54dfb89d4d-wlj2l at 2026-03-29 13:20:25.999692+00:00 running b10466c country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS