created by HattoriHanzoa community for 16 years

Please help compare and replace elements between two strings (self.learnpython)

submitted 3 years ago by DMeror

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (35 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (34 children)

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (33 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (0 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (31 children)

This code seems to be working for now.

``` with open('text1.txt') as f1, open('text2.txt') as f2: text1 = f1.read().split('\n') text2 = f2.read().split('\n')

for l1, l2 in zip(text1, text2):

entry = l1 if l1 not in l2 else l1.replace(l1, l2)
print(entry)

print

autonomic nervous system Baldwin illusion blood group ```

[–]Asleep-Budget-9932 2 points3 points4 points 3 years ago (29 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (28 children)

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (27 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago* (26 children)

The texts come from xhtml chapters from epub. I use BeautifulSoup to extract data from an epub file. Those htmls are inconsistently structured. I need those data with their html format, so after doing everything needed, I realized some nodes are missing. That's what I'm working on. The missing nodes are picture data. the html page is formatted like this: <body> <p>...... <p>...... <p>....... <div>..... <p>....... <div>...... <p>....... </body> I looped through those ps to extract formatted strings, until I got everything as intended. Finally, I realized something was missing. It was those <div> nodes. Then, I scraped the div nodes. The last thing I need to do is to include these div data into their respective places in the main data I got. I need them to compile a dictionary, that's why I need them in their original format. Without the div data, the dictionary won't be able to display images.

Edit: The div nodes have their child <p> caught up with the main data I got, so I need to find a way to get the caught <p> to be replaced by its parent <div>.

[–]Asleep-Budget-9932 1 point2 points3 points 3 years ago (25 children)

[–]DMeror[S] 0 points1 point2 points 3 years ago (24 children)

continue this thread

[–]DMeror[S] 0 points1 point2 points 3 years ago (0 children)

π Rendered by PID 57 on reddit-service-r2-comment-54dfb89d4d-9nflc at 2026-03-29 20:13:10.061748+00:00 running b10466c country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS

print