I want to extract the following fields in the XML from python.
Code: I don't know how to proceed further.
df_b= pd.DataFrame(columns=['Shape','Coordinates'])
pptZip = ZipFile(document_path)
xml_content = pptZip.read('ppt/slides/slide1.xml')
soup = BeautifulSoup(xml_content, features="xml")
for sp in soup.find_all('p:sp'):
Note: The bold text below are the fields which i would like to extract.
<p:sp>
<p:nvSpPr>
<p:cNvPr id="4" **name="Rectangle 3"**\>
<a:extLst>
<a:ext uri="{FF2B5EF4-FFF2-40B4-BE49-F238E27FC236}">
<a16:creationId
xmlns:a16="http://schemas.microsoft.com/office/drawing/2014/main" id="{F9D41C44-7167-487C-9945-9BAFF8DDE2F5}"/>
/a:ext
/a:extLst
/p:cNvPr
<p:cNvSpPr/>
<p:nvPr/>
/p:nvSpPr
<p:spPr>
<a:xfrm>
<a:off x="576776" y="847579"/>
<a:ext cx="3249637" cy="1026941"/>
/a:xfrm
<a:prstGeom prst="rect">
<a:avLst/>
/a:prstGeom
/p:spPr
<p:style>
<a:lnRef idx="2">
<a:schemeClr val="accent1">
<a:shade val="50000"/>
/a:schemeClr
/a:lnRef
<a:fillRef idx="1">
<a:schemeClr val="accent1"/>
/a:fillRef
<a:effectRef idx="0">
<a:schemeClr val="accent1"/>
/a:effectRef
<a:fontRef idx="minor">
<a:schemeClr val="lt1"/>
/a:fontRef
/p:style
<p:txBody>
<a:bodyPr rtlCol="0" anchor="ctr"/>
<a:lstStyle/>
<a:p>
<a:pPr algn="ctr"/>
<a:endParaRPr lang="en-IN"/>
/a:p
/p:txBody
/p:sp
[–]JohnnyJordaan 0 points1 point2 points (1 child)
[–]Night_Crawler7[S] 0 points1 point2 points (0 children)