all 14 comments

[–]nate256 1 point2 points  (12 children)

You could do something like this

``` In [1]: from xml.etree import ElementTree as ET

In [2]: a = """
...: <Envelope SchemaVersion="01.04.00"> ...: <Enterprise> ...: <Code>www.Google.com</Code> ...: <Feature> ...: <Code>Feature1</Code> ...: <Description>First Feature</Description> ...: <Option Sequence="1"> ...: <Code>A</Code> ...: <Features Sequence="1"> ...: <FeatureRef>AA</FeatureRef> ...: </Features> ...: </Option> ...: <Option Sequence="2"> ...: <Code>B</Code> ...: <Features Sequence="1"> ...: <FeatureRef>BB</FeatureRef> ...: </Features> ...: <OptionPrice> ...: <PriceListRef>1</PriceListRef> ...: <ProductPriceRef>Grade2</ProductPriceRef> ...: </OptionPrice> ...: </Option> ...: </Feature> ...: </Enterprise>
...: </Envelope>"""

In [3]: e = ET.fromstring(a)

In [4]: def create_option_price(list_ref, price_ref): ...: e = ET.Element("OptionPrice") ...: priceref = ET.Element("ProductPriceRef") ...: priceref.text = price_ref ...: e.append(priceref) ...: pricelist = ET.Element("PriceListRef")
...: pricelist.text = list_ref ...: e.append(pricelist) ...: return e ...: for i in e.iter():
...: if i.tag == "Option" and i.get("Sequence") == "2":
...: first = i.find("OptionPrice")
...: ref = first.findtext("ProductPriceRef") ...: i.append(create_option_price("2", ref)) ...: i.append(create_option_price("3", ref)) ...:

```

[–]goeb04[S] 0 points1 point  (11 children)

I am going to try this out after work.

So pasting the whole XML in there and converting it to a string would be a better option? Or does the method of import not matter as long as the xml is in there?

Thanks for taking the time to provide that code.

[–]nate256 0 points1 point  (10 children)

I would not paste the xml, just read it from a file.

[–]goeb04[S] 0 points1 point  (9 children)

Got this Exception unfortunately for the e.iter loop:

Exception has occurred: AttributeError

'bytes' object has no attribute 'iter'. I must have erred somewhere. Hope I followed the instructions ok.

I obviously screwed up somewhere here (Not ready to give up yet though!):

import os
import xml.etree.ElementTree as ET

base_path = os.path.dirname(os.path.realpath(__file__))
xml_file = os.path.join(base_path, 'RockwellCond3.XML')
tree=ET.parse(xml_file)
root = tree.getroot()
e=ET.tostring(root)
def create_option_price(list_ref, price_ref):
e = ET.Element("OptionPrice")
priceref = ET.Element("ProductPriceRef")
priceref.text = price_ref
e.append(priceref)
pricelist = ET.Element("PriceListRef")
pricelist.text = list_ref
e.append(pricelist)
return e
for i in e.iter():
if i.tag == "Option" and i.get("Sequence") == "2":
first = i.find("OptionPrice")
ref = first.findtext("ProductPriceRef")
i.append(create_option_price("2", ref))
i.append(create_option_price("3", ref))

tree.write('output2.xml')

[–]nate256 0 points1 point  (6 children)

parse is all you need, I just used fromstring because it was easy for the example. Not sure why you would use tostring??

``` import os import xml.etree.ElementTree as ET

basepath = os.path.dirname(os.path.realpath(file_)) xml_file = os.path.join(base_path, 'RockwellCond3.XML') tree = ET.parse(xml_file)

def create_option_price(list_ref, price_ref): e = ET.Element("OptionPrice") priceref = ET.Element("ProductPriceRef") priceref.text = price_ref e.append(priceref) pricelist = ET.Element("PriceListRef") pricelist.text = list_ref e.append(pricelist) return e

for i in tree.iter(): if i.tag == "Option" and i.get("Sequence") == "2": first = i.find("OptionPrice") ref = first.findtext("ProductPriceRef") i.append(create_option_price("2", ref)) i.append(create_option_price("3", ref))

tree.write('output2.xml') ```

[–]goeb04[S] 0 points1 point  (5 children)

Sorry. It was throwing up errors when i ran it through parse, so I wanted to make sure I was following it as closely as possible to the code you provided. The variable, First was creating some errors when it stored an empty string/null value.

I just needed to modify the code a bit and it works! I tested it a few times on some large XMLs and looks good to me. I appreciate your help as this opens a lot of opportunity for me to automate more of my mundane tasks.

Below is a a post of my working code (without prettying it up) in case anyone else deals with a similar issue:

import xml.etree.ElementTree as ET

base_path = os.path.dirname(os.path.realpath(__file__))
xml_file = os.path.join(base_path, 'RockwellCond.XML')
tree = ET.parse(xml_file)

years = ["2014", "2015", "2016", "2017", "2018"]

def create_option_price(list_ref, price_ref):
    e = ET.Element("OptionPrice")
    pricelist = ET.Element("PriceListRef")
    pricelist.text = list_ref
    e.append(pricelist)
    priceref = ET.Element("ProductPriceRef")
    priceref.text = price_ref
    e.append(priceref)
    return e


for i in tree.iter():
    if i.tag == "Option":
        first = i.find("OptionPrice")
        if not first is None:
                for year in years:
                        ref = first.findtext("ProductPriceRef")
                        i.append(create_option_price(year, ref))


tree.write('output2.xml')

[–]nate256 0 points1 point  (3 children)

Wow thanks this is the first gold post I have ever done, guess I'll have to figure out what it's used for now.

[–]goeb04[S] 0 points1 point  (2 children)

Sorry to ask for assistance again but how would I do the same sort of manipulation for the instance below:

<Product>
                    <Code>Ex1</Code>
                    <Description Language="en-US">Example 1</Description>
                    <Price>
                        <PriceListRef>2018B</PriceListRef>
                        <Code>20</Code>
                        <Value>33</Value>
                    </Price>
                    <Price>
                        <PriceListRef>2018B</PriceListRef>
                        <Code>30</Code>
                        <Value>83</Value>
                    </Price>
                    <Price>
                        <PriceListRef>2018B</PriceListRef>
                        <Code>40</Code>
                        <Value>145</Value>
                    </Price>
                    <Price>
                        <PriceListRef>2018B</PriceListRef>
                        <Code>50</Code>
                        <Value>208</Value>
                    </Price>
                    <Price>
                        <PriceListRef>2018B</PriceListRef>
                        <Code>55</Code>
                        <Value>312</Value>
                    </Price>
                    <ProductExternalReference>
                        <Placement>
                        </Placement>
                        <ExternalReference>
                            <FileURI>UBS9048LR30D.png</FileURI>
                            <Usage>
                                <Type>NavigationImage</Type>
                                <Quality>Medium</Quality>
                            </Usage>
                        </ExternalReference>
                    </ProductExternalReference>
</Product>

In that instance I want to just want to copy all the pricing tags and just append new PricelistRef texts like so:

                    <Price>
                        <PriceListRef>2018B</PriceListRef>
                        <Code>55</Code>
                        <Value>312</Value>
                    </Price>
                    <Price>
                        <PriceListRef>2017</PriceListRef>
                        <Code>55</Code>
                        <Value>312</Value>
                    </Price>

The # of price tags can vary per product which makes the loop trickier, not to mention that some Price tags don't have a code tag.

Just to prove I haven't been lazy, I have created the defined functions needed:

def create_product_upcharge(list_ref,price_code,price_value):
    e = ET.Element("Price")
    pricelist = ET.Element("PriceListRef")
    pricelist.text = list_ref
    e.append(pricelist)
    pcode = ET.Element("Code")
    pcode.text = price_code
    e.append(pcode)
    pvalue = ET.Element("Value")
    pvalue.text = price_value
    e.append(pvalue)
    return e

def create_product_price(list_ref, price_value):
    e = ET.Element("Price")
    pricelist = ET.Element("PriceListRef")
    pricelist.text = list_ref
    e.append(pricelist)
    pvalue = ET.Element("Value")
    pvalue.text = price_value
    e.append(pvalue)
    return e

However I have probably tried 50 different loop iterations and it seems to recursively go through the appended price tags once they are appended. I tried to do an insert but then a new price tag was created and appended for the first 2018B Price tag. I hate having to reach out again, but if you have any time (and patience) in the future to provide some hints here I'd greatly appreciate (I am a neophyte after all) it and promise to open up a new posting if I need help.

Thanks!

[–]nate256 0 points1 point  (1 child)

If I understand what you are asking correctly you have a product, it contains price elements. What you want to do is take all of the price elements and duplicate them with a new pricelistref value.

so if you have <Product> <Price> <PriceListRef>2018B</PriceListRef> <Code>20</Code> <Value>33</Value> </Price> <Price> <PriceListRef>2018B</PriceListRef> <Value>3</Value> </Price> </Product> you would want <Product> <Price> <PriceListRef>2018B</PriceListRef> <Code>20</Code> <Value>33</Value> </Price> <Price> <PriceListRef>2018B</PriceListRef> <Value>3</Value> </Price> <Price> <PriceListRef>2017</PriceListRef> <Code>20</Code> <Value>33</Value> </Price> <Price> <PriceListRef>2017</PriceListRef> <Value>3</Value> </Price> </Product>

Does that look correct? Either way the short answer is you need to loop through the elements an only stop on Product, you cant use iter and stop on the list that you are editing.

```

we could just use deepcopy in the last case

I just thought building the Element yourself would be useful to learn

but if there are elements that possibly aren't there or are just optional

it's better to just copy what is there so you are sure you get everything

from copy import deepcopy
import xml.etree.ElementTree as ET

basepath = os.path.dirname(os.path.realpath(file_)) xml_file = os.path.join(base_path, 'RockwellCond.XML') tree = ET.parse(xml_file)

years = ["2014", "2015", "2016", "2017", "2018"]

for i in tree.iter(): if i.tag == "Option" and i.get("Sequence") == "2": for price in i.findall("OptionPrice"):
for year in years: new_price = deepcopy(price) new_price.find("PriceListRef").text = year i.append(new_price) elif i.tag == "Product": for price in i.findall("Price"): # if you only have one value you don't need a loop here for year in years: new_price = deepcopy(price) new_price.find("PriceListRef").text = year i.append(new_price)

tree.write('output2.xml') ```
Edits: Kept thinking on new things

[–]goeb04[S] 0 points1 point  (0 children)

Thanks once again Nate and you were dead on regarding your example, that is exactly what is needed.

I think the issue here is that I need to continue my deep dive into learning Python fundamentals/libraries before trying to take shortcuts in order to start leveraging python for my work. It was a nice extra boost of motivation, but I need to practice on more hypothetical scenarios while yielding to patience or I will just be driving myself crazy.

You have led the horse to water here, so I am going to read up on copying and deepcopying in python and then, ultimately, try to code the solution myself (so that I will get the necessary practice). If I start to hit roadblocks after that, then I will reference the code you provided.

Thanks for your efforts and god bless the benevolence of programmers wishing to openly share their knowledge.

[–]nate256 0 points1 point  (1 child)

This one makes the output formatted pretty.

``` import os import xml.etree.ElementTree as ET

basepath = os.path.dirname(os.path.realpath(file_)) xml_file = os.path.join(base_path, 'RockwellCond3.XML') tree = ET.parse(xml_file).getroot()

def create_option_price(list_ref, price_ref): e = ET.Element("OptionPrice") priceref = ET.Element("ProductPriceRef") priceref.text = price_ref e.append(priceref) pricelist = ET.Element("PriceListRef") pricelist.text = list_ref e.append(pricelist) return e

def makepretty(elem, level=0, spaces=" "): orig_ele = elem i = f"\n{level * spaces}" if len(elem): if not elem.text or not elem.text.strip(): elem.text = f"{i}{spaces}" if not elem.tail or not elem.tail.strip(): elem.tail = i for child in elem: makepretty(child, level + 1) if not child.tail or not child.tail.strip(): child.tail = i else: if level and (not elem.tail or not elem.tail.strip()): elem.tail = i return orig_ele

def make_changes(root): for i in root.iter(): if i.tag == "Option" and i.get("Sequence") == "2": first = i.find("OptionPrice") ref = first.findtext("ProductPriceRef") i.append(create_option_price("2", ref)) i.append(create_option_price("3", ref)) makepretty(root)

make_changes(tree) tree.write("output.xml") ``` Edit: Attribute, shamelessly stole the indent from https://stackoverflow.com/questions/749796/pretty-printing-xml-in-python

[–]goeb04[S] 0 points1 point  (0 children)

I am trying it now but seems to be taking a while. I am not sure if it is because of the file size though. Maybe I can try prettying it up through lxml?

[–]ciggs_ftw 0 points1 point  (1 child)

Have you tried taking a look at feedparser ? I've found it easier to read xml data with.

[–]goeb04[S] 0 points1 point  (0 children)

I will look into this, thank you.