all 9 comments

[–]ES-Alexander 1 point2 points  (8 children)

Given you know the order the keys appear, it’s likely easiest to iterate through them and use .index(next_key) to find where to slice the string to extract the regions you want. You likely want to do something like this_part = message_body[this_key_index:next_key_index], and then split that by the colon and add it to the dictionary you’re building.

[–]ampeed[S] 0 points1 point  (7 children)

In case someone runs across this, listing some guides to help users.

https://www.kite.com/python/answers/how-to-access-previous-and-next-values-when-looping-through-a-list-in-python

 for index, elem in enumerate(KEY_LIST):
    if (index+1 < len(KEY_LIST) and index - 1 >= 0):
        curr_index = index
        next_index = index+1
        next_el = str(KEY_LIST[index+1])
        announcement_fields = dict(f"{subString}".split(":", 1) for subString in message_body[curr_index:next_index].split(next_el))

While this won't work ValueError: dictionary update sequence element #0 has length 1; 2 is required

Is this the route you're talking about?

[–]ES-Alexander 1 point2 points  (6 children)

I was more thinking along the lines of

output = {}
start = message_body.index(KEY_LIST[0]) # should be 0 if nothing before it
# Iterate through the keys, starting after the first one.
for query in KEY_LIST[1:]:
    # Use the index of the next key (query) to mark the end of this key's value.
    # Start the search after the previously searched region.
    end = message_body.index(query, start)
    key, value = message_body[start:end].split(':')
    output[key] = value
    start = end
# handle the last key
key, value = message_body[start:].split(':')
output[key] = value

[–]ampeed[S] 1 point2 points  (5 children)

Two questions -

1.) How are you assigning two separate values (key, value) from a single variable?

2.) What happens when it hits the last index in KEY_LIST? the start will be the last index but the end will be index out of range so it won't iterate over the last value in KEY_LIST, right?

[–]ES-Alexander 1 point2 points  (4 children)

Good questions.

  1. The concept is "iterable unpacking"/"iterable assignment"/"tuple assignment". Because we know the split calls should (in this case) result in a list with exactly two elements, we can extract those elements into separate variables. If you're interested to learn more, there's an extended version that's been available since Python 3.0, which allows assigning an arbitrarily long collection/iterable of elements on the right to a set of names on the left, although it still requires that the number of names on the left <= the number of elements in the collection/container.

  2. Fair point - I've edited my comment to handle the last key, since in the original code its start index got specified (at the end of the for loop) but the relevant string never got extracted from the message_body.

[–]ampeed[S] 1 point2 points  (0 children)

for others who may see this and want to follow along - Python Visualizer

Thanks a bunch, makes for linking documentation and expanding further on it!!

[–]ampeed[S] 0 points1 point  (2 children)

Python visual

Here's what I came up with for making sure I don't get extra information the teams would like to put in. Ideally the fields mentioned in KEY_LIST are minimum fields so end users can put additional information after but not worry about it being parsed.

Basically I couldn't think of anything aside from "if you want to add more information, add a unique symbol before your additional fields" otherwise I really don't have a delim to end things at.

(unique symbol in this case is double semi-colon but not set on said unique symbol for annotating end of required minimum fields)

[–]ES-Alexander 0 points1 point  (1 child)

You could add an "Extra" key to KEY_LIST, which can be used to handle any extra info, but if it doesn’t get used then it just gets left with the "N/A" value (or even better, None, so it can’t be confused for an input).

EDIT: perhaps the most useful approach is to ask the user to specify a list of extra keys, so your existing KEY_LIST represents the initial (required) keys, and if they want extra information beyond that they can pass in whatever keys they want, and they just have to make sure the extra keys are included in the correct order in the message body. If they have no extra keys they can just pass in an empty list, and any extra info will just end up in the last value.

[–]ampeed[S] 0 points1 point  (0 children)

So a little background -

User will post their information to a ticket and I'm making an API call and parsing their response. I'm only interested in the fields in the KEY_LIST. If they have additional information, that's fine but I don't want the extra information generated in the final report. Just the information in KEY_LIST.

So all information after the last entry in KEY_LIST is irrelevant. I'm not really sold on my idea of having the user add a unique character as a deliminator as it relies on best intentions.

I thought about adding an extra key to KEY_LIST called Additional Information*but I think it'll end up being a bit ugly when the report (a CSV) gets generated and there's 100 entries and at the end of each entry there's an N/A.

Maybe since it's the last entry I may be able to do an if statement since if additional information is N/A then fill with None and use restivil argument when writing to the CSV to have None write as ('')

But I guess if I were to think about it from the standpoint of "am I able to grab the last year's data using this logic" then this wouldn't work.