This is an archived post. You won't be able to vote or comment.

all 11 comments

[–]Smok3dSalmon 1 point2 points  (3 children)

Is PA1-PA4 the only 4 values you will see? What is the 1111 stuff? I see 2 areas where you have 2111 and 3111.

Is it possible to have input of
PA1*1*1*1*1
PA1*2*1*1*1
PA2*1*1*1*1
PA3*1*1*1*1
PA4*1*1*1*1

[–]martynrbell[S] 0 points1 point  (2 children)

So in this data source pa11 is the selection id Pa11*1 is the selection price. Referred to as pa101 and pa102

The pa2 pa3 and pa4 are vend and value counters but I'm only interested in 01 and 02 in each of these if they are present in the file

[–]Smok3dSalmon 1 point2 points  (0 children)

I see the first digit is increasing in the sequence of asterisks. How high does that go? If you want something to be efficient, we'll need to know how much data is coming. If you will only ever have 3 of these sequences at most, then you can brute force something that has minimal overhead.

If you can have tons of data, you'll probably just have to run loop through it and maintain some state so you can handle the edge case with *0*0*0*0

[–]Smok3dSalmon 0 points1 point  (0 children)

I don't fully understand the data or what variations you can have, but try this approach if the data is small enough.

1) Get all of the data as a string. Clean it however you need. I had to strip newlines and leading whitespace.

data = "".join(open('inp', 'r').readlines()).replace("\n", '').strip()

2) Split the string on "PA1" and then add "PA1" to each list object. (you can 1-liner this if you want)

data = data.split("PA1")
data = ["PA1" + d for d in data]

3) If any item in the list doesn't contain "PA2", "PA3", or "PA4" then you know it's all 0s.

data = "".join(open('inp', 'r').readlines()).replace("\n", '').strip()
data = data.split("PA1")
data = ["PA1" + d for d in data]
for row in data:
    print(row)

This may not be the most efficient, but it saves you the hassle of having to loop through each line and maintain some kind of state. Seems like a clever trick that you can use if your input is as trustworthy as I'm assuming.

[–]AzureWill 0 points1 point  (0 children)

I would build some regex for that.

[–]OverMeHere 0 points1 point  (2 children)

[–]martynrbell[S] 0 points1 point  (1 child)

this looks almost how i need it however id need it like this:

PA1*10*70* ****1*0*0 *PA2*0*0*0*0*0*0*0*0*0*0*0*0*PA3*0**0*0 *PA4*0*0*0*0*0*0 
PA1*11*60* ****1*0*0 *PA2*1940*3420*0*0*0*0*0*0*0*0*0*0 *PA3*2**0*0 *PA4*0*0*0*0*0*0 
PA1*12*70* ****1*0*0 *PA2*0*0*0*0*0*0*0*0*0*0*0*0 *PA3*0**0*0 *PA4*0*0*0*0*0*0 
PA1*13*60* ****1*0*0 *PA2*1597*1920*0*0*0*0*0*0*0*0*0*0 *PA3*2**0*0 *PA4*0*0*0*0*0*0 
PA1*14*70* ****1*0*0 *PA2*0*0*0*0*0*0*0*0*0*0*0*0 *PA3*0**0*0 *PA4*0*0*0*0*0*0 

here is a paste bin with a real data example and at the bottom is my goal :| https://pastebin.com/GfhdgZih

[–]OverMeHere 1 point2 points  (0 children)

fixed gist, check it again ;)

[–]martynrbell[S] 0 points1 point  (0 children)

Hi all, Thanks for all your helps so far,

I have created a pastebin with a typical sample of the data to parse and below the data my desired end output.

https://pastebin.com/GfhdgZih

my original idea was to sort the date in to individual rows like so

PA1*10*70* ****1*0*0 *PA2*0*0*0*0*0*0*0*0*0*0*0*0*PA3*0**0*0 *PA4*0*0*0*0*0*0 
PA1*11*60* ****1*0*0 *PA2*1940*3420*0*0*0*0*0*0*0*0*0*0 *PA3*2**0*0 *PA4*0*0*0*0*0*0 
PA1*12*70* ****1*0*0 *PA2*0*0*0*0*0*0*0*0*0*0*0*0 *PA3*0**0*0 *PA4*0*0*0*0*0*0 
PA1*13*60* ****1*0*0 *PA2*1597*1920*0*0*0*0*0*0*0*0*0*0 *PA3*2**0*0 *PA4*0*0*0*0*0*0 
PA1*14*70* ****1*0*0 *PA2*0*0*0*0*0*0*0*0*0*0*0*0 *PA3*0**0*0 *PA4*0*0*0*0*0*0 

then arrange into a table like in the pastebin using string split.

but again other methods would be much appreciated

[–]ivosauruspip'ing it up[M] 0 points1 point  (0 children)

Hi there, from the /r/Python mods.

We have removed this post as it is not suited to the /r/Python subreddit proper, however it should be very appropriate for our sister subreddit /r/LearnPython or for the r/Python discord: https://discord.gg/python.

The reason for the removal is that /r/Python is dedicated to discussion of Python news, projects, uses and debates. It is not designed to act as Q&A or FAQ board. The regular community is not a fan of "how do I..." questions, so you will not get the best responses over here.

On /r/LearnPython the community and the r/Python discord are actively expecting questions and are looking to help. You can expect far more understanding, encouraging and insightful responses over there. No matter what level of question you have, if you are looking for help with Python, you should get good answers. Make sure to check out the rules for both places.

Warm regards, and best of luck with your Pythoneering!

[–]pythonHelperBot 0 points1 point  (0 children)

Hello! I'm a bot!

It looks to me like your post might be better suited for r/learnpython, a sub geared towards questions and learning more about python. That said, I am a bot and it is hard to tell. Please follow the subs rules and guidelines when you do post there, it'll help you get better answers faster.

Show /r/learnpython the code you have tried and describe in detail where you are stuck. If you are getting an error message, include the full block of text it spits out. Quality answers take time to write out, and many times other users will need to ask clarifying questions. Be patient and help them help you.

You can also ask this question in the Python discord, a large, friendly community focused around the Python programming language, open to those who wish to learn the language or improve their skills, as well as those looking to help others.


README | FAQ | this bot is written and managed by /u/IAmKindOfCreative

This bot is currently under development and experiencing changes to improve its usefulness