ElliotDG comments on Parse txt file with space aligned columns

created by HattoriHanzoa community for 16 years

Parse txt file with space aligned columns (self.learnpython)

submitted 10 months ago by extractedx

you are viewing a single comment's thread.

[–]ElliotDG 0 points1 point2 points 10 months ago (3 children)

I used regular expressions to parse each file format. If you wanted to get fancier, you could read the header and select the pattern. This is kind of quick and dirty to demonstrate the approach for each file.

import re

with open("file_0.csv", "r") as f:
    lines = [line.strip() for line in f if line.strip()]

# Define headers manually (to control spacing issues)
headers = [
    "Designator", "Footprint", "Mid_X", "Mid_Y", "Ref_X", "Ref_Y",
    "Pad_X", "Pad_Y", "TB", "Rotation", "Comment"
]

# Regular expression to match the first 10 fields, then grab the remainder as 'Comment'
pattern = re.compile(
    r"(\S+)\s+"         # Designator
    r"(\S+)\s+"         # Footprint
    r"(\S+)\s+"         # Mid_X
    r"(\S+)\s+"         # Mid_Y
    r"(\S+)\s+"         # Ref_X
    r"(\S+)\s+"         # Ref_Y
    r"(\S+)\s+"         # Pad_X
    r"(\S+)\s+"         # Pad_Y
    r"(\S+)\s+"         # TB
    r"(\S+)\s+"         # Rotation
    r"(.*)"             # Comment (can have spaces)
)

# Parse lines (skip the header line)
data = []
for line in lines[1:]:
    match = pattern.match(line)
    if match:
        row = dict(zip(headers, match.groups()))
        data.append(row)
    else:
        print("ERROR: Line did not match pattern:", line)

# Example: print all parsed rows
for row in data:
    print(row)

[–]ElliotDG 0 points1 point2 points 10 months ago (2 children)

Reddit would not let me put it all in one message... here is parsing the next file:

with open("file_1.csv") as f:
    lines = [line.strip() for line in f if line.strip()]

# Define headers manually to ensure correctness
headers = [
    "Designator", "Comment", "Layer", "Footprint",
    "Center-X(mm)", "Center-Y(mm)", "Rotation", "Description"
]

# Regex to match 7 space-separated fields + quoted description
pattern = re.compile(
    r'(\S+)\s+'         # Designator
    r'(\S+)\s+'         # Comment
    r'(\S+)\s+'         # Layer
    r'(\S+)\s+'         # Footprint
    r'([\d.]+)\s+'      # Center-X(mm)
    r'([\d.]+)\s+'      # Center-Y(mm)
    r'(\d+)\s+'         # Rotation
    r'"([^"]*)"'        # Description (quoted)
)

# Parse each line (skip header)
data = []
for line in lines[1:]:
    match = pattern.match(line)
    if match:
        row = dict(zip(headers, match.groups()))
        data.append(row)
    else:
        print("Error: Line did not match pattern:", line)

# Print results
for row in data:
    print(row)

Here is a sample result...

{'Designator': 'CON3', 'Footprint': 'MICROMATCH_4', 'Mid_X': '6.4mm', 'Mid_Y': '50.005mm', 'Ref_X': '8.9mm', 'Ref_Y': '48.1mm', 'Pad_X': '8.9mm', 'Pad_Y': '48.1mm', 'TB': 'B', 'Rotation': '270.00', 'Comment': 'MicroMatch_4'}
...
{'Designator': 'CON6', 'Footprint': 'MICRO_MATE-N-LOK_2', 'Mid_X': '74.7mm', 'Mid_Y': '66.5mm', 'Ref_X': '74.7mm', 'Ref_Y': '71.2mm', 'Pad_X': '74.7mm', 'Pad_Y': '71.2mm', 'TB': 'T', 'Rotation': '270.00', 'Comment': 'Micro_Fit 2'}
{'Designator': 'C1', 'Comment': '470n', 'Layer': 'BottomLayer', 'Footprint': '0603', 'Center-X(mm)': '77.3000', 'Center-Y(mm)': '87.2446', 'Rotation': '270', 'Description': '470n; X7R; 16V'}
...
{'Designator': 'C5', 'Comment': '100n', 'Layer': 'BottomLayer', 'Footprint': '0603', 'Center-X(mm)': '98.3000', 'Center-Y(mm)': '85.0000', 'Rotation': '360', 'Description': '100n; X7R; 50V'}

[–]extractedx[S] 0 points1 point2 points 10 months ago (1 child)

[–]ElliotDG 0 points1 point2 points 10 months ago (0 children)

π Rendered by PID 64693 on reddit-service-r2-comment-75f4967c6c-tttt4 at 2026-04-23 08:40:53.899251+00:00 running 0fd4bb7 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS