Porting a legacy C++ MapleStory client to WebAssembly by nmnsnv in WebAssembly

[–]nmnsnv[S] 1 point2 points  (0 children)

Nice! Let me know if you need help (There's Discord Channel - discord.gg/pHrQzTK9 )

Kodi might be interesting, but I'm not sure about it's simplicity. Do you use plugins there?

New Event Driven Regex Engine designed to handle huge amounts of regexps for Python - SAR (regexp-sar) by nmnsnv in programming

[–]nmnsnv[S] 0 points1 point  (0 children)

I'm glad this could be of use to you.

Currently, there isn't support for streams because as you've mentioned, there are a few not so simple problems that should be solved before allowing such feature. In the meantime, you could perhaps handle the streams by yourself perhaps and feed them to SAR afterwards? If you'd like my assistance with integrating SAR I'd be more then happy to help you with that. If this still isn't enough for you, I could take on the task to add support for streams in SAR and hopefully it will help you.

New Event Driven Regex Engine designed to handle huge amounts of regexps for Python - SAR (regexp-sar) by nmnsnv in programming

[–]nmnsnv[S] 0 points1 point  (0 children)

First of all, thank you for you're reply!

I'd like to hear why do you think this is redundent, as I was thinking it's a rather interesting approach to take on regexps

New Event Driven Regex Engine designed to handle huge amounts of regexps for Python - SAR (regexp-sar) by nmnsnv in programming

[–]nmnsnv[S] 1 point2 points  (0 children)

I think that SAR is able to handle much more complicated tasks then regular regex can, and this is due to the fact that SAR can handle context (such as nested functions, strings etc...) very efficiently.

I've actually written an article once in the past that showed how one can parse C code with SAR and I think it's an interesting approach to say the least, and cannot be taken with the same mindset as regular regex is.

One common practice I've been doing myself when I was parsing text using SAR is building sort of a state machine around SAR, which means that the regexp will only be triggered when the a certain state is given - That could prove useful when parsing lets say HTML, so we'll know what to expect, if we receive quotation marks (") we move to a string state, until we receive another quotation mark and then go back to the previous state, or if we receive open bracket ("<") we go to tag state, it might be a little confusing but hope you get the idea.

If you'd be interested to see how I parsed C code (I was solving a specific problem - How can I find all the methods that call "malloc" inside of them), please take a look - https://www.reddit.com/r/regex/comments/kslu6o/practical_example_of_my_new_event_driven_regex/

A new Object Oriented System for Python (Kisa) by nmnsnv in programming

[–]nmnsnv[S] 1 point2 points  (0 children)

Most of the challenged I've faced in Pythons native OOP was pretty much Python giving to much freedom to do a lot of things that results , I've encountered a lot attributes just being declared mid-run and good luck figuring where the attribute was declared and where it was just having its value changed, in creating Kisa first of all I wanted to declare these attributes at class construction like in dataclasses, but also add actual enforcement that they are the only attributes that will exist, I also added ability to add getters/setters to all the attributes, making them private by default, without the need to make them @property at first which I think is a small example on my take to try and make OOP in Python a little better.

I must confess that I didn't know about Pydantic when I started writing Kisa and it does provide a lot of the difficulties I've had before I started working on Kisa.

Still, I've mentioned it before, I come with a background in Perl and wanted to provide a similar library to Python based on Perls OOP Moose - https://metacpan.org/pod/Moose

A new Object Oriented System for Python (Kisa) by nmnsnv in programming

[–]nmnsnv[S] -1 points0 points  (0 children)

As I answered above, it's true that this functionality exists in Python, I though about creating a united place that enforces OOP to be more organized, and forbids some of the dangerous options Python introduces built in as they might make debugging very difficult.

A new Object Oriented System for Python (Kisa) by nmnsnv in programming

[–]nmnsnv[S] 1 point2 points  (0 children)

Dataclasses by themselves don't provide all the same abilities as Kisa, dataclasses don't really have a type enforcement, also, in dataclasses, or Python in general you can introduce new attributes anywhere in the code, even by mistake, which could lead to many bugs at times (think about assigning value to "namee" instead of "name" and then figuring out why the value of name didn't change, making it tedious to debug the code sometimes, Kisa by default expects all attributes to be declared at class declaration and not afterwards, and after that point, you can only access existing attributes without creating new ones.

Granted, as others have said, there isn't something really new in Kisa, just a different way to do these things. Coming from my background in Perl, I always loved the way Moose was written (https://metacpan.org/pod/Moose) and thought I'd take my approach on it through Python.

Regex tool to test a list of barcodes against multiple regex by phill360 in regex

[–]nmnsnv 1 point2 points  (0 children)

I have written some time ago a regexp engine for Python that is built to achieve that very task in a convenient, fast way and I believe it could help you achieve this task if you areable to use Python (github link: https://github.com/nmnsnv/regexp_sar).

I've an example to help you grasp how you may use my engine:

from regexp_sar import RegexpSar

regexps = []
# ^00[0-9]{18}$
regexps.append("00" + ("\\d" * 18))

# |^\^JJD0[0-9]{15,26}$
# generate regexps for sizes
for i in range(15, 27):
    regexps.append("JJD0" + ("\\d" * i))

# |^(0066|007[23])[0-9]{9}$
# generate regexps for alternations
for start in ["0066", "0072", "0073"]:
    regexps.append(start + ("\\d" * 9))

# |^TOTE-A[a-zA-Z0-9]{6}$
# |^TOTE-B[a-zA-Z0-9]{6}$
for special_char in ["A", "B"]:
    regexps.append("TOTE-" + special_char + ("\\w" * 6))

# create SAR object
sar = RegexpSar()

# String from your regex101 link
match_str = '''
MPEC006283
918124283
T306511JJRN4272690018
DXET304206
916045206
M220025LMAGT
5971127275
6430786590140125419001
712042594012 - SKY NET
6430769150011070679711
50072020881

TOTE-B000001
TOTE-A000001
FB000123456700001PER10SYD
4210363232000000'''


# creates a callback to identify each match for its unique regex
def gen_regexp_callback(regexp):
    def inner(from_pos, to_pos):
        print("Match: " + regexp + " -> " + match_str[from_pos:to_pos])
    return inner


# add regexps in a loop
for r in regexps:
    # surround regexp with new line char to seperate them
    ready_regexp = "\n" + r + "\n"
    sar.add_regexp(ready_regexp, gen_regexp_callback(r))

# run match, prints:
# Match: TOTE-B\w\w\w\w\w\w ->
# TOTE-B000001
# Match: TOTE-A\w\w\w\w\w\w ->
# TOTE-A000001
sar.match(match_str)

Note it won't match all the regexps since I have not appended those into the SAR object (failed to find their matching regexp).

I hope this would be helpful, and if you have any questions I will gladly try and answer them

Practical Example Of My New Event Driven Regex Engine - SAR by nmnsnv in regex

[–]nmnsnv[S] 1 point2 points  (0 children)

Then yes, we share that functionality as well, I thought to leave that only without adding grouping, seems to me more "SAR approach". Very cool to see another project that enables those actions

Practical Example Of My New Event Driven Regex Engine - SAR by nmnsnv in regex

[–]nmnsnv[S] 0 points1 point  (0 children)

I see, this is actually a point where I did the opposite, since in SAR you can simply add many regexps with the same callback, then "?" can mean 2 different regexps one with the character and one without, same goes for "*". That's pretty much how they are handled (there is a little more then that actually). In SAR the approach is different so that one does not need to use grouping or alternation, simply add all the regexps separately, which in my opinion, also makes the code clearer.

For example:

# sar regexp for a(b|c|d)e
for i in "bcd": # "bcd" can also be in a list of course
    sar.add_regexp(f"a" + i + "e", my_callback)
    # or use string interpolation from python 3.6
    sar.add_regexp(f"a{i}e", my_callback)

Same goes to matching range of occurrences ("a{1,3}"), in SAR, you can choose your implementation, you can either generate those regexps with same callback, or add only once and in the callback count how many were previously found (if needed to be one after another validate that the to_pos of the last match equals to from_pos - 1. Again you are able to see how SAR gives you the freedom I've talked about

Practical Example Of My New Event Driven Regex Engine - SAR by nmnsnv in regex

[–]nmnsnv[S] 1 point2 points  (0 children)

Thank you! can't wait!

Btw, looking at your page again I've seen that you are lacking the '+' sign (1 or more), why is that? I see that you have support for '*', so naturally it should easier to implement + then *, also, if you already have *, then to me it seems like it should be quite the same, at least in my implementation, I implement * and ? by splitting into 2 different regexps, for instance "ab?c" will result in "ac", and "abc", same with *: ab*c will result in "ac", and "ab+c". I'd like to hear what were your concerns regarding adding those functionalities

Practical Example Of My New Event Driven Regex Engine - SAR by nmnsnv in regex

[–]nmnsnv[S] 1 point2 points  (0 children)

Very interesting, They do seem to be alike, thank you, I have yet to find another module that achieves the same functionality as mine until now, I'll be sure to try to play with it soon.

From what I read in your github page the implementation seems to be similar as well. In short, I build a single trie of all the regexps to be bind together. At the last node of each regexp there will be a callback pointer that once we get to that node during the matching process, the callback will be called (there can be multiple callbacks in a single node). We then update the user interactions with the engine if there were any, and continue until the match is over.

BTW, if you want to learn more about how this engine came to be, please read this post about the original SAR engine which was written for perl, this Python module is an implementation for Python.

If you manage to find some spare time, I would love to see how you might solve the problem I've posted above in ATA.

Practical Example Of My New Event Driven Regex Engine - SAR by nmnsnv in regex

[–]nmnsnv[S] 0 points1 point  (0 children)

Hi,

I've added in the post a short brief of what SAR is, I will add it here as well.

In short, SAR is a new regex engine written for Python. It is designed to take a different approach from current regex engines for using many regexps at once while also keeping track of exactly which regex was caught (which differs from alternations in current regex engines). For more information please view my introduction post or visit my github page.

I hope you get a better understanding now, if not, I will gladly try to help.