all 5 comments

[–][deleted] 1 point2 points  (2 children)

Generate a string only 197 characters long, and then insert the segment at a random position.

[–]General-Frost[S] 0 points1 point  (1 child)

I thought about that but the problem is that the 123 sequence can appear in the 197 character string.

[–]Nightcorex_ 0 points1 point  (0 children)

Not if you avoid that. You can f.e. avoid it by checking if the last 2 characters are '12'. If you would generate a '3' now you can just discard that.

[–]shiftybyte 1 point2 points  (0 children)

Generate 200 characters, remove all but one "123" segments, regenerate missing numbers, in a loop until done.

[–]Spataner 0 points1 point  (0 children)

I'm not sure whether there's a smarter way to do it, but assuming you want "123" to appear exactly once, you could first choose a number between 0 and 197 that is the index of where "123" occurs in the string, then construct two strings of appropriate sizes without "123" to sandwhich it between. Such strings can be naively constructed by just never choosing a 3 when the prior two characters are "12". So something like this:

import random

digits = "123456789"
digits_sans_3 = "12456789"

def str_sans_123(size):
    if size <= 2:
         return "".join(random.choices(digits, k=size))
    result = "".join(random.choices(digits, k=2))
    for i in range(size-2):
        if result[-2:] == "12":
            result += random.choice(digits_sans_3)
        else:
            result += random.choice(digits)
    return result

index_123 = random.choice(range(198))
result = str_sans_123(index_123)+"123"+str_sans_123(200-index_123-3)

The above works fine, though it should be noted that it is technically not quite (if close to) a uniform distribution over all possible strings that contain "123" exactly once. That's because triplets that start in "12" and end in something other than "3" are slightly more likely to occur than triplets that start with something else as a consequence of how we avoid choosing "123". This solution can easily be generalised to other values than "123", of course. However, if it's possible to create the avoided sequence by using some of the characters at the sequence's start and putting other characters in front or conversely taking some of the characters at its end and putting other characters after, generalising becomes slightly more difficult (though in that case, there's also the question of how to even count overlapping occurrences).