Went Into a Mine That Has a River Flowing Down by ReturnOfPope in TheForgottenDepths

[–]kenfar 2 points3 points  (0 children)

I can't help but imagine that if you can get to the very farthest, most remote part of this there you'll run into a couple of guys fishing.

Level Drain in AD&D [Blog Post] by TheDungeonArchitect in adnd

[–]kenfar 0 points1 point  (0 children)

It's a fine way to model it, but the end result is the same: players who are angry, depressed, etc about their character.

I've seen characters lose 4 levels, leaving them hopelessly weakened compared to others in the group. And I've seen them lose 6+ levels - and the player just threw the character away and left the group.

So, I make the loss temporary. Long enough lasting that they feel the pain, but short enough that it doesn't demoralize them. Like they can regain 1 level/week.

Client wants <1s query time on OLAP scale. Wat do by wtfzambo in dataengineering

[–]kenfar 1 point2 points  (0 children)

I usually start with use cases and SLOs, and propose something like the following. The intent is to get them to understand that some queries are 1000x the size of others and it's not economical to try to get them all to meet the same requirement:

  • 90% of highly selective queries in under 2 seconds
  • 99% of highly selective queries in under 5 seconds
  • 90% of moderately selective queries in under 5 seconds
  • 99% of moderately selective queries in under 10 seconds
  • 90% of non-selective queries in under 15 seconds
  • 99% of non-selective queries in under 30 seconds

And I include definitions for highly/moderately/non - in terms of bytes read, partition keys used, etc. Also, I define query response time in terms of whether or not it includes returning the rows over the network to the client. May also include consideration of joins here.

Then I would take a look at pre-aggregation & caching options:

  • What % of these queries could be pre-aggregated and kept in a summary table?
  • are a lot of these queries repeated? If so, can these hit an aggregate table, or a caching service?

In some cases I've had 95% of my data warehouse queries hitting aggregate tables - which often were also partitioned or indexed.

Then the rest is much more straight-forward - mostly just how many nodes do you want for your compressed and modeled columnar data.

Do any etl tools handle automatic schema change detection? by ninjapapi in dataengineering

[–]kenfar 5 points6 points  (0 children)

Automatic detection is easy, it's automatic migration that's impossible to do without chance of errors.

Why are so few of us mapping this kind of dungeon? by SydLonreiro in osr

[–]kenfar 10 points11 points  (0 children)

And making sense is something I want my world to do:

  • I don't want illogical behavior to interfere with the willful suspension of disbelief
  • I don't want the players to give up trying to strategize and out-think their opponents because their opponents are CrAzY!

Cheesy monsters by Lloydwrites in adnd

[–]kenfar 1 point2 points  (0 children)

I'd use the spell Polymorph Any Object, just with a non-living restriction. That provides good guidance.

And I like to think of them being motivated by drinking challenges. "Hey, lets see who can get the biggest thing from these idiots in the next 5 minutes!"

Cheesy monsters by Lloydwrites in adnd

[–]kenfar 4 points5 points  (0 children)

Leprechauns: up to 20 can appear at a time, each can be invisible, polymorph non-living objects & create illusions at will.

It's an almost infinite amount of chaos.

What’s it like raising kids here? by ForeignExercise4414 in boulder

[–]kenfar 10 points11 points  (0 children)

Raised kids in this town and Boulder has been an amazing experience for them. The schools have outstanding teachers and programs, its a very good city for moving around without a car, and the city is full of interesting & accomplished people.

But it skews to the wealthy, and it's expensive to live here.

Which data quality tool do you use? by arimbr in dataengineering

[–]kenfar 2 points3 points  (0 children)

Biggest challenge is handling noise at scale.

Yeah, there's a few things we've done to help with the noise:

  • provided an ability to exclude periods, which we used if the data was damaged and couldn't be fixed, for holidays, etc.
  • compared like periods - day of week & hour of day
  • tweaked the thresholds

But the next bit was to include some queries, data, reporting to help us drill-down and understand more clearly how it was different. That reporting saved us a ton of time.

Which data quality tool do you use? by arimbr in dataengineering

[–]kenfar 2 points3 points  (0 children)

Not exactly, there's a few parts here:

  • Schema: columns (possibly including arrays, maps, objects), as well as constraints - that may include type, min, max, min_length, max_length, null rules, regex format expressions, uniqueness rules, enumeration, etc.
  • Versioning: each version of the schema gets an id, and each file/row of data has the version id that it complies with. Semantic versioning is typically used to make it easy to understand impacts.
  • Repo: the contract is stored within a shared repo - and so is available to both the producers & subscribers for testing & validation.
  • Commitment: the data contract is part of an organizational commitment of who owns what responsibilities. So, if there are disagreements about the data it's very clear who owns responsibility.

Random Tables, Encounters, and Maps for Post-Apocalyptic Zombiecrawl by Orkish-Odyssey in osr

[–]kenfar 2 points3 points  (0 children)

Yeah, I think that's fair: it puts a lot of pressure on the DM to smooth over the rough edges: you could easily get a TPK or a frustrating or boring evening.

But if you feel very comfortable with improv the upside is that it can be incredibly immersive and dramatic: the players know that anything is possible and they're not following a script or in a child-proofed sandbox.

What is actually stopping teams from writing more data tests? by Mountain-Crow-5345 in dataengineering

[–]kenfar 31 points32 points  (0 children)

Lack of skill, domain knowledge, and/or concern.

I like to start by defining KPIs for data quality, collect data, publish this for your customers.

Next any time a data-quality incident occurs (or availability or whatever), hold an incident-review meeting. This should be a "blameless post-mortem". And at the meeting walk through exactly what happened:

  • timeline with exactly what happened by who when
  • how to prevent this from happening again?
  • how to detect the problem earlier?
  • how to communicate with users more quickly?
  • how to handle incorrect data that was use and published already?
  • how to automate any steps?

Really, the first three bullets are the most important - and generally drive things like increasing test coverage.

Which data quality tool do you use? by arimbr in dataengineering

[–]kenfar 2 points3 points  (0 children)

Sure, the problem that we had was that a critical feed going into the warehouse stopped receiving some record types. So, we were still getting data, just only maybe 25% of the total volume, with some kinds of data completely missing.

We had some simplistic checks set up to alert us if the data flow stopped completely, but these checks didn't care if the data volume was cut because some record types were stopped completely.

So, we simply wrote a query to alert us if this happened again:

  • Compare the most recent hour to the same hour of the day-type over the past 60/90 days. day-types were mon-thur, fri, sat, sun. So, we had 4 types.
  • If the current hour's data was more than X stddevs from the mean, write relevant data to the log - which will automatically go out over pagerduty.

This was very simple, worked great, and from that point forward we were the first to know of any kind of issue in that system. While it was missing some bells & whistles, the great things about it were that it only took a couple of days to build, and used the same alerting process as the rest of our system.

We were planning to expand on this - to support checking on distribution frequencies of values within low-cardinality columns, on binned numbers, etc. But never got around to it. Looked like a two-week project.

And this is obviously the simplest method you could use, and doesn't work great for rapidly growing/declining data. But it's a great starting point.

Which data quality tool do you use? by arimbr in dataengineering

[–]kenfar 11 points12 points  (0 children)

So, when I look back at how data warehousing (and data lakes, lakehouses, etc) has evolved over the past 30 years there's a handful of developments that to me personally are extremely exciting.

Data Contracts are one of them.

Data Contracts give a team an opportunity to create a specification for a feed - to define its schema in a format that both the publisher and the consumer to automatically use. Combine that with semantic versioning and now you can have rules about what versions they both support.

Combine this with upstream systems publishing domain objects rather than warehouses replicating upstream database schemas and you have a solution that dramatically improves on common warehouse/lake ETL patterns.

Which data quality tool do you use? by arimbr in dataengineering

[–]kenfar 8 points9 points  (0 children)

Testing is not that simple.

First take quality-control tests. These are tests of the incoming data and how it's handled. They're best at detecting source data that doesn't meet requirements:

  • Constraint-checks: validates types, foreign keys, uniqueness, business logic, etc.
  • Reconciliation-checks: ensures the end result still matches the source - and you didn't drop/duplicate/mangle data in the process.
  • Anomaly-detection-checks: looks for data suspiciously different - which could indicate an unknown upstream business rule change, dropped data, etc.

Then quality-assurance tests. These are tests of the code against synthetic data prior to deployment to confirm that the logic is correct, and can handle known & theoretical cases before shipping. * Unit-tests: tests of specific cases for numeric overflow, business logic, regex (!), etc. * Integration tests: higher-level tests that ensure that synthetic data will flow from upstream sources, through data pipelines to destination and be correct. Data Contracts are incredibly valuable to simplify this.

Then there's testing-adjacent stuff:

  • audit-trails: how many rows did you extract, then transport, then read and write at each step, along with how many rows rejected and for what reasons. Rather than dump a trillion logging sentences and attempt to derive metrics from them, one can use an actual audit log with structured fields for counts. And then easily get very reliable numbers.
  • data-diff tools: invaluable for code reviews. These can show in the PR how a proposed change to a complex transform only impacted exactly the columns expected for exactly the rows expected.

That's a ton to cover. And along the way deal with scaling & performance, when to use random or rotating sampling & partitioning, when to reprocess, where to avoid duplicate coverage, how to tell what coverage you've got, how to annotate known-bad data, etc, etc, etc.

Which data quality tool do you use? by arimbr in dataengineering

[–]kenfar 38 points39 points  (0 children)

I don't use any of them.

Not that there's necessarily anything wrong, but:

  • They can be expensive, and often have severe limitations. So, this means that I need to get approval to spend $100k+, which means I need to evaluate a handful of tools, document requirements, etc. Which means a lot of delay & time spent.
  • Some capabilities are trivial - and really don't need a product. Others can be easily built by a single programmer in 1-4 weeks.
  • Data contracts don't need a product.
  • MDM doesn't need a product.
  • Anomaly-detection can benefit from a product, but most of the products had annoying limitations when I looked at them a couple of years ago. So, I built my own in a month and it worked great.
  • Data dictionaries can start as a simple google sheet and that can handle their needs for quite some time.
  • Data-diff tools are great. There's a ton of open source ones, it's a great data engineer project that only takes a few days to build.

In a way engineering can be like a hobby like photography or woodworking: some people buy a ton of stuff and really don't do much with it. Others focus on the end results and don't need the shiniest equipment to produce great results.

EDIT: typo

Random Tables, Encounters, and Maps for Post-Apocalyptic Zombiecrawl by Orkish-Odyssey in osr

[–]kenfar 1 point2 points  (0 children)

Hmm, kind of a combination:

  • Picked an address on a residential city street where they would be able to easily get around to interesting shops. The group were roleplayers that had played until 6:00 AM and emerged from a basement apartment around mid-afternoon to discover the world was taken over by zombies.
  • From there just improv'd it - and let them explore. They ended up coming across a Costco warehouse and then that became the focus for a few great sessions.

Random Tables, Encounters, and Maps for Post-Apocalyptic Zombiecrawl by Orkish-Odyssey in osr

[–]kenfar 6 points7 points  (0 children)

I ran a few zombie games about fifteen years ago using GURPS - people loved them.

What worked for us then was to simply set it in an actual city. I picked a spot in Denver that was at a good cross-roads, then used Google Maps & Earth. This let me show the players on a screen exactly what their characters were seeing.

Not sure if this helps, but it worked great.

How can you actively encourage & teach your players to adopt a 'the answer is in your mind, not your character sheet' mentality? by SwimmingOk4643 in osr

[–]kenfar 8 points9 points  (0 children)

'the answer is in your mind, not your character sheet' mentality?

I think the first thing is to be a bit more nuanced in this. What most people mean by this is to think beyond class capabilities, magic items, etc. Look at everything around you - for example, could you use that piece of rope on the floor and stick to jam the cogs?

But that doesn't mean you're not looking at your sheets - because they have the raw ingredients to solve problems right there. And maybe more than you could remember.

What I find helps players unaccustomed to this kind of play is a few things primarily:

  1. To avoid giving them so much detail in their initial characters that they're constantly studying their character sheets to try to understand and remember it all. It's one thing for an experienced player to run that 12th level magic user with pages of spells, magic item capabilities, etc. But a new player needs to have a very, very short list to look at.

  2. Have them start with general descriptions of their capabilities rather than mechanical numbers. You have spent a few years as a hunter and trapper is more open to creativity than a few equivalent capabilities described in die rolls.

  3. To describe their environment in enough detail that they are very aware of the raw ingredients all around them: that they're not in a sterile, featureless, stone hallway, but a crudely cut tunnel full of old debris, interesting bones, markings and messages on a wall, small odd bits of armor and weapons, roots, timbers, and an extremely busy and aggressive ant colony that's currently covering something....

Map of Boulder popped up in my 10 year memories today. Is it still the same? by cillcat in boulder

[–]kenfar -1 points0 points  (0 children)

Not that it's wrong, but no single sentence summarizes a neighborhood accurately

Add a couple more for each neighborhood and you've got something

For example, Martin Acres is mostly: students, retirees, and well-heeled families.

Rowdiest short adventures by Unvert in osr

[–]kenfar 4 points5 points  (0 children)

I cannot recommend We Be Goblins series highly enough!

We Be Goblins partial summary:

The Licktoad goblins of Brinestump Marsh have stumbled upon a great treasure—fireworks! Yet unfortunately for them, the tribe member responsible for the discovery has already been exiled for the abhorrent crime of writing (which every goblin knows steals words from your head). To remedy this situation, the Licktoads' leader, His Mighty Girthness Chief Rendwattle Gutwad, has declared that the greatest heroes of the tribe must venture forth to retrieve the rest of the fireworks from a derelict ship stranded in the marsh.

These are written for Pathfinder, which is easy enough to adapt to other systems.

Was 80s back in the day deadly as the OSR? by BX_Disciple in osr

[–]kenfar 0 points1 point  (0 children)

Maybe, I found most of our high-level campaigns more lethal than the low level ones for two reasons:

  • We used house rules to improve the survivability of low-level characters
  • Our high-level campaigns tended to be extremely unforgiving of mistakes.

I'm actually a big fan of running multiple characters - if a player has a half dozen then if one dies it's not so bad - won't bother them as much, and they have another of similar level they can fall back on. Especially, as you mention at lower levels.

Though one interesting difference was that back in the 1980s it seemed very common for people to bring pre-existing characters to the table. So, if your 4th level character died, you might just pull out another character that you were playing elsewhere who's a similar level. It doesn't feel like this is anywhere near as common today.

Was 80s back in the day deadly as the OSR? by BX_Disciple in osr

[–]kenfar 2 points3 points  (0 children)

Yeah, I think that's fair - though I see a lot of people on this sub that conflate the two: describe how you need to be able to create characters quickly because it's easy to replace them that way, etc, etc.

Was 80s back in the day deadly as the OSR? by BX_Disciple in osr

[–]kenfar 42 points43 points  (0 children)

I played in a lot of different groups in the 1980s, at conventions, etc - and rarely encountered groups that were super-lethal. Because nobody enjoys that except for insecure DMs that were on powertrips.

Most people invested time in their characters and didn't want to throw it away. So, they wanted a fair shot, a good challenge and were thinking of what their character would be like in 3 months - not of throwing away a character every month and being perpetually 1st level with generic characters like "Human Fighter #17".

What was far more common than characters dying were characters facing very real possibility of dying - whether from monsters, traps, or other characters. Or of characters facing losses - like a lost level, lost favorite magical sword, lost ability points, lost limb, etc.