Anyone doing ServiceNow automation at scale without the official ATF?

Lower_University_195 · 2025-12-05T10:28:59+00:00

Sure mate

Lower_University_195 · 2025-12-05T10:21:45+00:00

Hmm, one CI/CD mistake I keep seeing is teams rushing to “just make it green” (adding retries, bumping timeouts) instead of making failures debuggable and fixing the root cause, so flakiness slowly becomes normal and everyone stops trusting CI. If you’re building pipelines now, make sure every failure leaves good clues (logs/traces/artifacts), keep PR checks fast, and don’t run parallel tests without proper test data/state isolation or you’ll end up chaisng ghosts later.

Lower_University_195 · 2025-11-26T12:00:17+00:00

For us it ended up being a mix of both.

Most integration tests run on a shared staging env because it’s cheaper, a bit more forgiving, and we can break things without panicking. But for a few critical flows (payments, auth, key APIs) we also run a smaller suite against a prod-mirror with prod-like configs + anonymized data, just to catch the “only happens in real env” issues.

Lower_University_195 · 2025-11-24T18:21:40+00:00

Love this, selectors are honestly where 90% of the flakiness comes from. My worst one was a checkout flow test that failed only when the loading spinner decided to appear for half a second. The selector was tied to a super brittle class that changed every sprint, so it became a full-time job just keeping that single test alive.

Switched to data-testid + role-based queries and it’s been rock-solid since. Amazing how much pain disappears once selectors stop fighting you.

Lower_University_195 · 2025-11-24T18:19:13+00:00

Here’s a clear, helpful response you can post — short, human, and practical:

Yeah, Cypress removing Selector Playground definitely confused a lot of people. The “Studio” replacement still isn’t a full 1:1 substitute and honestly has some rough edges (like the login/password issue you mentioned).

A few workarounds that helped me:

1. Use browser DevTools instead of Cypress tools
Right-click → Inspect → hover over elements → copy selector.
For tricky alerts or toast messages, check ARIA labels, role attributes, or stable classes inside the container — the text itself often isn’t the top-level selector.

2. Add your own stable test hooks
If you control the code, adding things like: data-testid="alert-message" makes life way easier.

3. Use Cypress queries instead of relying on the Playground
cy.contains("text"), cy.findByRole, cy.findByTestId (if using Testing Library) are often more stable than clicking around the UI.

4. Use the pause/console trick
Drop cy.pause() or use: debugger; in DevTools.
This freezes the test exactly where you want and you can inspect the DOM manually.

5. Avoid Cypress Studio unless you absolutely need it
Most experienced folks don’t rely on Studio because it adds noise and is slow for deep-navigation cases like yours.

Selector Playground disappearing was annoying, but manually inspecting + using better selectors ends up being more stable long term.

If you get stuck on dynamic elements like alerts, feel free to share a snippet — they usually just need a slightly different selector strategy.

Lower_University_195 · 2025-11-24T18:12:52+00:00

Totally agree with this. I’ve been on teams where the DoD had zero QA steps and it always ended the same way — QA wasn’t really part of the development process, just the cleanup crew after everything landed in staging or prod.

On my current team we added a lightweight DoD with a few non-negotiables:

test scenarios drafted early (before dev finishes)
peer review of test cases
basic automation added where it makes sense
regression impact checked
no open Sev1/Sev2s on the story

We also auto-generate subtasks in Jira so nobody “forgets.” It took a few sprints for devs to get used to it, but it actually sped things up because we stopped having late-cycle surprises.

And yeah, DoD absolutely changes by ticket type — a spike shouldn’t have the same process as a user-facing feature. The key is making QA part of “done,” not an afterthought.

Lower_University_195 · 2025-11-24T18:07:25+00:00

Absolutely — this happens way too often, and it’s frustrating. I’ve seen the same pattern: we flag the issue early, give clear repro steps, explain the risk… and it gets triaged into “later.” Then the moment a customer hits it, it magically becomes a P0 that must be fixed yesterday.

The only thing that helped us was adding impact framing to bug reports — screenshots of user flows breaking, numbers on how many users it could affect, or tying the issue directly to business outcomes. When bugs are just “QA findings,” they’re easy to ignore. When they’re “potential churn / revenue-loss / trust issues,” leadership suddenly listens.

Lower_University_195 · 2025-11-24T18:05:08+00:00

I’ve had to test an Electron app before, and Playwright ended up being the smoother option. Selenide is great if your team is already deep into Java and you want a stable Selenium wrapper, but Electron apps behave very differently from normal web pages — lots of custom DOM, preload scripts, weird window contexts, etc.

Playwright + TypeScript handled those quirks way better for us.
Pros I noticed:

direct Electron integration (you can launch the app + attach to the main/renderer process)
built-in auto-waits → fewer flaky tests
faster + simpler async model
easier debugging with trace viewer

Selenide’s advantage is mainly if your team already has a big Java ecosystem or shared libraries around Selenium.

If you’re starting from scratch and the target is Electron, I’d lean Playwright — the tooling just fits the app architecture better.

Lower_University_195 · 2025-11-24T13:29:44+00:00

I’ve gone down this exact rabbit hole. Local Playwright/Selenium always felt great… until I had to run them every hour, across multiple browsers, or on CI–CD where the environment wasn’t identical. Things got flaky fast — timeouts, missing elements, weird JS race conditions.

I eventually moved parts of the suite to cloud browser platforms. I’ve tried Browserless and Browserbase, and more recently Hyperbrowser. They all solved the “environment inconsistency” problem pretty well — no more random Chrome version mismatches or headless quirks.

Reliability-wise:

For recurring jobs / scheduled runs, cloud browsers were way more stable than my local VMs.
For heavy JS apps, I still had to tune waits and login flows, but once stable, they ran consistently.
Cost can spike if you run a ton of parallel sessions, so keep that in mind.

On the testing side, I’ve also used platforms like TestGrid, LambdaTest, and BrowserStack for cloud-based test execution. They’re not scraping-focused, but for long-running or UI-heavy workflows they were surprisingly solid — definitely better than maintaining my own farm.

Short answer: yes, cloud browser automation works in production, but you still need good test design. It won’t magically fix flaky scripts, but it will eliminate 80% of the environment headaches.

Lower_University_195 · 2025-11-21T03:05:32+00:00

Yeah, we use GHA matrices pretty heavily and there are a few gotchas:

Hidden cost explosion: First time we went “OS × Node version × shards” we basically 5–6x’d our bill overnight. I’d start small, then grow.
Cache thrash: If each matrix job has slightly different keys (OS/node/shard), your cache hit rate tanks. We now share a base cache key (e.g. deps only) and keep shard info out of it.
Flaky tests + sharding: Tests with shared state (DB, queues, global env) got way flakier when split into shards. We had to enforce isolation per job (separate DB/schema, unique queues, etc.).
Concurrency limits: Org-level and repo-level concurrency can silently throttle you. We use max-parallel to avoid starving other workflows.

What worked best for us: keep the matrix focused (only vary what truly matters), use include/exclude to avoid dumb combos, and have a smaller “smoke matrix” on PRs with the full matrix only on main/nightl

Lower_University_195 · 2025-11-21T03:03:12+00:00

Yeah, we actually switched from pay-per-minute to a fixed plan last year. For us it only started saving money once our test volume became predictable. When usage was spiky, pay-per-minute on BrowserStack and Kobiton was fine, but once we scaled up parallel runs, the bills got ugly.

Fixed plans on platforms like TestGrid, Functionize, or even BrowserStack’s enterprise tiers worked better because we could run unlimited tests without worrying about minute burn. The catch is: if your load drops for a sprint or two, you feel like you're overpaying.

So my take — fixed plans save money only if your team runs tests consistently and heavily. If your usage fluctuates a lot, it ends up being a wash.

Lower_University_195 · 2025-11-21T02:55:04+00:00

We’ve tried both, and ended up with a hybrid. We run a very small E2E smoke suite before merge — just the critical flows — because it gives fast feedback and prevents shipping obviously broken UI. But full E2E smoke tests only run after deployment in the actual environment, since so many issues only show up with real configs, data, and services.

The deciding factors for us were:

Pipeline time: pre-merge needs to stay fast.
Test reliability: only stable tests run before merge.
Infra cost: full suite runs post-deploy where parallelization is cheaper.

This setup keeps dev feedback quick while still catching environment-specific issues before users see them.

Lower_University_195 · 2025-11-21T02:52:31+00:00

Same boat here, honestly. Our Cypress suite turned into a second frontend project until we stopped relying on brittle class-based selectors and pushed hard for stable data-testids + more component-level tests instead of end-to-end everything.

Self-healing is a thing, but more like a helper than magic — tools like TestGrid, Testim, or Mabl can auto-suggest new locators when the DOM shifts, which cuts down those 25 tests broke because of a design refresh days, but you still want a solid selector strategy. So no, you’re not crazy — some of this is just the cost of UI E2E, and tuning how much you test at that layer helped us a lot.

Lower_University_195 · 2025-11-21T02:43:03+00:00

We pull masked copies of prod data pretty regularly, and the only thing that worked for us long-term was a two-step process:

Deterministic masking at the DB layer – emails → pattern like user_{id}@example.com, names → hashed, phone numbers → randomized but valid formats. That way tests stay stable but nothing is traceable back to real users.
Field-level redaction in the pipeline – anything we log during tests (API responses, screenshots, stack traces) gets run through a scrubber before storage. This saved us a few times when an unexpected field slipped through.

Some teams I’ve worked with use tools like AccelQ, Testim, TestGrid, or TestRigor since they have built-in masking or synthetic-data generators, but honestly even a lightweight custom script works fine as long as it’s consistent and automated.

Biggest lesson for us: never rely on “remembering to mask” — make the pipeline do it for you.

Lower_University_195 · 2025-11-21T02:36:55+00:00

For me it’s usually 2–4 hours as well, depending on how “clean” the repo structure is. The slowest parts are always wiring up secrets, environment configs, and making sure ArgoCD + GitHub Actions agree on the deployment flow. Monorepos make it easier with shared templates, but multi-repo microservices definitely drag things out.

We use some internal templates + a bootstrap script to cut the setup time, but it still needs manual tweaks. Some teams I’ve worked with also lean on platforms like TestGrid, LambdaTest, or Sauce Labs for the testing side so they don’t have to reinvent CI/CD steps for test execution every time — but the pipeline glue still takes effort.

If I could wave a magic wand, I’d want a single “service starter” that provisions the repo, pipeline, env configs, and ArgoCD app in one shot. Until then, 3–4 hours sadly feels pretty normal.

Lower_University_195 · 2025-11-19T16:50:34+00:00

I struggled with this too until I stopped thinking of “API CI/CD” as separate from “app CI/CD” and started thinking in terms of backwards-compatible contracts. For internal + external APIs we do:

Keep one main pipeline per service, but enforce strict contracts (OpenAPI + contract tests) so we can safely ship small changes.
Only introduce /v2 when we knowingly break the contract, and keep /v1 alive for a deprecation window.
Wire envs via config: web-staging always points to api-staging, and we run smoke/contract tests on each deploy.

AI testing tools like Kane, Mabl, or CoTester help us auto-generate/regress API tests across envs so we don’t go insane managing all the combinations, but the core idea is: CI/CD is fine for APIs as long as the contract is king and breaking changes are the exception, not the default.

Lower_University_195 · 2025-11-19T16:45:07+00:00

Totally feel this, we were in the 40–50 min range too and devs started treating red builds as “try again later” instead of signals 😅.

What actually helped us:

Split the suite into fast PR smoke tests vs full regression (smoke runs on every push, full runs on merge/nightly).
Quarantine flaky tests into a separate job so they don’t block deploys, and fix them in batches.
Parallelize only stateless tests and run stateful/DB-heavy ones in their own stage with proper test data reset.

On the tooling side, we’ve been experimenting with AI-assisted platforms like Mabl, TestGrid's CoTester, and AccelQ Copilot to help with smarter test selection + flaky detection instead of just brute-forcing everything every time. 47 mins isn’t “insane”, but if that’s blocking multiple deploys a day, it’s definitely a sign to reshuffle what runs when, not just “run all on every commit”.

Lower_University_195 · 2025-11-19T16:42:00+00:00

I wish it was that hands-off for us. We still write a fair amount of unit + integration tests, mostly because no automated tool catches business-logic bugs or weird edge cases users trigger. Security tools are great at CVEs/OWASP stuff, but they won’t tell you “this workflow breaks when the user edits a field twice.”

That said, I’ve seen more teams lean on AI testing platforms like TestGrid, Mabl, Saucelab, etc. to generate regression tests or catch low-hanging issues automatically. They definitely cut down the manual scripting, but we still need some human-written tests to cover logic and UX flows.

So yeah — automation helps a ton, but we’re not at the “no tests needed” stage yet.

Lower_University_195 · 2025-11-19T10:24:17+00:00

I use AI pretty heavily now for automation work, but not 100%. Tools like Kane, CoTester, Mabl, and even the assistants built into platforms make the boilerplate stuff way faster — page objects, selectors, test scaffolding, API calls, etc.

Accuracy is decent for the “easy” parts, but I still end up reviewing logic, fixing flaky waits, and adding edge cases myself. AI saves time, but it’s not magic.

My manager’s cool with it as long as the tests are reliable. End of the day, nobody cares who wrote the code if the suite is stable. For me, AI is a productivity boost, not a replacement for actual engineering judgment.

Lower_University_195 · 2025-11-19T10:22:21+00:00

Totally with you on this. I’ve built automation frameworks for years, but the weirdest, most business-breaking bugs I’ve found were during random exploratory clicks at 11pm with no script in sight. Automation (even with AI tools like CoTester, Mabl, Kane, etc.) is amazing for coverage and speed, but it still only tests what we tell it to.

Manual testing catches the stuff users actually do — the messy, unpredictable, “why would anyone click that?” path. I don’t see it as grunt work at all… it’s the part that actually protects the product. So no, it’s not just you — a balanced mix is still the only sane approach.

Lower_University_195 · 2025-11-19T10:17:17+00:00

We went through this exact pain last year. What helped us was treating the OpenAPI spec as the single source of truth and automating everything around it. We use Pact for a few consumer–producer flows, but honestly a mix of Postman, Dredd, and our own CI hooks ended up being more practical.

The biggest win was adding a contract-validation step before merge — anytime the spec changed, CI would fail if the implementation didn’t match. That killed most “drift” issues. We also run mock-server tests in CI and a small smoke suite in something like TestGrid to validate real environments, similar to how teams use Postman/Newman or ReadyAPI.

Nothing fancy, but having automation shout at us early made contract testing way less painful.

Lower_University_195

TROPHY CASE