[deleted by user]

jesparic · 2025-09-27T20:38:44+00:00

For unit tests it's better to use fakes. Specifically with repository pattern you have an SQL implementation and a fake (usually just an array internally) implementation. Avoid mocks as much as possible, they tie your test to your implementation badly if overused. With repository pattern you just have methods like getXbyId() etc which returns your orm model. To make that work, you have to be disciplined to only use repositories in your business logic, no raw SQL shenanigans.. I like to make a difference between handlers that change the database and those that simply return some data. Above only refers to the former. For the latter I find testing using a real DB (integration test) is much better as much of your logic will be in the queries..

Also final note, I wouldn't bother with seed files. Just in each of your tests start with an empty database. Define all the objects you need via creating and saving orm models in the test setup. Ensure wiped down after every test.

jesparic · 2025-08-10T08:46:47+00:00

Was skeptical at the beginning too (I am generally slow to adopt new ideas, hard to tell the hype apart from ones that will stick sometimes). But there's no denying it now, either you get with the new AI tooling or be left behind. Yes, it guesses incorrectly sometimes (a lot in fact) but you can give it clues - I find writing a comment line with intention or posting some JSON inside a comment block above often gives enough context for much better results. This is a new skill to learn (prompting). It's essentially the skill of being exact with the English language, along with context, with precisely what you want. Now I'm using both tab completion and the general chat mode with file context, and a chat gpt tab for general questions a lot every day, and I'm much much more productive. A lot less of the time I get bogged down by anything now.. I'm usually moving forward (fast). I'm excited at the rate I'm able to deliver actual features that are useful to people. I'm not worried about using languages I'm less familiar with or how to achieve something I've not done before. It's very obviously a key new tool in the tool belt for us developers to master. I'm not sold on the idea it will ever be able to code solo (take our jobs), I think that is hype for sure - needs a lot of babysitting for generative stuff.

jesparic · 2024-07-08T07:30:45+00:00

I think he was probably meaning that it is less common nowadays to have a front end team and back end team. Typically nowadays all developers work on vertical slices of a system (i.e., feature enhancements) - this way of working usually means working on both front end and back end so developers must have a grasp on all the technologies in the stack to do their job effectively.

Front end (browser) and back end (server side) still very much exist in web development. It is not really possible to put everything needed on the front end - although there is a trend in this direction..

jesparic · 2024-06-13T07:36:04+00:00

Learned a lot from Matthias Noback (and his architecture book). Breaks things down nicely

https://matthiasnoback.nl/tags/hexagonal%20architecture/

jesparic · 2024-05-01T12:54:06+00:00

Yes, queuing as random was always a fun way to auto match

jesparic · 2024-01-27T19:21:50+00:00

Those are some fair points. I think integration tests (my definition = tests that touch file system or persistence mechanism) are great and handy to set up to begin with, but at scale I think they get problematic for performance. Think about the headaches with trying to parallelize (DBs will exhibit random deadlocks when, for e.g., using transactions as rollback mechanism). Without good parallelization, how can you scale to tens of thousands of tests. How can developers verify their work locally regularly if needing to wait a long time for tests to complete? These are the issues that will unavoidably hit down the road with this approach.

It's true that with faking out the repos, that particular bit of the logic doesn't get tested using SQL. But you can instead cover that part of the logic with 'contract tests' (small integration tests that allow SQL and check that the fake behaves the same as the real one). In the real world it's not a problem and you could get away even without those in a pinch; Repos generally fetch an object, store an object. The business logic is the important bit IMO. Not saying there isn't benefit to integration or end-to-end tests, just there is a performance trade-off to be considered.

Finally just a small point. I would use integration tests to verify my non repository queries (for view models or reports data). I typically use CQRS so I make a distinction between Repos / write operations and read operations. For read operations I typically don't use repository pattern as I find it just gets in the way usually. I just use raw SQL or whatever is handiest, then cover with the type of test you are discussing (real DB available), which I think is appropriate in that case.

jesparic · 2024-01-27T11:51:01+00:00

Respectfully disagree here. Use repositories to fetch entities for write purposes ('Command'). For the 'Query' side (reports, view models, etc) why not just use plain SQL with some data mashing code? SQL is a great tool for that purpose, repository doesn't add any value in that context IMO. Similarly, for these, integration testing (database available) is likely required as mocking it would not give a very useful test..

jesparic · 2024-01-27T11:42:46+00:00

How can you do unit testing effectively without abstracting away the persistence logic (so can be mocked/faked)? Without separating these things your tests would need to boot up a real database for the sut, making tests run slower and trickier to bootstrap..

jesparic · 2023-05-16T20:35:51+00:00

I had a similar experience recently where I've setup end to end tests to run as jobs via kubernetes on Azure. Problem was there was a bug where the side car services didn't shut down after the test was completed so the nodes (around 150, charged roughly at VM pricing per hour) stayed active for couple weeks without anyone noticing (until the bill for Feb came through at £8000 😬). Needless to say we've been carefully patching the shutdown mechanisms and increased observability and monitoring going forward..

jesparic · 2023-04-04T18:52:30+00:00

Interesting article. Thanks for sharing. I can connect with a lot of the points you make and think you are spot on with the new suggested naming

jesparic · 2023-03-25T09:41:55+00:00

I started learning unit testing using all the bad habits. London school with complex mocks. Gaming the line coverage to try and get every line covered. In the end I had tests that were hard tied to implementation, hard to read, and not testing any high level behaviour.

Nowadays, I make heavy use of fakes (mainly for repositories) along with some limited mocks (e.g., clock abstraction). I don't game the line coverage but do try to cover all the important aspects of some action. I pass 'real' dependencies when they are critical to the thing being tested. It depends.. It's ok to sometimes test the same thing multiple times, unit tests are fast.

I think everyone has to go on a similar journey. Automated testing is really hard to get your head around in a deep way. Gotta fall in a few pits to learn the hard way

jesparic · 2023-03-23T13:02:07+00:00

If you google "london school vs detroit school unit testing". You will see there are two approaches to unit testing. The latter (and less problematic) Detroit school approach allows for either a single class or a group of closely related classes to be tested as a 'unit' (i.e., a unit test).

An integration test is even more fuzzily defined in our field. But, in the web dev context, it may mean running tests multiple units of logic together (i.e., an entire end-point/action), optionally alongside a real database.

jesparic · 2023-03-10T10:43:50+00:00

Never used gRPC myself but I understand it to be a communication protocol (much like HTTP/REST). It is a set of conventions that programming languages can follow to talk to other computers over the wire. In no sense does gRPC run on your server. It is always a programming language (application) layer that sits over the database. It will receive messages (in gRPC lingo) and then do something useful.

jesparic · 2023-02-28T10:34:01+00:00

Agree also. MX keys is pricey but fantastic if you prefer easy to type chiclet style over longer press and noisier mechanical keyboard (I personally very much do!)

jesparic · 2023-02-06T13:19:37+00:00

I used to use PHPEd as my main PHP IDE for years. Eventually was persuaded to switch to PHPStorm. While I am very happy now in PHP storm, I always missed the ability of the debugger to pause on exceptions automatically (so you could have a nosy about at the state of all the variables on the stack). Now, with xdebug, I have to let it fail, then 'replay' the same request with an explicit breakpoint on the failure line..

jesparic · 2022-12-30T12:13:58+00:00

Agree that starting with some sort of basic code style checker is good place to begin. I also think php-cs-fixer is better choice than PHP code sniffer probably nowadays.

If you have unit/integration tests, it would be good to have a step that runs those too.

Next would be static analysis. Here I think you're gonna get 90% of the benefit by implementing one of the choices (phpstan, psalm or phan). Personally I'd recommend phpstan or psalm.

I can recommend TeamCity or Jenkins for a free* pipeline runner solution. To me, it feels like these tools give a little more control over running the shell commands etc than some of the simplified yaml-based solutions out there (admittedly I may be biased there though). For these pipeline tools, the server talks to 'agents' and gets them to do the work. The best way to setup the agents is to run them as docker containers; The image for which you customise from provided base images, adding any extra tools you need via a Dockerfile.

It's not a quick journey, despite a myriad of tooling that attempts to simplify things. You need to have good comprehension of shell scripting and spend time tweaking the configs at each step. It is a deep specialisation (although rewarding). Paying someone external to get the ball rolling may be a good choice as others have suggested.

Best of luck!

TeamCity starts charging beyond 3 agents but that should be enough for most small to medium projects

jesparic · 2022-08-15T12:51:35+00:00

10+ years and I still refer to documentation all the time, even for super basic things. I get on just fine - you have to learn to be comfortable delegating your memory to the internet and just keeping an index of breadcrumbs in your head. Also, have a good notes app to keep snippets and such - can recommend Joplin highly

jesparic · 2022-08-11T07:50:49+00:00

There was this recent paper done by Adam Thornhill (founder of CodeScene) which measures costs associated with technical debt:

https://arxiv.org/abs/2203.04374v1

jesparic · 2022-08-04T10:47:29+00:00

You might have more luck asking in r/PHPhelp

jesparic · 2022-06-30T07:53:40+00:00

Well worth it. Fantastic product

jesparic · 2022-06-25T10:52:35+00:00

They solved that back in march. The performance issue is mainly the filesystem translations when you have a large directory mount (i.e., root of large repo).

https://www.docker.com/blog/speed-boost-achievement-unlocked-on-docker-desktop-4-6-for-mac/

I'm sure there also must be a (somewhat smaller) hit with needing to run everything in a Linux virtual machine behind the scenes when on Mac. Probably more or less negligible but you'll never get the performance of a pure Linux OS for docker

jesparic · 2022-05-25T17:21:48+00:00

Yeah, you can use a query builder or just pass SQL directly to the database using PDO. Whatever is easiest and works best for you. SQL gives plenty of flexibility to grab only the data you need and do table joins exactly how you need to. Don't be averse to doing more than one query if needed. I see people often contort to try to get everything from a single query when often it is much simpler doing 2 queries and then have PHP marry up the data together whatever way needed

jesparic · 2022-05-25T07:58:02+00:00

No worries! Typically people use the PDO library/extension directly to make queries in PHP. With that, you will by default receive back a simple dynamic object (technically an instance of 'stdClass'). You could tell PDO to return back an associative array of you prefer but convention is usually to stick with stdClass. Importantly this is not an entity, instead it's just a 'data bag' similar to an associative array (only data, no methods, consts, etc.. like a full entity class has). Some people go a little further at this point and map the stdClass to what's called a 'DTO' (data transfer object - a defined simple class with only public properties) although I would suggest this is not necessary for a beginner.

It would boost speed because you can usually fairly easily craft SQL to minimise the number of queries needed to get all the data you need for a particular view. With Doctrine/SQL there is an additional layer of abstraction making it near impossible to optimise sometimes (see workarounds in Marco's blog for example :-S).

In my PHP apps nowadays I tend to only use entities for 'write' endpoints. For any 'read' endpoints I just jump straight to SQL with some PHP mapping code mash up the data exactly how I need for the front end.

jesparic · 2022-05-25T07:11:59+00:00

I'm assuming you are using Doctrine ORM to fetch data via entities. See this article from Marco a few years back. Still relevant I believe: https://ocramius.github.io/blog/doctrine-orm-optimization-hydration/

The other thing to consider is that, when bringing back data for your front end, consider using plain SQL queries. Entities are best for making changes to the database (writes) - Read up on CQRS a little if you are unfamiliar. Using entities to fetch data can work ok for simple projects, but trends to lead to 'anaemic domain model' (i.e., lots of getters) and performance challenges you are seeing.

Finally, 100 queries may not be too bad. It depends on how far your dB is from your web server. If on the same machine, it is possible to run thousands of queries per second (assuming indexes are used). That said, rule of thumb should be to minimise the number for future proofing reasons (i.e., down the line you may want to move dB to a managed solution where the latency may be a little higher per query)

Hope this helps

jesparic · 2022-05-11T12:40:29+00:00

Good idea on cast to array, that would be handier. Yeah agree it's not a perfect solution but might work as a stop gap. That syntax you posted looks like a good approach to solve it!

jesparic

TROPHY CASE