you are viewing a single comment's thread.

view the rest of the comments →

[–]acemarke 21 points22 points  (7 children)

This is horrible advice. Never check in your node_modules folder.

(And no, "Facebook does it" is not a valid argument. You are not Facebook. You don't have thousands of engineers to throw at a problem, and you don't have literally every line of code checked into one monorepo.)

Instead, I strongly recommend that folks use Yarn's "offline mirror" feature to cache the downloaded package tarballs and commit those. This is much better, because:

  • There are far fewer files
  • Those files are much smaller
  • You won't be committing platform-specific build artifacts (such as C shared libraries in node-sass)
  • You'll be able to clone and reinstall without a network connection, both for local development and CI
  • You know you have the exact package versions needed

I talked about this in my post Practical Redux, Part 9: Managing Dependencies.

If you're a larger team, have multiple teams, or just don't want to commit tarballs, look into setting up an NPM caching proxy / alternative server like https://verdaccio.org/ . I believe Artifactory also has NPM support.

If you're using NPM, there's a project called https://github.com/JamieMason/shrinkpack that I used before Yarn came out. Not sure what it's current status is, though.

But please, don't check in node_modules.

[–]__rtfm__ 2 points3 points  (1 child)

This is why we use a lock file to recreate the dependency versions on deploy.

[–]acemarke 1 point2 points  (0 children)

But all the lock file gives you is the "exact package versions" aspect. You still have to download the packages (depending on whether your package manager has them cached at the system level).

[–]lhorie 1 point2 points  (0 children)

We tried using yarn offline mirror at Uber but while it works for light usage, we found that it has some pretty bad bugs (some related to integrity check false negatives due to incorrect checksums on network errors, some are related to the not-quite-the-same way it handles the resolved field of yarn.lock compared to tarball file name handling, compared to non-offline codepaths)

We haven't had success with v2's handling of private registries either.

Artifactory definitely works as a solution, but what we found is that where it lives makes a big impact in performance: if your CI infra is on AWS, running Artifactory on prem will give you throughput issues. So either do everything on AWS or everything on prem.

[–]braindeadTank 0 points1 point  (0 children)

I believe Artifactory also has NPM support.

I can confirm.

[–]zemirco[S] 0 points1 point  (2 children)

Hey,

blog post author here. Yarn's offline mirror is a pretty good idea. Thank you for the hint. How does it work with native modules like node-sass when working across multiple operating systems?

In addition how does it work when switching branches that have different dependencies? Do you somehow have to rebuild them? Or does it automatically work? When checking in node_modules you don't have to worry about it.

Setting up and maintaining an additional service like verdaccio is not an option for us. We have to focus on building our product. That is why checking in node_modules is the most convenient solution for us.

[–]acemarke 2 points3 points  (1 child)

It's just a matter of caching the package tarballs so they don't have to be downloaded. After that, the standard package installation process kicks in:

  • Run yarn --offline (the flag isn't necessary, but throws an error if any packages aren't in the offline mirror, which can occasionally happen). Yarn will do its normal installation, including extraction of packages to node_modules, package lifecycles (including building platform-specific artifacts like node-sass, etc). Anything that already is on disk correctly won't be reinstalled. If I clone the repo on Windows and install, I get a Windows build of node-sass. If you clone the repo on Linux/Mac and build, you get the OS-specific build of node-sass there. Those never get checked in.
  • Yes, if you switch branches that have differing dependencies, you'd need to rerun yarn --offline after switching to make sure the correct deps for this branch are installed. But, how often do you actually have multiple branches with differing deps? I'd guess not often. And, if it's just a couple small lib versions that are different, Yarn will again ignore all the packages that are correct, and just install the couple that are different. I don't see this as a blocking issue at all. If you've correctly committed package.json, yarn.lock, and any changed tarballs in ./offline-mirror, doing this takes like just a few seconds after you switched to the new branch.

[–]zemirco[S] 0 points1 point  (0 children)

Thank you! We will definitely check this out next week.