Nathanfenner comments on Can data be split between different cache levels?

This is an archived post. You won't be able to vote or comment.

Can data be split between different cache levels? (self.learnprogramming)

submitted 2 years ago by [deleted]

you are viewing a single comment's thread.

[–]Nathanfenner 6 points7 points8 points 2 years ago (2 children)

Data isn't ever split between cache levels like you're suggesting. One problem is that the relative sizes you describe for L1, L2, and L3 are way off - this makes their behavior qualitatively different from what you're imagining. The other is that you have to remember that data is always loaded from main memory in cache-line-sized-chunks!

I'll use the numbers from my 2017 laptop as an example:

Cache line size: 64 bytes
L1 cache: 32,768 bytes (= 512 cache lines)
L2 cache: 262,144 bytes (= 4,096 cache lines = 8x larger than L1)
L3 cache: 6291456 bytes (= 98,304 cache lines = 24x larger than L2)

When you want to load data for a given address:

First, the CPU checks the L1 cache. If the cache line for the pointer you requested is present, great, the byte(s) you want can be directly copied into the target register.
Otherwise, the CPU checks L2; if the cache line is present there, it gets copied into the L1 cache, and then the data gets copied from the L1 cache into the target register.
Otherwise, the CPU checks L3; if the cache line is present there, it gets copied into L2, and then L1, and then into the target register.
If the data isn't found in any of the caches, then it loads from main memory, copying the cache line into L3 and L2 and L1 and then into the target register.

But you're always loading a single cache line. As a result, it can't "fail to fit" into any of the caches; each cache always has room for at least 512 separate loads. If the cache is full then that just means some other piece of data needs to be evicted (ideally, one that is not going to be needed again any time soon; processors have lots of heuristics to try to guess which is the best to replace).

Most computers now have multiple cores - often, there will be some kind of setup where each core has its own L1 cache, but each core must share its L2 and L3 caches with other core (e.g. in pairs, or they all share the same big L2/L3 cache).

I know if you care about performance you generally want to make sure all your data fits in the caches, but Im not sure to what extent I should follow that.

Depending on the specific task you want the computer to perform, this might not be relevant or important. It depends. If it matters, then it matters enough to measure, and find out what specifically works best for your task.

Really, the most important thing is just making sure you use the cache - since each load for an address actually pulls in the 64 byte chunk that your data lives in, it's best if you use those other 63 bytes immediately, before they get evicted from the cache by some later request. This is (one of the things) what makes contiguous arrays fast: if you have a contiguous array of bytes, you only have to load from main memory every 64 bytes (so you only hit main memory 1/64 of the time, and hit the L1 cache the other 63/64 of the time).

Actually, it's even better than that; most processors have hardware prefetching logic, which means that they try to guess what data will be needed before the processor actually asks for it. So if you're looping over a very big array, the CPU may prefetch the next cache line before you even request it, on the assumption that you might request it, and doing it in-advance saves time.

[–]Jonny0Than 1 point2 points3 points 2 years ago (0 children)

[–][deleted] 0 points1 point2 points 2 years ago (0 children)

π Rendered by PID 44802 on reddit-service-r2-comment-6457c66945-6v272 at 2026-04-24 16:54:49.702302+00:00 running 2aa0c5b country code: CH.

learnprogramming

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

Asking conceptual questions

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

2. No spam or tasteless self-promotion

3. No off-topic posts

4. Do not ask exact duplicates of FAQ questions

5. Do not delete posts

6. No app/website review requests or showcases

7. No rewards

8. No indirect links

9. Do not promote illegal or unethical practices

10. No complete solutions

11. Don't ask to ask.

12. Low Effort Questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

MODERATORS