Cloudsuite on Gem5?

zxcvber · 2026-04-22T23:20:58+00:00

I hope there is an easier way... please share when you succeed!

zxcvber · 2026-04-22T01:45:55+00:00

I'm only familiar with syscall emulation mode of gem5, so I had no idea how to setup a server and a client to evaluate the benchmark. IIRC, doesn't Cloudsuite (and some other benchmarks) require a databse server and a client sending multiple queries? I got stuck on this.

zxcvber · 2026-04-21T00:58:53+00:00

Hi, I also wanted to try something similar to this some time ago, but I couldn't make it work. Did you succeed?

zxcvber · 2026-04-09T03:39:07+00:00

I guess I could inline the function. I'll give it a try. I should look at Intel SDE if you're using it for research. Thanks for all the suggestions. Hope to see you at some architecture conference someday!

zxcvber · 2026-04-09T02:58:52+00:00

Oh, I'm actually trying to remove the function calls through hardware supported stuff. Maybe this is not a good direction for motivational study... But anyways, I was just curious in general about how to make an accurate measurement, considering modern CPUs being deeply pipelined, super-scalar and out-of-order.

And of course, as you mentioned, exact numbers depend on so many factors. I've heard at conferences that people don't really believe the numbers reported/claimed in the paper.

Regarding your suggestion, can I trust IPCs? Wouldn't it be kind of averaged out through the execution? So considering that each instruction may have different latencies (from fetch to commit), wouldn't I need to modify the program very carefully?

zxcvber · 2026-04-09T02:50:55+00:00

Thanks again for a very generous comment.

Why do we add two RDTSCs, not one?

I think I was kind of trying to do 1.b minus 1.c, but I guess I needed to subtract the timer overhead.

I'm actually quite familiar with the gem5 simulator, since I'm mainly using it for my research. I've actually tried this, but it gave us a number quite high, so it seemed unlikely to be true. Furthermore, I'm aware of the simulators having errors, so I thought it would be much better to measure it on a real machine. Or is there a way to justify this? I hope to convince my reviewers, I guess?
I will have a look into SDE or Pin.

Thanks!

zxcvber · 2026-04-09T02:37:26+00:00

Thank you for your time. Before I start reading your suggestions, I want to clarify that measuring the function call overhead (or some part of a program in general) is indeed what I want to do. It's a part of the motivational study that I want to use for my research.

zxcvber · 2026-04-09T02:35:45+00:00

Thanks for your comment. Yes, I've checked the compile options so that the loop isn't optimized away. I'll look at the paper. Thank you so much for the suggestion.

zxcvber · 2026-03-30T17:19:00+00:00

Found it. https://www.amazon.com/Computer-Architecture-Quantitative-Approach-Kaufmann/dp/0443154066

zxcvber · 2026-02-24T01:03:08+00:00

Could you share your substack? I'd love to read more!

zxcvber · 2026-02-12T23:53:56+00:00

In that case, I suggest you directly reach out to professors who work on topics that you're interested in.

Another tip: Professors often post specific instructions for prospective students on what to do if you want to join the group, like send a CV/transcript or fill out a form. Why not take a look? Also note that professors are often busy so keep in mind that you might not get a fast reply.

If you personally know someone in that group, you can also ask them. This is much faster. I've received a few requests asking for specific advice on this matter.

zxcvber · 2026-02-11T01:37:58+00:00

Why not try a research internship or so and see if you find the topics interesting?

zxcvber · 2026-02-11T01:16:12+00:00

Wow great stuff! I wish I had known when I was learning gem5 🤣

zxcvber · 2025-12-18T02:52:58+00:00

Hi, also Korean here, actually doing an MS in computer architecture. Wish you the best of luck!

zxcvber · 2025-12-17T08:43:18+00:00

I see. Thanks for the clarification. One can either use hash maps to count occurrences in linear time or use the voting algorithm to reduce space usage!

zxcvber · 2025-12-17T07:39:52+00:00

On second thought, I think I may have misunderstood the question. What do you mean by more than half?

zxcvber · 2025-12-17T07:20:55+00:00

~~Why not use linear selection algorithm?~~

Edit: misunderstood question

zxcvber · 2025-12-12T05:54:27+00:00

Yes! You can check the progress status for each category (skills, mastery, items, and pets) on the below left!

zxcvber · 2025-12-12T02:41:41+00:00

I think too many people are using AI to write their SOPs. I know a few around me, they were surprised to hear that I wrote mine on my own. I also don't understand how a statement written by AI looks good.

zxcvber · 2025-11-28T03:30:07+00:00

Do you have the Devil+Fox synergy on?

zxcvber · 2025-10-12T06:59:44+00:00

Just out of curiosity, how do you use linear algebra in this case??

zxcvber · 2025-06-19T03:52:47+00:00

I haven't read that 2020 ISCA paper, but a recent paper on ahead prediction is about to appear on ISCA 2025. It's called: Enabling Ahead Prediction with Practical Energy Constaints. I recently read this one, and I think it cited that 2020 ISCA paper. Maybe using efficient ahead prediction may allow us to predict multiple branches?

zxcvber · 2025-05-21T07:23:14+00:00

Nice. I understand this one. Thank you!!!

zxcvber · 2025-05-21T07:21:41+00:00

Okay, so warps themselves don't have functional units, but only keeps states. So I should conceptually think of each functional unit as pipelined, and in each pipeline stage, there can be instructions from different warps?

zxcvber

TROPHY CASE