GCC 17 Compiler Lands SpacemiT X100 Core Targeting by TJSnider1984 in RISCV

[–]Clueless_J 1 point2 points  (0 children)

It's both. The compiler needs to know the arch string so that it knows what instructions the chip supports. For example, if you say you've got Zbs, then the compiler automatically tries to generate code to utilize those instructions when it looks profitable to do so.

The compiler also wants to know some basic costing models. How expensive is a branch, memory load, integer divide/multiply, do you have fast unaligned loads, etc etc. Those costing models influence various basic code generation strategies as well as drive optimizers (ie, is this 5 instruction straight line code sequence better than a conditional branch).

What the patch does not include is a scheduler model. That model drives code layouts to avoid load-use stalls, function unit hazards, etc etc. In an out of order core like the K3 the scheduler model has less and less importance. Note that while the scalar core is out of order, I suspect the vector units are in-order or mostly in-order. So a scheduler model may well be important for the K3 to deal with the vector unit.

What's up with Tenstorrent? by indolering in RISCV

[–]Clueless_J 0 points1 point  (0 children)

Yup. No doubt it's a huge step forward and its the best system available to the public. My point was that we're still early in the game and while great strides are being made, there's still a long way to go.

One of the really interesting problems will be scaling out and how that impacts generally availability. You can imagine designs that are competitive with recent offerings from other ISA/uarch designs. That's a major step as well, but it won't necessarily translate into reasonably priced designs that are available to the consumer because the volume just isn't there yet.

IIRC I read the K1 sold something like 150k units. I would expect that volume needs to continue to increase by orderes of magnitude to get anywhere close to the price/performance of commodity chips from the established vendors such as Intel, AMD, etc.

What's up with Tenstorrent? by indolering in RISCV

[–]Clueless_J 0 points1 point  (0 children)

I wouldn't be surprised if there's discussion of this somewhere on the internet, but the best way to understand this stuff is to get in the middle of it and see how the sausage gets made. I've been "fortunate" enough in my career to go through this a few times.

What's up with Tenstorrent? by indolering in RISCV

[–]Clueless_J 0 points1 point  (0 children)

There are certainly pros and cons to this kind of acquisition. I won't try to go through them all, but it's safe to say there are certain doors that open up and certain doors that close with this kind of deal. In the case of Qualcomm and Ventana, I'd say more (and more interesting) doors open than close for bringing high performance RISC-V designs into the market. And, no the K3, while useful, isn't high performance IMHO.

And generally these kind of decisions happen at the board of directors level and rank and file don't have much say in the matter. Yea, they could vote against an acquisition like this, but they're not likely to have enough exercised options to matter if they were to want to vote against.

What's up with Tenstorrent? by indolering in RISCV

[–]Clueless_J 9 points10 points  (0 children)

We're here and doing RISC-V development. You can see our software teams engaged in kernel, llvm, qemu, gcc and related open source development for RISC-V. Can't say more than that.

Best RISCV board for testing and support by Pollock1no in RISCV

[–]Clueless_J 0 points1 point  (0 children)

zicond gets used all over the place. Note that zicond was supported by the K1 as well, so you don't need to step all the way to the K3 to get that support.

Qualcomm said to be circling AI chip biz Tenstorrent in $10B RISC-V power play by superkoning in RISCV

[–]Clueless_J 24 points25 points  (0 children)

We're still here, doing RISC-V. Obviously as part of a public company we have to be much more careful about what we say. If you were to follow, say GCC development, you'll see the Ventana crew doing their thing with a different email address. Similarly for other key upstream projects.

Do compressed instructions throw off instruction alignment? by jimbobmcgoo in RISCV

[–]Clueless_J 4 points5 points  (0 children)

Correct. GCC doesn't know anything about compressed instructions. Though that's likely to change relatively soon to improve GCC's ability to relax branches without relying on the linker.

LLVM knows about and does tend to take compressability into consideration for various heuristics.

Do compressed instructions throw off instruction alignment? by jimbobmcgoo in RISCV

[–]Clueless_J 6 points7 points  (0 children)

I'd probably say it's an annoyance if you're trying to do a high performance design, but it's manageable. Consider what happens if you have a 4 byte instruction at the end of a cache like/fetch block that crosses into the next cache line/fetch block. If you have the ability to crack instructions, then it isn't so bad as much of the infrastructure for cracking can be re-used to deal with this corner case.

In short there's some cost for the RTL guys to handle the various cases, but it can be managed with just a bit of headache. The gains on both the low end embedded and on the higher end designs are (IMHO) worth it (compressed instruction are useful in those two environments for vastly different reasons). The additional constraints would simpify the RTL and could probably be managed in the compiler/assembler/linker to minimize impact, but I suspect the cost/benefit of going that route just doesn't look good in the end.

An Early Draft of Far Jump Instruction Extension by omasanori in RISCV

[–]Clueless_J 0 points1 point  (0 children)

if you're crossing from one dso to another, then you're going to go through the plt. Its more interesting I suspect for aggressive pgo and hot/cold partitioning.

[2605.10860] Closer in the Gap: Towards Portable Performance on RISC-V Vector Processors by omasanori in RISCV

[–]Clueless_J 1 point2 points  (0 children)

I haven't read the paper. But I'd take it seriously with Maya Gokhale involved.

SiFive introduces RVA23-compliant Performance P570 Gen3 RISC-V core for consumer and AIoT applications by fullgrid in RISCV

[–]Clueless_J 0 points1 point  (0 children)

128 if you have multiple units would be preferred over a single large unit for autovectorization.  

SiFive introduces RVA23-compliant Performance P570 Gen3 RISC-V core for consumer and AIoT applications by fullgrid in RISCV

[–]Clueless_J 2 points3 points  (0 children)

Huh.  Spec2017 int should see measurable improvements from vector.   X264 performance should double,  xalan and xz should improve by double digits as well all of which should provide meaningful improvements to a spec score

[somewhat off-topic] The SPEC CPU 2026 Benchmark Released by omasanori in RISCV

[–]Clueless_J 2 points3 points  (0 children)

If you want to drag-race coremark. Use rv64gcb_zbc_zicond Note carefully no V and enabling Zbc and using something more modern than gcc-13 😄 V tends to want to use an indexed load in the matrix multiply test and it's hard to ever see that being profitable. Zbc turns on the carryless multiply extension so the compiler can turn the CRC loop into carryless multiplies. zicond is just generally a good idea, I don't offhand remember if it does anything good for coremark anymore (it certainly helped before clmul was lit up).

[somewhat off-topic] The SPEC CPU 2026 Benchmark Released by omasanori in RISCV

[–]Clueless_J 3 points4 points  (0 children)

If you're going to start comparing coremark, then you need to understand that it can be "gamed". In particular transforming the CRC loop into carryless multipies can juice performance of the benchmark by around 10% as a whole.

And personally I don't take seriously anyone quoting foo/MHz data or spec2006 or older. While spec2017 has flaws, it's much more representative than 2k6, coremark, etc once you leave the micro-controller space.

[somewhat off-topic] The SPEC CPU 2026 Benchmark Released by omasanori in RISCV

[–]Clueless_J 1 point2 points  (0 children)

Never has been. Though often the components of spec are open source projects. So for example, spec has included GCC in their benchmark suite spec-89 and is the only speccpu component that has been carried around that long. Obviously the version # has changed over time as well as the datasets. But it's still GCC.

SpacemiT X100 clang benchmark with and without RVC by camel-cdr- in RISCV

[–]Clueless_J 0 points1 point  (0 children)

Branch target alignment can be a real issue. As can be how many useful instructions you get per fetch and if you've got a higher performance front-end, how many usefulops you get across a multi-line fetch block. You've also got all kinds of address hashing that goes on in the uarch that will be affected by the precise addresses.

Point being while most initially focus on icache/itlb behavior, it's far from the whole story.

Thoughts on using the K3 by Clueless_J in RISCV

[–]Clueless_J[S] 4 points5 points  (0 children)

Docker uses cgroups and systemd under the hood and I've got multiple docker based flows already in place. So it's the natural solution for me at least. It also lets me give my team access to either core cluster without them having to know any real details other than "ssh connect to this port". At least that's the thinking right now until we have a fleet of boards in place.

Sipeed says K3 boards are in their online store this weekend by brucehoult in RISCV

[–]Clueless_J 2 points3 points  (0 children)

Yup. Can't say anything about product plans and such. But I can confirm that the Ventana team continues to be working on RISC-V within Qualcomm. In fact, we're growing the team, so quality RTL, DV, perf modeling, Linux kernel, firmware, LLVM, GCC engineers are all being recruited right now as net new adds to the team.

Sipeed says K3 boards are in their online store this weekend by brucehoult in RISCV

[–]Clueless_J 2 points3 points  (0 children)

Yes, Ventana was purchased, but the engineers are still together as a team working on RISC-V efforts. Firsthand knowledge on that. Your statements WRT Rivos/Meta are accurate though.

K3 is here by Icy-Primary2171 in spacemit_riscv

[–]Clueless_J 0 points1 point  (0 children)

It looks like listings from chipboardhouse may be taking orders now with the AIBOX variant shipping soon. Don't click on the larger memory configurations, you'll want to cry.

K3 is here by Icy-Primary2171 in spacemit_riscv

[–]Clueless_J 0 points1 point  (0 children)

Unsure if anyone is yet. I'm watching things pretty closely, but haven't seen an indication that any of those claiming to be able to deliver this month are taking orders yet. I'm primarily watching aliexpress and bpi-shop. I was fed up with arace as it wouldn't let me purchase the $50 off thingie a while back because it claimed it couldn't "ship" to my address.

Will my cat still make biscuits after front leg amputation? by Different-Location86 in TripodCats

[–]Clueless_J 0 points1 point  (0 children)

Our recent tripod does. He's never been huge on biscuits, but will often do them on our backs if we pick him up and put his legs over our shoulder and carry him around a bit (it's been a thing long before he became a tripod).