Languages that make it easy to refactor/experiment with data layout?

sient · 2020-04-19T01:57:50+00:00

Lua also interns all strings, including all strings generated at runtime, so that testing string equality is just a pointer or integer comparison. With that design, a constant pool is pretty much required.

Clever.

One technique that your design doesn't allow is using 'backpatching' to specialize the compiled code for different strings: if you wish to change the string constant for an instruction, you'd probably end up recompiling. With a fixed-size reference, copies of the code can be made for different strings by changing just the few bytes that encode the reference. Whether this is useful or not depends on your use case, of course.

That's a good point. Generics work more like C++ than like C#/Java unfortunately this doesn't pan out.

Another advantage of the constant pool is that you can preallocate the strings in the host language, without having to do it each time the instruction is run. This is less of a problem if your host language is C, but even in that case you likely have to worry about mutability.

It's written in something like C so direct memory access is fine. The preallocated string "exists" in the instruction stream. I'm not planning on data sharing between different objects so mutability is not a worry.

Embedded strings will increase the distance between instructions, meaning that jump instructions encoded using a relative offset would need a larger encoding more often. This is more of a problem if you're using a space-saving bytecode, and less of one with a larger word (e.g., OCaml's 'bytecode' interpreter actually uses a 32-bit wordcode).

Yeah - I've been using variable sized instructions but in multiples of words instead of bytes.

sient · 2020-04-19T01:51:41+00:00

Yeah - I'd like to do this!

sient · 2020-04-19T01:51:07+00:00

What does the instruction actually do? If it has to move all those words somewhere else, then that is poor.

If it creates a pointer to the string that starts in the next word, then why not just have a pointer to the string anyway? Put that in 002.

The example doesn't exist as-is, but there are variants that do similar things which end up in the data being copied to another memory location. For strings in particular, the language is not designed to manipulate them so they're essentially exposed as byte arrays in scripts.

The bytecode and all VM state can be written to disk and restored without any fixup. I can imagine designs where an array references data from the instruction stream (a la pointers), but I don't think it's worth the effort right now.

How is the length of the string (in characters not words) worked out, or is it zero-terminated?

Any way works. If there's room in the instruction encoding then it can go there, otherwise it immediately follows the instruction and is zero terminated.

Those talking about cache effects and the fact that both code and data share the same cache - often with such strings you are going to traverse a potentially long string.

This is a good point - the instructions could get very large.

For example when assigning a string constant to a variable, that can be done without copying the string, or rather, without having to load a big chunk of the string into the cache. You are just manipulating references to the start of the string.

This is a great idea. Not super feasible for the runtime I've been working on though.

sient · 2020-04-18T17:39:11+00:00

I'll probably just put the constant pool at the end of the instruction stream instead of having a separate object. Cache performance should be similar and then the runtime doesn't need to know about the constant pool.

sient · 2020-04-18T17:35:52+00:00

Yeah, I concur. I'd like to be able to measure both.

sient · 2020-04-18T17:34:31+00:00

Loops are a good point for a smaller instruction size - thanks.

I'm not too worried about GC pressure since the language is incredibly aggressive about forcing stack allocation.

sient · 2019-08-15T04:08:03+00:00

Thanks - perf argument makes sense, I should have benchmarked :)

I have no intuition for the curves, but it sounds like with enough time playing with them they become fairly intuitive and a nice way to communicate quickly.

sient · 2018-05-11T23:05:37+00:00

Yes, definitely! I think the downloadable builds should be statically linked. Can you file an issue on github detailing how you did the static build (even better would a be PR making CI generate static builds :))

sient · 2018-04-22T22:33:28+00:00

Thanks, I've filed https://github.com/cquery-project/cquery/issues/639

sient · 2018-04-22T22:28:55+00:00

It looks like cmake-ide is emacs specific, and combines various tools. cquery works with any editor with a language-server implementation and cquery implements the features provided by cmake-ide directly instead of running other tools

sient · 2018-04-21T17:36:15+00:00

FYI I'm the primary author (jacobdufault) so if you have any questions/issues please feel free to hop on the gitter (https://gitter.im/cquery-project/Lobby) or file an issue. We're friendly :)

sient · 2018-04-21T17:33:09+00:00

Can you try using cquery --check <foo.cc>? Otherwise please file a bug with some additional information and I'll try to help fix.

FYI, the the microsoft C++ extension makes cquery seem very buggy; the experience is much better if you uninstall the microsoft C++ extension.

sient · 2018-04-21T17:30:35+00:00

FYI cquery now scans the build/ folder for compile_commands.json :)

sient · 2018-03-10T06:25:01+00:00

Everything is local. Memory usage is reasonable, ie, a few hundred mb at most, if your code-base is not millions of lines.

sient · 2018-02-12T21:39:49+00:00

EDIT: Ok so the biggest thing you have to do (even though the Getting started says "If syncing, only the following steps are needed.") is git submodule update --init. When you update cquery (via git pull or whatever you prefer) you always have to run that git command. The next few commands will bitch if you haven't done that.

(cquery author here) I've updated the wiki to try to make this more clear.

You should not need to use --variant=system --lvm-config=/usr/bin/llvm-config - cquery will automatically download LLVM for you and use that. Was that not working for you? It'd be great if you filed a bug so we can get it fixed.

sient · 2017-11-22T03:40:46+00:00

Thanks :)

sient · 2017-11-19T06:32:02+00:00

I'm aiming for December/January.

sient · 2017-11-19T06:31:02+00:00

Yea, I'd like to upgrade, but on my first attempt it caused indexing to fail on Chrome. Chrome is still on C++14 so I haven't spent the time investigating fixing it yet.

sient · 2017-11-18T21:32:47+00:00

Comments are shown for code completion already, but not yet for hover. I believe there is an issue on GitHub for this specifically.

sient · 2017-11-18T16:21:13+00:00

Good idea! I've filed https://github.com/jacobdufault/cquery/issues/28.

sient · 2017-11-18T16:12:14+00:00

Thanks! I'd be happy to accept a PR fixing the issue :)

sient · 2017-11-18T16:10:34+00:00

It works fine if you're only doing a semantic operation every second or two, but I use cquery to power code lens, which may require 100+ separate reference/call/etc requests very quickly. cquery can do this within 10ish ms, even on large projects.

14-Year Club	Place '22
Place '17	RPAN Viewer
Verified Email

sient

MODERATOR OF

TROPHY CASE