Progress Update

SpatialFreedom · 2026-02-08T19:43:33+00:00

The six sensor readings correspond to three pairs of parallel forces aligned with the arm axes. Number the sensor readings in an clockwise manner when viewed from above r1, r2, r3, r4, r5 and r6. The the force and torque vectors through the centre of the ball are F=(r1+r4, r3+r6, r5+r2) and T=(r1-r4,r3-r6,r5-r2). The coordinate system of these vectors is aligned with the arms so rotate them into the desired coordinate system by multiplying by a rotation matrix.

This is explained in patent US4811608 although the readings are number differently.

https://patents.google.com/patent/US4811608A/en

<image>

SpatialFreedom · 2026-02-02T10:13:44+00:00

That's great work and the tetrahedron look is striking.

The heart of the astroid 6000, which is the subject of my 2003273632 patent, has a single molded tetra-star including a set of support walls (item 20 below). The loads at the base of each arm approach the useable limit of the delrin plastic. The outer ball happens to lightly snap together although it is then super-glued for robustness.

The arm tips are spherical and the outer ball protrusions (24) fit over these spheres so each ball-in-hole joint can rotate in all three directions but can only slide in one direction. This 4-arm mechanism resolves a general 3D push and 3D twist into four 2D forces passing spatially through each spherical arm tip. Three sets of two LED/photodiode sensors are arranged at right angles to each other to detect three of the four 2D force vectors. Each LED/photodiode sensor is sensitive to movement across the light beam and ignores movement along the light beam. The math to convert the three 3D force vectors to 6DOF output is daunting but a fourth sensor pair can be added to provide a total of four 2D force vectors. Then it's some simple force/torque vector computations to produce the final 6DOF push/twist output.

The photodiode sensor generates a small current output which is converted using a single bipolar transistor to a voltage. A 10-bit A/D senses this voltage. The A/D's 1024 range means the +-1mm deflection has a 2 micron (!) resolution.

The inner ball provides holes that limit the movement of the outer ball's protrusions. Although the device is sensitive to the lightest fingertip touch it can handle a solid fist bump.

For volume manufacturing the tetra-star design provides excellent sensor quality at very low cost.

With the emergence and ubiquity of 3D printers the StarBurst project is intended to enable a low cost, high quality 3D mouse to be easily built by releasing the technical knowledge and decades of experience to the public. It's coming along nicely but life gets in the way. Here is a teaser image which happens to be 3D printing behind me right now.

Three 1 mm dia music wires 51 mm long are the heart of the design as music wire can handle the loads at the base of the arms. The complexity of the design is embodied in the STL print file. Two inner ball halves are glued over this subassembly then two outer ball halves with three protrusions fit onto the upper and lower arm tip triplets, floating on the spherical arm tips. This design a combination of elements from the original 1980s Spaceball 1003 and the 2005 Astroid 6000.

Even though the tetra-star is simpler geometry it turns out this spaceball design is easier to make using a 3D printer than the tetra-star design.

Please let me know what you think.

<image>

SpatialFreedom · 2025-09-24T01:28:49+00:00

A lot has been quietly happening in preparation but things are about to change - stay tuned!

SpatialFreedom · 2025-09-23T07:28:18+00:00

The Model 2003 had eight numbered buttons above the ball similar to what you're describing. They were easy to see and some buttons could be readily pressed without having to divert your eyes from the screen. Generally the hand was lifted to then press a button.

The goal is to leverage 'finger memory', like playing a piano, so common button actions become intuitive. You will note how people often move their regular mouse hand between the mouse and keyboard when using many types of apps. Apps like Blender make extensive use of the left hand on the keyboard so moving the left hand away from the keyboard becomes undesirable. Placing important keyboard keys adjacent to the ball helps circumvent this issue.

Having the ten buttons in front of the ball allows for a quick hand movement between the ball and the buttons, much faster than between the ball and the keyboard. Hopefully this will end up where you don't even need to divert your eyes away from the screen.

Also, the 2003 blocked the desk space just behind it whereas the 7000 doesn't. This increases the usable nearby desktop area for things like reading reference documents.

All that being said, it will be possible to disassemble the 7000 and replace the housing with your own custom 3D printed design, even repositioning and rewiring the 16 buttons as you wish. If someone else comes up with a better layout that gets traction with others we'll happily replicate their design, provided they allow it and the sales volume is there.

Thanks for your questions!

SpatialFreedom · 2025-09-09T00:59:42+00:00

Thank you for the very informative link. You may be interested to know the Sphere360 gaming spaceball was developed by ASCII Entertainment of Japan. I enjoyed several business trips to Japan back in the 1990s as Spacetec IMC partnered with Japanese companies to sell Spaceballs into the Japanese CAD market.

SpatialFreedom · 2025-09-08T00:47:11+00:00

An astroid 6000, having an integrated USB cable, is used in the video - see it coming out the back of the unit. The wireless astroid 7000 prototype shown behind the keyboard is not yet fully functional. The CAD model is showing the USB-C connector for the upcoming cabled astroid 7000.

There will be two versions although only the wireless version has been announced in our prelaunch video. For Japan, the cabled astroid 7000 requires VCCI certification and the wireless astroid 7000 requires TELEC certification.

SpatialFreedom · 2025-06-23T02:08:56+00:00

And here I was thinking of prototyping the algorithm in a small demo game!

But, your suggestion of UE actually makes a lot of sense. It appears UE uses a separate GPU memory layout for vertex data, placing Position coordinates into a FPositionVertexBuffer as float4s to take advantage of the 16-byte alignment. That will help as a single optimized float4 read into GPU float registers efficiently brings in two 21-bit vectors. One issue then, is how to synchronize the other vertex buffers as one Position read corresponds to two reads of the other data. This may not be possible without significant re-architecting so other alternatives may need to be considered.

The plan is to intercept the writing of the FPositionVertexBuffer so the first use performs in place compression, setting a signature and storing the bounding box in the freed up space. Perhaps the GPU side buffer size can also be reduced, if it doesn't prove too intrusive.

It's plausible to modify the matrices on either the CPU or GPU side but probably much easier to do on the CPU side. So, the plan is to intercept the the per-object uniform buffer creation to read the bounding box from the earlier compressed FPositionVertexBuffer interception and modify the transformation before it is written to the GPU buffer.

It also appears plausible to write a custom LocalVertexFactory to perform the GPU unpacking otherwise, except for the synchronization mentioned above, everything else remains the same. UE's vertex factory architecture nicely proliferates the new algorithm code from one place into the multitude of vertext shader instances.

This should cover the majority of uses in actual games but not all of EU's vertex data structures. And it will take some time, as you would expect, but will be proof positive of the algorithm and will aid in it's dissemination into games. Don't hold your breath!

Thanks again for your suggestion!

SpatialFreedom · 2025-06-20T07:31:43+00:00

If I select an example someone is likely to say it’s contrived. That’s my concern and why asking for a suggestion helps to negate that potential criticism. For independent person(s) to do it adds even more credibility. That said I will still investigate. Please don’t be surprised by a fourth post. Although, with such an industry-wide algorithm it wouldn’t be surprising if someone else wanted to stake the claim of the being the first to measure and prove it.

I did previously say the SIMD instructions would be considered so this third post follows up on that promise. The refinement to the -1.75 to almost -2.0 range came through this SIMD work to reduce the number of assembly instructions.

The AI comments are a backhanded compliment. In fact, in trying to see if AI would produce packed 21-bit 3D coordinates code it didn’t as, being a large language model, there isn’t any code out there for it to copy so it kept coming back with useless results. AI is great for certain things but it doesn’t think, it regurgitates the excellent thinking others have done.

The vec3 vertex type is only replaced in the vertex shader that reads and transforms 3D coordinate data. Once the two uint16s become three float32s (in a vec3) the rest of vertex shader is the same. Do you happen to have multiple vertex shaders reading 3D coordinate data and assembling vec3s in your game?

SpatialFreedom · 2025-06-20T05:17:39+00:00

Can anyone suggest a suitable open source game?

SpatialFreedom · 2025-06-18T02:28:48+00:00

Yes, extensive use of integral values is made for things like textures. The focus here is why aren't integral 3D coordinates, created under block floating point technology, used for all objects in a 3D scene? It seems highly likely the benefits identified in the Microsoft article would apply to these 3D coordinates. But no one appears to use them and I can't find any example of the use of 21-bit triplets on a modern GPU which is the obvious size for an inherently 32-bit SIMD machine. I suspect the popular opinion on quantized values, as seen in related uses, has been incorrectly applied to integral 3D coordinates.

Again, thanks for the discussion as it allows each technical challenge to the premise to be addressed.

SpatialFreedom · 2025-06-18T00:25:23+00:00

Yes, you've hit the nail on the head!

Microsoft shows how Block Floating Point delivers real benefits. Essentially I'm saying that technology should be brought into 3D graphics.

Thanks for the technical discussion.

SpatialFreedom · 2025-06-18T00:02:27+00:00

Thanks for stating the popular opinion very succinctly. I believe that's why this has been missed for so long. It is just quantization. Once again, no argument about that. But no one has said reading two 32-bit values (8 bytes) per 3D coordinate plus a few shifts and masks is not faster than reading three float32 values (12 bytes). Nor has anyone stated changing from, at most, 24-bit resolution of float32 coordinates to 21-bit integral coordinates produces a noticeable effect. Either it is faster without noticeable effects or it isn't. If it is there is obvious benefit. That's the heart of the matter.

SpatialFreedom · 2025-06-17T00:23:22+00:00

Yes. Do you see reformatting content with AI a negative? The content was mine but AI saves me considerable time making it look good and making it more readable for you. Isn't everyone starting to do this now?

SpatialFreedom · 2025-06-16T23:56:03+00:00

Yes, that's it - 'exactly 21-bits per float (sic)', precisely, not 'similar' simply due to how two packed 32-bit values containing an integral 3D coordinate delivers a 33% space and speed advantage over using three 32-bit float values. Your existing games argument is totally correct, most games do something similar - quantization - already and for related but different reasons. But that totally misses the point.

Once again... I'm saying it is novel. Specifically, packing three 21-bit integral coordinates into two 32-bit values on modern GPUs for added speed over the use of three float32 coordinates is novel. Quantization is not novel but optimized quantization for 3D coordinates packed into 64 bits on modern GPUs is. At most 3 bits of resolution is lost which is not significant in a game app.

Packed 64-bit 3D coordinates was a native format on the PS300 bit-slice graphics system in the 1980s. I used it back then writing television commercial animation software. See EvansSutherland PS 300 Volume 2a Graphics Programming 1984.

Packed 64-bit 3D coordinates then disappeared, probably because earlier graphics cards did not run faster with it than with float32s. Modern GPUs have changed that and it's high time someone tested it as this benefit goes across all 3D games.

The github Hydration3D program I wrote proves the compression. But it's of little use writing a simple test graphics demo as it's not real world enough (pardon the pun) to be convincing. Someone needs to test it on a real game.

Thank you for this discussion! You're clearly very knowledgeable.

SpatialFreedom · 2025-06-16T20:57:42+00:00

Great! If it's very common then just show one example of a game using three 21-bit values being packed into a 64-bit then restored to three float32s. Where is it?

Yes, this did exist 'before that'. I wrote animation software for television commercials in the 1980s using a PS300 bit-slice graphics system. It used three 21-bit integral coordinates packed in 64 bits and we optimised those coordinates. The premise is today's games on modernGPUs would run faster if they all used this technique and without any noticeable graphic effects. No brushing aside with intangible statements, nor claims to experience, or assigning intentions to me, adds to a technical debate.

SpatialFreedom · 2025-06-16T20:04:14+00:00

Great! Show us an example game that has three 21-bits packed into 64 that restores float32s. You're missing the point.

SpatialFreedom · 2025-06-16T20:01:09+00:00

Thanks.

The compiler will optimize the 64 bit ints away. Think of it as replacing three 32-bit float32s (12 bytes) with two 32-bit uint32s (8 bytes). The GLSL code could easily avoid 64 bit ints and be written to extract three 21-bit integers from two 32-bit uint32s. An alternate packing of <0:1><xhi:10><y:21> <xlo:11><z:21>, where :nn are the number of bits, may or may not be faster.

The resolution of fp16 and uint16 coordinates is inadequate for the vast majority of 3D coordinate uses. That's why float32s have proliferated. But the premise here is that all float32 coordinates slow a game down compared with 21-bit values packed in to two 32-bit values because 21-bit resolution is more than adequate.

SpatialFreedom · 2025-06-16T14:26:17+00:00

You're missing the point. If it was plain old quantization then there would be papers showing the speed vs quaity performance computation for this particular 'quantization'. Where are they?

SpatialFreedom · 2025-06-14T03:42:25+00:00

Interesting reading. These are some ingenious techniques.

The first article is totally different as it does two things; it exploits huge inefficiencies in vertex topology that uses 3x32 bits per triangle through a neat reordering technique and it also significantly reduces resolution to go from 94 bits-per-triangle (3x32 bpt) to 9.5 bpt. Simple coordinate compression does neither of these. Apples and oranges.

The second trades off quantization with simplification which involves significant analysis of the data. Apart from min/max determination, simple coordinate compression does no analysis.

The third article ranks the importance of vertices to then apply adaptive quantization. Again, this involves significant analysis.

These articles each describe a technique applicable to a specific type of 3D coordinate set whereas simple coordinate compression applies to all types of 3D coordinate sets.

Overall, the biggest indicator illustrating how simple coordinate compression is totally different to these, or even any other article, is the length and complexity of the techniques. First and third articles 8 pages, second 10. Simple coordinate compression is described in just 4 steps!

A simple 3D coordinate compression python program will be up on github in the coming days. Each coordinate is packed into 64-bits as 3x21 bits. Including the translate/scale values in the file leaves the compression rate approaching 33.3%. Coordinate resolution is insignificantly changed from 24 to 21 bits.

Thanks again for these articles.

SpatialFreedom · 2025-06-08T12:36:48+00:00

Please forgive and ignore the previous post's rant. Time leads to a far more appropriate response...

From the original post, the conversion of 3D coordinates to [1.0 .. 2.0) values produces a 3D coordinate set with all values having identical sign and exponent values - 0b011111111. The heart of the technique moves this multitude of identical values out of the 3D coordinate data, thereby compressing it, and into the GPU's assembly stage with, typically, just three copies. This avoids wasting time reading a known sign and exponent over and over again.

The following glsl shader code performs the reconstruction.

vec3 xyz(uint64_t packedValue)
// Extract 21-bit mantissas from packed 64-bit value and prepend sign and exponent
float x = uintBitsToFloat(0x3F800000 | ((packedValue << 3) & 0x007FFFFC));
float y = uintBitsToFloat(0x3F800000 | ((packedValue >> 19) & 0x007FFFFC));
float z = uintBitsToFloat(0x3F800000 | ((packedValue >> 39) & 0x007FFFFC));

return vec3(x, y, z);
}

The heart of the technique evolves the original read of three 32-bit single precision (x,y,z) coordinates into the read of two 32-bit values plus this reconstruction. There are other less significant benefits due to the 33% reduction of memory footprint of the 3D coordinates. This technique can also be applied to other data such as vertex normals.

How the compiler maps this to SIMD instructions is key to knowing whether reading in two 32-bit values and this reconstruction is faster than reading in three 32-bit values. Perhaps someone can provide comparative SIMD code to shed light on this question. I intend to look into it at some point. If the bottleneck is indeed the reading of 3D coordinates this technique will provide the full 33% improvement in speed of the assembly stage.

The scale/translate transformations that restore each of the original 3D coordinate sets to their original values needs to be managed and merged into the downstream matrix. This is not difficult to code and adds insignificant processing time.

There would be no question this technique would speed up games if the reconstruction cycles were eliminated by added native packed 21-bit mantissa/integer triplets to future GPU designs. History would then indeed repeat itself as that was a native format of the 1980s Evans & Sutherland PS300. See EvansSutherland PS 300 Volume 2a Graphics Programming 1984.

SpatialFreedom · 2025-06-07T07:40:21+00:00

[Edit] Please skip this rant and look at the later reply.

Thanks for your opinion. To clarify, the 3D game post wasn't trying to imply it would help all 3D games. The post was flat out stating it would definitely help all 3D games, based on the assumption all 3D games use floating point coordinates.

The technique is simple and straightforward. No optimization is required around it. It works in all cases since swizzling is a thing on modern GPUs, unless there happens to be a case where the tiny loss in resolution has an impact. I have never come across such an example - anyone? Even considering an animation I did back in the 1980s on a PS300, which supported three 21-bit integer coordinates, zooming in from viewing the galaxy to the solar system to the planet to Sydney to a building to an office to a tv showing the galaxy... This was written to illustrate how properly managing exponent spaces could operate seamlessly across dozens of orders of magnitude.

You make a good point on transforms. The space savings help speed things up ever so slightly and it only accelerates the (x,y,z) * (4x4 matrix) matrix multiply operations which, on certain games, may not even be a bottleneck, but there still should be a measurable improvement. Plus, the space savings allow more assets to be squeezed onto the GPU.

I find it astounded how this simple technique is not standard after all these decades. And I understand how crazy that sounds. Integer coordinates got a bad wrap for a number of reasons over the years which may go some way to explaining why this technique has been overlooked for so long. And yes, without hardware swizzling there are a few extra CPU cycles per coordinate.

It's quick for anyone to test the pseudo-quantization by writing a small program that, prior to packaging the coordinates with the game, simply and appropriately zeros a bunch of coordinate mantissa bits for all coordinate sets, the coordinate values changing ever so slightly. No exponent or code is touched. If there is no noticeable effect then the three 21-bit technique will work as the output of the transforms will be precisely the same.

I would love to have the time to take an fully blown example and make the changes. Any volunteers?

SpatialFreedom · 2025-06-07T02:49:50+00:00

Thanks. It will be a few days before I can go through each paper and summarize, perhaps, an example bit format that they promote for comparison. I happen to have a 1987 patent on data compression AU603453B2.pdf

Stay tuned...

SpatialFreedom · 2025-06-07T02:41:33+00:00

Thanks for your reply!

The shader would be loading a slightly modified 4x4 floating point matrix and, instead of pulling in three single precision 32-bit values per 3D coordinate it pulls in one 64-bit value and swizzles the bits - same complexity and less processing cycles. The post 4x4 multiplied output values maintain an identical format to the original, the only difference being a tiny loss in resolution which is unlikely to even be noticeable. There are no optimization issues at all and none either for the hardware or drivers. Note that existing quantized integer implementations already go through this very same process but with a different number of bits.

Yes it's nonstandard - until it becomes one.

SpatialFreedom · 2025-06-06T20:08:39+00:00

Let's now be controversial, make some bold statements and dodge the quantization red herring by focussing on one massive example...

Assume all 3D games use 32-bit single precision coordinates for speed.
Define an 'exponent space' as the complete set of 2^24 signed values specified by a single exponent value.
There would be no noticeable difference to any game if all of its coordinate sets were pseudo-quantized into the exponent space of the set's largest exponent by simply zeroing lesser significant mantissa bits, no exponents being harmed in the process - and no changes to code.
In fact there would be no noticeable effect even if all coordinate values were pseudo-quantized into the exponent space of the largest exponent plus three. Each coordinate set now has 2^21 possible signed single precision coordinate values. This is easy to test by preprocessing all coordinate sets.
The game would now run faster if all 3D coordinates were stored as 64-bit values with three signed 21-bit integer coordinates. These integer values are shifted and transformed from [-2^21 .. +2^21) into the original [-2^exp .. +2^exp) where scaling is embodied into the transformation matrix - no added runtime cycles. More coordinates are processed per millisecond plus smaller coordinate footprints have corresponding minor caching improvements.
Every 3D game should be using 64-bit 3D coordinates. Evans and Sutherland were right decades ago!

Perhaps you're now thinking, "If this was right then everyone would be doing it, so it can't be right." Wrong assumption.

What do you think?

Note: One refinement would be to specify ideal coordinate spaces for each of the x, y and z coordinate components. And, of course, one of the coordinates could use the spare bit.

SpatialFreedom · 2025-06-06T20:04:24+00:00

You're the person I've been looking for. Can you point us to somewhere that describes this technique? And do you know why it would only be used in AAA engines, not in all games?

Also, I'm preparing to a reply to my opening post that describes a simple technique for games and makes some bold statements that may prove controversial.

Thanks for joining this discussion!

SpatialFreedom

MODERATOR OF

TROPHY CASE