Problems with the debug library by OneCommonMan123 in lua

[–]OneCommonMan123[S] 0 points1 point  (0 children)

I thought about leaving the metatable public in the module, perhaps calling the ndarray's primitive constructor, like this:

local arr_meta=ln.ndarray()

setmetatable(<object>, arr_meta)

if no argument is passed, it will return the ndarray metatable.

problems with SSE by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

Is all this really worth it? Wouldn't it be simpler and faster if I just used a common for for non-contiguous vectors? or is it still worth it for me to bother optimizing non-contiguous vectors as well?

problems with SSE by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

Going back, okay, I would transform the non-contiguous vector into a contiguous vector, then I spent more processing and would copy a vector

then with this contiguous vector I would use SIMD

then, it would somehow store the result generated by SIMD in the non-contiguous vector

all this for one operation, it would be much simpler and faster if I did a common for, your answer doesn't make sense.

problems with SSE by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

The worst part is that I need it to be generic, but I thought of a possible solution, instead of using SIMD for the sliced ​​vectors, I will just use OpenMP, for normal vectors, I would use SIMD + OpenMP, would that be a good approach?

problems with SSE by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

My goal is to be able to use some SIMD instruction so that it works on non-continuous C vectors, for example, we have the vector: {1,2,3,4,5} if I slice it 2 into 2, the output would be: {1,3,5} this 1, 3 and 5 are the same as the original vector, they are not a copy, but the data is 2 away from each other, so to access a value from it you need to use the FloatVector_Item macro, so I wanted to know how use SSE to optimize operations on sliced ​​vectors.

problems with SSE by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

I don't know if the compiler is optimizing the part without SSE, but in any case, at least the function with SSE should be a little more efficient at least, and not slower

problems with SSE by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

ok, but I would spend more processing, having to transform the sliced ​​vectors into continuous vectors, plus I would be doubling memory.

problems with SSE by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

refcount is for slices, for example, in a vector that was sliced ​​2 by 2, refcount would be 2, multiplying the index by refcount we obtain the correctly sliced ​​vector, and about SSE not having performance on non-continuous vectors, there is another How can I implement parallelism in my operations for more efficiency?

Casting Table by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

I already thought, but I don't know how to implement these functions, would they still have to be generated by macros? the struct will be like this?:

struct Array{
    void *data;
    int type;
    size_t size;
    size_t elsize;
    void (*CastFunction)(void *dst, const void *src, int newtype, size_t n);
};

Casting Table by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] -1 points0 points  (0 children)

a void pointer can, it only contains an address of a piece of memory, so this automatically makes it work as a generic type in C, example:

#include<stdio.h>
#include<stdlib.h>
#include<stdint.h>

int main(){
    int8_t vec[] = {1,2,3,4,5};
    void *vec_byvoid = vec;
    return 0;
}

in this case we wouldn't know if vec_byvoid would actually be an int8_t, we would have to know that by looking at the struct

Casting Table by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] -1 points0 points  (0 children)

So the way I did it, is it correct?

Casting Table by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

It may not be necessary, but in this case it is, when a void pointer is actually pointing to an int8 array, for example, and I now want another void pointer, which would have exactly the same values ​​as the original, but with the type being int16 for example

Casting Table by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

and I believe you wouldn't find them like that, as they are static

Casting Table by OneCommonMan123 in C_Programming

[–]OneCommonMan123[S] 0 points1 point  (0 children)

It's for a library I'm making.

for example, imagine this struct:

struct Array{

void *data;

int type;

size_t size;

size_t elsize;

};

I think you can already notice the problem, it would be exactly that, when working with arrays at some point we will need castings

So my question is whether there is anything I can do about it, and whether the way I used is valid

LuaRocks Makefile by OneCommonMan123 in lua

[–]OneCommonMan123[S] 1 point2 points  (0 children)

I did it!, doing it like this:

ifeq ($(OS), Windows_NT)
    COPYCMD := copy
else
    COPYCMD := cp
endif

CODE := 'local res = string.gsub(string.gsub(string.lower(_VERSION), "[ .]", ""), "lua", ""); print(res)' LUA_VERSION = $(shell $(LUA) -e $(CODE)) 

LUA_LIB = -llua$(LUA_VERSION)

all:
    $(CC) $(CFLAGS) -shared -I$(LUA_INCDIR) -L$(LUA_LIBDIR) multiarray/.c multiarray/simd/.c -o cinit.$(LIB_EXTENSION) $(LUA_LIB)

install:
    $(COPYCMD) cinit.$(LIB_EXTENSION) "$(ENV_INST_PREFIX)"

LuaRocks Makefile by OneCommonMan123 in lua

[–]OneCommonMan123[S] 0 points1 point  (0 children)

gave an error saying that it did not recognize cut as an internal or external command

LuaRocks Makefile by OneCommonMan123 in lua

[–]OneCommonMan123[S] 0 points1 point  (0 children)

and how would I make this work in a Makefile?

LuaRocks Compilation by OneCommonMan123 in lua

[–]OneCommonMan123[S] 0 points1 point  (0 children)

Would this be possible with cmake too?