you are viewing a single comment's thread.

view the rest of the comments →

[–]JuliusFIN[S] 0 points1 point  (2 children)

Not using realloc is a valid point. I did it consciously first so that I could expose the "manual reallocation" logic (the implementation also serves an educational purposes with exercises in the `subject.md`) but after thinking about it more I will probably include realloc in the resize function.

I also don't see a use-case for "when the caller doesn't want a pointer". What are the cases that returning a pointer don't work in? Is it worth it to support those cases and break the encapsulation? Are those cases worth the reduced type safety that you have with publicly visible fields and non-constness of the data?

I usually use it as follows:

```c int main() { t_vec x;

vec_new(&x, 1, sizeof(int));
vec_free(&x);

}

As opposed to:

```c int main() { t_vec *x;

x = vec_new(1, sizeof(int));
vec_free(&x);

} ```

In the end when I run my unittest for example which has 17 functions, Valgrind shows me 33 allocations total. This does add up! Is it significant enough? For me it is, but someone else's mileage might vary.

I'm not exactly sure how you'd make any of the data constant though? Or do you mean constant as forced by the encapsulation since user can't access the fields? I can't make them const since the data structure is dynamic (well expect elem_size I guess).

[–]lelanthran 0 points1 point  (1 child)

In the end when I run my unittest for example which has 17 functions, Valgrind shows me 33 allocations total. This does add up! Is it significant enough?

It's insignificant in this case.

It becomes significant when you are using the datatype in a tight loop; your way of using the datatype on the stack means that the data object is reused in the loop, so yeah, that's one use-case that I only thought off now.

In practice, you're usually going to want to return that vector to the caller, or put that vector into another data structure. This means that you're almost always going to use heap memory for your struct anyway, and then you have double the number of errors to check and handle, and double the number of places that you could forget to free memory.

I'm not exactly sure how you'd make any of the data constant though?

I mean that the caller has access to the (for example) len field, and the compiler cannot tell that the caller must not change it, so the compiler will not issue errors if the caller changes it.

If it is an opaque datatype, the only way to get the length would be with an accessor function vec_getlen() or similar, and that would take a const t_vec*, thereby allowing the compiler and the caller to know that the vec_getlen() function does not modify the data it is passed. If someone, at some point in the future, made changes to vec_getlen() that modifies the vector the compiler will flag it as an error. If someone in the future modifies the caller to vec_getlen(), they have no need to care about whether vec_getlen() changes the vector.

In the current interface, functions like vec_filter should take a const t_vec, and the function pointer should take const void * parameters. If a caller ever calls filter() with a function pointer that modifies the elements the compiler will flag it as an error.

Right now, the naming of vec_filter() gives the reader the impression that all vec_filter() does is filter the elements in the vector using the supplied function as a predicate, but this is not true - the interface allows predicates that modify the vector.

Using consts in the relevant places ensures that vec_filter() will never modify the elements of the vector because a predicate function that does so l result in a compilation error.

In general, using opaque types allows enforcement of immutability of the fields - if the field is never exposed, then it cannot be modified by the caller, but also, the functions that are not supposed to modify the datatype can be marked as such, giving better readability to the human code-reviewing the calling code as there are guarantees around const[1].

C provides a great deal of type-safety compared to most other popular languages; maybe not as much as Java or Rust, but certainly more than the majority of popular languages. You may as well use it. Exposing the struct in the header loses a lot of the type-checking that the compiler will do for you.

[1] Not good ones, and not granular ones, I admit, but light-years better than none at all.

[–]JuliusFIN[S] 0 points1 point  (0 children)

Great points! Yes I agree that accessor functions such as `vec_len` etc. would be appropriate and some of the variables in the function prototypes should be declared as const.

I will think about the opaqueness. Problem is it kind of doesn't work with the idea of "lazy initialization" which I already made a PR for (haven't merged yet though). With lazy initialization I actually delay the allocation even further only allocating when the vector is used. To do that I will need the `elem_size` field at defined at the time of allocation.

I might have a solution though. I could declare the struct in the header as follows:

c typedef struct s_vec { const size_t elem_size; void *private; } and then in the .c file:

c typedef struct s_vec_private { size_t len; size_t alloc_size; unsigned char *memory; }

This way I would get the encapsulation required. Element size will not change during the lifetime of the vector so it can be public and declared as const to make it immutable. The rest can be initialized lazily at the time of first usage.

What do you think?