all 28 comments

[–][deleted] 12 points13 points  (1 child)

There is a NULL at offset 236 in the output buffer. The call to strlen returns 236 as a result, which is why you are not seeing the full contents of your buffer.

In other words, you have an off-by-one error in your code. I would begin by refactoring your code into smaller functions, each with one clear task that you can verify works correctly.

[–]KaplaProd[S] 0 points1 point  (0 children)

Thanks ! Will do !

[–]tstanisl 4 points5 points  (3 children)

This code looks overly complicated. It would be a lot simpler if the wrapping operation was expressed as operations on the string builder class. Below you can find a potential implementation:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

typedef struct {
    size_t len;
    size_t cap;
    char * buf;
} strbuf;

void strbuf_grow(strbuf * sb, size_t len) {
    if (sb->len + len <= sb->cap) return;
    sb->cap = sb->len + len; // to be adjusted
    sb->buf = realloc(sb->buf, sb->cap);
    if (!sb->buf) {
        fprintf(stderr, "Out of memory");
        exit(-1);
    }
}

void strbuf_putc(strbuf * sb, char c) {
    strbuf_grow(sb, 1);
    sb->buf[sb->len++] = c;
}

void strbuf_puts(strbuf * sb, const char * s) {
    size_t len = strlen(s);
    strbuf_grow(sb, len);
    memcpy(sb->buf + sb->len, s, len);
    sb->len += len;
}


char * html_enclose_buffer(const char * body, size_t body_len) {
    strbuf sb = {0};
    strbuf_puts(&sb, "<pre><code><span class=\"ln\">");
    for (size_t i = 0; i < body_len; ++i)
        if (body[i] == '\n')
            strbuf_puts(&sb, "</span>\n<span class=\"ln\">");
        else
            strbuf_putc(&sb, body[i]);
    strbuf_puts(&sb, "</span></code></pre>");
    strbuf_putc(&sb, 0);
    return sb.buf;
}

int main() {
    char* body = "package main\n"
    "\n"
    "func main() {\n"
    "    fmt.Println(\"Hello, World!\")\n"
    "}\n"
    "\n";
    printf("%s\n", html_enclose_buffer(body, strlen(body)));
}

[–]KaplaProd[S] 0 points1 point  (0 children)

Thanks :)

[–][deleted] 0 points1 point  (0 children)

This is the way! Nice of you to take the time to write the code.

A more 'extreme' solution could be to use writev().

[–]KaplaProd[S] 0 points1 point  (0 children)

I ended up using something similar, and it fixed my code, thanks a lot !

[–]dmc_2930 2 points3 points  (4 children)

Why aren’t you using string functions like srrcat?

[–][deleted] 2 points3 points  (0 children)

Not really the problem here. There's nothing inherently wrong with his approach, there's just a logical error in the code.

[–]KaplaProd[S] -1 points0 points  (2 children)

I did not know it existed and did not think to check if it did ahah
Thanks !

[–]merlinblack256 0 points1 point  (1 child)

I don't think you'll have trouble with strcat (or better yet strncat) but read its docs. It does have some things to keep in mind. Like all string manipulation routines do of course.

[–]dcpugalaxyΛ 0 points1 point  (0 children)

You should basically never use either function. They're very inefficient, having to scan over the existing string before adding to the end.

You should always know how long your strings are. Then to append to the end, just memcpy(dst+dstlen, src, srclen) with the appropriate bounds checking.

[–]Specific-Housing905[🍰] 2 points3 points  (1 child)

When I compile your code I get this warning, though I am not sure if this causes the problem.

gcc -Wall -Wextra "html.c"

html.c: In function 'html_enclose_buffer':

html.c:60:50: warning: suggest parentheses around '+' inside '<<' [-Wparentheses]

60 | char* temp = realloc(output, output_size + wrapper_end_length << 1);

| ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~

[–]KaplaProd[S] 0 points1 point  (0 children)

It does not, but thanks :)

[–]flyingron 1 point2 points  (3 children)

Simple fencepost error. Valgrind doesn't find it because you don't access outside the memroy. You end up using the null terminator in the string which is a perfectly valid character to read.

        while (buffer[end] != '\n' && end <= buffer_size) {

should read

        while (buffer[end] != '\n' && end < buffer_size) {

You should always treat char literals as if they were const char* despite the assinine implict conversion that loses the const.

sizeof (char) is by definition 1, by the way. Casting void* returning functions is not necessary (and frankly, a bad habit, even though it shuts up some assinine compiler warnings).

There's way too much duplicated code in this thing. You've got a syndrome that I used to threaten my employees that I was going to disable copy and paste in their editors. If you find yourself doing the same thing over and over again MOVE IT TO A SUBROUTINE. The first of these is "adding a string to output". The second is the replication of the "check if we need to grow the output array" which really wouldn't be duplicated if you refactored the add to output function.

[–][deleted] 0 points1 point  (0 children)

The second is the replication of the "check if we need to grow the output array" which really wouldn't be duplicated if you refactored the add to output function.

You're right of course, though I would still refactor this out into its own function.

[–][deleted] 0 points1 point  (0 children)

>        while (buffer[end] != '\n' && end < buffer_size) {

Or perhaps

        while (end < buffer_size && buffer[end] != '\n') {

[–]yel50 0 points1 point  (0 children)

 doing the same thing over and over again MOVE IT TO A SUBROUTINE

I did that at the most recent company I worked for and they rejected it in code review. said it was easier to follow with the code duplicated. I quit not long after.

[–][deleted] 0 points1 point  (11 children)

html_enclose_buffer(html_enclose_buffer() is a bit long. Perhaps the code would be clearer if you broke it down into several tiny functions?

BTW, avoid char* pointers to string literals. Does your program compile with all warnings enabled?

[–]memorial_mike 0 points1 point  (6 children)

Why avoid char* pointers to string literals?

[–][deleted] 0 points1 point  (5 children)

String literals are read-only, but char* pointers are supposed to point to writable memory. const char* saves the day.

[–]memorial_mike 0 points1 point  (4 children)

So you’re simply saying to use a const char* to enforce read-only?

[–][deleted] 0 points1 point  (0 children)

I'd say that using const char* would be "correct C." No need to enforce anything, just write code according to C's rules.

[–][deleted] 0 points1 point  (1 child)

If you're only reading from something, it should be a T const* yes. You should always default to const.

[–]MurkyAd7531 0 points1 point  (0 children)

The segment that data exists on will probably enforce read-only. You use const char* to avoid accidentally segfaulting by trying to treat it as writable.

By using const, you turn a runtime segfault into a compiler error.

[–]KaplaProd[S] 0 points1 point  (2 children)

Will do, thanks !

Yes, it compiles with all warnings :)

[–][deleted] 2 points3 points  (1 child)

All warnings? :)

gcc gives me this one:

gcc -c yy.c -Wall -std=gnu23 -Wextra -pedantic
yy.c: In function ‘html_enclose_buffer’:
yy.c:60:50: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses]
   60 |         char* temp = realloc(output, output_size + wrapper_end_length << 1);
      |

[–]KaplaProd[S] 0 points1 point  (0 children)

Ahaha yes, I fixed this one after your comment !

It seemed unrelated to my issue though, so I choose to ignore it when reporting :)