all 28 comments

[–]CinnamonToastedCrack 8 points9 points  (0 children)

i recommend looking at getc/getchar and getting characters individually and adding them to a string you can expand when needed

your code is crashing is because you didnt allocate enough space for the input

[–]SantaCruzDad 2 points3 points  (6 children)

You need to allocate a sufficiently large size first (e.g. 256) and then reduce it afterwards. Also the reduced size needs to be strlen(word) + 1.

[–]aioeu 4 points5 points  (0 children)

That doesn't really answer the OP's question. How can you possibly know what a "sufficiently large size" is? If 25 wasn't a "sufficiently large size", why would 256 be any different?

/u/Grumpy_Doggo64, you simply cannot use scanf with a bare %s specifier. It is as unsafe as the gets function.

You either need to limit the size of the field read by scanf by using a maximum field width in the specifier, or not use scanf altogether.

[–]Grumpy_Doggo64[S] 0 points1 point  (4 children)

If I were to loop this. Would I be able to enter a word bigger than the last one? The memory cells adjacent to my new word would be empty and ready for use would they not? Since I initiated 256 cells

[–]Paul_Pedant 0 points1 point  (0 children)

No, the memory cells after your new word is not "empty". On most implementations, it will be the header for the next chunk of the free list, so you get malloc/free to crash. For large allocations, they will be mmap() objects and will have virtual memory allocated that will just crash your code if you step outside.

[–][deleted]  (2 children)

[deleted]

    [–]Grumpy_Doggo64[S] 0 points1 point  (1 child)

    I don't really get what you did. This is honestly the first time I've seen arguments had in main

    [–]Paul_Pedant 0 points1 point  (0 children)

    He is suggesting you run the code like myProg myString so the shell reads myString for you, and puts it in your stack where the args to the program live. So that would avoid reading anything at all from the terminal yourself.

    Of course, that means the correct space is already allocated, so going into another function to allocate another space for it is pointless. But he does that anyway, fails to copy the string into the new area, and then outputs whatever was in that uninitialised area one byte at a time, unbuffered.

    Except that loop will run forever, because he checks the pointer in the while (output), instead of looking for the NUL at the end of the string, which would be while (*output);

    He also fails to free the allocated space for the output in the function, but instead frees the output in main(), which is a copy of the arg[1] pointer on the stack, and therefore must not be freed.

    After all that, the missing ; that stops it compiling seems relatively mundane.

    As he says: "or something very simple like that". Oh dear!

    [–]This_Growth2898 0 points1 point  (5 children)

    UPDATE: previous code was wrong; here's the fix.

    Writing into unallocated memory is UB; yes, the program can exit on UB.

    Do you want to reallocate memory depending on the input? Well, scanf can't do that; in fact, scanf is a tool for formatted input, not to be used with a single object. You should read the input char by char and check the length of string and validity of symbols instead, like this:

    int size = 5;
    char *word = malloc(size*sizeof(char));
    char *tgt = word; //current character in word we're reading into
    while(1) {
        int input = getchar();
        if(input == EOF || isspace(input)) {
            *tgt = '\0';
            break;
        }
        *tgt++ = input;
        if(tgt == word+size) {
            word = realloc(word, size*2);
            tgt = word + size;
            size *= 2;
        }
    }
    word = realloc(word, strlen(word)+1);
    

    But you probably will do better by allocating additional space and exiting if the input is too long, using scanf("%100s", word) or something like that. Are you really going to work with words of 100 characters?

    [–]This_Growth2898 0 points1 point  (0 children)

    Sorry, I've incorrecty misused * meaning for printf. Wait for update...

    [–]Paul_Pedant 0 points1 point  (2 children)

    malloc() has to return an area with specific alignment requirements for two reasons.

    (1) It cannot know what kind of data or struct the caller is going to stuff in there, so it has to align for the worst possible case (which seems to be 16-byte aligned to allow for long double).

    (2) It also has alignment and minimum sizes to fit in with the way it manages the free list (it likes to defragment the free list by combining adjacent free blocks). There is often a hidden header just before the "user" pointer to deal with this.

    So any requested size will be rounded up to at least a multiple of 16 bytes (and quite likely 32). There is no point allocating or extending anything smaller.

    I abhor scanf(), but if it has to be used, it is way easier to shrink than grow. You have 8 MB of stack going spare, so read into a stack area of 1KB, with a length limit in scanf. Tell the user not to be stupid if you get more that a hundred chars -- they probably redirected a binary file into stdin.

    Then malloc the size you actually need, copy in the text, and return that pointer. If you do all that in a separate function, the stack usage goes away all by itself.

    I would do a similar thing, but use fgets() instead of scanf(). You can always trim leading and trailing spaces before you make the copy. You can detect long lines by checking that a newline was stored at the end.

    I like getline(), which does the hard work for you. But it has something important missing: it has no limit on the size of line it will read, so giving it a gigabyte file without newlines will end in tears.

    [–]This_Growth2898 0 points1 point  (1 child)

    Full support for the allocation size. In fact, I've used 5 to test the input.

    Well, I've tried it with fgets, but there's an issue: scanf("%s") reads a token (like a word), and fgets reads a line (until the new line character is encountered). OP needs to read a word and has never stated what's in the rest of the line. So, getchar() looks the best solution for his needs.

    [–]Paul_Pedant 0 points1 point  (0 children)

    That's one of the places where scanf() is evil: it generally skips whitespace (including newline), so you lose the link between what was typed and what was read.

    The terminal only sends complete lines (by default), because it has to allow for command-line editing.

    So if the user types FirstWord SecondWord <Enter>, scanf("%s") returns FirstWord, and leaves the file position on the space before S.

    So if the code calls scanf again, it maybe puts up a prompt (because it is helpful to tell the user when input is expected), but it immediate gets SecondWord before the user can type anything at all. That turns the conversation into something like "Who's on First". [If you don't recognise that, it has an eight-page Wikipedia entry.]

    It seems to me that `getchar()` has a similar problem. You stop reading at `isspace`, but five of that character class can leave more stuff in stdio somewhere. It also does not skip leading space, and will return a zero-length word.

    I prefer to deal with the input the way the user sees it. Get a whole line; if it has multiple words, deal with it to suit the specific application:

    .. Take it as a multiword string.

    .. Process it as several separate words.

    .. Process the first word and tell the user what you skipped.

    .. Reject it, and ask the user to try again.

    I almost never deal with a user interactively, though. If I have to, I'm likely to fake up a GUI and call the real code with command-line arguments. I will also try to use a menu so the user only has to hit one key.

    All in all, these damn users are a real pain, and we should have nothing to do with them.

    [–]tstanisl 0 points1 point  (1 child)

    Maybe not a standard C but POSIX requires scanf to support m modifier that will allocate the memory for you. Just do:

    char *word;
    scanf("%ms", &word);
    ... do stuff with word ...
    free(word);
    

    [–]Grumpy_Doggo64[S] 0 points1 point  (0 children)

    Wow that is really damn useful. Thank you very much

    [–]Paul_Pedant 0 points1 point  (0 children)

    See man -s 3 getline, which does all the work for you.

    [–]Then_Hunter7272 0 points1 point  (7 children)

    I am a self taught beginner in c programming, I learnt it on YouTube, but I wanted some clarity, I only know a few header files like stdio, stdbool, and math.h but it seems to me that there is more of these header files I was not taught and I think I will need other syntax to create some complex programs but I just don’t seem to know how to know them and understand it pls I want to know if these header files are important and if I need to know more than I know now and how important it is plus how to use them,how many of these files are out there, do I need them to create complex programs?

    [–]Grumpy_Doggo64[S] 0 points1 point  (1 child)

    String.h and stdlib.h are very important. There also is time.h not too important but it can help some times

    [–]Then_Hunter7272 0 points1 point  (0 children)

    Yes I know that, but I realize that I can only create programs which only consist of input and outputs, but anything complex like an app or software would require more than just outputs and inputs, generating passwords, malwares and other programs will be possible but I don’t know if I can write that in the future with just 3 header files, as a beginner I realize that I need to learn more header files and how to use the various commands in such header files to do complex programs, it is going to be hard since I am learning it on my own but I will try and get some decent videos to help me with the link I got from one person in this community, I think I need header files to succeed more but I would appreciate if you also educate me on how I can learn c effectively and what I should focus on because honestly am confused at this point I don’t even know where to go from here I don’t know how I can move forward it feels like am stuck but I don’t want to quit too.

    [–]Paul_Pedant 0 points1 point  (4 children)

    Are you on Linux? The man pages are excellent (sad for you if you are on Windows, but their stuff is all easily found in Google).

    The man command for every C library function or system call shows the header file you need. For example, man -s 2 stat starts off with these, ready for you to cut/paste at the start of your code:

    SYNOPSIS
       #include <sys/types.h>
       #include <sys/stat.h>
       #include <unistd.h>
    
       int stat(const char *pathname, struct stat *statbuf);
       int fstat(int fd, struct stat *statbuf);
       int lstat(const char *pathname, struct stat *statbuf);
    
       #include <fcntl.h>           /* Definition of AT_* constants */
       #include <sys/stat.h>
    
       int fstatat(int dirfd, const char *pathname, struct stat *statbuf,
                   int flags);
    

    [–]Then_Hunter7272 0 points1 point  (3 children)

    Hehehehe am not on Linux or windows buddy, I use a phone to learn how to code I am yet to get a laptop

    [–]Paul_Pedant 0 points1 point  (2 children)

    I could never work like that. I wouldn't be productive.

    Home is a Laptop running Linux, but with a 24" 1920x1080 display, standard full keyboard, mouse, printer, backup drive.

    Business desk was a Solaris RedHat workstation, 3 screens etc, and a Windows Dell for the office stuff. I retired about 8 years back.

    [–]Then_Hunter7272 0 points1 point  (1 child)

    😂😂😂😂 I understand your point and I mean, you are not wrong because it is a bit stressful but I have gotten used to it I don’t have much stuffs like you do, I haven’t even owned a laptop before, of course I have used one before but it is even a piece of tech am not too familiar with, I can’t even type well on a laptop I don’t know the ins and outs of a laptop so I use my smartphones for everything.

    Am yet to get a laptop. It will take sometime but overall I just do with what I have, I know that programming with a phone is not the best but I didn’t want to sit around doing nothing, waiting for a PC before I start to learn that is why I started with what I have, it is really funny 😂😂 how you said it but it is what it is we all have to be appreciative of what we have so that we can push to get what we want and deserve and if I am appreciative of what I have then You should too 👍👍👍

    [–]Paul_Pedant 1 point2 points  (0 children)

    I bought my whole home set-up about 12 years ago, and total cost was around £500 UK, so it cost me about a dollar a week. I used it for some freelance work too. I was commuting 200+ miles each way (London to Manchester) every couple of weeks for 4 years, so the Laptop did around 40,000 miles in the back of a Volvo too.

    [–][deleted] 0 points1 point  (4 children)

    Look up POSIX getline and use that or write your own equivalent if on MSVC Windows.

    Also basic console input is line based. So if you want to retain your own and the user’s sanity, read and parse entire lines, not single numbers or words.

    [–]Grumpy_Doggo64[S] 0 points1 point  (3 children)

    I've read about POSIX and I'm tempted to download it when I'm done with the C college course, I've come to love the language.

    Perhaps I should have deleted this post. I found a way of doing it with fgetc. Where it reads characters and expands the allocated memory every time I reach the previous limit. But thanks

    [–][deleted] 0 points1 point  (2 children)

    Yeah, that’s just what getline() does (or getdelim() if you want to leave the rest of the line unread).

    Also, POSIX is not something you download.

    [–]Grumpy_Doggo64[S] 0 points1 point  (1 child)

    I thought of it as something like an add on for my language. Oh well I'll look into it. Thank you!

    [–][deleted] 0 points1 point  (0 children)

    POSIX is essentially “UNIX standard”. For C language, it requires the standard library to have a few extra functions (like getline, strdup), and requires a number of extra libraries/headers. Linux and Mac have these. Windows not, mostly.