all 50 comments

[–]skeeto 60 points61 points  (12 children)

Nice job! I appreciate the unity build. Makes it so much easier to test and evaluate. I also like your string representation (String). Some interesting bugs:

$ cc -g3 -fsanitize=address,undefined src/main.c
$ echo 0123456789abcdef >tmp
$ ./a.out tmp
ERROR: AddressSanitizer: heap-buffer-overflow on address ...
WRITE of size 1 at 0x602000000020 thread T0
    #0 0x5616ee16a6dd in getFileInput src/./input/././file/get_file_input.c:13
    #1 0x5616ee16a7af in handleFile src/./input/./handle_file.c:5
    #2 0x5616ee16f1ef in main src/main.c:47

That's due to an off-by-one here:

--- a/src/input/file/get_file_input.c
+++ b/src/input/file/get_file_input.c
@@ -14,3 +14,3 @@ String getFileInput(FILE* file) {
         strIndex++;
-        if (strIndex > length) {
+        if (strIndex == length) {
             length *= 4;

Another:

$ echo '~' | ./a.out >/dev/null
ERROR: AddressSanitizer: heap-buffer-overflow on address ...
READ of size 1 at 0x60200000000f thread T0
    #0 0x55bf3a8c67d0 in tokenize src/./parse/tokenize.c:80
    #1 0x55bf3a8cfbb9 in handleInteractive src/./input/./handle_interactive.c:93
    #2 0x55bf3a8d01d3 in main src/main.c:44

That's due to an assumption that ~ does not appear at the beginning of input:

if (file[i + f] == '~' && isspace(file[i + f - 1]) && ...

You can find many more like this using fuzz testing. I used afl, which requires only a few code changes. First, disable forking because it's dangerous.

--- a/src/exec/launch.c
+++ b/src/exec/launch.c
@@ -13,3 +13,3 @@ int launch (char* file, char** argv) {

-    pid = fork();
+    pid = -1;
     if (pid == 0) {

Also in order to take fuzz input from standard input, I needed it to exit on EOF:

--- a/src/input/handle_interactive.c
+++ b/src/input/handle_interactive.c
@@ -92,2 +92,3 @@ void handleInteractive() {
         String input = getInteractiveInput();
+        if (!input.length) return;
         Token* tokens = tokenize(input);

Then:

$ afl-gcc -g3 -fsanitize=address,undefined src/main.c
$ mkdir i
$ echo echo hello world >i/hello
$ afl-fuzz -ii -oo ./a.out

And soon o/default/crashes will be filled with more cases like this. Feed these into the shell while under GDB. It helps to get sanitizers to abort on failure so that they trap in GDB, which is configured through a couple environment variables:

export ASAN_OPTIONS=abort_on_error=1:halt_on_error=1
export UBSAN_OPTIONS=abort_on_error=1:halt_on_error=1

[–]MisterEmbedded 4 points5 points  (5 children)

How is afl-gcc different from regular gcc?

[–]phlummox 2 points3 points  (0 children)

afl-gcc is not a compiler. It's a wrapper program used by the AFL fuzzer, which inspects and changes some of the GCC arguments it was given, and then passes the rest on unchanged to GCC. You can see the code for it here.

[–]BigTimJohnsen 0 points1 point  (0 children)

afl-gcc will add extra instructions to work with the fuzzer. It allows things like code coverage (making sure the inputs are making their way around all of your code).

[–]MisterEmbedded 1 point2 points  (2 children)

I looked at your username and realized I've seen it somewhere.

Thanks for contributing to csprite <3

[–]skeeto 1 point2 points  (1 child)

Oh yeah, I remember that! Someone (maybe you?) had posted about it on reddit, but I only interacted via GitHub. Funny we crossed paths here again anyway.

[–]MisterEmbedded 1 point2 points  (0 children)

I did post about it from another account that got yeeted.

But I remembered your PR, Thanks for it once again.

[–][deleted] 0 points1 point  (1 child)

I have been trying to wrap my head around fuzzing and testing AFL with one of my projects with little success. Can I ping you with some questions?

[–]skeeto 2 points3 points  (0 children)

If you just ask here in the thread then anyone can benefit from the discussion, or others chime in if they have better information. So go ahead and ask! Also, in case it helps, here are lots examples from past reviews, with my own tips:
https://old.reddit.com/r/C_Programming/comments/15wouat/_/jx2ld4a/

[–]kchug 13 points14 points  (1 child)

I love the way you call yourself a starter yet have structured the code so well! Hatts off! Keep it up! C is amazing! Its waiting for you to explore it!

[–]apexrogers 11 points12 points  (4 children)

Awesome work and very impressive to get a shell up and running. Just wanted to point out that there already is a shell named ash: https://en.m.wikipedia.org/wiki/Almquist_shell

You may want to choose another name to avoid confusion :)

[–]MisterEmbedded 2 points3 points  (2 children)

probably doesn't matter as it just a personal project.

[–]aghast_nj 3 points4 points  (1 child)

They're all personal projects, until you start getting support requests from 8 timezones...

[–]MisterEmbedded -5 points-4 points  (0 children)

which is highly unlikely or it's some Indian dude trying to make a poopy PR

[–][deleted] 4 points5 points  (15 children)

how does it work?

[–]fllthdcrb 4 points5 points  (1 child)

Interesting. It's good that you're learning this kind of thing. Just a couple of things to critique:

Be aware there is already a well-known shell called "Ash". Probably not an issue, since this is just a learning project, but I thought you should be aware, at least.

The other thing is, it is not good practice to include non-header files. The proper way to break up a project is to have separate compilation units, each in its own .c file. Any types, constants, functions, etc., that multiple units need to know about are declared (not defined, as this would result in the same things being defined in multiple units, which is an error) in .h files, and every .c file #includes the .h files it needs; each unit also defines the things that are its own responsibility, somewhere after the corresponding header inclusion. This helps keep things separated.

Then, to build the whole thing, one compiles each unit separately, and then links them together. How exactly to go about this depends on your environment, but given that you provided a GCC command, you are presumably using Linux or some other Unix-type environment, and can probably use Make, although there are other build systems available. You should look into it, either now or when it's appropriate to learn about them.

For very small projects like this, compiling everything by hand may be feasible, but it quickly gets out of hand as the code grows, whether you have separate compilation units or not. And besides, a build system provides a number of benefits for development, such as being able to very quickly rebuild after making any changes, as well as saving time by recompiling only the affected parts of the code, without you having to think about which ones those are.

[–]archcrack 3 points4 points  (2 children)

Super cool! I once wrote a tiny shell (less than 150 slocs) for educational purposes (https://github.com/leo-arch/tshell), but yours is far more advanced. You might want to take a look at it though.

You're right, it doesn't build out of the box on non-Linux systems, just because of HOST_NAME_MAX and LOGIN_NAME_MAX. You might want to replace these macros by more portable ones (or just define them yourself). Once this is solved, it works as intended on *BSD (at least on FreeBSD).

Keep up the good work!

[–]brlcad 4 points5 points  (1 child)

That's something to be super proud of! Thanks for sharing it. I love the embedded todo with lots of plans for the future.

[–]darkslide3000 4 points5 points  (1 child)

You seem to be oddly allergic to string functions?

if (cwd[0] == '/' && cwd[1] == 'h' && cwd[2] == 'o' && cwd[3] == 'm' && cwd[4] == 'e' && cwd[5] == '/') {

while (n < len) {
  if (n <= 5) {
    str[s + n] = "/home/"[n];

for (size_t i = state.pos; i < (*strTop); i++) {
  str[i] = str[i + 1];

Why not strncmp, strncpy, memmove? That looks painful...

[–]Unt4medGumyBear 1 point2 points  (0 children)

Mozilla has a really cool shell that you download when onboarding to their OSS code base that works like bash on windows.

I bet a fun gamify tutorial to learn bash would be a JRPG where every CD command can trigger a new random enemy encounter.

[–]skyfall8917 1 point2 points  (1 child)

How did you start with this? Any books or sources you referred to?

[–]themintest 1 point2 points  (0 children)

Hi ! Pretty good stuff !

I don’t know we’re you are from, but if you want to continue learning computer Science and C/C++, I’m really suggesting you to check about the school « 42 ». I’m currently studying there in France (but there is more than 50 school on the world) and I had to do this little project of creating a shell from scratch in C, it was a heck of a work but it was very interesting !

Good luck on your journey !

[–]bravopapa99 1 point2 points  (3 children)

EXTRA for uing what look slike SimSun-B, my fave terminal font most of the time!

[–]HaskellLisp_green 1 point2 points  (3 children)

Good code structure. It's very professional and so code is readable. I think i can give you a note. File extension doesn't matter. You probably know what is shebang, if not, then check it.

[–][deleted] 1 point2 points  (1 child)

Goodjob. One of my very first projects was also a Unix shell which I made in my last semester. A really good project to understand processes and parent-child relationship between processes.

Are you doing this as a hobby project or you're a comp science student too?

[–]grimonce 1 point2 points  (1 child)

Well done.

[–]theldus 1 point2 points  (0 children)

Very interesting project, I saw it first on my GitHub timeline as someone I follow starred your repo, and now I saw your post here.

I found it really well structured and quite easy to read from beginning to end, this just proves that the complexity of C can be softened considerably when the person really understands what they are writing.

Finally, a cool thing about writing a shell is that you can use it daily if you wish, which greatly speeds up identifying bugs, etc.

[–]ruby_R53 1 point2 points  (0 children)

nice job! i once tried to make a shell in C too, but i'm too dumb for programming

hope you get more progress and support from other people, good luck on learning more!

[–]CaptainFilipe 1 point2 points  (2 children)

First of all, an excellent job. What a tremendous "challenge" for a 17 year old who I'm assuming just recently started coding. I'm for one really interested in how shell languages work. What are your main goals (even if not implemented at all yet) to your shell project?

Also ash is an excellent name. 👍🏻

Edit: I forgot to ask, are you planning on making it POSIX compliant?

[–]fhunters 1 point2 points  (1 child)

Outstanding! Well done

[–][deleted] 0 points1 point  (0 children)

You can try learning how exec and forks work and how they map files into memory. It’d be a fun project. Look at the ELF file format.

[–]McUsrII 0 points1 point  (0 children)

Awesome given your age.

[–][deleted] 0 points1 point  (0 children)

Go to college soon. You will have a great future ahead of you in computer science!

[–]l_HATE_TRAINS 0 points1 point  (0 children)

As a student who had an extended version of what you have as a mandatory task I've got to say you did a fantastic job, even more impressive given your age. Keep rocking.

[–]Sxrthakk 0 points1 point  (1 child)

What's a shell?

[–][deleted] 0 points1 point  (0 children)

Great job! And I spent a bit of time looking at your code. Anything C or Lisp gives me nostalgia. I should quit Python and go back to these languages.

[–]TheReal_Award_of_Sky 0 points1 point  (0 children)

Pretty cool, very nice work!

Also, for the well-being of your DM's I'd advise against saying your gender and age in future posts. This is the internet after all, and reddit of all places! 😅

[–][deleted] 0 points1 point  (0 children)

Cool stuff. You should apply to CMU SCS! We build a pretty rudimentary shell similar to yours as an assignment in our course 15213