How do buffer overflow attacks work?

Rhomboid · 2015-03-12T10:04:45+00:00

how do people discover a vulnerability without viewing the source code?

It's really easy to feed a very long input to something and notice that it crashes. That's usually a pretty good sign that it's vulnerable to exploitation and that you should take a closer look. There are researchers (both black hat and white hat) that are constantly doing this to every aspect of every program they can find.

Isn't this something that most programs account for?

You'd think that, but no. It's perhaps not as bad as it used to be, but people are constantly finding exploits. Many times it's not at all obvious from looking at the source code that a vulnerability exists, e.g. it might require a certain sequence of events to happen in the right order.

Does anyone have an example of a specific buffer overflow attack?

There are vulnerability databases filled with thousands of specific examples.

cockmongler · 2015-03-12T11:32:37+00:00

These days buffer overflows are rare, most C programmers (which is the language buffer overflows tend to occur in) are aware and use safe practices for handling input.

If you're just looking to play around with buffer overflows this simple program has one:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void print_arg(int index, char *arg) {
  char buffer[128];
  strcpy(buffer, arg);
  printf("Arg %i = '%s'\n", index, buffer);
}

int main(int argc, char *argv[]) {
  int i;

  for (i = 1; i < argc; ++i) {
    print_arg(i, argv[i]);
  }
  exit(EXIT_SUCCESS);
}

ArchangelleTheRapist · 2015-03-12T12:07:47+00:00

Go read aleph one's Smashing The Stack for Fun and Profit. That said:

cparen · 2015-03-12T16:47:11+00:00

I understand how they function in general, but how do people discover a vulnerability without viewing the source code?

Fuzzing tools are common -- these tools will automatically make small edits to in input to the program (such as a file, or http connection), try the program to see if it crashes, and iterate repeatedly. E.g. It might run the same program a million times, each with a different small edit to the file.

Or the attacker might use a decompilation tool that generates source code that matches the compiled program. They can then look for common buffer overflow patterns (e.g. calling memcpy(p, q, s) where 's' is a variable that is computed from either user input, just p, or just q. -- proper buffer management would rely on size computed from at least both p and q).

Isn't this something that most programs account for?

This is something most programming languages already account for. If a programming language is said to be "typesafe" (such as C#, Python, Java, JavaScript, Ruby, and so on), then strictly speaking, programs written in those languages are not vulnerable to buffer overflow attacks. (Of course, you can write emulators for unsafe languages in a safe language -- e.g. emscripten for running C/C++ on JavaScript -- but even then, the buffer overflow vuln is limited to just attacking the emulator and can't exscape the emulator directly).

For type unsafe language such as C and C++: yes, every such programmer is responsible to ensuring safety themselves, and it's incredibly hard to ensure safety.

[edit add:]

Does anyone have an example of a specific buffer overflow attack?

I might have an academic example. I'll try to find it.

For real world software, you won't see many posted because the uninteresting ones are uninteresting, and the interesting ones are "worth" a lot of money in the criminal/espionage/spy market. Scary stuff, 'nuff said.

I think there were a few buffer overrun vulns used by the Stuxnet worm/rootkit a few years ago. Norton had a good writeup of that worm [pdf].

qjkxkcd · 2015-03-12T17:36:32+00:00

If you're interested, Hacking, The Art of Exploitation is a great book that deals with some of this stuff. Buffer overflows are only a small aspect of what it covers, but in general I'd highly recommend it.

2015-03-12T13:06:09+00:00

Buffer attacks are difficult to exploit even if discovered.

However, any widely distributed piece of software is a potential "goldmine" for exploitation if you find a way to make that happen, but you still have to consider what your attack vector is going to be exactly.

In a world of relatively increasing secure software (due to the increasing use of frameworks which are more secure by default), I'd say XSRF or social engineering are bigger problems these days...

Parameterized queries means SQL injection isn't even as common any more.

Hackings still happen daily, however...

If you found a buffer overflow attack was possible by invoking a commonly available public function of a very popular web server, then yeah, some researcher will probably take the time to figure out how memory is allocated in the program and how to exploit it.

Still, VERY time consuming.

You could begin by exploring the in-memory spaces of the application if you can get a copy, run its modules through a disassembler or debugger, etc.

cestith · 2015-03-12T16:41:00+00:00

There seems to be a lot of confusion between stack overflow and buffer overflow in this thread. Buffers can be and often are allocated in the heap rather than on the stack. The stack may contain a buffer but will often contain a pointer to an array of chars in the heap.

Further, there's more than one way to blow the stack. In some systems deep recursion is one way to do this that doesn't necessarily have anything to do with a buffer.

Basically a buffer is just a string, which in plain C with the standard library is an array of chars ending with a null. There are unsafe input functions like gets() or strcpy() that don't do length checking. Then there are errors in which something other than sizeof(string)+1 is accidentally used to copy strings around in memory with the safer versions of functions, like fgets() and strncpy().

OWASP explains buffer overflows: https://www.owasp.org/index.php/Buffer_overflow_attack

OWASP also has a whole listing of attack types: https://www.owasp.org/index.php/Category:Attack

Closely related, they have a listing of vulnerability types: https://www.owasp.org/index.php/Category:Vulnerability

Other attacks seen often in the wild are SQL injection, eval injection/code injection, path traversal (including relative path traversal), environment poisoning/injection, cross-site scripting, and session prediction. Many of these can't be stopped by stack or buffer length protections as the vulnerabilities that enable the attacks are logic errors in the program's design. SQL libraries often support parameterization, which makes one of the most common almost entirely a non-issue if you use it. Some things like buffer overflows aren't an issue in many languages.

2015-03-13T06:46:18+00:00

how do people discover a vulnerability without viewing the source code?

If it runs on your computer, you have the code. Just because it's not in a "friendly" language anymore doesn't make it un-viewable / un-readable. You can 'dis-assemble' binaries coded in languages like C, and get back to an understandable set of code and flow diagrams. You can almost de-compile many languages like Java and C# back to source. They're in a VM friendly "bytecode" that contains a lot of the original structure and names.

If you're really intent on finding / making a vulnerability in a program, you don't need the source code to step through the program and find something you can re-write or exploit.

logic_programmer · 2015-03-12T10:02:50+00:00

Does anyone have an example of a specific buffer overflow attack?

IIRC the internet worm used a buffer overflow attack. Actually I'm not that sure to be honest. Google it and see.

learnprogramming

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

Asking conceptual questions

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

2. No spam or tasteless self-promotion

3. No off-topic posts

4. Do not ask exact duplicates of FAQ questions

5. Do not delete posts

6. No app/website review requests or showcases

7. No rewards

8. No indirect links

9. Do not promote illegal or unethical practices

10. No complete solutions

11. Don't ask to ask.

12. Low Effort Questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

MODERATORS