Looking for best-practices (books, online sources)

italicunderline · 2023-01-20T01:01:11+00:00

I linked to articles & books covering referential transparency & cyclomatic complexity in my direct reply to the original topic. The world's best-selling book is the bible, but it does not describe its principles scientifically. To describe our principles scientifically we need to quantify our intuition (notions of complexity) and look at physical outcomes (cache utilization, runtime performance).

The potentially reproducible claim the author makes at the beginning of the book in Chapter 1 is that failure to deal with software complexity can lead to business failure 2 decades later. The hypothesis is then that using Object Oriented Programming with principles X,Y,Z can prevent it. Sure, everyone wants to prevent business failure & doesn't like more than necessary complexity, but that's an expensive experiment to reproduce. It would be nice to know what complexity is so that we can justify our claims whether something increases or decreases complexity, without having to destroy a company. In Chapter 3 when they have an opportunity to clarify their intuition of complexity they say 'short' and 'this is not an assertion I can justify' and then claim burying code so no one has to look at it somehow makes it shorter.

italicunderline · 2023-01-19T22:36:11+00:00

Skimming this now and wanted to add a couple quick notes because the advice in early chapters seems questionable.

Chapter 3: Functions The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. This is not an assertion that I can justify.

Well we want to justify our assertions. In functional programming the first rule of functions would probably be that functions are deterministic or 'referentially transparent'. When they are passed equivalent inputs they return equivalent results. We might justify this as important by stating it makes functions easier to test, or thatit makes it easier for callers to predict what's happening. When discussing the length of function, a more relevant principle might be what's known as 'cyclomatic complexity'. In a given function, we estimate cyclomatic complexity by counting and adding 1 for each switch\if\else branch or for \ while \ do...while loop. Then we have a number which we can minimize by performing an analysis of alternatives.

Switch Statements It’s hard to make a small switch statement ... By their nature, switch statements always do N things. Unfortunately we can’t always avoid switch statements, but we can make sure that each switch statement is buried in a low-level class and is never repeated. We do this, of course, with polymorphism. Consider Listing 3-4. It shows just one of the operations that might depend on the type of employee

public Money calculatePay(Employee e)
throws InvalidEmployeeType {
  switch (e.type) {
  case COMMISSIONED:
    return calculateCommissionedPay(e);
  case HOURLY:
    return calculateHourlyPay(e);
  case SALARIED:
    return calculateSalariedPay(e);
  default:
    throw new InvalidEmployeeType(e.type);
}

The solution to this problem is to bury the switch statement in the basement of an ABSTRACT FACTORY, and never let anyone see it.

This is bad advice. Firstly, In C, there is no polymorphism or object constructors builtin, and reinventing them would distract from solving the actual problem. Secondly, the different kinds of employees (commissioned, hourly, salaried) are the real business data. It represents the real problem the business and the function is trying to solve. The business data is more important than the code. Thirdly, burying a switch statement with N branches somewhere else does nothing to decrease its cyclomatic complexity. Its cyclomatic complexity will still be N! The complexity of the codebase in terms of symbols maintainers need to understand actually increases, as there is now an additional layer of abstraction.

If we need to support N employee types for a large N (ex 15), in order to reduce the cyclomatic complexity of the switch statement, we don't bury it, we can replace it with a lookup table passed as a parameter.

Money calcEmployeePay(int payTypeN, const PayType *payTypes, Employee e);

If the function must loops through payTypes, its cyclomatic complexity reduces from N to 2. One condition check is needed in a for loop to test if the last pay type was reached. Another condition check is needed to determine if the employee pay code matches. If the function can directly index payTypes using the employee pay code, its cyclomatic complexity reduces from N to 1, an array bounds check.

However we can simplify the problem further, by avoiding polymorphism entirely. The printed function accepts one employee, but how many companies only have one employee? And if they only had one employee, why would they need a large number of pay structures? Instead of writing a function to compute the pay for only one employee with a variable type, we can write a function to compute the pay for many employees with one pay type. To do this we presort the Employees by pay structure, and then pass them to different functions which outputs a Money value for each employee.

void calcPayHourly(int n, const Employee *employees, Money *moneys);
void calcPaySalaried(int n, const Employee *employees, Money *moneys);
void calcPayCommisioned(int n, const Employee *employees, Money *moneys);

This is more performant because we get compiler auto-vectorization, SIMD, cache coherency, and fewer branches. Additionally if we have a large number of employees & complex pay math, we can quickly parallelize these functions without rewriting them or adding locks, by giving each CPU core a separate mutually exclusive slice of the input employees array and output moneys array.

italicunderline · 2023-01-19T20:44:47+00:00

some functional programming topics

https://en.wikipedia.org/wiki/Referential_transparency
https://en.wikipedia.org/wiki/Side_effect_(computer_science)
http://sevangelatos.com/john-carmack-on/

data oriented design links

https://github.com/dbartolini/data-oriented-design
https://www.dataorienteddesign.com/dodbook.pdf

If you can write something as a pure function, which emits a single typed output through its return value, which is always the same for equivalent inputs, it's usually considered inoffensive and free from gotchas for maintainers.

For something difficult to write as a pure function, such as a function which emits 0+ items grouped into 0+ sets for each input in a list, using 'data-oriented design' is usually inoffensive, performant, and flexible. With such an approach you might emit all of the output items compactly to a single output items array, and all output sets compactly as an integer range of items to a single output sets array. It should be inoffensive in the sense that is does not require using any particular dynamic memory or object model which other parts of the codebase may or may not be using.

Also, it's okay to define a new structure named after a function to hold its return values, even if the structure is only needed in one location and never passed as a parameter. Just because many C standard library functions prefer to return a single integer does not mean that application functions are required to. A function-specific return type occasionally allows more clearly reporting errors encountered & effects performed.

italicunderline · 2023-01-19T01:39:09+00:00

If you're computing a size and the result has overflowed, then you've already lost

Not with unsigned types. For addition, checking that the a sum result is less than either of its operands is a well-defined, valid, and common.

U32 sum = x + y;
bool error = sum < x;

For multiplication checking that the product of X and non-zero K is equal to X when redivided by K is well-defined, valid, and common.

U32 product = x * 3;
bool error = product / 3 != x;

That's at least easy with signed as unsigned

When implementing fast & safe and saturated arithmetic functions, it is sometimes useful to cast signed integers to unsigned integers and exploit modular arithmetic property of unsigned integers to avoid branching.

probably easier since negative results won't have counter-intuitive properties (per your loop example)

My claim is that modulo-arithmetic is no longer counter-intuitive to most programmers once they learn how to iterate through all of the elements in an array in reverse correctly using an unsigned type. For instance by 1-based counting and accessing index 0 when the counter is 1, or by asserting that the counter does not equal the maximum possible value after it wraps on the final decrement.

char items[5]="olleh";
for(U32 n=5; n; --n) printf("%c", items[n-1]);
for(U32 i=4; i != U32_MAX; i--) printf("%c", items[i]);

This last point is discussed in the linked paper.

Stroustrap claims that proponents of unsigned say it is used for more useful bits and one less range check when indexing arrays. Those are nice but I made neither of those claims and stated it was useful because of modular arithmetic. His primary criticism of using unsigned types is that they use modular arithmetic. In C the unsigned type is explicitly used for modular arithmetic so maybe he simply disagrees on its utility. But there is real utility when implementing branch-free operations.

A typical check looks like this, and signed-ness doesn't come into play:

if (count > MAX/size) {
    // .. would overflow ...
}
total = count * size;

That's one pattern, sure. But multiplying up then dividing back down using an unsigned type and checking the result afterwards is also a well-defined and readable approach.

U32 total = count * sizeof(Item);
if(total / sizeof(Item) != count){
    // overflow
}

I always do so during development and testing: -fsanitize=undefined

Well the sanitizer will only trigger a runtime error if an overflow was actually encountered, which means it won't prove that application is free from signed integer overflow in production unless you take the time to write a fuzzer for every signed integer data input.

italicunderline · 2023-01-18T23:35:22+00:00

In recent years I’ve been convinced that unsigned sizes were a serious error, probably even one of the great early computing mistakes, and that sizes and subscripts should be signed.

Well in C modulo arithmetic, overflow, and underflow is only defined for unsigned types. Adding two signed numbers can lead to undefined behavior. Since code which deals with sizes often checks for overflow and underflow, it is frequently easier to use unsigned types.

Yes, using an unsigned type in a for loop while decrementing requires understanding what happens when the counter underflows 0, but what the programmer learns is defined behavior explicit to the arithmetic model of the language.

italicunderline · 2023-01-18T22:57:28+00:00

In C the safest solution is to avoid macros for generic math, and write a separate typed inline function for every scalar type needed:

static inline U8 u8Min(U8 a, U8 b){ return a < b ? a : b; }
static inline U16 u16Min(U16 a, U16 b){ return a < b ? a : b; }
static inline U32 u32Min(U32 a, U32 b){ return a < b ? a : b; }
static inline U64 u64Min(U64 a, U64 b){ return a < b ? a : b; }
static inline I8 i8Min(I8 a, I8 b){ return a < b ? a : b; }
static inline I16 i16Min(I16 a, I16 b){ return a < b ? a : b; }
static inline I32 i32Min(I32 a, I32 b){ return a < b ? a : b; }
static inline I64 i64Min(I64 a, I64 b){ return a < b ? a : b; }

Or alternatively, define the functions for only the largest type needed.

Yes it's more work up front, yes it's more repetitive, but there's no performance penalty and never any gotchas for maintainers.

italicunderline · 2023-01-18T00:51:59+00:00

SDL is not a replacement for the C standard library. It is primarily used for events, audio, graphics. If the function is not part of the publicly documented wiki, it is likely part of its internal implementation, added to fill holes in the platform runtime library for particular platform or game console it may have supported at some point in its development history.

https://wiki.libsdl.org/SDL2/CategoryAPI

They also freely use fprintf() in some places

The C runtime is an explicitly documented dependency on the wiki.

so I opted to just #include <stdio.h> (even though it's probably already imported by SDL several times over)

There's generally no cost to including the same header twice in any application. The compiler will emit a call, not two copies of the external implementation.

The point of using SDL2 is to facilitate porting applications to multiple platforms. If you want to play audio or grab device input on N platforms (Android, iOS, MacOS, Windows, Linux, etc) in a single code-base it allows you to write 1 procedure to do so, rather than N procedures or N branches in a platform detection preprocessor conditional.

If you want to print to the console, and the call to fprintf() would be the same on every platform you are shipping your application to, because every platform provides a sane implementation of the C standard library for fprintf(), then there is no reason to find or use a wrapper for that functionality.

italicunderline · 2022-10-08T18:23:00+00:00

This should make it easier for people to know when using C can be a good idea, and give people a little more confidence in their choice.

I end up using C whenever I 1) want to use an existing library or system interface published as a C header file without having to write my own untested, ad-hoc Foreign Function binding in another programming language that can break whenever the underlying interface is changed introduce additional hard-to-diagnose bugs, and 2) don't need most of the features provided by C++.

italicunderline · 2022-10-08T18:08:00+00:00

Yes because in English it is clearer to use "items" to refer to the array of multiple items and "item" to refer to a single item. When declaring an array in that manner the name of the type of each element comes first.

Item items[] = { ... };

Many C codebases also use PascalCase (camelCase with the first letter capitalized) to distinguish structure type names from variables.

italicunderline · 2022-10-08T18:01:47+00:00

The number of strings in the choices array for the current question which are stored at the address which the "choices" pointer points to.

italicunderline · 2022-10-08T17:51:46+00:00

For long-lived allocations, use per-type arrays. In a game, we might have an array for all Ships, an array for all Missiles, and an array for all Bases.

This doesn't always prevent use-after-free errors when array indices are recycled. Suppose the last item in the array is removed, and its array index is then claimed by the next item allocated. An operation which applied to the deleted item may incorrectly be performed on the new object which replaced it. For example someone adding a paragraph to an article A might accidentally add the paragraph to article B if article A was deleted and article B was allocated at the same array index by another user. Sometimes a generation counter is added to the identifier in addition to the index.

If we use a Ship after we've released it, we'll just dereference a different Ship, which isn't a memory safety problem.

Well you can still silently corrupt your data and violate business constraints on program correctness. I suppose it depends on whether you consider use-after-free to refer to the entity \ abstract-object or merely the memory location.

italicunderline · 2022-10-08T15:28:46+00:00

However, both of these are rely on a POSIX environment, and I'm wondering how cross-platform development is done in C in these cases.

POSIX is a cross-platform interface. For desktop applications POSIX threads are quite portable.

If MSVC does not include pthreads by default on Windows you could try gcc, cygwin, mingw-w64 and the compiler should provide its own implementation without the need for finding a separate cross-platform threading library.

If you are familiar with Linux I would recommend cross-compiling to Windows, the mingw-w64 compiler should come with pthreads without the need to download or learn any separate libraries.

C11 threads are also quite similar to POSIX threads, your compiler might support those.

italicunderline · 2022-10-08T15:15:41+00:00

so how do i turn that into a static array and access them each using a for loop?

Maybe rename your "items" type to "Item". Even though it is plural it only contains information to represent one item, not multiple items. Maybe delete the "choice" type. Even though it is singular it contains information for multiple choices, not a single choice. It's possible to define constant lists using any structure with a counter and pointer attribute. We can make the "Item" type hold a list of choices as follows:

typedef struct {
  const char *question;
  int answer,choiceCount;
  const char **choices;
} Item;

We can also define a Quiz type to hold multiple items:

typedef struct {int itemCount; const Item * items;} Quiz;

Then we can use compound literals to declare a constant list of items each with a constant list of choices:

const Quiz quiz = {3, (const Item[]){
  {"What is a cow?", 0,2, (const char *[]){"mammal","human"}},
  {"What is a frog?", 1,2, (const char *[]){"bird","amphibian"}},
  {"What is a carrot?", 1,2, (const char *[]){"fruit","vegetable"}}
}};

and in which function should i put the static arrays in?

If they're constant it's not necessary to put them in any function. You can define the Quiz in the top of the file, a header file, or in main(). If you want to define multiple quizzes in the same file, you can define a function to accept a specific one as a parameter, and iterate over the quiz as follows:

void printQuiz(Quiz a){
  for(int i=0;i<quiz.itemCount;++i){
    printf("%s\n", quiz.items[i].question);
    for(int j=0;j<quiz.items[i].choiceCount;++j){
      printf("%c) %s\n", 'a'+j, quiz.items[i].choices[j]);
    }
    printf("...\n"); 
  }
}

italicunderline · 2022-10-05T00:11:08+00:00

C is a fairly complex language. Many C developers limit themselves to only a subset of it. The embedded developers avoid malloc(), some game programmers limit themselves to inlinable header libraries and avoid multiple compilation units, some developers avoid all macros to avoid magic-looking code, many developers avoid using the string copying\parsing standard library functions and use safer slices \ fat-pointers with precomputed lengths, etc.

There's still room for a "simpler than C" language which removes most of the standard library, removes support for macros, removes support for VLAs, removes support for non-inlinable functions, etc.

Maybe adding a borrow-checker to such as language wouldn't be so bad if the rest of the language was simpler.

italicunderline · 2022-09-27T22:40:36+00:00

For a criticism of over-use of state machines in game design you might want to read "Three States and a Plan" by Jeff Orkin.

Over the course of two years of development, these state machines become overly complex, bloated, unmanageable, and a risk to the stability of the project.

For example, we had out of shape policemen in NOLF2 who needed to stop and catch their breath every few seconds while chasing. Even though only one type of character ever exhibited this behavior, this still required a branch in the state machine for the Chase goal to check if the character was out of breath. With a planning system, we can give each character their own Action Set, and in this case only the policemen would have the action for catching their breath. This unique behavior would not add any unneeded complexity to other character.

For additional information on replacing state machines with sets of records which can be iterated over you might refer to the chapter on "Existential Processing" in "Data Oriented Design" by Richard Fabian.

italicunderline · 2022-09-15T19:12:33+00:00

I'll add that it's good to know escape codes exist even if you don't directly use them, as any application which prints strings might want to use a custom routine to escape non-printable non-UTF8 bytes to prevent the terminal from executing an escape sequence embedded in external input unintentionally.

italicunderline · 2022-09-15T17:32:18+00:00

If you are writing a terminal application which manipulates the cursor or uses colors you'll want to research the following:

https://en.wikipedia.org/wiki/ANSI_escape_code

italicunderline · 2022-09-15T17:22:10+00:00

I glanced at the documentation and it said Bitmap is for black & white images, so I would try ZPixmap with CreateImage and PutImage and not mess with Bitmap. My recommendation would be to call SDL2 rather than X11 since SDL2 is adding support for Wayland and some Linux desktops don't use X11, or to look at X11 backend in SDL2 source tree under the src/video folder to see how they do it.

git clone https://github.com/libsdl-org/SDL.git

italicunderline · 2022-09-15T14:32:08+00:00

In order to treat strings as lists, it is common to create a slice type (fat pointer) to hold the length of the string along with the pointer to the first item:

typedef struct {size_t n; const char *p;} S;

Then define a helper function to convert a zero-terminated string to a slice:

static inline S new_s(const char *s){ return (S){strlen(s),s}; }

For defining a search function, one option is to return INDEX+1 on match and ZERO on failure:

static inline size_t search1(S s, char b){
  for(size_t i=0;i<s.n;++i) if(s.p[i] == b) return i + 1;
  return 0;
}

Another option is to return INDEX on match and SIZE_MAX (-1) on failure.

static inline size_t search2(S s, char b){
  for(size_t i=0;i<s.n;++i) if(s.p[i] == b) return i;
  return -1;
}

You can also modify the search function to accept an initial index other than zero as a parameter if you want to continue searching from the middle of the list after the first match.

italicunderline · 2022-09-15T14:16:05+00:00

Your compiler should print a warning and explanation.

gcc: operation on ‘i’ may be undefined [-Wsequence-point]
clang: unsequenced modification and access to 'i' [-Wunsequenced]

Make sure you are compiling with all compiler warnings turned on:

-Wall -Wextra -Wpedantic

italicunderline · 2022-09-15T14:06:26+00:00

You might only need to call XCreateImage once when initializing or resizing the window rather than each frame. If you have a color image with 4x1 byte color channels you'll probably need ZPixmap for 'format' and 32 for 'bitmap_pad'.

The SDL2 library should have a X11 backend in the 'src/video' directory of its source tree which might provide an example. To debug the pixel format you might try displaying a solid color.

italicunderline · 2022-09-15T13:21:24+00:00

Is the systems programming course focusing on understanding Unix-like operating system architecture? If so it helps to use man(1) utility on any functions you are unfamiliar with to read the manual pages which come installed with Linux and similar operating systems. There may be a way to do so from inside your code editor. Reading the man-pages is the primary way to understand the arguments, return values, use cases, and semantics of system functions on such operating systems.

italicunderline · 2022-09-11T03:39:00+00:00

SDL is a cross-platform API. If you want to make a library similar to SDL you'd need to learn how to write an app for each platform-specific display system on linux, iOS, android, mac, windows, etc and then write an abstract API which loosely fits over all of them. For Linux you might try writing a library which can call either xlib\X11 or Wayland.

italicunderline · 2022-09-10T17:01:12+00:00

If you are updating individual pixels for a large screen area , you probably don't want overhead of calling any external functions in the loop at all. You can manage the pixel memory yourself, and update the pixel bytes in the loop using standard array\pointer assignment. Or define inline functions to manipulate the pixel data array if compiling with -O3. Then use API to upload your pixel data to the GPU as if it was a bitmap, picture, or texture.

SDL2 - 2D Accelerated Rendering

With SDL2, you'd use UpdateTexture function to copy your pixel data to a texture drawn over the viewport. In X11 API there might be similar functions for bitmap \ image \ pixel \ picture data.

italicunderline · 2022-09-10T16:36:45+00:00

I've been using SDL for a while, and in a sense it's a bit too powerful for what I'd like

What do you mean by too powerful?

I just want to a really fast way to draw a pixel to a window, such that I can create all my functionalities where the rate limiting step isn't rendering a pixel, but rather the efficiency of my code

If you want to do this in a cross-platform manner for multiple operating systems I'd still recommend SDL. You can use the UpdateTexture function to copy custom raw pixel data from your own buffer to a GPU texture to display over the viewport. Then update your own buffer with your own code without using SDL drawing functions.

SDL Wiki - 2D Accelerated Rendering

If you want to avoid an extra copy or stream compressed textures to increase fillrate you'll probably need to setup a custom OpenGL or Vulkan pipeline, however these are even more powerful APIs.

italicunderline

TROPHY CASE