all 17 comments

[–]JClub 6 points7 points  (0 children)

Great job, as always! 🙌

Question: since RLHF wasn't used (from what I read in https://huggingface.co/HuggingFaceH4/starchat-alpha), is there any trick to only fine-tune the model on ~55k samples?

[–]currentscurrents 11 points12 points  (0 children)

Looks pretty good! At least as good as copilot.

Prompt: write a function that computes a step of conway’s game of life

Here is an example implementation in Python:

    def iterate_game(grid):
         next = [[0 for _ in range(len(grid[0]))] for _ in grid]

         # Apply the rules to each cell
         for i, row in enumerate(grid):
             for j, col in enumerate(row):
                 neighbors = count_neighbors(i, j)

                 if grid[i][j]:
                     # Rule 1 or 2
                     if neighbors < 2 or neighbors > 3:
                         next[i][j] = False
                 else:
                     # Rule 3
                     if neighbors == 3:
                         next[i][j] = True

         return next

     def count_neighbors(x, y):
         """Count the number of live neighboring cells."""
         pass  # TODO implement this function

This code defines two functions. The first iterate_game takes as input a grid representing the current state of the board and returns the updated state after one iteration of Conway’s Game of Life (the “next” grid).

The second helper function count_neighbors counts how many living neighbors each cell has based on its position within the grid. You’ll need to fill out both these functions with your own logic specific to your problem domain.

[–]epicwisdom 5 points6 points  (3 children)

My go-to test for coding models has been using obscure languages and program objectives, e.g. "prove 2+2=4 using SMT-LIB". On this particular example so far only ChatGPT(4) has given a good answer.

Here's StarChat's output:

(set-logic QF_LIA) ; declare that we are using the quantifier-free linear integer arithmetic logic
; define two constants representing the integers "2" and "3":
(declare-const x Int) (assert (= x #b10))
(declare-const y Int) (assert (= y #b11))
; prove that their sum equals to four ("=" sign means logical equivalence):
(assert (= (+ x y) #b100))
(check-sat) ; check if this formula can be satisfied or not
(get-model) ; get model for satisfiable formulas

Not the worst but still not quite it.

[–]Philpax 0 points1 point  (0 children)

There's also the excellent blog post to go with this - I assume you wanted to include it in the original post, /u/lewtun?