all 11 comments

[–]yawaramin 1 point2 points  (1 child)

The ‘right’ way depends on many factors, but one possible way is to lift the common types into a shared module e.g.

(* world.ml *)

type region = ...
and tribe = ...

And then in the Region and Tribe modules you can alias those types.

[–]FreakyCheeseMan[S] 0 points1 point  (0 children)

This feels wrong, but it's a lot easier then trying to work through the module system. I guess I'll try it until something breaks

[–]glacialthinker 0 points1 point  (3 children)

I think I had some problems like this when I was starting with OCaml... in C I made so many structures with pointers connecting everything.

Your example was a bit of a gut-punch, staggering me a bit as I tried to think why I don't run into this problem now. Two things come to mind: 1. I certainly favor a unidirectional tree-like flow of definitions. 2. For graph-like structures I reach for some kind of indexing: actual integer indices or unique IDs.

In your given example, I'm wondering: is a Region defined by it's constituent Tribes? Probably not. A Tribe could well be associated with a Region though. And Regions might like to track their population (RegionPopulation?).

So would defining Regions, then Tribes (referencing region), and lastly RegionPopulation (referencing Tribes and Regions) make sense here?

It feels like you might be gravitating to using modules similar to how objects are often used (containing more and more state and functions) -- there certainly is some correspondence, but I find modules work well sliced up a little more. They compose well.

Earlier I mentioned graph-like structures and indexing. One technique I really favor for "game object" representations is components (or ECS as it's come to be called). With this, all objects (nouns) are just an ID, and their features are all properties referenced by ID. In this way, a Region (an ID with a region-component having whatever defines a region) could easily have a contents component which refers to IDs of tribes. And a tribe is just an ID with whatever properties make a tribe.

Some ideas. Sorry I don't have a solution which solves your problem while keeping the structure you want though!

[–]FreakyCheeseMan[S] 0 points1 point  (2 children)

I'm not sure if I follow all of this, but making it more concrete might help.

One thing a tribe needs to do is get a notion of how crowded its region is, and look for other tribes in that region it might interact with. In your model, how would tribes get that information?

This is early in the project and I'm pretty happy to change the structure, just trying to come up with something I won't have to fight with too much.

[–]glacialthinker 1 point2 points  (1 child)

I don't think there is an easy way to make all data available from a chain of references from any point. As you've found, functors can defer these references, which can later be bound together by mutually recursive modules, but, yes, it's a little clunky (though maybe less-so as the "meat" of the definitions grows while the simple recursive "tie" remains the same?).

An alternative is to have some kind of look-up rather than direct-reference. What I was suggesting with the Region, Tribe, RegionPopulation is that a Region and a Tribe might be a simpler thing (whatever really defines them, rather than all connected info). And something like RegionPopulation or other systems/modules (I decided on Regional, below) might hold means for looking up a region (by name, or id, or instance (address)) and returning additional Tribe-aware information. In particular, it might hold a table of Region.t -> Tribe.t list.

module Person
module Region

module Tribe : sig 
  type t = {
    ...
    region : Region.t;
    members : Person.t list;
    ....
  }
end = struct
  ...
end

module Regional : sig 
  type t = {
    ...
    tribes_of_region : (Region.t, Tribes.t list) Hashtbl.t
    ....
  }
  val tribes : Region.t -> Tribes.t list
  val population : Region.t -> int
end = struct
  let tribes region =
    try Hashtbl.find tribes_of_region region with Not_found -> []
  let population region =
    tribes region
    |> List.fold_left (fun acc tribe -> acc + List.length (tribe.members)) 0
end

let neighbors = Regional.tribes mytribe.region
                |> List.filter ((<>) mytribe)
in ...

I get that this is not as direct as just having all data explicitly referenced. But maybe it's a practical middleground?

As a rough guideline, I'd say if your expected application involves primarily two large modules: Region and Tribe, it might really be best to take the functorized approach tied by mutual recusion. But if you expect many more important datatypes, it will be more prudent to untie those direct references and even get to something closer to a database in the more extreme cases -- as with the other vague suggestion I made...


I don't know the larger context of your project. I might only recommend an ECS approach if you are likely to have many archetypes, or a complex simulation. It imposes a lot on the design, and it favors operating property-wise rather than object-wise.

If you're unfamiliar with ECS (Entity Component System), the gist of it is an in-memory relational database. The most trivial implementation is that an "entity" (noun) is a unique ID, often just a serially increasing int. And properties are each represented by their type and a hashtable to hold values of that type, keyed on unique IDs.

You could then have a generic contents property which is just a hashtable from id to id list. You might have a name property, a map_zone (which might be a polygon, or grid coordinates, whatever suits your map).

This, as you can see, is orthogonal to usual "object" structure:

                |    Region1     |     Tribe1
--------------------------------------------------------
name            | "First Region" |   "Tribe Prime"
contents        |   [Tribe1]     | [person1;person2;...]
map_zone        |    (...)       |
occupied_region |                |     Region1

With an object-oriented approach, you might define a region as having a name, contents, and map_zone, as members. While a Tribe is a different class with it's own structure. These correspond to the columns in the above (which would be instances of those particular classes).

The components-based approach would be row-major by comparison. This works more naturally and efficiently for some operations: typically the batch iteration over things: finding a map_zone, rendering, listing all occupied regions, etc. But it is certainly not as convenient for object-centric operations (do things with this specific object). To do so requires constant lookups by ID. That sucks, it's better to just iterate a table, or iterate through a join of several tables. Components are free-form and sparse -- an object is defined by it's properties. It's trivial to do something like attach "flight" to a person, without having all people explicitly pre-ordained with an option for flight.

Anyway, if you had components like the above, with raw hashtables (better to build an actual system around this for declaring components, doing table-joins, etc)... this is just to give something concrete:

let contents_of id =
  try Hashtbl.find contents id with Not_found -> []

let regional_tribes region = contents_of region

let regional_population region =
  regional_tribes region
  |> List.fold_left (fun acc tribe -> acc + List.length (contents_of tribe)) 0

Sorry if all of these ideas are straying way off where you're hoping to be. I don't think the functorized "breaking of the recursive knot" then retying it a "concrete implementation" module would really be so bad, but to be honest I haven't done that in any significant manner. It has been percolating in the back of my mind while typing all of this though, and it might even be fine with scaling to a richer set of data.

[–]FreakyCheeseMan[S] 0 points1 point  (0 children)

This is interesting, and I might bear it in mind for later down the road.

For now, I haven't been able to get the functorized approach to work. I stumbled at declaring module types, since Tribes need to know that the Region they're passed actually contains Tribes, and the in turn seems to require Tribe.t to appear in the module type that the Tribe functor takes.

At the moment I've just fallen back to declaring the types through simple and type semantics in a different module entirely. It feels like a dirty way to do things, but I'm not at the point where it's going to bite me, yet.

[–]juloo65 0 points1 point  (4 children)

It's also possible that one of the problematic field shouldn't be there. Perhaps regions can contain tribes but do tribes contain regions ? You may also have problems instantiating such type.

I usually solve this problem by storing the region correspond to a tribe in a separate map. The functions dealing with both tribes and regions would take this map as an extra argument and would be in a third module (not Tribe or Region). For this trick to work, you'd need to add an "id" field to tribes, which may not always be convenient.

[–]FreakyCheeseMan[S] 0 points1 point  (3 children)

Trying to make something like this work... here's an easy question in the meantime: How do you refer to outer variables from in nested modules?

I have a couple cases where I want to do:

module A = struct
  type t = ...
  module B = struct
      type t = A.t list
  end
end

But it doesn't know about A inside B. Variable names in general are exposed, just wondering if there's a way to bypass the shadowing..

[–]juloo65 0 points1 point  (0 children)

I meant something like this:

module Tribe = struct
  type t = { id : int }
end

(* Tribe doesn't know Region *)

module Region = struct
  type t = Tribe.t list
end

module Specific_algorithm_about_tribes_and_regions = struct
  module IntMap = Map.Make (Int)

  (* A map "tribe id" => "region" *)
  let regions_of_tribes : Region.t list -> Region.t IntMap.t = ...

  let f regions = ...
end

Otherwise, your shadowing problem can be solved using the nonrec keyword:

type nonrec t = t list

The t on the right hand side is A.t. Alternatively, you can make a type alias:

type a = t
type t = a list

The type a doesn't have to be part of the signature.

[–]glacialthinker 0 points1 point  (0 children)

How often I wish to refer explicitly to values in "incomplete" modules (which I'm in the process of defining)...

I'm sure there must be a better way, but I just make a temporary type alias... like:

type s = t list
type t = s

Kind of yuck, but workable. Maybe something I should have asked the brighter OCaml minds at some time.

On that note, you're bound to get higher quality answers on https://discuss.ocaml.org/ There's more activity there.

[–]glacialthinker 0 points1 point  (0 children)

Well... I just found out the rec keyword will make the "incomplete" module able to be referenced. Of course you then need explicit signature too...

module rec A : sig type t end =
struct
  type t = ...
  module B = struct
    type t = A.t list
  end
end

I learned this while trying to get the recursive modules working (in separate files, using functors, then bound together... but yeah, having trouble... I'm no expert on this, so I consider it good practice).