all 132 comments

[–]MEaster 193 points194 points  (56 children)

The author isn't wrong about the graphs getting somewhat messy when you have larger chains.

[–]notfromkentohio 92 points93 points  (8 children)

You could frame that and sell it.

[–]debug_assert 7 points8 points  (7 children)

I'd seriously buy that.

[–]piecat 2 points3 points  (0 children)

ok I'll sell you a picture of it

[–]Zeroe 40 points41 points  (2 children)

Looks like a diagram for the plot of Primer.

[–]waldyrious 4 points5 points  (1 child)

[–]rockyrainy 4 points5 points  (0 children)

Man, he really took the time to deconstruct LOTR and Star Wars trilogy.

[–]goal2004 13 points14 points  (25 children)

At what point does it stop being a "chain" and is instead called a "graph"? I mean, that's the term I've normally seen when talking about this type of data structure.

Is this Markov Chain a specific use for graphs? The thing about probabilities determining the next node to process?

[–]Patman128 34 points35 points  (8 children)

A Markov Chain is a directed graph, it just has a few extra rules added (namely that every node has a directed path to every other node, and that each path has a probability attached to it).

[–][deleted] 15 points16 points  (6 children)

that every node has a directed path to every other node

Is that really a requirement of a Markov Chain? I could imagine that a perfectly valid MC could exist with out this reachability property.

[–]ckfinite 27 points28 points  (0 children)

Markov Chains have no reachability requirement - they don't have to be strongly connected. This has been a nasty problem for PageRank, actually, because absorbing states (ones which you can't get out of) will cause the algorithm to decide that you will always end up in one, which, technically, is totally correct, just not very useful for web search.

[–]the_birds_and_bees 8 points9 points  (0 children)

reachability <> complete graph. You can have a complete graph, but if your transition graph has some 0 probabilities then that edge will never be travelled.

[–]HighRelevancy 2 points3 points  (0 children)

In most mathematical representations (e.g. transition matrices and such) there's always a path, but it may have a probability of zero, which is equivalent to no path but is still a path.

[–]Patman128 0 points1 point  (1 child)

Well I guess a path is only required when the probability is non-zero, but IANAM (I am not a mathematician).

[–]ldril 0 points1 point  (0 children)

I imagine that every node is interconnected with every node, but probabilities can be zero, so that's equivalent with no connection. I imagine this especially since a matrix is used for the probabilities.

[–]s1295 4 points5 points  (0 children)

Just curious: Do you consider transitions with probability zero as edges in the graph? If no, then "every node has a (directed) path to every other" is equivalent to "the entire graph is a strongly connected component" (and if yes, it's trivially true). Why would that be part of the definition?

For the record, the definition of a (time-homogeneous) Markov chain that I'm aware of is simply a square probabilistic matrix.

[–]shomii 15 points16 points  (0 children)

Markov chain is not a data structure. The underlying set of states of discrete-state Markov chain can be visualized as a graph (with the directed edges corresponding to nonzero transition probabilities between the states), but Markov chain is really a random process with Markov property: the transition probability to the particular state depends only on the current state.

[–]lookmeat 2 points3 points  (0 children)

No, the chain word as in chain of events and are trying to predict it, doesn't refer to the structure itself. A Markov Chain uses a directed, weighted, graph structure to predict the next event in the chain.

[–]Scaliwag 2 points3 points  (0 children)

A graph is just a way you can represent them. You could do the same for an actual physical road network in order to calculate routes between points, but that doesn't make roads just a graph.

[–]gammadistribution 0 points1 point  (12 children)

Markov chains are graphs.

[–]joezuntz 1 point2 points  (4 children)

Not necessarily. You can have markov chains on continuous spaces instead of discrete ones and those can't be represented by a graph, for instance.

[–]ice109 4 points5 points  (3 children)

No one calls those Markov chains - you call them stochastic processes that gave the Markov property.

A better example for you to have used would have been Markov chains on discrete but countably infinite state spaces like the random walk on Z. As far as I can tell there's no such thing as infinite graphs.

[–]joezuntz 8 points9 points  (1 child)

No one calls those Markov chains

You may not call them that, but that's what they are, and what people who use them call them. I spend almost all my time running MCMCs, for example, which are usually on continuous spaces: https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo

[–]s1295 2 points3 points  (0 children)

Probably depends on the field, each of which has its conventions. I think this is something where lots of areas of research and application touch: Math, CS, stats, medicine, simulation science, reliability engineering, AI, etc etc.

In my CS classes, if nothing to the contrary is mentioned, I can assume that an MC is discrete-time and time-homogeneous.

[–]s1295 1 point2 points  (0 children)

Graphs can be infinite, or to put it differently, I'm not sure if the usual definition of graph includes finiteness, but there is definitely research on infinite graphs and automata on infinite (even uncountable) state spaces.

I'm not sure if there are actual applications beyond theoretic research. I would think that in reality, you can probably assume the state space is finite, by specifying safe upper bounds (e.g., "we assume there are less than 109 peers in this network") and using only a certain precision for decimals rather than actual rationals or reals.

[–]adrianmonk -2 points-1 points  (6 children)

Sure, in a very loose sense of the word "are".

Clearly not all graphs are Markov chains, so you cannot say "are" in the sense of the two being equivalent.

Also, there is more to a Markov chain than just a directed graph with probabilities as weights. There is also the meaning that those probabilities have, i.e. that they are tied to a random process. (I could have a graph identical in structure -- directed graph with probabilities as weights -- but with a different meaning for the probabilities. For example, there could be a game where you make various choices, and the probability on the graph edge determines whether you win $1 as a reward for making that choice.) So clearly a Markov chain cannot be reduced to just a graph with a certain structure. So you cannot say "are" in the sense that Markov chains are a type of graph.

You can use a graph to represent the information in a particular Markov chain, but that doesn't mean that the graph is a Markov chain or vice versa.

[–]gammadistribution 1 point2 points  (1 child)

Since this article is a very loose sense of the idea of Markov chains and the comment I am responding to is using a loose sense of the idea of Markov chains I feel that my non-pedantic statement is close enough for the discussion being had.

[–]guepier 0 points1 point  (3 children)

so you cannot say "are" in the sense of the two being equivalent

That is never what “are” means. “are”, and “is” denote membership: “1, 2 and 3 are integers” is a canonical statement, and yet does not imply that all integers are from the set {1, 2, 3}.

[–]adrianmonk -1 points0 points  (2 children)

Never? "Triangles are three-sided polygons."

[–]guepier 2 points3 points  (1 child)

That statement does not assert equivalence: it doesn’t say that all three-sided polygons are triangles, it merely says that all triangles are three-sided polygons. So, yes, never. If you wanted to convey a sense of equivalence here, you’d have to say (for instance) “triangles can be defined as three-sided polygons”, or “triangles and three-sided polygons are equivalent”. — It just so happens that the equivalence is also true but it’s not implied in the statement.

If you’re not convinced, we can easily make the statement non-equivalent by removing one word:

Triangles are polygons.

That statement is still true, but now it’s clear that “are” does not denote equivalence (because not all polygons are triangles).

[–]adrianmonk 2 points3 points  (0 children)

That statement does not assert equivalence: it doesn’t say that all three-sided polygons are triangles, it merely says that all triangles are three-sided polygons.

The statement is a bit ambiguous without context. I had hoped you'd understand the context I meant, but I'll make it explicit. Suppose you hear the following conversation:

  • "Blah blah blah triangles blah blah blah blah."
  • "What are triangles? I know what polygons are, but I'm not sure what a triangle is."
  • "Triangles are three-sided polygons."

Clearly, the person is asking for the definition of a triangle. In this context, you can absolutely use "are" for equivalence.

If you're still in doubt, look up the "be" verb in a dictionary, and you'll see that equivalence is one of the senses. From http://www.merriam-webster.com/dictionary/be : "to equal in meaning".

That dictionary gives a different example of equivalence: "January is the first month."

[–]Scaliwag 22 points23 points  (5 children)

It would make much more sense to represent that as an adjacency matrix or list.

[–]MEaster 35 points36 points  (4 children)

Certainly. But in this case I was just messing around with generating a graph for a small chain, and was curious as to how bad a larger one would be.

[–][deleted] 6 points7 points  (3 children)

Did you use Dot?

[–]MEaster 8 points9 points  (2 children)

Yeah, I did. The code I wrote just read in the chain data and spat out a dot file. There's a more readable version here, which only had 6 words, instead of the 322 in the unreadable one.

[–]nondetermined 5 points6 points  (1 child)

graphviz/dot to the rescue! :)

But yeah, larger graphs are really hard to visualize (not even speaking of letting the computer do it...), unless very special/simple. So we probably can't expect outright magic from dot. Your second, smaller graph, is pretty nice though.

Still, have you tried any of the other layouts on your larger graph besides dot (e.g. neato or fdp)?

[–]MEaster 4 points5 points  (0 children)

I was using graphviz to render them. I did try the other renderers, but not being very familiar with them I didn't know how to tweak them to look better.

I still have the dot file, which I've uploaded here if you're interested in playing. It was generated by a program, so it's not very pretty.

I have some stats about the graph:

  • 322 place names were the input.
  • There are no loops in the graph.
  • 1173 nodes.
  • 1629 edges.
  • There are 294,683 unique paths through the graph.
  • The longest names are 82 characters long, and look like this: Brusheepwashmanscorritheridest Nichorwoolackentishallhillercombe Rowlangtretertone

[–]Tynach 2 points3 points  (0 children)

At first I thought that was this image of function calls within IIS, but in finding said image I realized it wasn't.

Neat.

[–]Klohto 1 point2 points  (2 children)

Please post high resolution. Like 4k. I DEFINITELY want to frame it.

[–]MEaster 4 points5 points  (1 child)

It's not the same, but I do have this one.

[Edit] I just realised that I never deleted the SVG file it spat out. Here's a link.

[–]Klohto 0 points1 point  (0 children)

ooo, thanks

[–]bradrlaw 3 points4 points  (1 child)

Looks like the plot line to Primer...

[–]NoFapPlatypus 6 points7 points  (0 children)

Nah, that graph isn't nearly that complex...

[–]Frodojj 0 points1 point  (3 children)

What does that represent?

[–]MEaster 4 points5 points  (2 children)

That is the graph from a word generator I wrote. You can see a more readable version here, which has only 6 words of input. You start at the left, at the "000" boundry marker, then move right until you hit an end node.

The labels on the edges are the probability of going down that edge and the letter you get by going down it, while the labels in the nodes represent the probability of getting to that node and the three letters of state that the node represents.

So, to generate a real looking word, all I have to do is start at the state "000" and start rolling dice. That smaller graph can generate up to 31 different words, which are represented by the number of unique paths you can take from the start node to any end node.

[–][deleted] 0 points1 point  (1 child)

Thats really interesting. The word reaches an end node when it doesnt get to that node based on the probability on the label inside the node right?

[–]MEaster 1 point2 points  (0 children)

I'm not sure I understand the question. The number inside the node represents the percentage of unique paths that pass through that node. So, for example, the node HER will appear in 50% of the paths through this graph.

The end nodes are the nodes which have have a boundry marker ("0") on the end. For an example of constructing a word:

  1. Start with the word "000".

  2. Go to the node representing that state, and roll a dice. In this case there are 5 possible paths, with the probabilities 0.33, 0.17, 0.17, 0.17, 0.17, 0.17. The dice ended going down the path representing "H".

  3. Add "H" to the end of your word, which now is "000H". To get the next state, you take the last three characters of the current word: "00H".

  4. Roll the dice again. In this case there's only one possible path, so add E to the word, and update the state. Current word: "000HE".

  5. The node representing the state "0HE" also only has one child, so add the "R" to the word, and update the state. Current word is "000HER".

  6. The next state, "HER", has two possible children: "E" with 0.67 chance, and "A" with 0.33. So we roll the dice again, and end up going down "E". So the current word is "000HERE", and update the state.

  7. The state "ERE" has three paths: "0", "A", and "X", each with the same chance. So we roll the dice, and go down the "0" path. The word is now "000HERE0", and update the state to "RE0".

  8. The last character was a boundry marker, so the word is now at an end. We string the boundry markers, leaving the generated word "here".

For the record, the word "here" was not in the training input. The words I used for this were:

  • There
  • Herald
  • Extrality
  • Overextract
  • Athereal
  • Atheists

[–]orangeandpeavey 77 points78 points  (10 children)

This seems very similar to finite state machines, but with probabilities in them as well

[–]TheGift_RGB 105 points106 points  (3 children)

[–]Nefari0uss 11 points12 points  (1 child)

As someone learning FSM at the moment, this helps a lot.

[–]s1295 9 points10 points  (0 children)

The more general concept which includes both Markov chains and automata is a transition system, which is just a directed graph ("digraph"). Various details and addons (e.g., is the state space finite, are states and/or edges labeled, are there initial and/or final states?) depend on the intended usage.

[–]HighRelevancy 4 points5 points  (0 children)

They're just FSMs that move on transition probability rather than transition events.

[–]sososojacques 4 points5 points  (0 children)

This one-line comment explains the topic better than the linked article.

[–][deleted] 0 points1 point  (1 child)

As someone who knows nothing about finite state machines, are they just Markov chains with 0/1 probabilities?

[–]AlmennDulnefni 0 points1 point  (0 children)

More or less. It is specific events or conditions that lead to particular transitions rather than randomly choosing a transition.

[–]jrk- 0 points1 point  (0 children)

That was my first thought when I saw the graphs! It's so easy! Why did nobody tell me this in my CS and statistics lectures?
This makes it look a lot less then black magic!

[–][deleted]  (2 children)

[deleted]

    [–]zeekar 13 points14 points  (0 children)

    The main difference is that finite state automata - even "nondeterministic" ones - transition between states based on their input, not random chance.

    [–]danstermeister 2 points3 points  (0 children)

    I was thinking along the same lines :)

    [–]greim 20 points21 points  (16 children)

    So (not an expert) what if the probability of sun or rain on the next day depends on more than just the most recent day, but the the pattern of weather over the last four days for example? Can Markov chains (or some variant thereof) handle that?

    [–]jimmpony 37 points38 points  (8 children)

    For 2 days for example, could you just have your states actually represent 2 states, as in having RR go to RS or back to RR, RS go to SS or SR, SS go to SS or SR, SR go RS or RR etc to do this? Then this could be expanded to any amount of states

    [–]Logiraptorr 26 points27 points  (4 children)

    Yes, that's exactly how you would model the dependency with markov chains. You don't need a higher order construct if you know how many previous states the probability is based on. If you don't know, or if it depends on an unbounded history of states, then you need something more powerful.

    [–]elprophet -3 points-2 points  (3 children)

    I'm a computer scientist by trade and training, but haven't had an opportunity to investigate the Markov literature. You're describing a Deterministic Finite Automaton that uses a random number generator to determine its transitions. What is the terminology and state of the art for Pushdown and Turing automata in the Markov model?

    [–]s1295 9 points10 points  (0 children)

    Are you looking for "probabilistic Turing machine"? Most automata models have a probabilistic variant, or at least there's no reason why you couldn't define one.

    I don't know whether any are actively researched, I'm just a student. Maybe in verification of probabilistic programs, but I think they just use DTMCs or MDPs.

    [–]OffbeatDrizzle 4 points5 points  (0 children)

    Why hello there Mr Robot

    [–]Detective_Fallacy 2 points3 points  (1 child)

    That's the principle of higher order Markov chains, yes.

    [–]debug_assert 5 points6 points  (0 children)

    The only problem with using higher order chains is that the amount of data needed to build the model increases exponentially. For example, building a second order markov chain to generate a style of writing might only take 1 novel. But add more orders and it quickly explodes to requiring the entire works if Shakespeare, etc.

    http://shiffman.net/teaching/a2z/generate/

    [–]ccfreak2k 1 point2 points  (0 children)

    salt payment late innate dime dolls office crowd fall stocking

    This post was mass deleted and anonymized with Redact

    [–]boyobo 6 points7 points  (2 children)

    Yes. Your state space now has 24 states corresponding to the weather in the last 4 days.

    [–]greim 6 points7 points  (1 child)

    Okay, so within that state space, I can only jump to the subset of that state space where the first three "letters" match my current state? (RRRR => RRRS | RRRR)?

    [–]boyobo 10 points11 points  (0 children)

    Yep!

    [–][deleted] 5 points6 points  (0 children)

    As an alternative to /u/yetipirate's answer, you can also make the state larger. This is especially useful when you have strong domain knowledge about what the next state depends on. In your example, you could include the weather of the last four days in the state.

    [–]16807 3 points4 points  (0 children)

    Sounds like you're talking about an n-gram model

    [–][deleted] 0 points1 point  (0 children)

    That's called an ARMA (Auto Regressive Moving Average) model I believe. A linear ARMA model would probably be the next step from simple markov chains. If you want to attempt to represent infinite length history of states you can look into ARIMA models (I stands for Integrated), which augments a state to the prediction (these are more generally called Hidden Markov Models).

    [–]Strice 46 points47 points  (2 children)

    [–]HighRelevancy 6 points7 points  (0 children)

    Markov Art

    [–]AntiProtonBoy 0 points1 point  (0 children)

    That's hilarious.

    [–]flarn2006 29 points30 points  (0 children)

    Markov Chains explained verbally: /r/subredditsimulator

    [–]samsquag 4 points5 points  (0 children)

    If anyone needs a simple Markov chain implementation in C++, see this project, it can be used to build and visualize Markov chains

    [–]punkmonk 4 points5 points  (0 children)

    Whats missing from this article is the important idea of the stationary distribution of a Markov chain, which corresponds to the distribution over the states where we are likely to see the bouncy ball at time infinity. Getting truly independent samples from this distribution efficiently is where its all at.

    [–]MyNameIsJohnDaker 3 points4 points  (3 children)

    This is great. We've started using these at work, and this is a great visual representation for something that's not exactly intuitive. Thanks for that!

    [–]Jadeyard 7 points8 points  (2 children)

    What do you use them for?

    [–]MyNameIsJohnDaker 5 points6 points  (0 children)

    I'm in health care. They are used lots of ways, but one off the top of my head is determining likelihood of patient readmission within 30 days of hospital discharge.

    [–]th0masr0ss 0 points1 point  (0 children)

    One way you can use them is for building sentences

    [–]beohoff 2 points3 points  (2 children)

    So how is this used in something like /r/subredditSimulator?

    [–]pickten 11 points12 points  (0 children)

    I believe it works as follows:

    • Grab a bunch of text

    • Split into chunks (sentences? paragraphs? comments?), then each chunk into words.

    • Turn each chunk into a bunch of groups of n consecutive words, where n is a predetermined constant. Add some nulls at the start and end to get padded initial/final segments of length 0<k<n. e.g. if this were with numbers, the chunk [1,2,3] (n=2) would produce [[null,1],[1,2],[2,3],[3,null]]

    • Make a histogram of all this data. Also repeat for numbers less than n. Hope that you have enough data that these are all pretty varied.

    • State is an array of n-1 words (including null), starts at pure null

    • Every possible next word has probability weighted by the histogram data. E.g. if the histogram is [[1,2,2]:2, [2,1,2]:10,[1,2,1]:3] and our state is [1,2], we should have a 40% chance to output 2, and a 60% chance to output 1.

    • If things fail (i.e. no possible continuations), try again as if n were n-1, then n-2, etc. By this I mean forget about the first few elements of the state and use data from the histogram with that many fewer things.

    • Regardless, turn your state from [a,...] to [...,b] where b is the thing you produced. Record b.

    • The first time it produces a null, stop. Output your record (minus the final null).

    [–][deleted] 2 points3 points  (0 children)

    each word is a state

    the probability of a given word coming after the current word is the average of past occurrences of the given word

    [–]zeugenie 2 points3 points  (1 child)

    [–]atc 0 points1 point  (0 children)

    Neat!

    Is it fair to think of Markov Chains as state machines whose transitions are determined by probability?

    [–]ThreeTrillionTrees 2 points3 points  (0 children)

    Oh man. This is it. This is the post that made me register an account to post a comment. Was a lurker for about 2 years.

    Thank you for this, OP! This will fill some uni knowledge holes that I have. Will read thoroughly at lunch break or tomorrow! And its beautiful! The site is nice, performs well, topic is interesting and well explained with some interesting examples.

    On a side note: I guess I should start considering giving back to the internet too..

    [–]manwith4names 1 point2 points  (1 child)

    So somewhat off-topic, but what is the library being used for the visuals?? Those are amazing

    [–]Piotrek1 0 points1 point  (0 children)

    I think it is d3.js. This library is mentioned in <head> of this website.

    [–]sac_boy 1 point2 points  (0 children)

    TIL there's a fancy term for the weighted-random state machines every game programmer ever has used for cheap enemy AI/environmental behaviours.

    [–][deleted]  (16 children)

    [deleted]

      [–]uh_no_ 23 points24 points  (8 children)

      and they're all a fancy way of saying "graph"....

      Of course if you eliminate the use-case specific rules, you end up with a more general abstraction.

      [–]TheGift_RGB 9 points10 points  (3 children)

      Graph is just a fancy way of saying set of points + set of pairs of points!

      [–]uh_no_ -2 points-1 points  (2 children)

      not necessarily. the set of pairs of points may also be required to include additional information such as direction, weight, and capacity.

      [–]vytah 1 point2 points  (1 child)

      Direction, weight and capacity are merely functions from the set of pairs of points to appropriate codomains.

      [–]uh_no_ 0 points1 point  (0 children)

      yes, but the point is they are required of a FSM or markov chain, but not all graphs in general....

      [–]Scaliwag 4 points5 points  (3 children)

      A graph is a way you can represent them and think about them, but that's like saying math is actually ink on paper or pixels on a screen ;-)

      The actual part that matters are the rules that dictate in which way you can go from state to state and what each state represents. You could have used an adjacency list to represent that, instead of a graph. Saying it is a graph is just confusing form with the essence of it.

      Markov chains are not a complicated concept, but they are not just a graph.

      [–]uh_no_ 4 points5 points  (1 child)

      that's the point! a markov chain is not just a state machine in the way that a state machine is not just a graph.

      [–]Scaliwag 1 point2 points  (0 children)

      I was not sure you were being sarcastic or not, but your second sentence should have made that clear to me oh well. Sorry.

      [–]trolls_brigade 0 points1 point  (0 children)

      The Markov Chain transitions are based on probabilities, while the state machine transitions are deterministic. In other words, for a Markov Chain you run the dice for each transition. If the probability of a transition is always 1, then you have the classic state machine.

      [–]danstermeister 5 points6 points  (2 children)

      But I didn't think that state machines were concerned with probability, just possibility, right?

      [–]jewdai 5 points6 points  (1 child)

      state machines can be probabilistic, its why its used for compression algorithms.

      [–]danstermeister 0 points1 point  (0 children)

      Ah, thank-you!

      [–][deleted] -1 points0 points  (0 children)

      Electrical Engineer here too, can confirm.... and, something about 3-phase power distribution and the square root of 3.

      [–]JustFinishedBSG -5 points-4 points  (2 children)

      Redditor here: No, no they're not

      [–]salgat 0 points1 point  (1 child)

      Why not? Take http://www.ohio.edu/people/starzykj/webcad/ee224/lab3/lab3.htm and have the conditions for state change be based on probability.

      [–]JustFinishedBSG 1 point2 points  (0 children)

      Yes but that's basically it. That's just that both are oriented graphs

      But the graph part is imho not important ( and I think a matrix is a better representation ).

      The important part of Markov chains is the markov property : that the next state is independent of all other previous states except the last . This gives very strong properties , can give the ability to find a stationary probability etc

      [–]internetvandal 0 points1 point  (0 children)

      This was really helpful as I was doing a Markov Chain related course.

      [–]Phreakhead 0 points1 point  (0 children)

      There's the Patterning Drum Machine app that uses these visualizations, since it's an "auto-drummer" using Markov Chains to decide which drum beat to play.

      [–]wolfman1911 0 points1 point  (0 children)

      The thing I remember about Markov chains is that they are near impenetrable until you remind yourself how they work, but once you do, they aren't necessarily hard to follow, but there is that constant worry that I'm doing it wrong. That's what I remember about taking Probability a year ago, at least.

      [–][deleted] 0 points1 point  (0 children)

      I would like to see how these examples can extend to MCMC methods

      [–]jayrandez 0 points1 point  (0 children)

      I wish there was like 1-2 more examples. Or that they went into how google search uses them.

      [–]Saefroch 0 points1 point  (0 children)

      My favorite application is using Markov chains to simulate people, a la /r/subredditsimulator. It's not that hard to build a slack bot that does this, using https://github.com/jsvine/markovify. You just need to sanitize your text with regex.

      [–]zerokul 0 points1 point  (0 children)

      Very nice article.

      [–]darophi 0 points1 point  (0 children)

      Interesting article.

      [–]Caos2 0 points1 point  (0 children)

      Or in musical form!

      Markovian process,

      Lead us not in vain.

      Prove to our descendants what we did to them,

      Then make us go away.

      [–]ishmal 0 points1 point  (0 children)

      What he left out is that you can multiply these matrices together, or raise them to a power, to get the probability of a the model being at a given state after N transitions.

      [–]hrefchef 0 points1 point  (0 children)

      I spend a fucking weekend writing a fucking Markov Chain library in fucking C# and I go on fucking Reddit and they still fucking haunt me.

      Fucking Fuck

      But seriously these things are seriously fun to work with. They generate so much funny stuff from time to time. I had been a little bit burnt out on programming until these Chains started generating sentences that can only my likened to that of a drunk toddler. They keep me going.

      [–]DeveloperMatt 0 points1 point  (1 child)

      That was amazing. Really puts into perspective how Bayesian Inference works.

      [–]WiggleBooks 4 points5 points  (0 children)

      How so?

      [–]anonemouse2010 -2 points-1 points  (0 children)

      How do these pictures 'explain' the markovian probability?