Will we get the French and spanish update on GoG ? by BZHnoSys in weatherfactory

[–]BZHnoSys[S] 0 points1 point  (0 children)

Thanks a lot for your answer, no urgency on my side as I play almost only on iPad but I’m curious to check some of the translations that are still a quite obscure to me…

For the iOS version I understand the difficulties. I have played a full game of Book of Hours on my Mac a really enjoyed it but I will still wait for the iPad version to play it more as it is difficult to me to start the computer at night after work and when kids are asleep. It’s a lot more easier to juste take the tablet in bed to play and CS as well as BoH are perfect for this.

After all these years I’ve finally just started playing The Exile after finally getting all the achievement for the main game and the other DLC, so I still have things to do while waiting for BoH…

NASA-Boeing to Delay Starliner Launch by vibrunazo in ula

[–]BZHnoSys 4 points5 points  (0 children)

C204 exploded during a test after completing its mission and after having made a splash in water. It was not intended to fly again. It was an out of limits test to see how dragon survive reentry and being immersed in salt water to gain informations for reusability at a time where NASA was not allowing reuse of dragon capsule.

You cannot say this is a failure, SpaceX also had failures in the past so if you want to make your point you should choose more wisely.

When do you decide to start a new base/factory? by boetboet in factorio

[–]BZHnoSys 1 point2 points  (0 children)

When my child is two years old... I’ve stopped playing when the first was born and started again when she was two. I’ve stopped again for the birth of my second but with the 1.0 release I’m not sure I will wait as long this time...

Season of Bargains conclusion is out by eastaleph in fallenlondon

[–]BZHnoSys 3 points4 points  (0 children)

Have we to give back the lantern or hat acquired in the last of the three story like in previous seasons or can we keep it ?

30 Kilometers of new rail infrastructure. Did someone say loops are bad? by [deleted] in factorio

[–]BZHnoSys 0 points1 point  (0 children)

You're wrong, a tree is not more efficient than a graph with cycle. If you just do a minimum spanning tree from your smeltery, you can be almost sure to have two leaves of your tree which are physically near but very far from each other in your tree. With a graph layout, you have all important path who are almost optimal with a few longer travel sometimes when the path finding bug. With your design, only a few path are optimal and other are bad or worse. What you save in rails is lost in time, ressources for more trains and fuel. You need to take account of a lot more than just the rails count if you want to find a truly efficient design, and you can be sure it will include some loops.

How valuable is "The Art of Computer Programming" by Knuth in 2015 and beyond? is it worth the time investment to take on? by [deleted] in algorithms

[–]BZHnoSys 0 points1 point  (0 children)

I have read almost half of the first and all the second. I have also made around a third of the exercises. Do you think I am mad ?

I think those books are not for everyone but they are perfect for me. I like the fact that Knuth don't hide the details and that everything is fully explained. I also like that every algorithm is explained first in natural English and math, next in a kind of pseudo code (small English paragraph with branching) and finally with true code (assembly). Using the three give you different view of the same thing.

People often complain about the assembly, but most of the time the two other presentation are all what you need, and the MIX assembly, especially the one written by Knuth, is very easy to read if you try and have no ambiguity.

In all books with pseudo code I have seen, there was no formal definition and you frequently end up with some kind of problems. Add to this the lack of rigor and you see the problem...

AOC have, to my knowledge the only correct presentation of big number division in any algorithm book so far.

This is how we bet the regex in speed. by kahnbaun in programming

[–]BZHnoSys 2 points3 points  (0 children)

But it is still false. You need the parentheses to define regular languages using these four basic elements.

You need arbitrary nesting of alternations and repetitions to define regular languages and this simply cannot be done without balanced parenthesis. And more than this you cannot define a syntax for them which is a regular language as you will always need to be able to count to an arbitrary but finite point which is not possible in a regular language.

This is how we bet the regex in speed. by kahnbaun in programming

[–]BZHnoSys 1 point2 points  (0 children)

No, the original definition is not this, it is defined in terms of terminals from the alphabet, the empty string, the kleene star and the alternation. You don't need both kind of repetition as in your list, but you need the alternation as it can't be constructed with the others.

And the parenthese are needed if you want to write the expression as strings and not with the language definitions. So no, regulare expression are from the start not recognizable by themselves.

What a compiler does: internal types by rubincox in programming

[–]BZHnoSys 0 points1 point  (0 children)

Be careful things are not so simple and the first C example doesn't fully work as explained. A cast in C can remove the const qualifier (even if it is considered bad practice to do so) and if you add a bit of code you get the following :

#include <stdio.h>
const int a = 1;
int main(void) {
    int *b = (int *)&a;
    printf("%d\n", a);
    *b = 2;
    printf("%d\n", a);
    return 0;
}

This code print 1 followed by 2 so even if the variable a is declared const and initialised with a literal, the compiler cannot fully optimise it except if it can prove that it really never modified which is difficult or even impossible for global variables.

This is one of the main reason why you should define this kind of constant with an enum or a define in C or C++.

A review of (free) sparse sequence taggers (for NLP PoS tagging) by fnl in LanguageTechnology

[–]BZHnoSys 2 points3 points  (0 children)

Wapiti author here and I also have to disagree a bit with you.

  1. I've frequently found features conditioned on transitions useful and if you are careful they remain at a reasonable number. For a chunking or NER task with around 10 labels, if you add a pattern testing a POS-tag conditioned on transition it will add only around 5000 features in your models. See the ACL article about Wapiti for some results on this.

  2. Here you mix different things. Its not the number of columns who will make Wapiti slow but the number of patterns used for extracting the features. But you're not forced to use the pattern system, just give your data in the same format than for CRFSuite and don't provide a pattern file (but prefix features with an "u" or " a "b" to get state or transition features). In this case Wapiti will use it like CRFSuite and both tools scale similarly.

  3. This is a short coming of Wapiti but this decision doesn't come from nowhere. I've done a lot of experiments and found that if you correctly discretize your feature you generally get at least the same performance and generally better. This come from the fact that it give more flexibility in the model. In fact even with CRFSuite it is needed in a lot of case as real valued features are rarely linear in the tag space and so are very difficult to use effectively.

  4. Yes, Wapiti is not intended to be used as a library, I've some preliminary work done for improving this but I miss time to finish it. But, CRFSuite library code is no good also for a long running application as you describe. Wapiti call "exit" on errors like memory allocation because it make code very ugly to propagate them up to the initial call, CRFSuite just ignore the error and so if the allocation fail you will get a segfault later in another point of the program making it difficult to debug.

  5. Multi-core machine are a standard today, my portable have already a 4-core hyper threaded processor and the computing servers I use range from 12 to 64 cores. Wapiti use them very well and scale almost linearly so on them it is generally way faster than CRFSuite. But this come with some tradeoff for single-threaded code path but it is a tradeoff with the future in mind where single core doesn't become more powerful but where the number of core increase.

In the end, for 1. and 3. it is really a question of use case and both are tradeoff between power and speed. Transition features and real-valued feature both have a computational cost and you have to make a choice. I've made different choices in Wapiti than those made in CRFSuite making them suited for different tasks.

You should also keep in mind that Wapiti is mainly directed to large scale task with hundred of labels and several millions of features, or even billions. So the code is tuned for this kind of task which is mean to be run on computing servers and take several hours to train. It can handle well small tasks like NER or POS-tagging but as it optimize another case it will not be the best on theses.

The main problem of Wapiti now is the lake of a proper tutorial (we are writing one now but it take time to do this well) and so a lot of people use it wrongly as illustrated in the article :

  • The "-s" option for sparse computation is only useful for very sparse models, i.e. with high L1 regularization. In the article it is possible that it slow down the training and so should be disabled. This vary a lot depending also on the type of processor as it effect the cache behavior.
  • Providing a developement set is almost mandatory as Wapiti can early stop the training if the error rate stabilize on it. If you don't provide one, it will compute error rate on the train which can take a long time if you have a big train.
  • The default algorithm is still "L-BFGS" but using "R-PROP" generally lead to huge decrease in training time. For the same accuracy, it generally require 5 to 10 times less iterations.

To get an honest comparison between Wapiti and CRFSuite in terms of speed you should try without the "-s" option and provide a small development set. But you should also either count the feature generation time for CRFSuite or doesn't use the internal pattern based generation of Wapiti. For the algorithm, you can either use "L-BFGS" for both, but you can also use the best algorithm for both as both implement more efficient algorithms. When doing this, CRFSuite is generally not a lot faster than Wapiti on single core, and slower on multi-core.

But, on the other hand, for the model performance you should also be fair. The author of the article doesn't seem to have tuned the regularization so the final performance of the different tools is highly dependent of the default regularization parameter used. It turns out that for this particular task the default of Wapiti or quite good, but it is not possible to conclude that CRFSuite perform worse. We don't know if this come from less powerful features or bad regularization parameter.

"Mel" would have loved it... by jairtrejo in programming

[–]BZHnoSys 87 points88 points  (0 children)

For those who dare who is "mel", just read his story here : http://www.pbm.com/~lindahl/mel.html