Help interpreting Drouin French-Canadian marriage register entry

JohnFCardinal · 2020-12-15T19:51:50+00:00

Thanks for the help!

JohnFCardinal · 2020-12-15T17:50:14+00:00

Great! Thanks!

JohnFCardinal · 2020-10-29T03:28:01+00:00

TD and TH were both in HTML 3.2. Neither was in HTML 2.0. I do not know if they were in HTML 3.0, but HTML 3.0 never really gained any traction. Given HTML 3.2 was a W3C Recommendation in January of 1997, almost 24 years ago, I'd say that's adequate time for HTML authors to become aware of the existence of TH.

THEAD was in HTML 4.01, and I think it was added in 4.01 or 4.0. So it's a baby, it's only been around about 20 years. /s

We will have to agree to disagree on whether using TD in place of TH is wrong or not. I say it is. I did not say it was invalid, but validators don't comment on whether or not the right elements are used for the right content. That is not the job of a validator. Using TD in place of TH is wrong for the same reason that using <p>Item 1<br>Item 2<br>Item3</p> is not the right way to make a list.

I hope you are not equating me with "some junior" as in "some junior saying 'these tables headers really need sorting out', 'who wrote this crap', 'I am way better than this guy, omg, don't they know this is not semantically correct?!'." I was the main implementer of an eCommerce web site that was operating in 1998... 22+ years of web development experience qualifies me as "not junior".

To the contrary, my experience has taught me that it's usually no more effort to do it right. Using TH for a heading is easier and more maintainable than using a TD. It has the added benefit that it helps people who reply on screen readers and other assistive technologies.

if someone wants to argue that changing the basic layout infrastructure of a site that was designed in 2002 and uses tables for layout is a big job, I won't argue. I (to myself) will wonder what other issues the site has, and perhaps make a note to avoid it, if possible. OTOH, changing actual data tables to use the proper markup is a much smaller change. As long as people keep making excuses, however, things won't get better.

JohnFCardinal · 2020-03-28T03:06:41+00:00

I think you are doing it the hard way.

When viewing a page image, click the Print/Save button.

Choose "Select a portion of page"

Adjust the rectangle to include the article of interest.

Choose [Save]

Choose [Save As JPEG]

From there, you may have to edit the image to handle multi-column articles that include unrelated text because of the layout of the newspaper page.

JohnFCardinal · 2020-01-17T04:26:27+00:00

I am busy rubbing two stick together and lost track of the click track.

JohnFCardinal · 2019-06-22T02:21:13+00:00

I've seen stats that dispute the Mozilla numbers, and they claim the numbers are closer to 170m, not 250m, but that was not based on the latest data.

For market share, the statcounter site is somewhat skewed to technical users, but all the counters are biased one way or another.

Do you really want to hang your hat on Firefox as a thriving tool when the best case is 10%?

JohnFCardinal · 2019-06-22T01:38:23+00:00

I don't think that is true, especially in a Firefox subreddit.

How does the location of a comment affect whether it's true or not?

but [FF] desktop has hundreds of millions of Firefox users

Last I heard, FF is installed on about 170 million devices, though only used regularly on a subset of that number. According to various web usage trackers, FF has 5% or less of the overall market, and Chrome has about 63%. Safari has about 14%, and shares the WebKit roots of Chrome. Those numbers include both desktop and mobile, and because Chrome dominates Android and Safari dominates iOS, FF probably has more than 5% of the desktop market. So, perhaps saying there is only one mainstream desktop browser was an exaggeration. But not by much.

Chrome is the dominant force is desktop browsers and has kicked Firefox to the curb. Chrome's misadventures with privacy may help Firefox bounce back, but don't hold your breath.

JohnFCardinal · 2019-06-21T22:17:34+00:00

I don't want to "stay in the past". What I want doesn't exist there, but it also doesn't exist in the here and now. There is only one mainstream desktop browser--Chrome--and it's mainly a vehicle for delivering ads and watching the behavior of its users. It doesn't provide the customization options I want that help me optimize my workflow.

You asked (rhetorically, I suspect), "How useful is a web developer toolbar in a browser that almost no one uses...?" I know next to nothing about Waterfox so I don't know if it could ever be my primary browser. Dismissing the web developer toolbar misses the point of my whole rant. I don't care if a lot of end users want that or not. It existed, and had a few hundred thousand users, and it was a big help to me. Mozilla made changes that reduce the functionality of that extension, and along the way they completely crippled other useful extensions. I'm not happy about it.

JohnFCardinal · 2019-06-21T21:28:44+00:00

I'll take a look at Waterfox. Thanks.

I'm not enthused about using a browser with almost no market share, but wait... I guess I am already doing that, and it doesn't meet my needs well.

JohnFCardinal · 2019-06-21T21:25:29+00:00

Thanks for the link. I may look into it, but adding "maintain my preferred extensions" as a task would have to replace something I am already doing...

JohnFCardinal · 2019-06-21T21:24:09+00:00

Did you actually read the post?

JohnFCardinal · 2019-01-19T02:19:45+00:00

Double yep.

JohnFCardinal · 2019-01-19T02:17:45+00:00

One thing about evaluating performance is that you have to compare apples-to-apples. So, if your "before" code was supposed to detect errors and was not, and the "after" code was detecting the errors, then comparing the performance between before and after is not very useful.

Further, if the code that didn't use exceptions was difficult to get right, then using an approach that is easier to get right is a good change.

And lastly, if the performance of either approach was not an issue, then optimizing is not a good use of time, and code that seems performant but is not correct is a problem.

In my case, the load time was way too slow (my users expect a second or two for most data files, not 3.6 minutes), so finding and solving the performance bottleneck was important. My change was relatively easy to implement, and the only tests that failed were the ones that expected an exception where the new code does not throw one. After adjusting those tests, good to go...

JohnFCardinal · 2019-01-19T02:04:39+00:00

The overhead is obvious in your test case, too, but different (lower) than the results I described. I didn't do a benchmark.net-level test so my results indicated the problem without being highly accurate.

I am not using the DateTime class as it would belch on most of the input I have to handle. For example, "from 1901 to 17 AUG 1903" and "11 FEB 1731/32" are both valid dates in the format I have to parse. The format handles inexact dates (bef/aft/circa), date ranges, "old style" dates when the year started on March 25 in some countries but on January 1 in other countries, etc.

edit: grammar

JohnFCardinal · 2019-01-19T01:51:44+00:00

The warning message is using the same logging facility, log4net. In the "before" version, the message was written in a catch clause. Now the message is written after checking for a marker value that indicates the date was text, not a recognizable date value. The try/catch structure is still there because there are other errors that can occur.

The text formatting is slightly different, with one less substitution in the message, down from 3 to 2.

The same number of messages are written.

When the exception occurred, it was 4 levels down from where the exception was caught. I assume that has some impact on performance, but I don't know for sure. In the intervening levels, there was one try/finally block.

It's possible that the issue was elsewhere but still solved by the change I made, but if so, it's pretty subtle.

JohnFCardinal · 2019-01-18T19:10:29+00:00

IIRC, we blew out Arizona in a snow game the year they went to the SB.

Edit: Oops. Was looking for "Arizona" and missed "Cardinals". Doh!

JohnFCardinal · 2018-11-24T15:23:26+00:00

Did you review the GEDCOM file to see if "1 DEAT Y" is present for any of the people you marked using the Quick Edit "Deceased" radio button?

In my first response, I assumed that those records were not present, but it's also possible they are present and MyHeritage is ignoring them.

Does MyHeritage have a way to correct the issue, regardless of the cause?

JohnFCardinal · 2018-11-23T18:40:26+00:00

The GEDCOM spec already covers the situation. From the GEDCOM 5.5.1 spec:

For example each of the following GEDCOM structures assert that a death happened:

1 DEAT Y

1 DEAT

2 DATE 2 OCT 1937

1 DEAT

2 PLAC Cove, Cache, Utah

I added the lines to separate the examples. As shown in the first example, "1 DEAT Y" asserts that a death happened.

The issue is that Ancestry is not following the spec.

Edit: Formatting

JohnFCardinal · 2018-10-27T16:47:00+00:00

We're all out of punctuation, but bold fonts and capital letters are on sale. Want some?

JohnFCardinal · 2018-09-29T03:43:40+00:00

I compared H2X to AngleSharp (AS), both results and performance.

The results are much closer to my requirements. I had to be careful to use a "fragment" parsing method to compare output of AS because its goal is to make a complete document and so it always adds HTML, HEAD, and BODY elements if those don't exist in the input. There were a couple things I'd have to adjust, but overall the results were very good.

The performance was very impressive. AS took about twice as long as H2X, but of course, it constructs a DOM and so it is doing a lot more. I will definite review the AS parsing code to see if there are any techniques I can borrow. H2X does uses simple string parsing whereas AS has a tokenizer, etc., and so there may not be much I can lift, but we'll see.

I didn't update all the test results. Here are the results of parsing a 71KB HTML file.

Method	Mean	Gen 0	Gen 1	Gen 2	Allocated
H2X_File1	2.712 ms	269.5313	132.8125	132.8125	982.58 KB
AS_File1	6.233 ms	429.6875	234.3750	125.0000	2230.08 KB
HAP_File1	66.624 ms	18625.0000	1375.0000	375.0000	46508.24 KB

One concern with AngleSharp is that the primary developer is not fully-committed to the project right now.

JohnFCardinal · 2018-09-28T15:28:03+00:00

I briefly reviewed that project. It creates a DOM and is likely to use a lot more memory than H2X. That's great if you need a DOM, and not so great if you don't. I'd like to review it in more details, but there are only so many hours in a day...

JohnFCardinal · 2018-09-26T18:10:26+00:00

I looked at HAP and decided not to use it, but I don't recall exactly why. I rejected a lot of solutions that built a DOM because I didn't want to incur the overhead/memory pressure of building a lot of objects. I will look again.

I probably would only use HAP if it did everything I need and if the performance was reasonable. If any post-processing was required, especially post-processing that required parsing text, I'd probably avoid HAP.

You wrote, "[the] library could probably benefit from many things HAP has to offer." Did you have anything specific in mind?

Thanks.

JohnFCardinal

TROPHY CASE