Read Latex

Sunday, April 22, 2018

More AI, More ML: An Open Letter to Ancestry.com and 23andMe.com

That's it. You can stop reading now. Just do the title. How hard can it be?

Ancestry.com principally and 23andMe.com to a lesser extent let you use their genealogical services to assemble a family tree. I will focus on Ancestry there, but similar reasoning applies to 23andMe.com. There are two components to the family-tree building process, the PAPER of existing records and the BIOLOGY of DNA samples which both services analyze. However there is a glaring problem of when it comes to certifying the authenticity of family trees derived from historical documents, that is, PAPER. Do you trust the source? Can you read the document? Are the spelling changes plausible and if so how much? By using both DNA and PAPER one can cross check one against the other to confirm authentic lineages and refute specious ones. But there must be quality control in both the PAPER and the DNA. Laboratory techniques for DNA handling use statistical quality control methods that are reliable, however there is no equivalent quality control methodology for PAPER, which in large part has been converted to MICROFILM and digitized with varying levels of quality control image processing. There are chain of custody issues when one submits a DNA sample to both services and one should really submit multiple samples to be sure that the correct sample has been tested and labeled. There are also handing issues as samples make their way through the mail, postal and delivery systems. More or less the later issues are being addressed.

Ancestry.com currently requires you to chase hints in time and space to determine if you are related to a given candidate ancestor listed in a public record or another family tree. For large trees this can be extremely labor intensive, without guarantee that one has constructed a forensically certifiable result.


One error source is this; Ancestry let's you use other's family trees that are themselves mashups of information of dubious origin and there is no rhyme or reason to confirming whether information in these other trees is accurate. In other words there is no quality control. No assurance that one is dealing in fact.

The addition of DNA helps one connect with living ancestors and to add ground truth to previously assembled trees. There are forensic methodologies that increase certainty, such as this: when independent sources of information confirm the information. The more redundancy of independent records, the higher the certainty that the conclusions, the facts are authentic.

The problem is, after ones' family tree gets to an interesting level of complexity, the number of hints grows exponentially and many of the 'hints' lead to completely specious assemblies of data.

The fix to this is to associate with each tree, and with each fact in each tree, a certainty that the fact in the tree is indeed true. For a given ancestral line, these certainties can be multiplied together to provide a composite value that indicates the reliability of information. As a detail certainty is a number between 0 and 1 inclusive. A 1 means certainty is complete (which never exists in the real world of statistics). A 0 means there is no certainty whatsoever. A certainty of 0.9 means that the fact has a 90% chance of being true. If we chain two facts together each with certainty of 0.9 we have a 0.81% certainty that both facts are true.

There are a host of microfilmed documents from all over the earth that have been read, digitized and collated by human beings and many of these have been collected by Ancestry.com in what constitutes a controlling monopoly over historical ancestral information. This source of this control, this power has its roots in Mormonism. This can be good thing in that there is a single long-term historical and motivating organization or presence. This could be a bad thing if religious exclusion occurs.

The point of my open letter is this:

Recent advances in machine learning would enable PAPER documents to be parsed by machines and would associate with each fact gleaned from them a level of certainty. Previous entries in this blog discuss summarize these advances in detail.

The Mormon Church and Ancestry.com have close affiliations. In Salt Lake City, Utah, both have excelled in using advanced computation to solve important problems.

The problem is that there is a financial conflict of interest at play. Many families have invested generations of time and thousands of hours of work in building family trees using manual and computational methods. They may not take kindly to having their work, especially closely-held beliefs or assumptions questioned when those beliefs provide them with self-esteem or status in the community.

For people who have spent 30 years justifying that they are related to a Hindenburg or a Henry the 2nd like myself, this will be good news and bad news. It will be good news in that it will allow a more comprehensive family tree to be assembled, more RAPIDLY with less human error. It will be GOOD NEWS that it will allow a precise certainty to be associated with each fact in the tree. It will be bad news for those who have a need to be related to someone famous or historic and are not and have significant social capital in those claims.

I have a large tree of both famous and historic ancestors, including kings and martyrs. But I would gladly trade it off for a complete and accurate picture of who I am actually related to.

Mainly, I don't have time to chase the 31,000 hints that have popped up in my Ancestry.com Family Tree, especially when I know that machines can do it better. To that end, it is time to make more exhaustive and complete use of handwriting and document analysis using the burgeoning progress taking place in Artificial Intelligence and Machine Learning. The opportunity for true and factual historical insight could be spectacular.


Sane Public Policy With a Gun Census







I observe. I think. I've thought. The pen is mightier than the AR-15. The AR-15 killing machine will rust and jam, especially if you shoot NATO ammunition. The pen endures forever. I polish this article every-time there is a mass-shooting. It is getting way too polished.

We are at a social Tipping Point. Malcolm Gladwell, in his book by the same name, makes the point of "opt-in" vs "opt-out" when it comes to states enrolling organ donors. States that require DMV applicants to opt-out of organ donation produce more organ donors than those who must opt-in because people are lazy.

Lazy or not, under current law, nearly anyone who is 18 or older, can buy an assault-rifle and bump stock that converts it to fully automatic use. To be denied the right to purchase there must be a glaring red flag in the rubber-stamp background check that kills the purchase. Many mass-shooters are "first-time" offenders, therefore the response, by definition is always too-little, too-late.

Conferring on someone the right to perform mass-execution MUST have a higher barrier to entry than the current one. The burden should be on the applicant to prove that they:
1) are of sound mind
2) the people in their household and circle of trust are of sound mind
3) their killing horsepower need is reasonable

There is a difference between weapons for self-protection and those for mass-execution. When the 2nd amendment was developed, weapons did not confer on any single individual the ability to perform mass-execution. Give me a second while I reload my musket.

There is very good math, and I am a mathematician, that show that proportionality in war is a good idea. Mainly it avoids mass-extinctions. See Robert McNamara in the "Fog of War" for an excellent explanation of this. So let's do some simple math:

If everyone had in their garage, a hydrogen bomb, that they could detonate when they became angry, depressed, despondent or mad at the neighborhood association, one person could destroy a city, a state, or even a nation.

As a democracy, the decision has been made that we do not allow individuals to possess or carry nuclear weapons because there is currently no scenario that would justify this. So now we have an immediate obligation to the constitution:

Imagine if every AR-15 owner had a hydrogen bomb, locked and loaded in their house, maybe under the bed or something. That would make for 3 million H-bombs. If that were the case how often would we read about H-bomb explosions? The suicide rate is currently listed as 13 per 100,000 people. that works out to 390 incidents per year. Not so good for Earth Day festivities. The murder rate is 6 per 100,000 people. Maybe it is comforting that people kill themselves more often than others, show how good people really are. Anyway, that works out to 180 H-bomb explosions per year for a total of 570. Now we are assuming that AR-15/H-bomb owners commit suicide and murder as the same rate as the general population. Prepper be ready.

So we don't allow this and for good reason. Through the miracle of calculus, consider the following limiting arguments. How much killing horsepower should one individual be allowed to possess? I've already shown that a hydrogen bomb is too much. One can similarly reason that an atomic bomb is too much. We can continue to reason to more reasonable scales. Current law does not allow individuals to carry grenades, presumably because that is too much killing horsepower to confer on a single individual. Yet we allow uncredentialed people to purchase assault rifles willy-nilly. This is contradictory.

We can reason from below, as well as from above. We can reason from too little as well as from too much. How much killing horsepower is too little for self-defense? If one is attacked by a single person, you need to have what they have, plus a little bit extra so that you win. If a single attacker can have a hydrogen bomb, then you need two hydrogen bombs, just to make sure. So now we have a paradox. Whatever we allow someone else to have, is all that we are allowed to have! To fix that we must allow ourselves to have anything that they could possible have plus a safety margin and viola - we are in the current impasse enabling mass-shootings at schools, movies, concerts and work.

BUT a group of reasonable people could get together and say, here is how much killing horsepower we are going to confer on any single individual (this has tremendous ramifications for military leaders, but I digress). The killing horsepower that you are certified for depends on: 1) the soundness of your mind and 2) the threats you reasonably expect to confront in your current daily life. This creates a table of permitted killing horsepower. A commander-in-chief I know of, has the killing horsepower of the entire hydrogen-bomb arsenal of the United States. That is probably too much power to confer on one person, who in a moment of questionable judgement, could make a mistake that would destroy the world. Again I digress.

In short, a table of permitted killing horsepower is created and everyone gets a ranking. This table of permitted killing horsepower is created by proper research, debate and due process, while we mourn those lost in the meantime. In the future Artificial Intelligence tools will be used to screen applicants fairly, by quickly applying the wisdom of history and the hive. The model is that of a DMV for killing horsepower. Everybody hates the DMV, even so you can't drive a big truck without a special license and the lawyers come when they run you off the road. We know how to make DMV's. Apple could make them more user-friendly though. Part One Done.

The second part is an inventory, a complete common-sense gun census. An inventory must be made of the location and type of every gun and every round of ammunition. The manufacturers of weapons are required to keep records on how many are made. Many of these can then be tracked down by those records in combination with registration databases and store records.  The location of ALL guns and ammunition repositories must be part of the certification process. If someone tries to cache weapons John Wick style, his killing horsepower privileges are reduced or removed. If you use a gun, that use is going to be audited by the justice system, and the gun census is part of that process. A gun census doesn't take anyone's weapons away that shouldn't have them in the first place.

With a complete inventory, it will be possible to screen for those who are currently in possession of killing horsepower outside the realm of their daily need and their soundness of mind.

Any other approach to this problem will find us lamenting the murder of our children in schools, the murder of our friends at movie theaters, concerts and work. Schools, movies, concerts and work makes life worthwhile. Possession of killing horsepower outside our need or ability to wield it makes life more tragic than it already is.

In conclusion. We must implement these two fixes now. If things don't improve we can implement a sundown clause, a return to crazy town and see how that works. Judging from current events, crazy town isn't working at all.