Read Latex

Friday, January 04, 2019

Computing and the Future HW 1 - Too Many Languages




An interesting course at the University of Arkansas, led by Dr. Jared Daniel Berleant is Information, Computing and the Future. Homeworks are submitted via blog, and lectures dating  back to 2012 are archived, which helps getting oriented to the material.

There is an assertion in these archived notes that programmer productivity is increasing for various reasons. The exact phrase that captured my interest is "Increased Programmer Productivity" or IPP for short. I want to to assert the existence of some mitigating factors to make the case that programmer productivity is decreased over what it might be due to the paradox of "too many choices", which manifests as:


  • too many languages
  • too many Interactive Development Environments
  • too many target architectures

The Tiobe language index lists 18 languages whose penetration exceeds 1%. Assuming there is an Interactive Development Environment for each language we have 18 IDE's. This is a conservative estimate. Enumerating target architectures we again can conservatively estimate 18 from this StackOverflow list. In the worst case this leads the programmer to a minimum of 183 "choices" or 5000+ things to be "good at". Obviously this is all somewhat contrived, since programmer's specialize and there isn't a separate IDE for each language, but how do programmers reassure themselves they have specialized in the best language+IDE skillset? At 182, there are over 300 combinations, and learning a language and IDE takes some time, say a year, to master. If you don't have 300 years lying around check out some favorable candidates below.

I am interested in the overhead that occurs when we create codes that are illustrative, informative, provable and self-explanatory. The Explainability Problem in AI is an example of this. Addressing this has become the law in the EU. Brexit anyone? A hot topic in machine learning research is hyperparameter optimization. Briefly, how does one choose the activation function (the neuron response), the learning rate, the optimal partition between training set and test set, etc. to create the fastest learning and best performing AI? Speaking of this:
To see hyperparameters and neural networks in action, visit the TensorFlow neural network playground. Academic excursion question: Bayesian Learning is known to be effective, but combinatorially expensive. Can it be shown that in the limit of sufficiently exploring the parameter space, that hyperparameter optimization is effectively just as intractable, because of the number of configurations that must be explored? 





Another thing I am interested in, and the main focus of this entry, is how we bring people into our train of thought, into our line of reasoning, as seamlessly as possible. We document our thoughts, communicate them, run simulations, and perform mathematical analyses to make statements about the future and the limitations that we have in predicting it1. Some tools (and their requisite URLS) that assist in this process are:


  • Blogger for creating running Blogs in the Cloud
  • Scrimba for Teaching/Demonstrating Algorithms in Vue.js
  • JSFiddle for Teaching/Demonstrating Algorithms in JavaScript
  • Jupyter Notebooks for running Machine Learning in Python
  • MyBinder for hosting deployed Jupyter Notebooks

The links above can be run for brevity, the discussion below enumerates some of the issues they raise or address.

Blogger

Blogger is an adequate blogging platform, enabling the inclusion of links, images, and videos such as the opening one above. The link provided above compares blogger.com to several competing platforms. It's not particularly futuristic per se, but it is better than paper or email. I would include an illustration, but this very page serves that purpose very well. It has been somewhat "future-proof" for the past few years anyway. It does the job, so no further commentary is necessary.

Scrimba

This system overcomes weaknesses in passive video tutorials. The architects of Scrimba wanted a system that would engage the student, that would let them make the rite of passage from student to contributor rapidly, and that would reduce the overhead of learning by enabling instant interaction with the code for the example at hand. The strong suite of Scrimba is currently its tutorial on Vue.js, that enable dynamic programming of graphical user interfaces via JavaScript, in html pages. The Scrima: Vue: "Hello World" example narrated by Per Harald Borgen is the perfect example of making the sale before the customer even realizes they've been sold. Per's intro leads to a more detailed exploration of Vue that actually does useful work, facilitated by Zaydek Michels-Gualtieri. Between these two tutorials one can learn the basics of Vue.js in a little over an hour. This makes the case FOR increased programmer productivity (IPP). However, the advent of yet another language and IDE adds yet another choice to the implementation mix, a mitigating factor AGAINST IPP.




Note 1: If programming were a crime, we could speak in terms of aggravating and mitigating (extenuating) circumstances that would enhance or reduce the programmer's guilt and corresponding sentence.

Note 2: According to my offspring, a powerful development trifecta is created when  Vue.js development is combined with Google Firebase user authentication and  Netlifly, a platform for deploying applications.


JSFiddle

Byte-sized proof of concepts for functions, hacks, tips and tricks. JSFiddle simultaneously shows the html, css, and javascript in three separate INTERACTIVE windows, along with a main window where the html markup, css stylesheets and JavaScript code are executed in the browser to show the outcome. JSFiddle lives on the web, in the public cloud so that anyone can contribute. Scrimba improves upon this concept by enabling the user to grab control of the session and the code at anytime during the presentation, pausing the narrator, saving past states and audio recording by the same user, who can eventually narrate their own content.




Jupyter Notebooks

"Writing code that tells a story." I include this because Python 3 programming in Jupyter has become the lingua franca of information exchange in the current machine learning (ML) revolution. The Tiobe index, cited above, has declared Python to be the language of the year for 2018! One can obtain both Jupyter, Python 3, and a host of useful libraries like numpy, scipy, and pandas from the Anaconda consortium, a useful one-stop shop. It is worth noting that these codes are evolving very rapidly, I have to check my pandas version as we speak. An important feature of Jupyter Notebooks is platform independence, they run on Windows, MacOS Unix, Ubuntu Linux, etc. Further this platform is not "owned" by a company, like the disastrous ownership of Java by Oracle, or that of C# by Microsoft.
The video link claims that Java is supported, but I find such support tenuous at best. Kotlin is supported at an alpha level, and since the terser Kotlin compiles into Java, indirect support is implied.

It is worth noting that Jupyter can run 100+ different kernels, or language environments. So the general idea of a working lab notebook that runs on a local server has really taken off and thus is the wave of the future. I like the relative permissiveness of the local sandbox, compared to Java development, and the fact that results can be reproduced by any other investigator with a Jupyter Notebook setup. I also like the "bite-sized" morsel of a Jupyter notebook that can focus on a single problem, in the context of a greater objective.







1 Who doesn't love a good footnote? I hope to amplify notions of the thermodynamics of trust, showstoppers and such in future entries to this chain.

Sunday, April 22, 2018

More AI, More ML: An Open Letter to Ancestry.com and 23andMe.com

That's it. You can stop reading now. Just do the title. How hard can it be?

Ancestry.com principally and 23andMe.com to a lesser extent let you use their genealogical services to assemble a family tree. I will focus on Ancestry there, but similar reasoning applies to 23andMe.com. There are two components to the family-tree building process, the PAPER of existing records and the BIOLOGY of DNA samples which both services analyze. However there is a glaring problem of when it comes to certifying the authenticity of family trees derived from historical documents, that is, PAPER. Do you trust the source? Can you read the document? Are the spelling changes plausible and if so how much? By using both DNA and PAPER one can cross check one against the other to confirm authentic lineages and refute specious ones. But there must be quality control in both the PAPER and the DNA. Laboratory techniques for DNA handling use statistical quality control methods that are reliable, however there is no equivalent quality control methodology for PAPER, which in large part has been converted to MICROFILM and digitized with varying levels of quality control image processing. There are chain of custody issues when one submits a DNA sample to both services and one should really submit multiple samples to be sure that the correct sample has been tested and labeled. There are also handing issues as samples make their way through the mail, postal and delivery systems. More or less the later issues are being addressed.

Ancestry.com currently requires you to chase hints in time and space to determine if you are related to a given candidate ancestor listed in a public record or another family tree. For large trees this can be extremely labor intensive, without guarantee that one has constructed a forensically certifiable result.


One error source is this; Ancestry let's you use other's family trees that are themselves mashups of information of dubious origin and there is no rhyme or reason to confirming whether information in these other trees is accurate. In other words there is no quality control. No assurance that one is dealing in fact.

The addition of DNA helps one connect with living ancestors and to add ground truth to previously assembled trees. There are forensic methodologies that increase certainty, such as this: when independent sources of information confirm the information. The more redundancy of independent records, the higher the certainty that the conclusions, the facts are authentic.

The problem is, after ones' family tree gets to an interesting level of complexity, the number of hints grows exponentially and many of the 'hints' lead to completely specious assemblies of data.

The fix to this is to associate with each tree, and with each fact in each tree, a certainty that the fact in the tree is indeed true. For a given ancestral line, these certainties can be multiplied together to provide a composite value that indicates the reliability of information. As a detail certainty is a number between 0 and 1 inclusive. A 1 means certainty is complete (which never exists in the real world of statistics). A 0 means there is no certainty whatsoever. A certainty of 0.9 means that the fact has a 90% chance of being true. If we chain two facts together each with certainty of 0.9 we have a 0.81% certainty that both facts are true.

There are a host of microfilmed documents from all over the earth that have been read, digitized and collated by human beings and many of these have been collected by Ancestry.com in what constitutes a controlling monopoly over historical ancestral information. This source of this control, this power has its roots in Mormonism. This can be good thing in that there is a single long-term historical and motivating organization or presence. This could be a bad thing if religious exclusion occurs.

The point of my open letter is this:

Recent advances in machine learning would enable PAPER documents to be parsed by machines and would associate with each fact gleaned from them a level of certainty. Previous entries in this blog discuss summarize these advances in detail.

The Mormon Church and Ancestry.com have close affiliations. In Salt Lake City, Utah, both have excelled in using advanced computation to solve important problems.

The problem is that there is a financial conflict of interest at play. Many families have invested generations of time and thousands of hours of work in building family trees using manual and computational methods. They may not take kindly to having their work, especially closely-held beliefs or assumptions questioned when those beliefs provide them with self-esteem or status in the community.

For people who have spent 30 years justifying that they are related to a Hindenburg or a Henry the 2nd like myself, this will be good news and bad news. It will be good news in that it will allow a more comprehensive family tree to be assembled, more RAPIDLY with less human error. It will be GOOD NEWS that it will allow a precise certainty to be associated with each fact in the tree. It will be bad news for those who have a need to be related to someone famous or historic and are not and have significant social capital in those claims.

I have a large tree of both famous and historic ancestors, including kings and martyrs. But I would gladly trade it off for a complete and accurate picture of who I am actually related to.

Mainly, I don't have time to chase the 31,000 hints that have popped up in my Ancestry.com Family Tree, especially when I know that machines can do it better. To that end, it is time to make more exhaustive and complete use of handwriting and document analysis using the burgeoning progress taking place in Artificial Intelligence and Machine Learning. The opportunity for true and factual historical insight could be spectacular.


Sane Public Policy With a Gun Census







I observe. I think. I've thought. The pen is mightier than the AR-15. The AR-15 killing machine will rust and jam, especially if you shoot NATO ammunition. The pen endures forever. I polish this article every-time there is a mass-shooting. It is getting way too polished.

We are at a social Tipping Point. Malcolm Gladwell, in his book by the same name, makes the point of "opt-in" vs "opt-out" when it comes to states enrolling organ donors. States that require DMV applicants to opt-out of organ donation produce more organ donors than those who must opt-in because people are lazy.

Lazy or not, under current law, nearly anyone who is 18 or older, can buy an assault-rifle and bump stock that converts it to fully automatic use. To be denied the right to purchase there must be a glaring red flag in the rubber-stamp background check that kills the purchase. Many mass-shooters are "first-time" offenders, therefore the response, by definition is always too-little, too-late.

Conferring on someone the right to perform mass-execution MUST have a higher barrier to entry than the current one. The burden should be on the applicant to prove that they:
1) are of sound mind
2) the people in their household and circle of trust are of sound mind
3) their killing horsepower need is reasonable

There is a difference between weapons for self-protection and those for mass-execution. When the 2nd amendment was developed, weapons did not confer on any single individual the ability to perform mass-execution. Give me a second while I reload my musket.

There is very good math, and I am a mathematician, that show that proportionality in war is a good idea. Mainly it avoids mass-extinctions. See Robert McNamara in the "Fog of War" for an excellent explanation of this. So let's do some simple math:

If everyone had in their garage, a hydrogen bomb, that they could detonate when they became angry, depressed, despondent or mad at the neighborhood association, one person could destroy a city, a state, or even a nation.

As a democracy, the decision has been made that we do not allow individuals to possess or carry nuclear weapons because there is currently no scenario that would justify this. So now we have an immediate obligation to the constitution:

Imagine if every AR-15 owner had a hydrogen bomb, locked and loaded in their house, maybe under the bed or something. That would make for 3 million H-bombs. If that were the case how often would we read about H-bomb explosions? The suicide rate is currently listed as 13 per 100,000 people. that works out to 390 incidents per year. Not so good for Earth Day festivities. The murder rate is 6 per 100,000 people. Maybe it is comforting that people kill themselves more often than others, show how good people really are. Anyway, that works out to 180 H-bomb explosions per year for a total of 570. Now we are assuming that AR-15/H-bomb owners commit suicide and murder as the same rate as the general population. Prepper be ready.

So we don't allow this and for good reason. Through the miracle of calculus, consider the following limiting arguments. How much killing horsepower should one individual be allowed to possess? I've already shown that a hydrogen bomb is too much. One can similarly reason that an atomic bomb is too much. We can continue to reason to more reasonable scales. Current law does not allow individuals to carry grenades, presumably because that is too much killing horsepower to confer on a single individual. Yet we allow uncredentialed people to purchase assault rifles willy-nilly. This is contradictory.

We can reason from below, as well as from above. We can reason from too little as well as from too much. How much killing horsepower is too little for self-defense? If one is attacked by a single person, you need to have what they have, plus a little bit extra so that you win. If a single attacker can have a hydrogen bomb, then you need two hydrogen bombs, just to make sure. So now we have a paradox. Whatever we allow someone else to have, is all that we are allowed to have! To fix that we must allow ourselves to have anything that they could possible have plus a safety margin and viola - we are in the current impasse enabling mass-shootings at schools, movies, concerts and work.

BUT a group of reasonable people could get together and say, here is how much killing horsepower we are going to confer on any single individual (this has tremendous ramifications for military leaders, but I digress). The killing horsepower that you are certified for depends on: 1) the soundness of your mind and 2) the threats you reasonably expect to confront in your current daily life. This creates a table of permitted killing horsepower. A commander-in-chief I know of, has the killing horsepower of the entire hydrogen-bomb arsenal of the United States. That is probably too much power to confer on one person, who in a moment of questionable judgement, could make a mistake that would destroy the world. Again I digress.

In short, a table of permitted killing horsepower is created and everyone gets a ranking. This table of permitted killing horsepower is created by proper research, debate and due process, while we mourn those lost in the meantime. In the future Artificial Intelligence tools will be used to screen applicants fairly, by quickly applying the wisdom of history and the hive. The model is that of a DMV for killing horsepower. Everybody hates the DMV, even so you can't drive a big truck without a special license and the lawyers come when they run you off the road. We know how to make DMV's. Apple could make them more user-friendly though. Part One Done.

The second part is an inventory, a complete common-sense gun census. An inventory must be made of the location and type of every gun and every round of ammunition. The manufacturers of weapons are required to keep records on how many are made. Many of these can then be tracked down by those records in combination with registration databases and store records.  The location of ALL guns and ammunition repositories must be part of the certification process. If someone tries to cache weapons John Wick style, his killing horsepower privileges are reduced or removed. If you use a gun, that use is going to be audited by the justice system, and the gun census is part of that process. A gun census doesn't take anyone's weapons away that shouldn't have them in the first place.

With a complete inventory, it will be possible to screen for those who are currently in possession of killing horsepower outside the realm of their daily need and their soundness of mind.

Any other approach to this problem will find us lamenting the murder of our children in schools, the murder of our friends at movie theaters, concerts and work. Schools, movies, concerts and work makes life worthwhile. Possession of killing horsepower outside our need or ability to wield it makes life more tragic than it already is.

In conclusion. We must implement these two fixes now. If things don't improve we can implement a sundown clause, a return to crazy town and see how that works. Judging from current events, crazy town isn't working at all.



Friday, April 13, 2018

A Blazing Fast Introduction to Machine Learning

Introduction

In what follows I'm going to talk about Artificial Intelligence (AI), Machine Learning (ML) and Artificial Neural Networks (ANN's). More to the point, "What are they good for?", in practical terms. If you are wondering what you can use them for the answer is, "most everything". From light to radio, from radio to sound, from graphics to games, from design to medicine, the list goes on. Here are some working concepts.

Artificial Neurons (ANN's) are an abstraction of Biological Neurons. The first thing we notice is that biological neurons use "many inputs to many outputs" connectivity. So in a mathematical sense they are not classic functions, because functions only have one output.
Anatomy of a Neuron
Biological Neuron

Artificial neurons have a "many inputs to one output" connectivity. So they are functions. Functions can have many inputs, provided they only have one output.


Artificial Neuron aka 'Perceptron'

This apparent shortcoming is remedied by connecting the output of a single artificial neuron to the inputs of as many other artificial neurons we want. This happens in subsequent or "hidden" layers, restoring their power by forking or duplicating their outputs. It seems forgivable to think of biological neurons working this way also, but improvements in the future may reexamine this assumption. The Perceptron link records its invention in 1958 where it was envisioned as a machine rather than a software entity. This pivoting between software and hardware continues as special purpose processors are being developed to speed machine learning computations.


Neural Network with Four Layers


Another thing we notice about the Artificial Neuron is that the magnitude of the output of any given neuron is clamped to some maximum value. So if you are in space staring at the sun, your brain doesn't fry because of the neural output, it fries because you standing next to the sun. How peaceful.

I would be remiss here if I didn't mention that until recently, programming languages implemented the concept of functions in a similar way, many inputs were allowed but only one output was returned per function call. 



Python, which has become the de facto language of AI allows one to return many outputs from a procedure call, thus implementing many to many relations. This is extremely convenient, amplifying the expressive power of the language considerably.

There are many details one must attend in programming neural nets. These include the number of layers, the interconnection topology, the learning rate and the activation function - that include S-shaped sigmoid function shown above at the tail of the Artificial Neuron. Activation functions and come in many flavors. There are also cost or loss functions that enable us to evaluate how well a neuron is performing for the weights of each of its inputs. These cost functions come in linear, quadratic and logarithmic forms, the latter of which has the mystical name "Cross Entropy". Remember if you want to know more about something you can always google it. 

Great strides have been made in neural networks by adjusting the input weights using "Gradient Descent" algorithms which attempt to find optimal combinations of input weights that maximize the effectiveness of each neuron. A neuron has many inputs to consider - many things shouting at it simultaneously -  and its job is to figure out who to listen to, who to ignore and by how much. These inspirational and optimal values are "Back Propagated" using the Chain Rule from our dear friend Calculus. This is repeated until the ensemble of neurons as a whole are functioning their best as a group. The act of getting this to happen is called "Training the Neural Network". You can think of it as taking the Neural Network to school. So the bad news is, robots in the future will have to go to school. The good news is that once a single robot is trained a whole fleet can be trained for the cost of a download. This is wonderful and scary but I digress.

TensorFlow Playground

Before we go any further, you must visit TensorFlow Playground. It is a magical place and you will learn more in ten minutes spent there than doing almost anything else. If you feel uneasy do what I do. Just start pushing buttons willy nilly until things start making sense. You will be surprised how fast they do because your neurons are learning about their neurons and its peachy keen.


TensorFlow Playground

Types of Neural Networks

CNN - Convolutional Neural Networks

Convolutional Neural Networks are stacks of neurons that can classify spatial features in images. They are useful for recognition problems, such as handwriting recognition and translation. 


Typical CNN
CNN's can also be used to recognize objects in an image such as these that occur in the CIFAR database. In this case the input to the CNN is an image and the output is a word such as "truck", "cat" or "airplane".


CIFAR Database
MNIST is a famous database of carefully curated handwriting samples used to train and subsequently recognize handwriting.

MNIST Training and Recognition
CNN's can also be used to transfer the style of one image to another as in Google's Deep Dream generator.  


Deep Dream Generator

RNN - Recurrent Neural Networks

Just as Convolutional Neural Networks can be used to process and recognize images in novel ways, Recurrent Neural Networks can be used to process signals that vary over time. This can be used to predict prices, crop production, or even make music. Recurrent Neural Networks use feedback and connect their outputs into their inputs in that deeply cosmic Jimi Hendrix sort of way. They can be unwound in time and when this is done they take on the appearance of a digital filter.
Unwinding an RNN in Time
RNN's can be used to predict the next most likely word in a sentence. They can also continue patterns seen in periodic functions.


Predicting Periodic Functions with an RNN


AE - AutoEncoders

Autoencoders are a unique topology in the world of CNN's because they have as many output layers as input layers. They are useful for unsupervised learning. They are designed to reproduce their input at the output layer. They can be used for principal component analysis (PCA) and dimensionality reduction, a form of data compression. 

RL - Reinforcement Learning

With Reinforcement Learning a neural net is trained subject to conditions of rewards, both positive and negative until the desired behavior is encoded in the neural net. Training can take a long time, but this technique is very useful for training robots to do adaptive tasks like walking and obstacle avoidance. This style of machine learning is one of the most intuitive and easiest to connect to.


Components of a Reinforcement Learning System

GANS - Generative Adversarial Networks

GANS are useful for unsupervised learning, an echelon above routine categorization tasks. They typically have two parts, a Generator and a Discriminator. The Generator creates an output, often an image, and the Discriminator decides if the image is plausible according to its training. In the MNIST example below the gist of the program is, "Draw something that looks like a number". In an interesting limitation, the program does not know the value of the number, but simply that the image looks like a number. Of course it would be a quick trip to the trained CNN to get the number recognized.
GAN Instructed to "Draw Something That Looks Like A Number"

Conclusion

This short note details several approaches to and application of machine learning. Hope you found it interesting. For more information just follow the links above.









Tuesday, April 03, 2018

#GTC2018 Sipping from the Firehose



After a protracted absence from the conference scene I got the chance to drink from the firehose at #GTC2018, nVIDIA's annual GPU Technology Conference. As drinks go it was a chocolate malt.




GPU stands for "Graphics Processing Unit" and rhymes with CPU or "Central Processing Unit", the heart of our personal computers. The GPU, a square chunk of silicon shown below measures 1.5 inches on a side, nestled into its graphics card home. You can see my fingerprints on the discard below and honestly it feels like touching a piece of Mars. It is the heart and soul of nVidia's latest offering containing a stunning 9 billion transistors:


nVIDIA has taken 16 of these and glommed them together in their latest offering, the DGX-2 graphics and artificial intelligence supercomputer. It's basically somebody's brain floating in space. At 350+ pounds it's a little heavy to float, but I promised Jensen Huang if he hired me I would put it on a diet and cut its weight in half. After all, this is the kind of things that humans like to fling into space.

The nVIDIA DGX-2 Supercomputer - source: nVidia


The conference in toto, was quite a spectacle, reminiscent of my tours of SIGGRAPH conferences during the heyday of computer graphics innovation. GPU's originally started out as add-on cards to enable 3D computer graphics in personal computers and workstations. Jim Clark deserves mention for getting the ball rolling in this field with Silicon Graphics, but I digress. The SIGGRAPH thrust curve, shown below, is one model. One model that is, for what happens when a disruptive technical change of paradigm comes along. The thirty-year period between 1982 and 2012 shows growth followed by reversion to a baseline of progress. Consider that 15,000 people is an army, 50,000 is a city. It takes a village and all. This year about 8500 people attended #GTC2018, compared to 6500 last year, so we are still inflecting upward.


SIGGRAPH Attendance - source: wikipedia


Market income appears as a three-legged stool for nVIDIA. Those three legs are Cryptocurrencies, Gaming, and Artificial Intelligence (AI) in addition to the core business of enabling computer graphics on PC's and workstations. Below is a historic stock chart and it is clear that things are going gangbusters since 2016.

nVIDIA Stock Price Vs. Time - source: google finance
Cryptocurrencies use GPU's to search for numbers that validate blockchains. These virtual currencies, are not currently backed by any good or service of tangible value but that doesn't seem to stop nVIDIA GPU's from selling like hotcakes.
 
"Look at my pretty prime numbers."

Commoditization of AI 
was, for me, the lasting excitement of #GTC2018. I made my way through a series of lecture sessions, labs and an enormous equipment expo, to get a sense of where things are and refine my own data science and robotic skills. If you are wondering, "What good is AI?", remember that everytime you say, "Hey Google, Hey Siri, Hey Alexa, Hey Cortana you are using AI that is becoming so much a part of our life it is invisible. It is used on Amazon in your every encounter. I don't even go to the store anymore, do you?

Other AI concerns include Autonomous Vehicles, Healthcare, Education, Finance, Robotics, Human Resources, Marketing, Media, Music, News, Customer Service, Personal Assistants, Toys and Games and Aviation. I expect it to follow the SIGGRAPH curve, but over a more compressed time scale due to improvements in telecommunication. For me it's fun to surf big technical waves, because that is what I've done all my life.

My first thesis advisor, Art Hale, a visionary mathematician and engineer, emphasized not getting swept up in the river of change for its own sake, but to remain attentive to first principles that dictate productive change and fundamental directions of endeavor. “It's better to stay up on the bank if you can.” he would say. He would also say, "There are millions of people." for some reason.

Thus, as I sat in the sessions I made notes of key principles that emerged invariant or "here to stay". Here they are in somewhat chronological order biased by my own interests:

AI Accountability Equals Explainability:

The first emergent theme is that AI’s need to explain themselves. At present these magic algorithms are somewhat opaque and difficult to audit when it comes to answer questions like, “How did you get to that answer?” If you were hired or fired by an AI, this is something you might want to know.

There was an entire session devoted to the explainability of AI, but it was paradoxical since the lecturer was a stakeholder in a company that uses proprietary methods to enable AI to explain itself.  AI that explains itself using secret techniques remains unexplainable by definition.

Fortunately there is work being done that pulls the curtain back, such as heatmaps for neural network weights. In the example below, I drew a blobby dot in each corner to see which digit would pop out of the neural net. My sketch classified it as a 1, but most humans would not read that as a one. They might read it as a four. To be fair, the neural net wasn't trained on things that appear on the face of a die. But it should not return an answer as one either. Note that the heatmap of things that this neural net classifies as a 1 doesn’t look anything like a 1, so that is a problem too. I hope to look at this in more detail this year.


Heatmaps for Explainability - source: LRP

To nVIDIA's eternal credit, the technical sessions are posted here which is a plus for addressing this.

Hyperparameter Optimization - this is hot bed of current research. It is also too long a word for a very simple idea. If you go to TensorFlow Playgrounds you can construct your own neural net and run it with various settings like "Learning Rate", or "Amount of Noise", or "Test/Train Split" and other configuration details. These are the Hyperparameters and really should be discovered by a machine rather than a human. By this time next year they will be. In the example below you are watching a computer think. I can't get enough of it.


TensorFlow Playgrounds - Source: Smilkov & Carter

Machine Learning is an Optimal Control Theory Problem. Watching the learning curve evolve over time reminds me of aircraft stability and control. There is a best way to proceed to a given flight configuration and these are knowable sorts of things. Even machine learning (ML) algorithms need autopilots.
Autopilot Design - Source: Open Access Korea
Another idea that emerged in the driverless car world is that marking intersections is really important to let driverless cars know to stop. Since most wrecks occur at intersections this makes sense in the driver-full world also. 
Intersection Risk as Function of Position - Source: Me

So we should expect transportation automation to influence road design and vica versa. Road signs are apparently good reflectors of radar, this causes glint in the sensors due to multipath RF propagation. Road signs made of fiberglass would be better since they reduce RF glint. They would also be better if you happen to run into one. Nature has a funny way of breaking what does not bend, at least that's what Jewel says.