Van's Blog

Thursday, September 01, 2022

Dinner with Andrew: The Sequel

So, we're at the gym again. A couple of laps and I chat with Dr. Chiang a tenured Ph.D. Professor in the department of Computer Science.

Me: I've found a new 1500-page book on Probability and Machine Learning. Well, its actually two books. But it comes with code!

Dr. Chiang: I got paid today!

Me: I have to warm up around the track. I go upstairs to the 1/8-mile track. I'm thinking of Gödel again. He had read Russell and Whiteheads Principia Mathematica in its entirety and confronted Russell about everything in it being provable. His attention to detail was incredible and I would mention this to Andrew who appears later in the conversation. Gödel believed there was a way that the US could be transformed into a dictatorship. Einstein and Morgenstern had great difficulty restraining him from announcing this at his US citizenship hearing. (See page 12 here)

I returned downstairs to the quad machine and managed to get through three sets of eight, increasing the weight in twenty-pound increments as I warmed up. Then I made it to the sitting bench press which I have undertaken to perform at three angles with various seat heights and grip positions. After a set, Andrew appears, carrying a notebook which he is eager to show me.

I would be remiss to provide only my own recollection and I am fascinated by the Rashomon phenom of the same event witnessed by two people being perceived differently. I have asked Andrew to provide a quick sketch of his recollection, quick, because as a student he has bigger fish to fry than recounting a random gym conversation, but it is interesting and he has graciously provided a recap, which I have worked into this thread.

Me: That is excellent. Please show me. But first, you are taking four courses. Tell me how they are going.

Andrew: Electronics course. Ohm's Law. The Linearization of V = IR in I-V space to determine R as the slope of the curve.

Me: Aha! But if you are determining slope then you must make TWO measurements to obtain a result, say if you wanted to do an automated determination of resistance. That is so interesting that one measurement is not sufficient.

Andrew: (Presenting the sketch book. I have been doing some sketches to hone my ability to express myself [in diagrams that you might find in a book]. In order to practice my arm control for drawing, I drew various straight lines and connected them in arbitrary ways.

Me: They look like a Voronoi diagram. Constructions like these have unexpected utility I have toyed with:

Me: This is excellent. Nothing is as portable as pencil and paper, but you are going to the next level, to quote reddit. There are a set of very useful tools that will amplify your efforts as well. Like Maxima for symbolic mathematics, like Geometry Expressions™ for symbolic geometry. The Shovel vs. the Bulldozer.

Andrew: Behold - The introduction of Euclid's elements. The foundation of all classical Greek mathematics. The beginning of the exploration of the 5 postulates.:

Then like a magician producing a deck of cards Andrew pulls out a black notebook bound with black wire. The first few pages are various doodles and patterns which then turn to the money shot, a page showing Euclid's Five Postulates on Geometry.

Andrew: Some of these are tasks, some are proofs. The distinction is Q.E.F vs Q.E.D:

QEF: Quod Erat Faciendum (that which was to be done) vs.
QED: Quod Erat Demonstrandum (that which was to be proven)

Covering these basics would yield unexpected fruit.

Andrew: Consider Euclid Postulate 1. A straight line segment can be drawn joining any two points.

Andrew: Given a line segment, construct an equilateral triangle. This was done by drawing two intersecting circles with the given line as the radius. Thus, the equilateral triangle was constructed based upon the equality of the line lengths rather than the equality of the angles.

Me: The act of drawing the point in the first place induced the existence of a pair of coordinates to locate that point. The same for its partner. Drawing line created a relationship between the two points called length. Length is a disembodied quantity, an attribute, a scalar whose value depends on the coordinates. If I hand you the length, I have not handed you the points from which the length was determined. Length is a Doppelgänger.

Our goose bump moment was observing that the assumption that we can connect two points lies in the notion that they are a finite distance apart. We had previously discussed spherical distortions of the function 1/x that made it appear like a baseball moving points at infinity to the other side of a sphere.

This seemed like a lot of baggage for postulate one, so we moved on while I did another set of reps on the sitting bench press.

Euclid Postulate 2. Any straight line segment can be extended indefinitely in a straight line.

Andrew paraphrased this as, "A ray can be extended from a given point.", which I liked.

Andrew: Given a line, duplicate the line at another point. Draw more lines extending them as in postulates 1 and 2, and then directly called upon the result of proposition 1 to accomplish its task.

Me: Euclid was the earliest effort at the task of creating a unified system of geometry that Russell and Whitehead attempted symbolically with Principia Mathematica some two plus millennia later.

We agreed that the level of dedication and attention to detail was worthy of admiration, and that it was great sadness that Gödel starved himself to death. We later agreed that this presented the notion that there is no preferred basis in Euclidian space,

Me: This makes the assumption that straight lines remain straight, that space if flat, except in regions of high gravity and the cosmological constant has two values depending on whether it is measured via cosmic background radiation or Hubble's standard candles.

I adjusted the seat and did another set of sitting bench presses. Am I conflating abstract space with real space? We review dimensionality and how the amount of wiggle room grows as we venture from the number line, to the real plane, to three dimensional space, finally adding time.

Andrew: Consider Euclid Postulate 3. Given any straight lines segment, a circle can be drawn having the segment as radius and one endpoint as center.

We talked about how this is just one way to specify a circle. One can also specify a circle by three points that live on its edge. This brought up the idea of constraints as a way to present geometry and the multiple ways of saying the same thing shape-wise. We talked about Phil Todd, the Portland genius who created GeometryExpressions™, used to create some of these figures. We talked about implicit equations of a circle which require a search and explicit forms that will represent the top or bottom half and how reexamining assumptions like this can lead to new work. This led to the code for the infinity sign below.

Andrew: Given a line segment and a short line segment, cut off a part of the longer line equal to the line segment. Euclid cheated a bit because he drew the short line segment in connection with the longer segment, rather than sticking with a compass and straightedge construction.

Me: I am displeased with the discrepant notation of using a single character for the short line segment's length while notating the end points of the other line segments as full (x, y) citizens in the sketch.

Andrew (post facto): For all I know, this could be an issue of translation, as the copy of the elements that I have is based upon Heath's translation. You stated that referring to the short line segment as a magnitude of length only, left it without any documentation of its location. Thus, a scalar would be suspended in space.

Me: It was at this point that I used the term Doppelgänger or disembodied scalar to refer to a length floating in space, as compared to an anchored point drawn with specific (x, y) intention.

Euclid Postulate 4. All right angles are congruent (the same).

Me: This one sounds like a tautology, as in, it is what it is, though granted, 'is-ness' is a lot larger space that the set of 90° angles.

Me: What about the relationship of Cartesian grids to hexagons?

Andrew: There is proof of the impossibility of trisecting an angle with a compass and straight-edge. The proof is here and uses a cubic polynomial.

Me: This is example of symbolic representations providing insight into visual representations of geometric relationships.

Andrew: I am interested in the fact that Euclid dared to purport that all right angles were equal before our modern notion of the dot product allowed us to extend the concept of orthogonality into higher dimensions.

Me: Doug Gilmore at the University of Illinois once used the construct wherein two linguistic statements were orthogonal to each other once.

Andrew: We agree that this is a fascinating notion to introduce orthogonal statements that neither support nor oppose each other but rotate the discourse by 90° into a higher dimension of understanding. These orthogonal statements could form the "basis" or "basis functions" for a view of the universe.

Me: [the term 'basis functions' is a trigger for me] Very interesting.

We talked about circles as basis functions for a drawing:

One circle of fixed radius that couldn't move.
Many circles of fixed radius that could move.
Many circles of varying radius that could move.

About this time, a young man was violently upset was being escorted from the gym after an altercation on the basketball court. It was hard to remain focused on our conversation, but we did our best, our limbic systems nonetheless activated for a fight or flight situation occurring before us. We remained on task, hoping that he would be back in better spirits.

It was time to do another set of bench presses of the upright and orthogonal variety.

We resumed talking about how ideas, facts and statements can be orthogonal to each other. Sometimes redundancy of expression cements an idea. For terseness orthogonal statements cover the most territory in the least words.

Me: There is also a symbolic definition of dot product being the definition of orthogonality, specifically the formula:

The question arose as to whether giving someone the symbolic representation of something is the same as giving them the geometric representation. It isn't. Both are needed. Two faces of the same coin.

We talked about the appearance of angle as being another disembodied scalar like length, those comprising the two fundamental aspects of dimension. We wondered if the 12 dimensions of string theory, including the tiny curled up dimensions aren't conflating rotation with length. The mean streets all sit orthogonal to each other.

Andrew: Consider if in two triangles, two sides and the angle subtended by them are equal, then the triangles are equal. The proof is a joining of the triangles along a common side and folding them upon each other. The proof presented by Euclid used a translational superposition of the triangles.

Me: The process of proof could be encoded within a trio of affine transformation matrices. In this way, truth could be containerized and exploited later. It is not until the truth is instantiated, at the point of the application of knowledge, that value is obtained. No symbolic expression achieves its full value until it is populated by actual numbers that express actual truths.

Andrew: So the truth would not be observed until the transformations were applied to a specific situation, although the true nature of the transformation would still be securely encoded.

Me: It is interesting that we have geometric expressions of truth and symbolic expressions of the same truth.

Andrew Postulate 5: Parallel lines never meet. Biased lines will eventually intersect. The devil of Euclid, that hyperbolic geometers love to ride on about.

Euclid Postulate 5. If two lines are drawn which intersect a third in such a way that the sum of the inner angles on one side is less than two right angles, then the two lines inevitably must intersect each other on that side if extended far enough. This postulate is equivalent to what is known as the parallel postulate.

Andrew then went on to illustrate the theorem of the isosceles triangle having extended legs, which produced a 12-angle complexity pop. I mentioned that if one went through all of Euclid's theorems (of which there appear to be 468) looking for complexity 'pops' (sudden increases) that one might be able to find a path of most escalating complexity and possibly the next big thing.

Andrew: If an isosceles triangle has its congruent sides extended, the angles subtended by the base and the congruent sides are equal, as are the angles between the base and the extended lines.

He recounted later that this proof obviously exploded in complexity or sophistication due to its length. It is observably longer and more detailed that the four propositions that precede it, managing to occupy an entire page in the book. You mentioned that this being the first burst of complexity, it would be prudent to watch out for and trace the pathway of similar bursts of complexity, and that following such a path would lead to the next node in the path of discovery.

Andrew then showed me a sketch where he had mapped every point in a grid from the origin to the grid and made a clever deduction about the null space of linear algebra.

Andrew: The singularity. To further practice drawing lines, I attempted to map an imaginary grid from the center of the page. A straight line does not have any intrinsic directional characteristics, so that my intention to map the center point to the grid points could be equally described as a mapping of the grid points to the center point.

Me: You have expressed the relationship between the Cartesian and polar coordinate systems, and how there existed finite angles with rational tangents that I could represent and a large number of continuous angles I could not represent.

Andrew: Another curious outcome of my drawing exercise, seemingly more artistic than mathematical, in which I was reminded of the Null space in Linear Algebra, where the entirety of a space is mapped into the zero vector, thereby vanished into nothingness. By coincidence, since I started drawing lines at the center of the page, the center point of the page had the highest accumulation of graphite and was therefore the darkest point on the page

as if it were a dark hole pulling in all of space.

Me: The origin is a singularity and if lines of zero width, that there would be no darkening, but that there would still be a singularity as a result of the density of lines at the center.

We talked about the density of the lines in such a graph and how as one moves closer to the origin the density increases until at the origin you have a black singularity. I initially said that the density was an artifact of the finite width of drawn lines but recanted that in favor of a definition of a circular region of fixed radius that we could count the number of line crossings in to obtain a line density metric. His sketch showed the intrinsic difference between a Cartesian and radial coordinate system, with the existence proof, obtainable by inspection that a finite grid with lines draw from each grid point to the origin can only represent a subset of the possible angles in a continuous space. Further the number of angles in the discrete set are always smaller than the number of angles that cannot be represented, implying that in the limit that the number of grid angles on an infinite grid is smaller than the number of continuous angles, which informally makes the case for different sizes of infinity.

Then I mentioned Euler's formula for points (vertices), edges and faces.

Andrew: 'haven't yet gone that far into graph theory...

Me: I think this precedes graph theory, but there is a connection. When I think of graph theory, I think of the Koenigsberg Bridge Problem:

Euler sure got around.

Collaborations that are highly disciplined can be rewarding. Michael Lewin writes about a collaboration between Amos Tversky and Daniel Kahneman. Ruthlessly revisiting items considered basic can yield new truth and bring old truth to life via exercise. Like a kata, a well-rehearsed prearranged form that the practitioner already knows, its repetition builds mastery.

There was a war on. This did not stop Freeman Dyson from applying himself to the problems of the day, to eventually winning a Nobel Prize. He had talent, but he also got back to basics and mastered them. Read this book when you have time. It is a set of letters through his life that provide a story, the hooks upon which to hang the practices of a physicist/mathematician.

Me: Watch "When the Curtain Falls", by Greta Van Fleet.

Saturday, August 27, 2022

Dinner with Andrew

Okay, so it wasn't dinner, and it wasn't with André, but it was a great conversation. The following is a best efforts recollection, possibly, make that certainly, out-of-order.

[We're exercising at the University gym, returning not only from the summer hiatus, but from the pandemic, from isolation, etc.]

Me: So Andrew, what is your major again? Hey Dr. Chiang, meet Andrew, he is the next Kurt Gödel.

Dr. Chiang: Wow! Are you in Computer Science?

Andrew: Physics and Math

[That reminded me of Cedric Villani who won a Fields Medal, he worked with the Boltzmann's equation of a gas, which talks about the distribution of molecular velocities, which seemed more like chemistry than math. Dr. Villani, besides being very distinguished is very kind and approachable.]

Me: What are you taking?

Andrew: Discrete Math, Classical Electrodynamics...

Me: Discrete Math?

Andrew: Yes and I'm reading a book by Knuth on 'Concrete Mathematics' which is a [portmanteau] of the words CONtinuous and disCRETE.

Me: Clever. Time with Knuth is time well spent. What about Quantum Electrodynamics?

[10 sitting bench presses and some banter.]

Andrew: Mutters something about Dirac Delta Functions.

Me: Infinitely Narrow, Infinitely Tall

Andrew: And unit area! (An area of 1)

Me: Those are interesting basis functions.

In my head I'm seeing a pair of them and wondering how one could create a parametric expression to interpolate between them:

[This raised a sidebar where we discussed explicit, implicit and parametric functions. Visit the link if you care to explore these creatures.]

Andrew: ... and did a particularly nice view of Ohm's law in terms of flux and such.

Me: So that would be a Maxwell's Law version of R = V/I?

[It was time to put a mention of Falstad's circuit simulator which is amazingly visual and useful since it animates current flow and signal.

[Then we went upstairs to play ping pong, called 'table tennis' by those who keep score, but we don't as it violates the aesthetic of keeping the ball going in the highly echoic racquetball court. We countered the echos by turning the table at 45 degrees, which helped funnel errant shots but did little for the many copies of sound we got during our conversation.]

V = IR reminded me of F = ma which reminded me of E = mc²

Me: The problem with the last trifecta is that c is squared when maybe it should just be a thing that isn't. Consider half-derivatives which interpolate between two worlds as well. Like maybe Length, Mass, and Time are consequences of things, rather than fundamental things. Perhaps we live in a world where velocity is the fundamental value, rather than position or length - where length is the integral of velocity with respect to time, rather than velocity being the derivative of length with respect to time.

[We went on for a bit about making things non-dimensional, like lift and drag coefficients and unitizing things so for example, the speed of light, c = 1.

As the ping pong ball (note it isn't called a table tennis ball) was flying back and forth, I mentioned that simultaneity is in the eye of the beholder according to the Special Relativity Lectures by Brian Greene.

Then after a few more rounds ruled surfaces came up. They appear curved, but are built of straight lines, which at first seems paradoxical.]

Me: You know black holes can float on water right?!

Andrew: (incredulous expression)

Me: It's true, I didn't believe it myself until I did the calculation. You can check my numbers but with mass accretion in supermassive black holes there is a war between surface area and volume and the upshot is that density drops considerably as the mass of a black hole increases. You can check my math:

The moral to the story was the reminder that I too should consider wearing a floppy Oppy hat when playing ping pong in a highly echoic room to suppress the back channels.

Friday, April 01, 2022

How to Get a Factor of Ten Speedup In Google Colab Jupyter Notebooks and Other Tricks

I don't work for Google. I'm just a Ph.D. student trying to get by. This evening at 1 AM, I was working with an autoencoder example that took a long time to train. Autoencoders are cool, but I don't have all day to wait around.

I use Google Colab for Python Jupyter Notebooks because I don't have to install any software on my machine. This also saves gobs of time and disk space.

Google maintains a host of library versions insuring compatibility, which is an enormous convenience since machine learning is a rapidly changing field by the day, even by the hour.

Google Colab is technically free, but $10 a month buys you access to GPU's, TPU's and the promise your job will actually finish. I figure it's my tithe to Google for all the good they do.

At the top of the notebook is the Runtime Menu, go to the bottom item, yellow arrow:

When you run a notebook, you have the choice to use a bare CPU, a GPU, or a TPU, and in addition you can request extra RAM using the questionably named 'Runtime shape' menu:

The option dialogs look like this:

but don't use Standard.

I was curious, for my autoencoder experiments, which configuration of devices were the fastest. Intuition would say, GPU and extra RAM, but I don't trust my intuition when I can just measure something and know for sure. Here is the data generated by logging all possible combinations. You have to restart and run all to make sure that runtime configuration changes stick.

To avoid analysis paralysis these values are thrown into a Python dictionary, with a quick crunch to compute the mean, round the results and sort them from fastest to slowest:

Intuition turned out to be right, but there was a surprise. A TPU with a standard memory configuration was the slowest. This is a 'for-sure' since each case was run three times to account for process noise. I could have easily convinced myself that this was a reasonable choice and taken TEN times longer to get done.

So GPU's with extra RAM are the fastest by a factor of ten for my particular problem which is a fairly run-of-the-mill machine learning task.

Other Tricks
1) Running colab in a Chrome incognito window starts up lots faster for reasons I do not understand. The difference is significant. Don't have time to fix it either so I wrote the billing department at colab-billing@google.com. That is a lot faster than trying to get tech support for reasons I completely sympathize with. But hey, I'm a paying customer.

2) To have colab show cell execution times automatically insert this code at the top of the notebook.

!pip install ipython-autotime
%load_ext autotime

Wednesday, March 30, 2022

A Note on Clarity in Machine Learning

We live in an internet of information and an internet of things.

The trifecta of childhood dreams, rockets, robots and radio has never seen a

brighter time. They are the three R's for the digital age.

Oz is giving something to the Tin Man, that he didn't already have - an artificial brain.

Adding to the seismic shift we find ourselves in is the explosion of machine learning (ML) techniques. There are two kinds of detonation taking place.

The first is the increase in the number of fundamental algorithms.

The second is in permutations or variants of those algorithms.

Consider the garden of fundamental algorithms:

1. Linear Regression (Supervised ML)

2. Logistic Regression (Supervised)

3. Decision Tree (Supervised)

4. Random Forest – Ensemble of Decision Trees (Supervised)

5. Support Vector Machines - SVM (Supervised)

6. Naive Bayes (Supervised)

7. Gradient Boosting Algorithms - XG Boost (Supervised)

8. Convolutional Neural Network – CNN (Supervised)

9. Recurrent Neural Network – RNN (Supervised)

10. K-Nearest Neighbors – kNN (Supervised)

11. K-Means (Unsupervised)

12. Dimensionality Reduction: Principal Components Analysis – PCA (Unsupervised)

13. Generative Adversarial Networks – GANs (Both)

14. Reinforcement Learning – RL (Neither)

15. Attention Mechanisms that Prioritize Machine Learning Operations (Self Supervised)

Consider the garden of a specific ML device, the autoencoder, we have:

1. Denoising Auto-encoders

2. Sparse Autoencoders

3. Deep Autoencoders

4. Contractive Autoencoders

5. Undercomplete Autoencoders

6. Convolutional Autoencoders

7. Variational Autoencoders

8. Recurrent Autoencoders

9. SeqToSeq Autoencoders

For each kind of ML algorithm and variant, we require the following representations to understand them:

1. The underlying equations

2. The block diagram showing how the parts fit together

3. An outline of the code that computes them

4. An animation of the progression of the algorithm

For algorithm selection and cost estimation would also like to know:

1. The time complexity

2. The space complexity

3. The loss function

4. The hyperparameters, learning rate, activation function, etc.

During the execution of an algorithm, it is useful to watch the loss function decrease, as this portrays the learning that is taking place.

Without these algorithm representations, selection and cost estimation we are just shooting in the dark. This especially so when we are comparing our work with that of others.

So that is my note on clarity.

References

Different Types of Autoencoders

Feature Extraction by Sequence-to-sequence Autoencoder

Wednesday, March 09, 2022

Least Damaged World

As I write these words, blinking lights in basements, garages, attics, and she-sheds, millions of gaming and mining rigs crunch away on a single new problem. Least Damaged World.

Up till now these rigs were running Grand Theft Auto, Minecraft, PlayersUnknown, and hundreds of other sick titles happily warming the oceans.

But these machines and their highly skilled ops now run a sim that seeks to solve but one solicitation:

How do we save New York?

Why all that horsepower on such a short question? Because it’s not just New York. It’s London, Paris, Munich and everyone pop music. Let me talk about it [short on time]:

We’re 100 seconds to midnight and we can do anything we want with those few precious moments.
The thing we should do is answer the question, like so:

There is some order of nuclear exchanges, starting with the tactical one now aimed at Ramstein AFB, that kills the least people and another set that kills the most people. Right now, no one on Earth, knows the answer to that simple question. No brass, no fruit salad, no one. That question is identical to the first one. How do we save New York, first on the list?

But we could know it. It is knowable. 

There is some best order of bartered exchanges, gained only through brute force simulation, that says what targets must be struck to save the Newest New Yorkers. This includes the lovely option of striking no one anywhere. This includes shooting the gun out of their cold dead hand. This includes shooting them in face even because no one holds gfriend by the neck but me.

Cut to the chase:
It unrolls like this. WormOfData Googles the nuclear inventories of all the card-carrying powers that be. DawnOfThunder puts them in the Unreal and Unity game engines. HiFionWiFi builds a Matrix quality sim good down to the dumpster decal. SereneDipity hacks a pure speed sim that trades away ray tracing in favor of getting the answer before 100 seconds are up.

ZipTie googles the fuzzed out regions to data-is-beautiful mine where the BIG ONES live, and which ones have their mouths open ready to barf up megadeath.

DateScroller divines the submarine inventory and Monte Carlos the favorite positions in an electrostatic LoveBoat episode of pissed off leaders in unhappy places.

SomebodySpecial forks a faster leaner version 12, built in as many days, outed to the world as open source.

You get the idea. Touch big red and watch the world end in a sim. Now change one fateful thing. A fewer number of people die. In a few million GPU hours you don’t just have crypto coinage, you saved New York, and a ton of all creatures great and small. Why? Because you and everybody else now knows something that wasn’t known before. What is the order, what is the optimal target list? Is the best thing to do nothing? What would have happened if we did nothing after 911? Would it be better than what we did?  We don’t know till the sims run. We just don’t know. And second order ignorance is an unhappy end, whether you’re brass or fruit salad or no one like me.

What is best thing [to do] after the First Strike?

That would make a nice name for Citizen SDI, but I’m sure it is already taken.

Is there even a world without New York? We tasted that one before…

- Van2022

Thursday, December 16, 2021

Personal Reflections on the Design of the Webb Space Telescope

L. Van Warren MS CS, AE, PhD. Candidate CS

4.4% Bounce Cost

The Webb Space Telescope has an optical path consisting of a concave primary mirror, a convex secondary mirror, a concave tertiary mirror, a flat steering mirror and a final focal surface. A ray of light impinging on the primary mirror thus has 4 reflections before final instrument entry on bounce 5. The reflectance of each mirrored surface has been measured to be within a neighborhood of 98.5%. Multiplying the loss at each bounce gives us a final effective signal strength of 94.1%.

Contrast the current design with one that stations the instrument entry in pThis would enable an improvement of 98.5-94.1 =4.4% in signal strength corresponding to an equivalent effective surface area increase, or a corresponding weight reduction if the primary mirrors were scaled down. It would also result in a significant reduction in mechanical complexity and cost if the three intervening mirrors were eliminated, at some cost in versatility.lace of the secondary mirror.

The convexity of the secondary mirror implies that the primary would also have to be reground if it were to immediately feed an instrument focal plane, which itself could be a difficult endeavor. So that is item one.

Solar Panels vs. RTG

The Voyager spacecraft have exhibited longevity that exceeds 44 years, due in part to the use of Radioisotope Thermal Generators whose performance is not dependent on solar distance. The planned orbit of Webb around the L2 Lagrangian point is a million miles from the sun, ~1 percent further than that of the Earth itself, so the solar irradiance is like Earth’s. I wonder if solar panels will have the longevity of RTG’s. In any case fuel to remain on station about L2, and to unload the reaction wheels would seem to be the factor limiting telescope lifetime, rather than power source. I also wonder if an ion thruster could offset the need for expendable fuel, resulting in increased spacecraft lifetime.

Wednesday, June 24, 2020

Machine Learning is the New Timesharing or Give the Dog a Head

"We write about the past to discover the future"

Despite the ongoing tragedy of the pandemic, we continue to live in a period of remarkable technical advance. Fortunately our society has advanced to the point where even when confined to home, we can continue to innovate. If we let our minds run a little bit, I wonder what we can come up with.

Any innovation these days is likely to involve both collaboration, and the most modern arrays of hardware that can be assembled. A certain proverb says, "Many hands make light the work". This is true for both processors and people. Both get viruses, but I digress.

For the sake of argument, a euphemism I use for stimulating discussion, let's assume someone has plunked down in front of us the fastest airplane money can buy. We immediately ask ourselves, "Where could we go, and more importantly, where should we go with it?"

John F. Kennedy said, “For of those to whom much is given much is required”, echoing the writings of Luke the Physician who said “For unto whomsoever much is given, of [them] shall be much required.”

The question is, "How can we bring the most benefit into the world, from the gifts we have been given?". Among these gifts is our ability to reason and communicate nearly instantly worldwide.

Three Observations Motivated by Personal History

1) Time Sharing is Back

I began my computing career with the help of Dr. Carl Sneed, an associate professor at the University of Missouri, one of five I attended over the years. I had signed up for an introductory computing course which was taught on the IBM 360 TSO mainframe in 1975.

IBM 360 with peripherals

Dr. Sneed was kind enough to walk me through the following process:

Dr. Carl Sneed, University of Missouri

a) Write one's Fortran IV program on 80-character paper.

IBM Coding Paper

b) Transfer each line on the paper to a punched card, using a punched card machine that announced each character with a kerchunk, like a sewing machine that has placed a stitch.

The IBM 029 Card Punch

c) Place the deck of cards on the card reader.

IBM System/360 Model 20 Card Reader

d) Press the button which made a Las Vegas card fanning sound as the deck was read.

Card Reader Panel Buttons

e) In those days, the size of one's deck was very much the status symbol, but I digress.

Card Deck

f) I specifically remember two programming assignments I had to get running:

- The 3,4,5 triangle problem

- The parabola problem

The parabola problem was the most important to me personally, having grown up in a family where such figures were important. The assignment did not ask for it, but I was compelled, even obsessed by the unassigned task of DRAWING the parabola whose roots were computed by the program. This drawing took place on a Calcomp plotter.

Calcomp 565 Plotter

Despite multiple attempts I never succeeded in accomplishing (on the IBM 360) the completion of this task, but the drive to do it never left me. It became a central focus of all future computing, and led me from aerospace engineering to computer science to computer graphics at the University of Utah.

It eventually would result in this, which you can click on if you like animations.

Various Animations

g) After the card deck was read, the next activity I clearly remember was the WAIT.

One had to wait to collect the printout that resulted from the execution of your program to find out if it had functioned correctly, or even at all.

Wide form computer printout

Like the Tom Petty song, waiting was the hardest part. This would range from 5 minutes on a good day, to 30 minutes or even, "Pick it up tomorrow" on a busy day.

h) In those days, the priority with which one's jobs ran was very much a status symbol, but I digress.

i) On obtaining the tree-consuming fan-folded printout of nearly poster size proportion, one would deduce, usually in seconds, any shortcoming the program had, which would lead to a repetition of the steps above.

Now why do I present, in such excruciating detail, the above series of steps? Because if we skip over the personal computing revolution to the current state of machine learning we find we have arrived at the same place again.

Enter Machine Learning

Fast forward 45 years. Besides all the mish-mash of algorithm design and coding, machine learning (ML) consists of three principal steps:

1) Training the neural network from data

2) Testing the neural network on data

3) Deploying the resulting inference engine for general use

The most time-consuming step by far is training the network. The Waiting problem has reappeared. since for most problems of current interest, training networks cannot be realistically done on a user's personal computer in a reasonable amount of time. So it has to be farmed out to a CPU, TPU, GPU, or APU in the cloud via Microsoft Azure, IBM Cloud, Google Cloud, Amazon Web Services and the like. The machines that execute the jobs sit in racks and those racks sit in server farms.

Google Server Farm in Belgium

They process our jobs and we wait and we pay. An example of a massively parallel job is GTP-3, a language inference engine that has 175 billion weights in its neural network and cost an estimated $12 million dollars to train.

So to follow Dr. Sneed's kind example, how do we make machine learning as easy as possible to learn and execute? How can we minimize the number of steps, the administrative and emotional overhead necessary to appropriate ML into our computational lives? ML is already available on demand using services like Google Home Assistant, Microsoft Cortana, Apple Siri, and Amazon Echo. These enable positively useful C3PO-like conversations with machines, whose only lack is a robotic delivery mechanism.

C3PO - Ready to Answer Questions

Transforming the current generation of personal assistants into more robotically enabled ones would seem to be a natural direction for growth and development. At this writing, one can already purchase a robotic canine from Boston Dynamics for $75,000 USD. A Google Assistant to use for a head is three orders of magnitude less expensive, $29 USD at this writing. So there is one idea.

FrankenSpot = Spot + Google Assistant

So that would be one interesting project, although I personally would prefer a more anthropomorphic version since hands come in handy for robotic assistants.