Van's Blog: mathematics

Showing posts with label mathematics. Show all posts

Tuesday, June 30, 2009

Filamentary Rotations

I don't know where to begin, so let me just get a few ideas on the table, to see if a whole will emerge.

A degree of freedom is something like movement in the x-direction. Move over, move back. This degree of freedom is big, because we can see the movement if it is large enough.

So we might call this idea of a freedom of movement in the x-direction a dimension.

A line of this movement, or a filamentary curve is something we can slide a bead along. We assume that we can label subsequent positions of the bead as we move over and move back, and if these labels are consecutive numbers, we can use them to say where we were, where we are, and where we will be.

This filament of possible movement is so large we will call it a large dimension - a large degree of freedom.

But now take the bead and twirl it. The bead can also have another degree of freedom, a rotation.

But assume for a moment that we allow the bead to shrink, ever smaller and smaller, till it is just the size of the filamentary curve itself. We can talk about the rotational station of this bead, but the degree of freedom itself is curled up, too small to see, so we might call this a small dimension - a small degree of freedom.

We can also think about labeling how curled up the bead is by naming the turns or parts of a turn the bead has made. If these names are consecutive numbers we can use them to say where we were, where we are, and where we will be.

Now we are ready to talk about the first idea.

If we have something that is moving in a large dimension along a curve or line, we can resolve this movement as movements along a set of complementary axes. Then we can say that movement along our arbitrary filamentary curve has this much movement in the x-direction, this much movement in the y-direction, and we can use these pairs of labels as perfectly adequate alternative names for the position of the bead along the arbitrary filament.

If you have ever spun a bicycle wheel and held onto the axle in each hand, you will know that the wheel doesn't like to be tilted. It resists this tilting with an inertial force called the gyroscopic force. The gyroscopic force wants to keep the wheel spinning in its original direction and complains by resisting if the wheel is tilted to spin in some other plane of rotation.

All the points on the wheel except the very center, can be represented as translations through space, they aren't rotating at all! But the very center of the wheel (we pretend the axle is rotating too) has an axle, an axis of rotation, and there is an infinitesimally-wide dimension where the movement is pure rotation with no translation. This is a small dimension, because it can never be seen. A more apt name for it might be an invisible dimension.

If the bicycle wheel became like the bead, and became infinitely small, it would have no gyroscopic force. So it could be spinning in one direction, and then that whole assembly could be spun in a second direction and we could resolve spins in one direction into components of spins in multiple other directions as we did with translation above.

Now for the second idea.

If we count up the directions that we live in there are three, and for each of these directions there are three rotational degrees of freedom. Then there is the passage of time. So we really live in a six dimensional sort of space, if we count the invisible dimensions of filamentary rotation, seven if we consider time to be a dimension, but it doesn't have the same freedom as the others. We are stuck in it.

The third idea.

I should stop here, but I am afraid I might lose an important idea, so I will just add something that I find interesting. We rarely talk about the position of a photon. We can talk about where it originated, or where it might be after a time, but the native state of a photon is not really its position, but rather its velocity, which in a vacuum is just c - the speed of light.

But this is a translational velocity and I want to know if there is also some kind of native rotational velocity of a particle, say a photon, or even some other kind of a fundamental building block, perhaps an electron. The spin of an electron is one of its four pieces of state information. My question is, how fast is it spinning?

So putting the last two ideas together we see that it isn't just where something is that is its native state, but rather how fast it is going that characterizes something important about it. How fast something goes is a degree of freedom - a dimension also.

If we add up where things are, and how fast they are going in translation and rotation we come up with 12 degrees of freedom, or 12 dimensions. That plus time makes 13, which just happens to be my lucky number...

Monday, October 23, 2006

Numerical Gold: A Certain Series

Consider the series:

1, 1, 2, 3, 12, 20, 300, 525, 1960, 49392, 1481760, 5821200, 164656800, 336370320, …

Do you, offhandedly recognize the generating function? This is a special series.

These are the inverses of the last solution xn to the Hilbert matrices of order 1, 2, … 14

These matrices were notorious for being ill-conditioned, they are solved here symbolically using Maxima and rational arithmetic.

Maxima notebook is below.

The solution is to the n x n linear algebra problem [H]x = b.

To reproduce these numbers, or larger ones, one can edit the file with a text editor and then drag it into the Maxima window. The last number required 2 minutes to generate on my new dual-core. To change the order of the Hilbert matrix, just change the order variable at the top of the file. It is currently 4. So a point of curiosity really. The numbers become extremely expensive to find, growing at least as the cube of n times a large constant. So to me they are a kind of symbolic gold. The 100^th number for example, is probably not knowable at the current time, but that is speculation on my part. You may notice some functional relationship that allows their simple generation thus “Cracking the Hilbert Code”.

For example

12 = 4 * 4 - 4
20 = 5 * 5 - 5
300 = 20 * 20 - 100
525 = 25 * 25 – 100
and so on.

Ref: The Code

load("eigen");
order : 4;
X : columnvector(makelist(concat(x,i), i, 1, order));
h[i,j] := 1/(i + j -1);
Unity[i,j] := 1;
A : genmatrix(h, order, order);
A . X;
B : genmatrix(Unity, 1, order);
A . X = B;
Ap : triangularize(A);
Ap . X = B;
App : invert(Ap);
App . B;

Friday, May 19, 2006

A Number Theory Problem and the Flexible Base Machine

Summary

A common problem in number theory is addressed using two methods of brute force search. The first search method is implemented as a ‘C’ program. The second search is performed in a spreadsheet. A generalized form of counting is developed to enable generation of the range indices for the second search method. This generalized form of counting takes the form of a flexible base machine.

The flexible base machine is then used to generate the indices for a Cartesian product search. In the process, two observations are made, and an answer to a question posed by James Watson is suggested. The number problem is similar to bin-packing problems. Progressive abbreviation, chunking, and grouping are related ideas.

Preamble

I want to bring a couple of ideas to the front before we embark on our little number problem journey. Try not to let their simplicity distract you. Suppose the following pattern is observed:

{ 101 blah blah blah 101 blah blah blah … }

Let the shorter token A abbreviate 101 in the pattern above.Substitution produces:

{ A blah blah blah A blah blah blah … }

Then let the shorter token B abbreviate blah blah blah in the above pattern to produce:

{ A B A B … }

Continuing this abbreviation progressively with C abbreviating A B yields:

{ C C C … }

and we have progressively abbreviated to saturation. Changing gears a little consider a number in base 10, spoken as "one hundred and one", written as 101, with an optional subscript indicating the base.

This number is the dot product of two vectors, a basis vector and a coefficient vector. Reading right to left the basis vector is:

{ …, 100, 10, 1}

which can also be rewritten:

{ …, 10², 10¹, 10⁰}

and the coefficient vector is:

{ …, 1, 0, 1}

the dot product is:

1 x 10² + 0 x 10¹ + 1 x 10⁰ = 101₁₀

where again, the subscript indicates the base.

We can run the same process again in base two, changing the basis vector to base 2:

{ …, 4, 2, 1}

which can also be rewritten:

{ …, 2², 2¹, 2⁰}

and the coefficient vector is:

{ …, 1, 0, 1}

the dot product is:

1 x 2² + 0 x 2¹ + 1 x 2⁰ = 101₂ = 5₁₀

where the subscript indicates the base and superscripts indicate powers of the basis coefficients.

Now consider the process in base Q:

{ …, QQ, Q, 1}

which can also be rewritten:

{ …, Q², Q¹, Q⁰}

and the coefficient vector is:

{ …, 1, 0, 1}

the dot product is:

1 x Q² + 0 x Q¹ + 1 x Q⁰ = Q² + 1

where the result is independent of base.

What happens when Q is -2 or the imaginary number i?

But first, the problem at hand.

Introduction

Daughter and I are at a high school math fair. A seemingly simple problem is written on a file card. The problem is:

Choosing from {16, 17, 23, 24, 29}

Show how you can make a sum of exactly 100.

There are two parts to this development. An elementary part and a subtle part.

Elementary Part:

There are several interesting things about this problem:

a) To proceed, the solver must infer that multiple instances of a given integer are required. What enables this to be obvious to one and opaque to another? Instinct? Training? Pattern Recognition?

b) How much time is required to find all solutions by hand?

c) If trial and error is used, how many trial solutions exist?

d) How many actual solutions exist?

e) Do rearrangements exist that make the problem easier to solve. Recall the famous incident when Gauss was asked by his teacher to add the first 100 integers. He did so by pairing 1 with 100, 2 with 99, etc. recognizing fifty such pairs. This pairing transformed a long addition into a short multiplication: 50 x 101 = 5050. The insight was not merely numerical. It was pattern recognition and viewpoint.

Development

As a professional mental defective with an ‘f’ and no ‘t’, I imagined solutions of the form:

1) a x + b y + c z + d w + e v = 100

Reversing the list, the specific problem would look something like:

2) a 29 + b 24 + c 23 + d 17 + e 16 = 100

Now to enumerate trial solutions in an orderly way, the coefficients a thru e must be bounded from above and below. To first order, trial values of a would vary from 0 to k such that

3) a 29 <= 100

4) k = floor(100/a) = floor(100/29) = 3

where we agree that floor rounds down to the nearest integer. We might limit a further by considering the likely participation of other integers solutions from our set. That is not done here, but I suspect most mathematicians do this instinctively.

So we replace the coefficients a thru e in equation 2 by the intervals computed using the method of equations 3 and 4. Representing intervals in square braces [0-k_a-e] we obtain:

5) [0-3] 29 + [0-4] 24 + [0-4] 23 + [0-5] 17 + [0-6] 16 = 100

The number of trial solutions is size of the five-dimensional Cartesian product. The first estimate of this number was:

6) 3 × 4 × 4 × 5 × 6 = 1440

but the answer is defective, consistent with the ‘f’ above. To check the interval from [0-k] requires k+1 operations as any ‘C’ programmer will quickly tell you. Adding one to each upper bound makes a threefold difference in the product. The non-defective answer is:

7) 4 × 5 × 5 × 6 × 7 = 4200

and we have answered question c. There are 4200 trial solutions.

The amount of time required to generate all trial solutions is estimated as follows:

Assume these calculations are performed on an infix calculator at the rate of one keystroke per second. Trial solutions of equation 2 require four adds and five multiplies each. Each add requires six keystrokes and each multiply five, because all adds involve pairs of two-digit numbers and all multiplies involve one-digit numbers times two-digit numbers. The ‘+’, ‘x’ and ‘=’ operators require one keystroke each. Adding it
all up:

8) time per add: 6 seconds

9) time per multiply: 5 seconds

10) time per trial solution: 4 × 6 + 5 × 5 = 49 seconds

10) time for trial solutions: 49 × 4200 = 205,800 sec (57 hours 10 minutes)

Fifty-seven hours requires the patience of David Blaine who at this moment has been suspended in a spherical tank of water for nearly a week. My first impulse was to throw this problem into a spreadsheet to see how it looked. David Blaine looks unexpectedly round due to the spherical nature of the tank. Using a spreadsheet also led to unexpected observations, which motivated this busy little development. Idea! How about a quick ‘C’ program to find all the solutions? The program took 5 minutes to write, 5 minutes to debug and 0.03 seconds to run. That is 343 times faster than doing it by hand for the first run and 6.9 million times faster for subsequent runs. A computer makes one at least 343 times more productive than a magician floating in a tank of water, clever though he is. For completeness, the ‘C’ code for finding the solutions follows:

main()

{

int iteration = 0;
int i, j, k, l, m;

int a[5] = { 3, 4, 4, 5, 6};

int x[5] = {29, 24, 23, 17, 16};

for(i = 0; i < a[0]; i++)

for(j = 0; j < a[1]; j++)

for(k = 0; k < a[2]; k++)

for(l = 0; l < a[3];l++)

for(m = 0; m < a[4]; m++)
{

if( (i*x[0] + j*x[1] + k*x[2] + l*x[3] + m*x[4]) == 100 )

printf
(
"%4d: %2d %2d + %2d %2d + %2d %2d + %2d %2d + %2d %2d = 100\n",

iteration,i, x[0], j, x[1], k, x[2], l, x[3], m, x[4]
);

iteration++;

}

}

This answers question d.

The run yielded three solutions, where iterations are numbered from zero:

iteration 26: 0 × 29 + 0 × 24 + 0 × 23 + 4 × 17 + 2 × 16 = 100

iteration 513: 1 × 29 + 0 × 24 + 1 × 23 + 0× 17 + 3 × 16 = 100

iteration 750: 1 × 29 + 2 × 24 + 1 × 23 + 0 × 17 + 0 × 16 = 100

Placing this in tabular form:

Solution	Obtained on Iteration	Value	Equiv. Hand Cranking
0	26	{0, 0, 0, 4, 2}	24 min: 45 s
1	513	{1, 0, 1, 0, 3}	7 hr: 50 min: 15 s
2	750	{1, 2, 1, 0, 0}	11 hr: 27 min: 30 s

Reversing the coefficients finds the first solution more quickly, a useful trick to remember.

Solution	Obtained on Iteration	Value	Equiv. Hand Cranking
0	20	{0, 0, 1, 2, 1}	18 min: 20 s
1	673	{2, 4, 0, 0, 0}	10 hr: 16 min: 55 s
2	734	{3, 0, 1, 0, 1}	11 hr: 12 min: 50 s

These solutions occupy a five-dimensional discrete counting space, from a vector point of view.

We have answered question b. This leaves questions a and e. Question a dealt with how different people solve the same problem and question e dealt with rearrangements of the problem that speed solution. Note that even though one finds all solutions within eleven hours and change, that one must evaluate all the trial solutions to prove that no others exist.

There is a certain irony to note. I spent about as much time with this problem as that required by manual brute force search. However, subsequent problems of this type would solved faster because the pattern is known. This last remark addresses question a. There are two people in this solution, the one before the pattern is recognized and the one after.

Rearrangements

Refreshing our mental image of the problem, we look for rearrangements.

Choosing from {29, 24, 23, 17, 16} Show how you can make a sum of exactly 100.

Question a implies a follow-up. What mental refresh rate is required by mathematicians vs. non-mathematicians?. Are numbers are stored visually, aurally, linguistically, some other way, or all of the above? When James Watson, the co-discoverer of DNA’s double helix was asked by newsman Charlie Rose what question he would like to know the answer to Watson replied, “Where [in the brain] is the number four stored?”. In the meantime, question e asks if there are rearrangements that make the math fair problem more tractable. We have:

11) a × 29 + b × 24 + c × 23 + d × 17 + e × 16 = 100

In this instance of the problem, two of the coefficients are odd and three
are even, a condition that would vary in any generalization. Exploiting
this idea nonetheless, we expand to:

12) a ( 20 + 9) + b (20 + 4) + c (20 + 3) + d (10 + 7) + e (10 + 6) = 100

There is no factoring (obvious to me) that simplifies this equation, so its solution retains its brute force search nature. We live in an age where brute force search is enabled by computers. This has been exploited in symbolic manipulation. It is exploited here also.

Subtle Part:

Finding the solution using a ‘C’ program is simpler and faster than using a spreadsheet. Looping through the Cartesian product makes setting the bounds on the search intervals easy. However it was in building the machinery to handle the spreadsheet version that two interesting observations were made. Is this because the intermediate steps were explicitly visualized?

As developed in the preamble, it is common practice to represent numbers using placeholder notation. Placeholders are just linear combinations of successive powers of a given base. So ordinary counting bears a an subtle and interesting similarity to our math fair problem.

13) In base 10, the number 101 means 1 x 10² + 0 x 10¹ + 1 x 10⁰ = 101

14) In base 2, the number 101 means 1 x 2² + 0 x 2¹ + 1 x 2⁰ = 5.

The same argument works for any base. The original problem was solved by searching linear combinations of integers that satisfied the given equation. Behold, counting numbers can be represented as a linear combinations of a more general set of numbers than successive powers.

Basis and Coefficient Vector Make a Number System

In both the original problem, and in the representation of any counting
number, there are two unstated vectors at play.

One is the basis vector {100, 10, 1} that specifies the decades of the number.

The other is the coefficient vector {1, 0, 1} that defines the values present in each decade.

We could write the number 101 as 1:0:1, but conventional placeholder notation insures that a single digit owns each decade that contributes to the number. To release the restriction we use colons. a:b:c:d to represent each coefficient of a multi-digit number in a flexible base format.

This discussion is more easily demonstrated with a spreadsheet. And to that we must briefly turn. With the spreadsheet is possible to construct working solutions to number theory problems and find the general rules that embody them. Poor man’s induction if you will, that will reach its conclusion in the appendix. Consider the following figure.

Figure 1 - Spreadsheet

The spreadsheet must index through decades of trial values in an orderly fashion, starting with the least significant rightmost trial values and incrementing left values when we complete a trip through a decade and roll over to the next value to the left. This will produce an unexpected reward.

Next, a three decade counter is constructed, which works in any base. For example, base two:

Figure 2 - Spreadsheet

A base 10 version requires only a simple change of basis from {4, 2, 1} to {100, 10, 1}.

Figure 3 spreadsheet

The spreadsheet has been split to account for its size. The split is shown by the horizontal bar. We are almost ready to make our observation. Here are the recurrence formulas the sheet uses:

Figure 4 - Spreadsheet Formula Detail

The second argument in the floor function is the number of significant digits to provide.

The Observation

Returning to our Figure 3 spreadsheet, we are not obligated to provide a basis vector whose numbers are successive powers. We could choose numbers from our original problem:

Figure 5 - Spreadsheet

Linear combinations make the representation of any integer possible with just one restriction.

Using the flexible base, we can write 57 using the basis vector {23, 16, 1} that number is just 2:0:11, which expands to 2 x 23 + 11 x 1 = 46 + 11 = 57. The parity sum at the top proves that the targetnumber and sum are always the same for all target trial solutions examined.

It is strange and beautiful to expand a base represented by a scalar, to a base represented by a vector of numbers that have no relationship except being different in size. In all counting there is the process of progressive abbreviation, we are always using a base that is represented by a vector, whether the base is {100, 10, 1} or the base is {23, 16, 1}. Progressive abbreviation is a powerful and reversible concept. It is linguistic, mathematical, and instinctive. Is progressive abbreviation a mechanism by which concepts are stored and linked in the brain?

To address Watson’s question above in terms of another more testable question:

Is the number four stored as a progressive abbreviation, chosen by each person that links to an expanding neural network of related notions?

If such a progressive abbreviation is removed, are the links to the concept broken with an aphasia of four being the result?

An interesting restriction must be in place for flexible base to work. There must be a 1 in the low-ordered place, or it becomes impossible to represent all numbers. Said another way, absent 1, closure in the counting system is lost. Trading for that loss, we immediately obtain the power to approximate periodic functions. This has interesting properties and applications as suggested below. Absent 1 in the basis set, the sum - target sequence is a generator than undulates with unique patterns that depend on the basis vector. These patterns can be used to approximate discrete functions whose variation undulates in a similar way. Thus we have an observation:

Closure of a counting system can traded for approximation of periodic functions.

Figure 6 - Spreadsheet

Absent 1 in the basis vector there is not full coverage of the linear combination and the representation cannot represent all integers. Here is one interesting pattern of non-closure:

Figure 6a - Spreadsheet Graph – Absent
One Generates Discrete Function

Similar arguments can be made for real numbers.

Figure 7 - Spreadsheet

In base 10 and base 2 number systems, we have a convention for the basis vector - the base itself which gives the rule for creating the basis vector. So “base 10” is the progressive abbreviation for a basis vector of {…, 100, 10, 1} and these in turn progressively abbreviation common counting numbers. We are used to this. Counting numbers are just the coefficient vector written without delimiters and with the assumption of a basis vector. A second observation is that:

A basis and coefficient vector pair implements a number system more general than a conventional number system.

Thus counting is a form of progressive abbreviation. Those possessing the instinctive ability to abbreviate, that is to compress and expand linguistic and symbolic relationships gave us the ability to count and all kinds of mathematical opportunity. This further amplifies the answer to question a above. It may be that concepts of number all originated in the act of abbreviation, historically speaking.

Now we will use the decade counters we generate to solve the original number theory problem in a non-looping construct. This non-looping construct is independent of base.

The original math fair problem is now cast in the non-looping spreadsheet construct. This brings together the naive and subtle developments using the flexible base machine. To accomplish a basis vector must be constructed that is six elements long. The recurrence relationships are similar to above, but complicate with each additional element as shown in the Appendix. Analysis of the recurrence relationships yields the pattern necessary for basis vectors of any length. The recurrence relationships are constructed in a long form and then abbreviated by substitution. This process is shown for the five-element case in Figures 8 and 9. Figure 8 is a transitional construct.

Figure 8 - Spreadsheet

Figure 9 - Spreadsheet Excerpt – Increasing Complexity

A six element version of the flexible base machine will be used to solve the original problem below. We needed a flexible base decade counter and one has been constructed - a bizarre and useful device similar to mixed-base appliances such as analog clocks.

Figure 10 - Spreadsheet Using Flexible Base to Set Bounds

We use a modified form of the decade machine to provide the multipliers that search the space and the same solutions are obtained, but at different iterations:

iteration 30: {1, 2, 1, 0, 0}

iteration 1095: {0, 0, 0, 4, 2}

iteration 1512: {1, 0, 1, 0, 3}

Understanding why the iteration numbers differ between the ‘C’ program and the spreadsheet will be left as an informative exercise for the reader. The answer is given after the Appendix.

Application - Data Compression and Encoding

The patterns that occur absent 1 in the flexible base create unique patterns. When these patterns correspond to the value of a signal, such as intensity or volume, they can be used to encode the pattern of a picture or a sound. The basis vector can be transmitted only as often as a change of basis is necessary to represent the image optimally.

This turns the encoding of an image into a search for the optimal basis and coefficient vector that represents the image. If done in short runs, this need not be overly expensive computationally. One can in fact define the image interpolant by the length of the coefficient and basis vectors. Note that they are always the same length.

One would expect the encoding step to be more expensive than the decoding step provided the basis vectors were published as part of the image data. The basis vectors form the key necessary to unlock the image. This could have applications in selling digital image or sound data. The data is free. You pay for the basis vector key. The technique could be applied recursively so that even the key itself could be represented as a basis vector + coefficient vector + data stream. This might be useful for compartmentalized security where portions of a document are at different levels of classification - one document, many keys. Keys would be distributed to users according to their privilege or classification category. Multi-level security has been a nagging problem which this could solve rather easily.

Application - Encryption

One useful encryption technique is to publish the number (coefficient vector without delimiters except for colons as noted above) without publishing the basis vector.

Breaking the code then becomes a task in searching for the basis vector that creates a recognizable message. Search time grows exponentially with basis vector size.

One tip off to the contents of the basis vector would be the presence of colons. Colons surrounding a number give an upper bound to the size of that entry in the basis vector.

Application – Progressive Abbreviation of DNA

DNA consists of coding and non-coding parts. These parts can be labeled by a simpler basis set, associated with function and the genome can be compacted by the progressive abbreviation technique illustrated above. It is this project which I hope to do next. It is a tribute to Watson. He uncovered the double-helix structure of DNA and then posed a question that led us to the next step.

A spreadsheet that demonstrates all calculations is available from the author. Just leave a comment requesting it.

Appendix – Building the Recurrence Relations for the Six-Fold Case

This will be done without further explanation by induction from the fivefold case which was built from the threefold case.

G8=FLOOR($B8/G$6,1) // Slot 6

I8=FLOOR(($B8-G8*G$6)/I$6,1) // Slot 5

K8=FLOOR((($B8-G8*G$6)-I8*I$6)/K$6,1) // Slot 4

M8=FLOOR(((($B8-G8*G$6)-I8*I$6)-K8*K$6)/M$6,1) // Slot 3

O8=FLOOR((((($B8-G8*G$6)-I8*I$6)-K8*K$6)-M8*M$6)/O$6,1) // Slot 2

Q8=FLOOR(((((($B8- G8*G$6)-I8*I$6)-K8*K$6)-M8*M$6)-O8*O$6)/Q$6,1) // 1

Answer to Question: Loop Order in the ‘C’ program searches right to left.

Van's Blog

Read Latex

Tuesday, June 30, 2009

Filamentary Rotations

Monday, October 23, 2006

Numerical Gold: A Certain Series

Friday, May 19, 2006

A Number Theory Problem and the Flexible Base Machine

Van's Bookshelf