Recursive Sequences as Linear Transformations

Subtitle: Fibonacci Spirals the Drain

This problem is from Otto K. Bretscher posted in the Classical Mathematics Facebook group.

bretscher-sequence

We’re working on a new platform for scientific research called eurAIka that combines The Whiteboard for writing notes, equations, and pasting in graphics, The Librarian for AI-assisted literature search, and The Coder where you can write programs. This will be our first attempt at using eurAIka for a Wild Peaches article.

The Fibonacci Sequence

Let’s start with an easier problem, the Fibonacci sequence:

fibonacci-sequence

The Fibonacci Sequence

where the recursive relationship is xn+2=xn+1+xnx_{n+2} = x_{n+1} + x_n and x0=x1=1x_0 = x_1 = 1​. We could generalize this to include any sequence composed of multiples of the previous two terms,

xn+2=αxn+1+βxn.x_{n+2} = \alpha x_{n+1} + \beta x_{n}.

The coefficients α=β=1\alpha = \beta = 1 for the Fibonacci sequence, and for the problem we’re trying to solve α=4\alpha = 4 and β=7\beta = -7.

fibonacci-spiral

Fibonacci Spiral

It wouldn’t be difficult to write a function in a computer language that calculates all of the elements of the sequence up to the one we’re interested in. To see how the value for x1=bx_1 = b changes the outcome, we’d have to run the code over many test cases to see what happened.

Another way to get to the value of x2024x_{2024} can be done in one step using linear algebra. Let’s call v0\overrightarrow{v_{0}} the vector composed of the first two entries of the sequence, so

v0=[x1x0].\overrightarrow{v_{0}} = \begin{bmatrix} x_{1} \\ x_{0} \end{bmatrix}.

The next step in the sequence puts x2x_2 in the first position and moves x1x_1 down giving

v1=[x2x1]=[αx1+βx0x1].\overrightarrow{v_{1}} = \begin{bmatrix} x_{2} \\ x_{1} \end{bmatrix} = \begin{bmatrix} \alpha x_{1} + \beta{x_{0}} \\ x_{1} \end{bmatrix}.

The entries of this new vector are linear combinations of the previous two entries of the sequence, so we could define a matrix AA that multiplies v0\overrightarrow{v_{0}} to get

v1=Av0=[αβ10][x1x0].\overrightarrow{v_{1}} = A \overrightarrow{v_{0}} = \begin{bmatrix} \alpha & \beta \\ 1 & 0 \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{0} \end{bmatrix}.

The recursion step says that v2=Av1\overrightarrow{v_2} = A \overrightarrow{v_1} and so on, or in terms of the original vector, v2=A(Av0)=A2v0.\overrightarrow{v_2} = A(A \overrightarrow{v_0}) = A^2 \overrightarrow{v_0}. For the Fibonacci sequence, A=[1110]A = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}. For example, we can calculate the 16th entry in the sequence by calculating A14=[610377377233]A^{14} = \begin{bmatrix} 610 & 377 \\ 377 & 233 \end{bmatrix}, which gives F15F_{15} in the [1,1][1,1] location. A convenient trick to extract just the upper left entry using Julia is to write this as (A^14)[1,1].

If Ax=λxAx = \lambda x then xx is called an eigenvector and the constant λ\lambda is an eigenvalue. For the Fibonacci matrix AA there are two eigenvalues, λ=1±52\lambda = \frac{1 \pm \sqrt{5}}{2} (the positive one is the golden ratio, ϕ\phi) and if you take the ratio of Fn+1F_{n+1} to FnF_n you’ll see that it converges to λ\lambda. The other eigenvalue is the negative inverse of the first one, so λ2=Fn/Fn+1\lambda_2 = -F_n / F_{n+1}.

The matrix AA can be decomposed into A=PDPTA = P D P^T where PP is the 2×22 \times 2 array of eigenvectors and DD is a 2×22 \times 2 diagonal array with λ1\lambda_1 and λ2\lambda_2 on the diagonal. The eigenvectors are unit vectors, so PP is orthogonal meaning that PPT=IP P^T = I, the identity matrix. With this decomposition, powers of AA can be written as

An=PDnPT.A^n = P D^n P^T.

Since DD is diagonal, then DnD^n is also diagonal with λ1n\lambda_1^n and λ2n\lambda_2^n on the diagonal, making matrix powers easy to calculate. Now the 15th iterate of AA can be found from

A14=PD14PT=[610.0377.0377.0233.0].A^{14} = P D^{14} P^T = \begin{bmatrix} 610.0 & 377.0 \\ 377.0 & 233.0 \end{bmatrix}.

This is numerically unstable because DD will quickly overflow or underflow as powers of λ1\lambda_1 and λ2\lambda_2 approach either zero or ±,\pm \infty, but this idea will become theoretically useful for our problem.

As we saw above, the ratios of successive values of the Fibonacci sequence approach the golden ratio, ϕ\phi. If you plotted points (Fn,Fn+1)(F_n, F_{n+1}) you would see that they all lie nearly on the line y=ϕxy = \phi x, and the fit gets better the farther out you go. The points (Fn,Fn+1)(F_n, F_{n+1}) are the vectors which when multiplied by AA give the next point or vector in the sequence.

Geometrically, matrix multiplication transforms vectors through rotation and scaling. What we’re seeing here is that in the decomposition PDnPTvnP D^n P^T v_{n}, the first multiplication PTvnP^T v_n only performs a rotation since PP is unitary. The scaling occurs with Dn(PTvn)D^n (P^T v_n), and multiplication by PP restores the direction.

As an example, let v5=[5,3]Tv_5 = [5, 3]^T so

PTv5=[0.07675.830]DPTv5=[0.04749.434]PDPTv5=[85]\begin{aligned} P^T v_{5} &= \begin{bmatrix} 0.0767 \\ -5.830 \end{bmatrix} \\ DP^T v_{5} &= \begin{bmatrix} -0.0474 \\ -9.434 \end{bmatrix} \\ PDP^T v_{5} &= \begin{bmatrix} 8 \\ 5 \end{bmatrix} \\ \end{aligned}

Fibonacci

Rotation and Scaling Effects

Notice that v5v_5 and PTv5P^T v_5 lie on the same circle as do the points DPTv5DP^Tv_5 and PDPTv5PDP^Tv_5 showing that PP is a rotation, while the scaling between F5F_5 and F6F_6 is multiplication by λ1\lambda_1​.

The Bretscher Sequence

Now let’s apply this same technique to the Bretscher sequence xn+2=4xn+17xnx_{n+2} = 4 x_{n+1} - 7 x_n. Define the transition matrix,

B=[4710]B = \begin{bmatrix} 4 & -7 \\ 1 & 0 \end{bmatrix}

which has the characteristic equation derived from the determinant of BλI|B - \lambda I|

0=(4λ)(λ)+7=λ24λ+7λ=2±3i.\begin{aligned} 0 &= (4 - \lambda)(-\lambda) + 7 \\ &= \lambda^2 - 4 \lambda + 7 \\ \Rightarrow \lambda &= 2 \pm \sqrt{ 3 } i. \end{aligned}

The eigenvectors are found by solving for (BλI)u=0(B - \lambda I) u = 0 for some uu. This gives

[4(2+3i)71(2+3i)][u1u2]=0.\begin{bmatrix} 4 - (2 + \sqrt{3}i) & -7 \\ 1 & -(2 + \sqrt{3}i) \end{bmatrix} \begin{bmatrix} u_1 \\ u_2 \end{bmatrix} = 0.

Expanding this gives the system of equations,

(23i)u17u2=0u1(2+3i)u2=0\begin{aligned} (2 - \sqrt{3}i)u_1 - 7 u_2 &= 0 \\ u_1 - (2 + \sqrt{3}i) u_2 &= 0 \end{aligned}

giving eigenvectors v=[2±3i1]v = \begin{bmatrix} 2 \pm \sqrt{3}i & 1 \end{bmatrix}. The inverse of

P=[2+3i23i11]P = \begin{bmatrix} 2+\sqrt{3}i & 2-\sqrt{3}i \\ 1 & 1 \end{bmatrix}

is

P1=[i2312+i3i2312i3].P^{-1} = \begin{bmatrix} -\frac{i}{2 \sqrt{3}} & \frac{1}{2} + \frac{i}{\sqrt{3}} \\ \frac{i}{2 \sqrt{3}} & \frac{1}{2} - \frac{i}{\sqrt{3}} \end{bmatrix}.

Let DD be the diagonal matrix formed from the eigenvalues,

D=[2+3i0023i]D = \begin{bmatrix} 2+\sqrt{3}i & 0 \\ 0 & 2-\sqrt{3}i \end{bmatrix}

and then verify that B=PDP1B = PDP^{-1} and PP1=IP P^{-1} = I, the identity matrix.

For the Fibonacci sequence, the eigenvectors formed an orthonormal basis, so we could quickly calculate powers of the matrix AA because

An=PDP1PDP1PDP1=PDIIDP1=PDnP1.A^n = PDP^{-1} PDP^{-1} \cdots PDP^{-1} = PDI \cdots I D P^{-1} = P D^nP^{-1}.

The Bretscher matrix also has this property, except that the eigenvectors aren’t orthogonal. Still, the eigenvectors form a basis of the space spanned by BB, so we can still calculate Bn=PDnP1B^n = PD^nP^{-1} for any power nn. This wouldn’t be so bad for some small values of nn, but we’ll need another method to answer the question of how to maximize x2024x_{2024}.

Otto Bretscher says in response to his original question,

The roots of the characteristic polynomial are 2 ± i√3 = √7exp[±i*arctan(√3/2)]. We can write x(n) as a linear combination of (2 + i√3)^n and (2 - i√3)^n and bring this expression into real form. While these computations are straightforward, they are somewhat tedious, perhaps best left to a computing device; the answer provided by WolframAlpha is attached. Since the coefficient of b turns out to be negative for n = 2024, we maximize x(2024) by letting b = 0.

wolfram-alpha-solution

Characteristic Polynomial from Wolfram Alpha

Pari/GP can calculate numbers with very high precision, so let’s give it a try. The sequence begins x0=1,x1=bx_0 = 1, x_1 = b, so the next element is found from

B[x1x0]=[4710][b1]=[4b7b].B \begin{bmatrix} x_{1} \\ x_{0} \end{bmatrix} = \begin{bmatrix} 4 & -7 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} b \\ 1 \end{bmatrix} = \begin{bmatrix} 4b - 7 \\ b \end{bmatrix}.

The fourth element is B2[b1]B^2 \begin{bmatrix} b \\ 1 \end{bmatrix} so the 2024th2024^{th} will be multiplied by B2022B^{2022}. Enter the matrix BB in PARI/GP as

B = [4, -7; 1, 0]

B2022

PARI Solution

So, how do you choose b[0,2π]b \in [0,2 \pi] to maximize x2024x_{2024}? Well you don’t need to know the value of the number that Pari just computed, all you need is the sign. Since the value of the entry in the [1,1][1,1] position of this little 2×22 \times 2 matrix is negative, the bb that maximizes x2024x_{2024}​ is zero.

recursive-image-of-p2024

Recursive image of this problem in eurAIka

Image credits

Code for this article

Problem2024.ipynb - JupyterLab notebook in Julia

Bretscher_sequence.ipynb - JupyterLab notebook using Wolfram Language 13.2

Fibonacci.html - Geogebra worksheet to demonstrate rotation and scaling effects of linear transformations (open in br)

Further reading


Software

Julia

The Julia Project as a whole is about bringing usable, scalable technical computing to a greater audience: allowing scientists and researchers to use computation more rapidly and effectively; letting businesses do harder and more interesting analyses more easily and cheaply.

Posts using Julia

JupyterLab

JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning.

Posts using JupyterLab

Wolfram Language

Wolfram Language is a symbolic language, deliberately designed with the breadth and unity needed to develop powerful programs quickly. By integrating high-level forms—like Image, GeoPolygon or Molecule—along with advanced superfunctions—such as ImageIdentify or ApplyReaction—Wolfram Language makes it possible to quickly express complex ideas in computational form.

Posts using Wolfram Language

Pari/GP

PARI/GP is a widely used computer algebra system designed for fast computations in number theory (factorizations, algebraic number theory, elliptic curves, modular forms, L functions…), but also contains a large number of other useful functions to compute with mathematical entities such as matrices, polynomials, power series, algebraic numbers etc., and a lot of transcendental functions.

Posts using Pari/GP

See all software used on wildpeaches →