Why does the square of the imaginary unit i become 1?
Home > Quantum mechanics > Do imaginary numbers exist?
2018/10/9
Published 2014/6/29
Koji Sugiyama
I explain why the square of the imaginary unit i becomes 1.
Figure 29: A rotation matrix rotates 90 degrees counterclockwise.
Euler introduced symbol i of the imaginary unit in 1777 and showed the following formula.
i^{2} = 1
This is a very mysterious formula. Why does the square of the number becomes minus one?
Euler, Gauss, and Hamilton have solved the secret of this formula. I would like the people to read this article, who want to know the secret of the formula.
The points are as follows.
(1) We interpret the imaginary unit as an abbreviation of a matrix.
(2) We interpret the imaginary unit as the 90 degrees turn of a picture.
(3) We interpret minus one as the 180 degrees turn of a picture.
(4) When we rotate the picture 90 degrees twice, it becomes a turn 180 degrees.
To understand the reason that an imaginary number is interpreted as an abbreviation of a matrix, I would like to explain the geometrical interpretation of a matrix in this article.
Contents
2.1. Matrix as the abbreviation of linear transformation
2.2. Operation rule of a matrix
2.3. Circle transformation diagram of a matrix
3.1. Complex numbers as an abbreviation of the matrices
5.1.1. What is the transposed matrix?
5.1.2. What is the symmetric matrix?
5.1.3. What is the determinant?
5.1.4. What are the eigenvectors and the eigenvalues?
5.1.5. What is the characteristic equation?
5.1.6. What is the singular value decomposition?
5.2.2. What are imaginary numbers?
5.2.3. What are complex numbers?
5.3.1. What is the complex conjugate?
5.3.2. What is the adjugate matrix?
5.3.3. What is the selfadjoint matrix?
6.1.1. What are the quaternions?
6.1.2. What is the quaternion conjugate?
6.1.3. What is the inner product?
6.1.4. What is the cross product?
6.2.1. Geometrical interpretation of complex numbers
6.2.2. The inner product of complex numbers
6.2.3. The cross product of complex numbers
In this article, I will explain the following formula is satisfied.
(Equation of the imaginary unit)


(1.1) 
To understand a matrix, I would like to consider transformation of a picture.
We move a point (x, y) to a point (x', y') as follows.
Figure 21: A matrix moves a point (x, y) to a point (x', y').
We express this transformation in the following expression.


(2.1) 
This transformation is called a linear transformation.
The expression of a linear transformation is complicated. Then, in order to write more simply, we express it by the following abbreviation.


(2.2) 
We call the abbreviation of the linear transformation as the above expression the matrix.
This matrix moves a point (1, 0) and a point (0, 1) as follows.


(2.3) 


(2.4) 
A movement destination of a point (1, 0) becomes the coordinate (a, c) which is made from a, c in the first column of the matrix.
A movement destination of a point (0, 1) becomes the coordinate (b, d) which is made from b, d in the second column of the matrix.
Figure 22: A movement destination of a point (1, 0) becomes the component (a, c) of the matrix and a movement destination of a point (0, 1) becomes the component (b, d) of the matrix.
I would like to derive the operation rule of a matrix from the linear transformation.
A linear transformation is as follows.


(2.5) 
We will transform the above point (x', y') once again.


(2.6) 
Next, we substitute the formula (2.5) into the formula (2.6). Then, we obtain the following formula.


(2.7) 
The formula (2.5), (2.6), and (2.7) are complicated. Then, in order to write more simply, we express it by the following abbreviation.


(2.8) 


(2.9) 


(2.10) 
On the other hand, we interpret the left side of an equation (2.8) as one value, and substitute the value into the formula (2.9). Then, we obtain the following formula.


(2.11) 
We compare the formula (2.10) and the formula (2.11). Then, we obtain the following formula.


(2.12) 
We define the operation rule of the matrix by the above equation.
British mathematician Arthur Cayley defined this operation rule of the matrix in 1588. [1]
It is difficult to indicate a matrix because the matrix transforms the whole picture.
Then, we observe the unit circle that is transformed by the matrix.
For example, we consider the following matrix.


(2.13) 
How does this matrix transform the following unit circle?


(2.14) 
It is the following ellipse.
Figure 23: Circle transformation diagram of a matrix
The blue circle mark ● means the movement destination of a point (1, 0). The red triangle mark ▲ means the movement destination of a point (0, 1). The dotted lines are major axis and minor axis of the ellipse. In this article, we call the expression like this "circle transformation diagram."
When we see the above figure, the major axis of the ellipse seems to be an eigenvector. However, it is not always so. Generally, the major axis is different from an eigenvector like the following figure.
Figure 24: The major axis of ellipse is different from an eigenvector of a matrix
We are able to express a general matrix by the following four kinds of operation, according to the singular value decomposition to be described.
Figure 25: We are able to decompose a matrix to right rotation, horizontal expansion, vertical expansion, and left rotation.
We are able to express a matrix by rotation and expansion. Therefore, I explain the rotation and the expansion in the following sections.
We consider the following matrix.


(2.15) 
This is called rotational matrix. We express the rotational matrix by the following matrix.


(2.16) 
The circle transformation diagram is shown below.
Figure 26: Rotational matrix
We express the right rotation as follows.


(2.17) 
We express the matrix that expands the xaxis direction into two times.


(2.18) 
In this article, we call this matrix horizontal expanding matrix.
The circle transformation diagram is shown below.
Figure 27: Horizontal expanding matrix expands the picture horizontally.
We are able to make a matrix that expands the picture vertically by the same method of the horizontal expanding matrix. In this article, we call this matrix the vertical expanding matrix.
We consider the following matrix.


(2.19) 
We express the matrix by the real number time of the unit matrix.
In this article, we call this matrix the expanding matrix.


(2.20) 


(2.21) 
In this article, we call the unit matrix the real number unit matrix or the expanding unit matrix.
If the vertical and horizontal expansion rate of a transformation is same, we express the transformation by the following matrix.


(2.22) 
The circle transformation diagram is shown below.
Figure 28: An expanding matrix expands equally.
We consider the following matrix.


(2.23) 
In this article, we call the unit matrix the imaginary number unit matrix or the rotational unit matrix.
We express the turn of 90 degrees by the following matrix.


(2.24) 
The circle transformation diagram is shown below.
Figure 29: A rotation matrix rotates 90 degrees counterclockwise.
We are able to decompose any matrix to the following transformations.
(1) Right rotation
(2) Horizontal expansion
(3) Vertical expansion
(4) Left rotation
This decomposition is called the singular value decomposition (SVD).
If the horizontal expansion rate is equal the vertical expansion rate, we are able to commute rotation to expansion. In the case, we are able to decompose any matrix to the following transformations.
(1) Rotation
(2) Expansion
This commutative property is the origin of the commutative property of complex numbers those will be described in the following section.
In order to understand the reason to interpret complex numbers as an abbreviation of matrices, we consider the following matrix.


(3.1) 
In this article, we call this matrix the imaginary unit matrix, or rotation unit matrix.
This matrix satisfies the following equation.


(3.2) 


(3.3) 
In this article, we call the matrix E the expanding unit matrix.
Here we define the following matrix.


(3.4) 


(3.5) 
In this article, we call this matrix Z the expanding rotation matrix. We use the following abbreviation for the matrix.


(3.6) 


(3.7) 
We called "the abbreviation of expanding rotation matrix" the complex number.
The original matrix is called "matrix representation of complex numbers."
Leonhard Euler [2] has discovered the following formula in 1748.
(Euler's formula)


(3.8) 
Matrix representation of complex numbers is as follows.


(3.9) 
We are able to express the righthand side by the components as follows.


(3.10) 
Righthand side is equal to the following rotation matrix.


(3.11) 
In this article, we explain the following contents.
 We are able to interpret the complex numbers as the abbreviation of expanding rotation matrices.
Therefore, we are able to interpret the imaginary unit as the abbreviation for the imaginary unit matrix. We are able to interpret the imaginary unit as 90degree rotation of the image. We are able to interpret the 1 as 180degree rotation of the image, too.
The double 90degree rotation is equal to 180degree rotation. Therefore, we understand the reason why the square of the imaginary unit becomes 1 geometrically.
The transposed matrix is the matrix obtained by exchange components of the rows to columns of a matrix.


(5.1) 


(5.2) 
We may also express the transposed matrix as follows.


(5.3) 
However, this article will adopt the following notation.


(5.4) 
What is the difference between the transposed matrix and the original matrix? We introduce the following matrix in order to consider it.


(5.5) 
We interpret the transposed matrix as follows.
Figure 51: The transposed matrix
In the above figure, the migration rate from xcity to ycity is not equal to the migration rate from ycity to xcity. This operation to replace these migration rates is a transposition.
If these migration rates are same, the matrix becomes the symmetric matrix that will be described below.
If the transposed matrix is equal to the original matrix, the matrix is called the symmetric matrix.


(5.6) 
The following matrix is the symmetric matrix.


(5.7) 
In the case of the symmetric matrix, the major axis of the circle transformation diagram is equal to the eigenvector.
Figure 52: In the case of the symmetric matrix, the major axis of the circle transformation diagram is equal to the eigenvector.
AugustinLouis Cauchy [3] has proven that all eigenvalues of the symmetric matrix are real numbers in 1829.
We are able to interpret a determinant as the extension rate of an area by a matrix.
We express the determinant as follows.


(5.8) 
Matrix equation AugustinLouis Cauchy [4] has defined in 1815.
We are able to interpret a determinant of a twodimensional matrix as an area expansion rate.
We are able to interpret a determinant of 3dimensional matrix as a volume expansion rate.
Eigenvectors are vectors whose direction does not change by transformation of a matrix. Eigenvalues are expansion rates of the eigenvectors.
In the case of the symmetric matrix, the major axis of the circle transformation diagram is equal to the eigenvector.
Figure 53: In the case of the symmetric matrix, the major axis of the circle transformation diagram is equal to the eigenvector.
A characteristic equation is an equation we use in order to obtain the eigenvalues of a matrix. The characteristic equation is shown below.


(5.9) 
We express the characteristic equation by the components as follows.


(5.10) 
Why are we able to derive the eigenvalues by the characteristic equation? In order to understand this reason, we observe the change of the matrix AλE by changing the variable λ.
Figure 54: A mechanism that we are able to derive the eigenvalues by the characteristic equation.
The black ellipse is the circle that is transformed by the matrix A. The red circle is the circle that is transformed by the matrix λ E. The blue ellipse is the circle that is transformed by the matrix AλE. The red arrows and the green arrows are eigenvectors with the length of the eigenvalues. When the radius of the red circle is equal to the length of one of these arrows, the determinant is equal to zero.
If the area of the blue circle is zero, the area extension rate is zero. Therefore, we are able to interpret that we search the value of the variable λ that the area extension rate is zero by changing the variable λ.
In the case of the symmetric matrix, an eigenvector is equal to the major axis of the ellipse. Therefore, the radius of the red circle is equal to the length of one of these arrows when the red circle is in contact with the black ellipse.
A singular value decomposition is the following decomposition.
 We are able to decompose any matrix M to rotation matrix R (θ), the diagonal matrix D, and the rotation matrix .


(5.11) 
We are able to express a general matrix by the following four kinds of operation by the singular value decomposition.
Figure 55: We are able to decompose a matrix to right rotation, horizontal expansion, vertical expansion, and left rotation.
The horizontal expansion rate and vertical expansion rate are called the singular values. They are generally not equal to the eigenvalues. In the case of the symmetric matrix, horizontal expansion rate and vertical expansion rate are eigenvalues.
Eugenio Beltrami [5] and Camille Jordan [6] have discovered the singular value decomposition independently in 1873 and 1874.
The real number is an abbreviation of "real number times of the expanding unit matrix."
The expanding unit matrix is as follows.


(5.12) 
We use the following abbreviation 1 to express this matrix.


(5.13) 
We express the matrix of the real numbers as follows.


(5.14) 
We use the following abbreviation to express this matrix of the real numbers.


(5.15) 
The imaginary number is an abbreviation of "real number times of the rotation unit matrix."
Rotation matrix is as follows.


(5.16) 
We use the following abbreviation i to express this matrix.


(5.17) 
We express the matrix of the imaginary numbers as follows.


(5.18) 
We use the following abbreviation to express this matrix of the imaginary numbers.


(5.19) 
The complex number is an abbreviation of "the expansion rotational matrix."
The expanding unit matrix is as follows.
The expansion rotational matrix is as follows.


(5.20) 
We express the matrix of the complex numbers as follows.


(5.21) 


(5.22) 


(5.23) 
We use the following abbreviation to express this matrix of the complex numbers.


(5.24) 
The complex conjugate consists of changing the sign of the imaginary part of the complex number.
Now, we express a complex number as follows.


(5.25) 
In mathematics, the complex conjugate is described as follows.


(5.26) 
On the other hand, in physics the complex conjugate is described as follows.


(5.27) 
This article adopts the following notation of mathematics.


(5.28) 
In the case of the matrix representation of complex numbers, complex conjugate is expressed by the following transposed matrix.


(5.29) 


(5.30) 
We are able to interpret the complex conjugate as the transposed matrix. Therefore, we need to change any complex number to the complex conjugate when we transpose any matrix.
The adjugate matrix is the transposed matrix obtained by taking the complex conjugate of each entry.
This matrix is also known as the following names.
 Hermitian conjugate
 Hermitian transpose
 Hermitian adjoint
 Conjugate transpose
Now, we express a matrix as follows.


(5.31) 
In mathematics, the adjugate matrix is described as follows.


(5.32) 
In physics, the adjugate matrix is described as follows.


(5.33) 
This article adopts the following notation of mathematics.


(5.34) 
We define the following matrix with the entries of matrix representations of complex number A, B, C, and D.


(5.35) 
In the case of the matrix with the entries of matrix representations of complex numbers, adjugate matrix is the transposed matrix.


(5.36) 
If an adjugate matrix is equal to the original matrix, the matrix is called selfadjoint matrix.


(5.37) 
This matrix is also known as Hermitian matrix.
Charles Hermite [7] showed that all the eigenvalues of the Hermitian matrix are real numbers in 1855.
We define the following matrix with the entries of matrix representations of complex number A, B, C, and D.


(5.38) 
In the case of the matrix with the entries of matrix representations of complex numbers, Hermitian matrix is the symmetric matrix.


(5.39) 
Hamilton introduced quaternion in 1843. We are able to interpret a quaternion as an abbreviation of the following matrix.


(6.1) 
We express the matrix of a quaternion as follows.


(6.2) 


(6.3) 


(6.4) 


(6.5) 


(6.6) 
We use the following abbreviation to express a quaternion.


(6.7) 
The quaternion conjugate consists of changing the sign of the imaginary part of the quaternion.


(6.8) 


(6.9) 
Physically, it means space reflection.
In the case of the matrix representation of the quaternions, quaternion conjugate is an adjugate matrix.


(6.10) 


(6.11) 
The inner product of threedimensional vector u and the threedimensional vector v is defined as follows.


(6.12) 


(6.13) 


(6.14) 
However, why the inner product definition as described above?
In order to understand this reason, I would like to consider the following the division of quaternion q by quaternion p.


(6.15) 
We define the following rotation by the quaternion R.


(6.16) 
We rotate the quaternion p to p'.


(6.17) 
We rotate the quaternion q to q'.


(6.18) 
The quaternion s is invariant for the rotation R.


(6.19) 
We modify the quaternion s as follows.


(6.20) 
On the other hand, we are able to express the quaternion s as follows.


(6.21) 
The rotation R does not change the absolute values of the quaternions p and q. Therefore, the quaternion r is invariant for the rotation R.
We make the following equation by the formula (6.20) and (6.21).


(6.22) 
We modify the above equation as follows.


(6.23) 
The absolute values of the quaternions p and q are invariant for the rotation R.
The quaternion r is invariant for the rotation R, too.
Therefore, the lefthand side is invariant for the rotation R.
In this article, we call the lefthand side as rotation invariant product. We are able to confirm that the rotation invariant product is invariant for rotation as follows.


(6.24) 


(6.25) 


(6.26) 
The real part and imaginary part of the rotation invariant product is invariant for the rotation.
We are able to interpret the inner product of threedimensional vectors as the real part of the quaternion of pure imaginary.


(6.27) 


(6.28) 


(6.29) 


(6.30) 
The cross product of threedimensional vector u and the threedimensional vector v is defined as follows.


(6.31) 


(6.32) 


(6.33) 
However, why cross product definition as described above?
In fact, we are able to interpret the cross product of threedimensional vector as a sign inversion of the imaginary part of the rotation invariant product of the quaternion of pure imaginary.


(6.34) 


(6.35) 


(6.36) 


(6.37) 
Norwegian surveyor Caspar Wessel published the complex plane in 1797. The complex plane is the plane to express complex numbers geometrically.
Figure 61: The complex number a+bi exists as a point (a, b) in the complex plane.
French mathematician JeanRobert Argand also published the complex plane in 1806. German mathematician Carl Friedrich Gauss used the complex plane in 1831.
Hamilton express the complex number a + bi as (a, b), and define the operation of the addition and the product as follows.


(6.38) 


(6.39) 
The inner product of threedimensional vector u and the twodimensional vector v is defined as follows.


(6.40) 


(6.41) 


(6.42) 
However, why the inner product definition as described above?
In order to understand this reason, I would like to consider the following the division of complex number q by complex number p.


(6.43) 
We define the following rotation by the complex number R.


(6.44) 
We rotate the complex number p to p'.


(6.45) 
We rotate the complex number q to q'.


(6.46) 
The complex number s is invariant for the rotation R.


(6.47) 
We modify the complex number s as follows.


(6.48) 
On the other hand, we are able to express the complex number s as follows.


(6.49) 
The rotation R does not change the absolute values of the complex numbers p and q. Therefore, the complex number r is invariant for the rotation R.
We make the following equation by the formula (6.20) and (6.21).


(6.50) 
We modify the above equation as follows.


(6.51) 
The absolute values of the complex numbers p and q are invariant for the rotation R.
The complex number r is invariant for the rotation R, too.
Therefore, the lefthand side is invariant for the rotation R.
In this article, we call the lefthand side as rotation invariant product. We are able to confirm that the rotation invariant product is invariant for rotation as follows.


(6.52) 


(6.53) 


(6.54) 
The real part and imaginary part of the rotation invariant product is invariant for the rotation.
We are able to interpret the inner product of threedimensional vectors as the real part of the complex number of pure imaginary.


(6.55) 


(6.56) 


(6.57) 


(6.58) 
The cross product of threedimensional vector u and the twodimensional vector v is defined as follows.


(6.59) 


(6.60) 


(6.61) 
However, why cross product definition as described above?
In fact, we are able to interpret the cross product of threedimensional vector as a sign inversion of the imaginary part of the rotation invariant product of the complex number of pure imaginary.


(6.62) 


(6.63) 


(6.64) 


(6.65) 
In writing this paper, I thank from my heart to NS who gave valuable advice to me.
[1] 
Cayley, Arthur, "A Memoir on the Theory of Matrices," A Philosophical Transactions of the Royal Society of London, vol. 148, pp. 1737, 1858. 
[2] 
Euler, Leonhard, "Introduction to the Analysis of the Infinite (Introductio in analysin infinitorum)," Opera Omnia, vol. 1, no. 8, pp. 92105, 1748. 
[3] 
Cauchy, AugustinLouis, "On the equation which helps one determine the secular inequalities in the movements of the planets (Sur l'equation a l'aide de laquelle on determine les inegalites seculaires)," Oeuvres Completes, vol. 2, no. 9, pp. 174195, 1829. 
[4] 
Cauchy, AugustinLouis, "Memoire sur les fonctions qui ne peuvent obtenir que deux valeurs egales et des signes contraires par suite des transpositions operees entre les variables qu'elles renferment," J Ecole Polytech, vol. 10, pp. 29112, 1815. 
[5] 
Beltrami, Eugenio, "Sulle funzioni bilineari," Giornale di Matematiche ad Uso degli Studenti Delle Universita, vol. 11, pp. 98106, 1873. 
[6] 
Jordan, Camille, "Memoire sur les formes bilineaires," Journal de Matheantiques Purse, Deuxieme Serei, vol. 19, pp. 3554, 1874. 
[7] 
Hermite, Charles, "Remarque sur un theoreme de M. Cauchy," C.R. Acad. Sci. Paris, vol. 41, pp. 181183, 1855. 
[8] 
Descartes, René, "La Géométrie," Discours de la méthode, pp. 376493, 1637. 
[9] 
Sylvester, James Joseph, "On The relation between the minor determinants of linearly equivalent quadratic functions," Philos. Magazine, vol. I, no. 4, pp. 295305, 1851. 