9.3 Non-Abelian Yang-Mills theory
After our brief tour through the world of Lie groups and Lie algebras, we are now equipped with the necessary technology to introduce one of the main players of the standard model of particle physics: non-Abelian Yang-Mills theory. This is a non-Abelian generalization, associated to a choice of Lie group \(G\), of the electromagnetic potential from Chapter 6. To understand better what happens below, please make sure that you refresh the electromagnetic potential and its gauge symmetries from Section 6.1 before you continue reading the current section. To simplify our presentation, we shall consider only the cases of unitary groups \(\U (n)\) and special unitary group \(\SU (n)\) in the sense of Example 9.2. So that we do not get lost in confusing notations and case distinctions, we shall uniformly write \(G\) for any of these choices, i.e. in what follows the symbol \(G\) stands either for \(\U (n)\) or \(\SU (n)\) and the symbol \(\g \) stands for the corresponding Lie algebra from Example 9.4.
A Yang-Mills field is defined as a covector field
\begin{equation} A\,:\,\bbR ^d\longrightarrow \bbR ^d\otimes \g ~,~~x\longmapsto A(x) \end{equation}
on the \(d\)-dimensional Minkowski spacetime \((\bbR ^d,\eta )\) that takes values in the Lie algebra \(\g \) of the chosen Lie group \(G\). Using our typical index notation \(A_\mu \) for the components of covectors, this means that each component
\begin{equation} A_\mu (x) = A_\mu ^a(x)\,X_a\in \g \end{equation}
takes values in the Lie algebra, hence it can be expanded as in Remark 9.5 by picking a basis \(\{X_a\in \g \}\). (From this we see that the number of components of a Yang-Mills field depends on the dimension of the underlying Lie algebra \(\g \).) For our examples given by \(\g =\mathfrak {u}(n)\) or \(\g =\mathfrak {su}(n)\) from Example 9.4, this means that \(A_\mu (x)\) is an anti-Hermitian \(n\times n\)-matrix that is also trace-free in the case of \(\g =\mathfrak {su}(n)\).
The gauge transformations of Yang-Mills theory are labeled by functions
\(\seteqnumber{0}{9.}{34}\)\begin{flalign} U\,:\bbR ^d\longrightarrow G~,~~x\longmapsto U(x) \end{flalign} from the Minkowski spacetime to the Lie group \(G\), and they act on Yang-Mills fields according to
To simplify notation, we shall often suppress the arguments \(x\in \bbR ^d\) and simply write this as
\(\seteqnumber{0}{9.}{36}\)\begin{flalign} (T_U A)_\mu \,=\, U\,A_\mu \,U^{-1} \,+\, U\,\partial _\mu U^{-1}\quad . \end{flalign} It is however important to keep in mind that all quantities in this expression depend on the spacetime point \(x\in \bbR ^d\), and in particular the gauge transformations are local (i.e. \(x\)-dependent) transformations. We recognize that the first term \(U\,A_\mu \,U^{-1}\) is given by the adjoint representation of \(G\) on the Lie algebra \(\g \), see Example 9.7, but the second term is new and it relies on the fact that \(U(x)\) is a function on spacetime. One can check that the transformation law (9.36) is a (non-linear, but affine) group representation, which means that \((T_{(U^\prime U)} A)_\mu = \big (T_{U^\prime }(T_U A)\big )_\mu \) for all gauge transformations \(U\) and \(U^\prime \). For completeness, let us do the relevant calculation
\(\seteqnumber{0}{9.}{37}\)\begin{flalign} \nn \big (T_{U^\prime }(T_U A)\big )_\mu &= U^\prime \,(T_U A)_\mu \,U^{\prime \,-1} + U^\prime \,\partial _\mu U^{\prime \,-1}= U^\prime \,\Big (U \,A_\mu \,U^{-1} + U\,\partial _\mu U^{-1}\Big )\, U^{\prime \,-1} + U^\prime \,\partial _\mu U^{\prime \,-1}\\[4pt] \nn &=(U^\prime \,U)\,A_\mu \,(U^\prime \,U)^{-1} + (U^\prime \,U)\,\big (\partial _\mu U^{-1}\big )\,U^{\prime \,-1} + (U^\prime \,U)\,U^{-1}\,\partial _\mu U^{\prime \,-1}\\[4pt] &=(U^\prime \,U)\,A_\mu \,(U^\prime \,U)^{-1} + (U^\prime \,U)\,\partial _\mu (U^\prime \,U)^{-1} = (T_{(U^\prime U)} A)_\mu \quad , \end{flalign} where in the third line we used the Leibniz rule \(\partial _\mu (U^\prime \,U)^{-1} = \partial _\mu \big (U^{-1}\,U^{\prime \,-1}\big ) = \big (\partial _\mu U^{-1}\big )\,U^{\prime \,-1} + U^{-1}\,\partial _\mu U^{\prime \,-1}\) for partial derivatives.
The field strength tensor of non-Abelian Yang-Mills theory is slightly more complicated than its analog in electromagnetism. The naive idea to consider simply the antisymmetrized partial derivative \(\partial _\mu A_\nu - \partial _\nu A_\mu \) does not work for non-Abelian Lie groups \(G\) because it has an unpleasant transformation behavior under gauge transformations. This issue can be resolved by making use of the (in general non-trivial) Lie bracket on \(\g \) to define
By construction, the field strength tensor takes values in the Lie algebra \(F_{\mu \nu }(x)\in \g \) and it is antisymmetric in the two covector indices \(F_{\mu \nu }(x) = - F_{\nu \mu }(x)\). Expanding the field strength tensor \(F_{\mu \nu }=F_{\mu \nu }^a\,X_a\) in a basis \(\{X_a\in \g \}\) of the Lie algebra one finds
\(\seteqnumber{0}{9.}{39}\)\begin{flalign} F_{\mu \nu } = \partial _\mu A_\nu ^a\,X_a - \partial _\nu A_\mu ^a\,X_a + A_\mu ^b\,A_\nu ^c\,[X_b,X_c] = \Big (\partial _\mu A_\nu ^a - \partial _\nu A_\mu ^a +f_{bc}^a\, A_\mu ^b\,A_\nu ^c\Big )~X_a \quad , \end{flalign} where \(f_{bc}^a\) are the structure constants from Remark 9.5. The main difference between the Yang-Mills and the electromagnetic field strength tensor is that the former is a non-linear expression in the \(A\)’s, while the latter is linear. For the resulting QFT this implies that non-Abelian Yang-Mills fields have self-interactions, while photons do not. With a short calculation, which I leave for you as an exercise, one shows that \(F_{\mu \nu }\) transforms under gauge transformations according to
which we recognize as the adjoint representation from Example 9.7.
The construction of a gauge invariant action functional for Yang-Mills theory is now relatively simple. Even though the quantity \(F^{\mu \nu } F_{\mu \nu }\) is valued in matrices (because \(F_{\mu \nu }(x)\in \g \) and \(F^{\mu \nu }\in \g \) are matrices), each component of these matrices transforms as a scalar under Poincaré transformations. We can produce from this a number by taking the trace \(\mathrm {Tr}\big (F^{\mu \nu } F_{\mu \nu }\big )\), which again transforms as a scalar field under Poincaré transformations. Using cyclicity of the trace together with the transformation law (9.41), we compute
\(\seteqnumber{0}{9.}{41}\)\begin{flalign} \nn \mathrm {Tr}\Big ((T_UF)^{\mu \nu } (T_UF)_{\mu \nu }\Big ) &= \mathrm {Tr}\Big (U\,F^{\mu \nu }\,U^{-1} \,U\,F_{\mu \nu } U^{-1}\big ) \\ & = \mathrm {Tr}\Big (U^{-1}\,U\,F^{\mu \nu }\, F_{\mu \nu } \big ) = \mathrm {Tr}\big (F^{\mu \nu } F_{\mu \nu }\big )\quad , \end{flalign} which means that \(\mathrm {Tr}\big (F^{\mu \nu } F_{\mu \nu }\big )\) is gauge invariant. This allows us to define the gauge invariant and Poincaré invariant action functional
where \(g_{\mathrm {YM}}\) is a constant that is called the Yang-Mills coupling constant. The Yang-Mills action in the form of (9.43) is an efficient packaging of a rather rich action functional that includes terms that are quadratic, cubic and quartic in the field \(A_\mu \). In fact, using (9.39), we can write out the Yang-Mills action in terms of the \(A\)’s and find
\(\seteqnumber{0}{9.}{43}\)\begin{flalign} \nn S_{\mathrm {YM}}[A] \,&=\, \int _{\bbR ^d}\frac {1}{2\,g^2_{\mathrm {YM}}}\,\mathrm {Tr}\bigg (\Big (\partial ^\mu A^\nu - \partial ^\nu A^\mu + [A^\mu ,A^\nu ]\Big )\, F_{\mu \nu }\bigg )~\dd x\\[4pt] \nn \, &=\, \int _{\bbR ^d}\frac {1}{2\,g^2_{\mathrm {YM}}}\,\mathrm {Tr}\bigg (\Big (2\, \partial ^\mu A^\nu + [A^\mu ,A^\nu ]\Big )\, F_{\mu \nu }\bigg )~\dd x\\[4pt] \, &=\, \int _{\bbR ^d}\frac {1}{2\,g^2_{\mathrm {YM}}}\,\mathrm {Tr}\bigg (2\, \partial ^\mu A^\nu \,\partial _\mu A_\nu - 2\, \partial ^\mu A^\nu \,\partial _\nu A_\mu + 4\,(\partial ^\mu A^\nu )\,[A_\mu ,A_\nu ] + [A^\mu ,A^\nu ]\,[A_\mu ,A_\nu ] \bigg )~\dd x\quad ,\label {eqn:YMactionexpand} \end{flalign} where in the second line we used antisymmetry of \(F_{\mu \nu } = -F_{\nu \mu }\) and in the last line we used cyclicity of the trace. Upon quantization, this will lead to a \(3\)-valent and a \(4\)-valent interaction vertex in the Feynman rules that describes the self-interactions of the Yang-Mills field.
-
Remark 9.12. To avoid confusion, I would like to add the following remarks:
-
(i) You might be worried by the fact that the coupling constant \(g_{\mathrm {YM}}\) appears in (9.43) in front of the whole Lagrangian, including the free (i.e. quadratic) part. This is simply a convenient convention that we have used to streamline our formulas in this section. Using the field redefinition \(\widetilde {A}_{\mu } := \frac {1}{g_{\mathrm {YM}}}\,A_\mu \) for the Yang-Mills field, the action functional in the form of (9.44) can be rewritten as
\(\seteqnumber{0}{9.}{44}\)\begin{flalign} \nn S_{\mathrm {YM}}[\widetilde {A}] \,&=\, \int _{\bbR ^d}\frac {1}{2}\,\mathrm {Tr}\bigg (2\, \partial ^\mu \widetilde {A}^\nu \,\partial _\mu \widetilde {A}_\nu - 2\, \partial ^\mu \widetilde {A}^\nu \,\partial _\nu \widetilde {A}_\mu \\ &\qquad ~\qquad + 4\,g_{\mathrm {YM}}\,(\partial ^\mu \widetilde {A}^\nu )\,[\widetilde {A}_\mu ,\widetilde {A}_\nu ] +g_{\mathrm {YM}}^2\, [\widetilde {A}^\mu ,\widetilde {A}^\nu ]\,[\widetilde {A}_\mu ,\widetilde {A}_\nu ] \bigg )~\dd x\quad , \end{flalign} where the coupling constant is now at its usual spot in front of the interaction terms. This form of the action is the most convenient one for perturbation theory, because it allows for an obvious and simple counting of powers of the coupling constant.
-
(ii) You also might ask why the Yang-Mills action (9.43) has a numerical prefactor of \(\frac {1}{2}\), while the related Maxwell action (6.11) has a numerical prefactor of \(-\frac {1}{4}\). The relative minus sign is related to our anti-Hermiticity convention for the Lie algebras \(\mathfrak {u}(n)\) and \(\mathfrak {su}(n)\) from Example 9.4, which implies that the trace \(\mathrm {Tr}(X\,X)\leq 0\) is non-positive for any Lie algebra element \(X\). The relative factor of \(2\) is introduced to match the usual convention that the basis \(\{X_a\in \g \}\) of the Lie algebra \(\g =\mathfrak {u}(n)\) or \(\g =\mathfrak {su}(n)\) is chosen such that it satisfies the orthonormality condition \(\mathrm {Tr}(X_a\,X_b)=-\frac {1}{2}\,\delta _{ab}\). (You can check this explicitly for our rescaled Pauli matrices for \(\mathfrak {su}(2)\) in Remark 9.5.) Hence, writing the field strength \(F_{\mu \nu } = F_{\mu \nu }^a\,X_a\) in terms of such basis, we find that the action
\(\seteqnumber{0}{9.}{45}\)\begin{flalign} S_{\mathrm {YM}}[A]\,=\,\int _{\bbR ^d}\frac {1}{2\,g^2_{\mathrm {YM}}}\,F_{\mu \nu }^a\,F^{\mu \nu b}~\mathrm {Tr}\big (X_a\,X_b\big )\,\dd x \,=\,\int _{\bbR ^d}-\frac {1}{4\,g^2_{\mathrm {YM}}}\,F_{\mu \nu }^a\,F^{\mu \nu a}\,\dd x \end{flalign} has a prefactor that is compatible with the one of the Maxwell action (6.11), up to the factor of \(\frac {1}{g^{2}_{\mathrm {YM}}}\) that we have explained in item (i) above.
Last but not least it is worthwhile to mention once more that different literature resources use different conventions. For instance, when working with the convention that the Lie algebra elements of \(\mathfrak {u}(n)\) and \(\mathfrak {su}(n)\) are described by Hermitian matrices, then there will be factors of \(\ii \) in some of the formulas above. The conventions taken in these lecture notes try to minimize the appearance of unnecessary prefactors.
-