$$ \newcommand{\R}{\mathbb{R}} \newcommand{\C}{\mathbb{C}} \newcommand{\N}{\mathbb{N}} \newcommand{\norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\abs}[1]{\left\lvert #1 \right\rvert} \newcommand{\weakto}{\rightharpoonup} \DeclareMathOperator{\supp}{supp} $$
Featured image of post Maximum Angle Condition

Maximum Angle Condition

We prove a result on interpolation error estimates for piecewise linear functions under the maximum angle condition.

$$ \DeclareMathOperator{\supp}{supp} \DeclareMathOperator{\diam}{diam} \DeclareMathOperator{\I}{\mathcal{I}} $$

The Problem

Let $\mathcal{T}$ be a triangulation of a polygonal domain $\Omega \subseteq \R^2$, and $V_h=\left\{v \in C^0(\bar{\Omega}):v|_T \text{ is linear for all } T \in \mathcal{T}\right\}$.

Let $u \in H^2(\Omega)$ and $\I u \in V_h$ be the linear interpolant of $u$. Show that

$$ \norm{u-\I u}_{L^2(\Omega)}+h\abs{u-\I u}_{H^1(\Omega)} \leq \zeta(\theta) h^2 \abs{u}_{H^2(\Omega)} $$

where $h=\max _{T \in \mathcal{T}} \diam(T)$, $\theta$ is the maximum angle in $\mathcal{T}$, and $\zeta$ is an increasing positive function defined on $[\pi / 3, \pi)$.

This problem is taken from Brenner & Scott, The Mathematical Theory of Finite Element Methods, pp. 126, exercise 4.x.12.

Remark. A Maximum Angle Condition asks that the maximum angle in the triangulation stays away from $\pi$. This is different from the minimal angle condition, which asks that the minimum angle in the triangulation stays away from $0$. But the minimum angle condition implies the maximum angle condition.

Step 1

We will first treat the $L^2$ term. The interesting (and perhaps counter-intuitive) result in Finite Element theory is that the $L^2$ interpolation error does not depend on the shape regularity or the maximum angle condition. It depends only on the diameter of the element $h$.

The dependence on the maximum angle arises solely from the transformation of the gradient (the $H^1$ seminorm) back to the physical element, which we will treat in Step 2.

Geometric Setup and Coordinate Transformation

Let $T \in \mathcal{T}$ be an arbitrary triangle with vertices $P_0, P_1, P_2$. Assume without loss of generality that the angle at $P_0$, denoted by $\gamma$, is the maximum angle of $T$. Thus $\gamma \in [\pi/3, \pi)$.

Let $\hat{K}$ be the reference triangle with vertices $\hat{z}_0=(0,0)$, $\hat{z}_1=(1,0)$, and $\hat{z}_2=(0,1)$. We define the affine mapping $F: \hat{K} \to T$ by:

$$ x = F(\hat{x}) = P_0 + B\hat{x}, \quad \text{where } B = [v_1, v_2] \in \mathbb{R}^{2\times 2} $$

Here, $v_1 = P_1 - P_0$ and $v_2 = P_2 - P_0$ are the edge vectors emanating from $P_0$. Let $h_1 = |v_1|$ and $h_2 = |v_2|$. Note that the diameter $h_T \simeq \max(h_1, h_2)$.

The Jacobian determinant is $J = \det(B) = h_1 h_2 \sin \gamma$.

Let $u \in H^2(T)$ and $\hat{u} = u \circ F \in H^2(\hat{K})$. Let $\mathcal{I}$ and $\hat{\mathcal{I}}$ be the linear Lagrange interpolation operators on $T$ and $\hat{K}$ respectively. By affine invariance, $\widehat{\mathcal{I}u} = \hat{\mathcal{I}}\hat{u}$. Let $e = u - \mathcal{I}u$ and $\hat{e} = \hat{u} - \hat{\mathcal{I}}\hat{u}$.

Transformation to Reference Element

We transform the $L^2$ integral from the physical triangle $T$ to the reference triangle $\hat{K}$ using the change of variables $x = F(\hat{x})$ and $dx = |\det B| d\hat{x}$.

$$ \|e\|_{L^2(T)}^2 = \int_T |e(x)|^2 dx = |\det B| \int_{\hat{K}} |\hat{e}(\hat{x})|^2 d\hat{x} = |\det B| \|\hat{e}\|_{L^2(\hat{K})}^2 \tag{1} $$

On the fixed reference domain $\hat{K}$, the shape is regular. We can apply the standard interpolation error estimates.

$$ \|\hat{e}\|_{L^2(\hat{K})} = \|\hat{u} - \hat{\mathcal{I}}\hat{u}\|_{L^2(\hat{K})} \leq C_{ref} |\hat{u}|_{H^2(\hat{K})} \tag{2} $$

Note that $C_{ref}$ is a universal constant depending only on the reference triangle $\hat{K}$.

Mapping the Seminorm Forward

We need to bound $|\hat{u}|_{H^2(\hat{K})}$ in terms of $|u|_{H^2(T)}$. By the chain rule, for any $i, j \in \{1, 2\}$:

$$ \frac{\partial^2 \hat{u}}{\partial \hat{x}_i \partial \hat{x}_j}(\hat{x}) = \sum_{k,l=1}^2 \frac{\partial x_k}{\partial \hat{x}_i} \frac{\partial x_l}{\partial \hat{x}_j} \frac{\partial^2 u}{\partial x_k \partial x_l}(F(\hat{x})) $$

Recall that $\frac{\partial x}{\partial \hat{x}_i}$ is simply the $i$-th column of the matrix $B$, which corresponds to the edge vector $v_i$. In vector notation:

$$ \partial_{\hat{x}_i \hat{x}_j} \hat{u} = v_i^T (\nabla^2 u) v_j $$

Taking the Euclidean norm (Frobenius norm for the Hessian matrix):

$$ |\partial_{\hat{x}_i \hat{x}_j} \hat{u}| \leq |v_i| |v_j| \norm{\nabla^2 u}_F \leq h^2 \norm{\nabla^2 u}_F $$

where $h = \max(|v_1|, |v_2|)$ is the diameter of $T$.

Now, integrate over $\hat{K}$:

$$ \begin{aligned} |\hat{u}|_{H^2(\hat{K})}^2 &= \sum_{i,j} \int_{\hat{K}} |\partial_{\hat{x}_i \hat{x}_j} \hat{u}|^2 d\hat{x} \\ &\leq \sum_{i,j} \int_{\hat{K}} h^4 \norm{\nabla^2 u(F(\hat{x}))}_F^2 d\hat{x} \\ &= 4 h^4 \int_{\hat{K}} \norm{\nabla^2 u(F(\hat{x}))}_F^2 d\hat{x} \end{aligned} $$

Change variables back to $T$ (using $d\hat{x} = |\det B|^{-1} dx$):

$$ |\hat{u}|_{H^2(\hat{K})}^2 \leq 4 h^4 |\det B|^{-1} |u|_{H^2(T)}^2 \tag{3} $$

Combining the Bounds

Substitute (3) into (2), and then the result into (1):

$$ \begin{aligned} \|e\|_{L^2(T)}^2 &= |\det B| \|\hat{e}\|_{L^2(\hat{K})}^2 \\ &\leq |\det B| C_{ref}^2 |\hat{u}|_{H^2(\hat{K})}^2 \\ &\leq |\det B| C_{ref}^2 \left( 4 h^4 |\det B|^{-1} |u|_{H^2(T)}^2 \right) \end{aligned} $$

Taking the square root:

$$ \|u - \mathcal{I}u\|_{L^2(T)} \leq C h^2 |u|_{H^2(T)}. $$

Step 2

We will prove that for each triangle $T \in \mathcal{T}$, we have

$$ \abs{u - \I u}_{H^1(T)} \leq \frac{C}{\sin^2 \gamma} h_T \abs{u}_{H^2(T)} $$

where $h_T = \diam(T)$ and $\gamma$ is the maximum angle in $T$.

Gradient Decomposition

We need to bound $\|\nabla e\|_{L^2(T)}$. The gradient $\nabla e$ is a vector in $\mathbb{R}^2$. It is convenient to decompose it along the directions of the edges $v_1$ and $v_2$, even though they are not orthogonal.

The directional derivative along $v_i$ is given by $\partial_{v_i} e = (\nabla e) \cdot v_i$. By the chain rule, this corresponds exactly to the partial derivatives on the reference element:

$$ \partial_{v_1} e(x) = \frac{\partial \hat{e}}{\partial \hat{x}_1}(\hat{x}), \quad \partial_{v_2} e(x) = \frac{\partial \hat{e}}{\partial \hat{x}_2}(\hat{x}). $$
Lemma

For any vector $\mathbf{w} \in \mathbb{R}^2$ and linearly independent vectors $v_1, v_2$ with angle $\gamma$ between them and lengths $h_1, h_2$:

$$ |\mathbf{w}|^2 \leq \frac{C}{\sin^2 \gamma} \left( \frac{|\mathbf{w} \cdot v_1|^2}{h_1^2} + \frac{|\mathbf{w} \cdot v_2|^2}{h_2^2} \right). $$
Proof
Decompose $\mathbf{w}$ in the dual basis or simply rotate coordinates such that $v_1$ is on the x-axis. The term $1/\sin \gamma$ arises from the transformation between the orthogonal frame and the skew frame defined by $v_1, v_2$.

Applying this to $\mathbf{w} = \nabla e$:

$$ |\nabla e(x)|^2 \leq \frac{C}{\sin^2 \gamma} \left( \frac{1}{h_1^2} |\partial_{v_1} e|^2 + \frac{1}{h_2^2} |\partial_{v_2} e|^2 \right). $$

Transformation to Reference Element

Integrate over $T$:

$$ \|\nabla e\|_{L^2(T)}^2 \leq \frac{C}{\sin^2 \gamma} \left( \frac{1}{h_1^2} \|\partial_{v_1} e\|_{L^2(T)}^2 + \frac{1}{h_2^2} \|\partial_{v_2} e\|_{L^2(T)}^2 \right). $$

Transform the integrals to $\hat{K}$ using $dx = |\det B| d\hat{x}$:

$$ \|\partial_{v_1} e\|_{L^2(T)}^2 = |\det B| \, \|\partial_{\hat{x}_1} \hat{e}\|_{L^2(\hat{K})}^2 $$

$$ \|\partial_{v_2} e\|_{L^2(T)}^2 = |\det B| \, \|\partial_{\hat{x}_2} \hat{e}\|_{L^2(\hat{K})}^2 $$

Substituting back:

$$ \|\nabla e\|_{L^2(T)}^2 \leq \frac{C |\det B|}{\sin^2 \gamma} \left( \frac{1}{h_1^2} \|\partial_{\hat{x}_1} \hat{e}\|_{L^2(\hat{K})}^2 + \frac{1}{h_2^2} \|\partial_{\hat{x}_2} \hat{e}\|_{L^2(\hat{K})}^2 \right) \tag{4} $$

Anisotropic Bounds

Theorem

Let $\hat{K}$ be the reference triangle with vertices $(0,0)$, $(1,0)$ and $(0,1)$, and let $\zeta \in W^m_p(\hat{K})$ for $m \geq 2$ and $1 \leq p \leq \infty$. Then

$$ \| \partial_{x_j}(\zeta - \mathcal{I}\zeta) \|_{L^p(\hat{K})} \leq C | \partial_{x_j} \zeta |_{W^{m-1}_p(\hat{K})} $$

where $\mathcal{I}\zeta$ is the interpolant of $\zeta$ in the $\mathcal{P}_{m-1}$ Lagrange finite element.

Proof

Without loss of generality, let $j=1$ and denote $x_j = x$. Let $\partial_x = \partial/\partial x$. Define $u = \partial_x \zeta$. Note that $u \in W^{m-1}_p(\hat{K})$.

The term $\partial_x (\mathcal{I}\zeta)$ is a polynomial in $\mathcal{P}_{m-2}$. We want to relate this term directly to $u = \partial_x \zeta$. To do this, we must establish that the operation $\zeta \mapsto \partial_x (\mathcal{I}\zeta)$ factors through $\partial_x \zeta$.

One can verify that if $\partial_x \zeta = 0$ (i.e., $\zeta$ is independent of $x$), then $\partial_x (\mathcal{I}\zeta) = 0$.

Given $u \in W^{m-1}_p(\hat{K})$, let $\zeta$ be any primitive such that $\partial_x \zeta = u$. Define:

$$ \Pi u := \frac{\partial}{\partial x} (\mathcal{I}\zeta) $$

This definition is consistent. If $\zeta_1$ and $\zeta_2$ are two primitives of $u$, then $\partial_x(\zeta_1 - \zeta_2) = 0$. By Lemma 1, $\partial_x \mathcal{I}(\zeta_1 - \zeta_2) = 0$, implying $\partial_x \mathcal{I}\zeta_1 = \partial_x \mathcal{I}\zeta_2$.

We now check how $\Pi$ acts on polynomials. This is crucial for applying the Bramble-Hilbert lemma.

One can then verify the following: if $q \in \mathcal{P}_{m-2}$, then $\Pi q = q$.

We can now rewrite the LHS of the inequality using $u = \partial_x \zeta$:

$$ \frac{\partial}{\partial x}(\zeta - \mathcal{I}\zeta) = \partial_x \zeta - \partial_x (\mathcal{I}\zeta) = u - \Pi u = (I - \Pi)u $$

We need to bound $\|(I - \Pi)u\|_{L^p(\hat{K})}$. Consider the linear operator $T = I - \Pi$.

  1. Boundedness: The operator $\Pi$ is composed of integration (bounded on compact domains), interpolation (bounded $W^m_p \to W^m_p$), and differentiation. Specifically, $\Pi : W^{m-1}_p \to \mathcal{P}_{m-2}$. Since $\mathcal{P}_{m-2}$ is finite-dimensional, all norms are equivalent, and $\Pi$ is bounded on $W^{m-1}_p$. Thus, $T$ is bounded from $W^{m-1}_p(\hat{K})$ to $L^p(\hat{K})$.

  2. Vanishing on Polynomials: By Lemma 2, for any $q \in \mathcal{P}_{m-2}$, $\Pi q = q$, so $T(q) = q - q = 0$. The operator vanishes on $\mathcal{P}_{m-2}$.

  3. The Estimate: By the Bramble-Hilbert lemma, if a bounded linear operator $T: W^{k}_p \to L^p$ vanishes on $\mathcal{P}_{k-1}$, then:

    $$ \|Tu\|_{L^p} \le C |u|_{W^{k}_p} $$

    Here, we set $k = m-1$. The operator vanishes on $\mathcal{P}_{m-2}$. Therefore:

    $$ \|(I - \Pi)u\|_{L^p(\hat{K})} \le C |u|_{W^{m-1}_p(\hat{K})}. $$

We apply the theorem with $m=2$:

$$ \|\partial_{\hat{x}_j} \hat{e}\|_{L^2(\hat{K})} \leq C |\partial_{\hat{x}_j} \hat{u}|_{H^1(\hat{K})} $$

This is an anisotropic estimate because the bound for $\partial_1$ depends only on derivatives of $\partial_1$ (i.e., $\partial_{11}$ and $\partial_{12}$), ignoring $\partial_{22}$.

For $j=1$:

$$ \|\partial_{\hat{x}_1} \hat{e}\|_{L^2(\hat{K})}^2 \leq C \left( \|\partial_{11} \hat{u}\|_{L^2(\hat{K})}^2 + \|\partial_{12} \hat{u}\|_{L^2(\hat{K})}^2 \right) \tag{5} $$

For $j=2$:

$$ \|\partial_{\hat{x}_2} \hat{e}\|_{L^2(\hat{K})}^2 \leq C \left( \|\partial_{22} \hat{u}\|_{L^2(\hat{K})}^2 + \|\partial_{21} \hat{u}\|_{L^2(\hat{K})}^2 \right) \tag{6} $$

Scaling Back to Physical Derivatives

As before, we use the chain rule to relate the reference Hessian to the physical Hessian:

$$ |\partial_{ij} \hat{u}| \leq |v_i| |v_j| |\nabla^2 u|_F = h_i h_j |\nabla^2 u|_F $$

Integrating over $\hat{K}$ and mapping back to $T$:

$$ \|\partial_{ij} \hat{u}\|_{L^2(\hat{K})}^2 \leq \int_{\hat{K}} h_i^2 h_j^2 |\nabla^2 u(F(\hat{x}))|^2 d\hat{x} = h_i^2 h_j^2 |\det B|^{-1} |u|_{H^2(T)}^2 $$

Now, substitute these into our estimates (5) and (6).

For the first term ($j=1$, divided by $h_1^2$):

$$ \begin{aligned} \frac{1}{h_1^2} \|\partial_{\hat{x}_1} \hat{e}\|^2 &\leq \frac{C}{h_1^2} \left( \|\partial_{11} \hat{u}\|^2 + \|\partial_{12} \hat{u}\|^2 \right) \\ &\leq \frac{C}{h_1^2} |\det B|^{-1} |u|_{H^2(T)}^2 \left( h_1^4 + h_1^2 h_2^2 \right) \\ &= C |\det B|^{-1} |u|_{H^2(T)}^2 \left( h_1^2 + h_2^2 \right) \end{aligned} $$

For the second term ($j=2$, divided by $h_2^2$):

$$ \begin{aligned} \frac{1}{h_2^2} \|\partial_{\hat{x}_2} \hat{e}\|^2 &\leq \frac{C}{h_2^2} \left( \|\partial_{22} \hat{u}\|^2 + \|\partial_{21} \hat{u}\|^2 \right) \\ &\leq \frac{C}{h_2^2} |\det B|^{-1} |u|_{H^2(T)}^2 \left( h_2^4 + h_2^2 h_1^2 \right) \\ &= C |\det B|^{-1} |u|_{H^2(T)}^2 \left( h_2^2 + h_1^2 \right) \end{aligned} $$

Final Assembly

Substitute these bounds back into equation (4):

$$ \begin{aligned} \|\nabla e\|_{L^2(T)}^2 &\leq \frac{C |\det B|}{\sin^2 \gamma} \left[ C |\det B|^{-1} |u|_{H^2(T)}^2 (h_1^2 + h_2^2) + C |\det B|^{-1} |u|_{H^2(T)}^2 (h_2^2 + h_1^2) \right] \\ &= \frac{C}{\sin^2 \gamma} (h_1^2 + h_2^2) |u|_{H^2(T)}^2 \end{aligned} $$

Since $h_1^2 + h_2^2 \leq 2 h_T^2$, we have

$$ \|\nabla e\|_{L^2(T)} \leq \frac{C}{\sin \gamma} h_T |u|_{H^2(T)}. $$

Step 3

Now we can put everything together by summing the local estimates over all triangles $T \in \mathcal{T}$. Let $\theta = \max_{T \in \mathcal{T}} \gamma_T$ be the maximum angle in the triangulation.

For the $L^2$ norm:

$$ \norm{u-\I u}_{L^2(\Omega)}^2 = \sum_{T \in \mathcal{T}} \norm{u-\I u}_{L^2(T)}^2 \leq \sum_{T \in \mathcal{T}} C h_T^4 \abs{u}_{H^2(T)}^2 \leq C h^4 \abs{u}_{H^2(\Omega)}^2 $$

For the $H^1$ seminorm:

$$ \abs{u-\I u}_{H^1(\Omega)}^2 = \sum_{T \in \mathcal{T}} \abs{u-\I u}_{H^1(T)}^2 \leq \sum_{T \in \mathcal{T}} \frac{C}{\sin^2 \gamma_T} h_T^2 \abs{u}_{H^2(T)}^2 \leq \frac{C}{\sin^2 \theta} h^2 \abs{u}_{H^2(\Omega)}^2 $$

Taking square roots yields:

$$ \norm{u-\I u}_{L^2(\Omega)} \leq C h^2 \abs{u}_{H^2(\Omega)} \quad \text{and} \quad \abs{u-\I u}_{H^1(\Omega)} \leq \frac{C}{\sin \theta} h \abs{u}_{H^2(\Omega)} $$

Multiplying the $H^1$ estimate by $h$ and adding them together:

$$ \norm{u-\I u}_{L^2(\Omega)} + h \abs{u-\I u}_{H^1(\Omega)} \leq C \left( 1 + \frac{1}{\sin \theta} \right) h^2 \abs{u}_{H^2(\Omega)} $$

This completes the proof, with $\zeta(\theta) = C(1 + (\sin \theta)^{-1})$.

Conclusion

The reason the Maximum Angle Condition is famous is precisely because standard Finite Element Theory (Ciarlet-Raviart) assumed the “Minimum Angle Condition” (shape regularity) to bound the Jacobian inverse $B^{-1}$.

  • $L^2$ error: Only involves $B$. Never blows up for flat triangles.
  • $H^1$ error: Involves $B^{-T} \nabla_{\hat{x}}$.
    • Standard theory bounds $\|B^{-1}\| \le h/\rho$ (where $\rho$ is the inscribed circle radius). This blows up if the triangle is flat (anisotropic).
    • Maximum angle theory notices that if the gradients align with the long edges, the derivatives don’t actually see the “short direction” (the altitude), allowing us to replace $1/\sin(\gamma_{min})$ with $1/\sin(\gamma_{max})$.