6.4: La regla de la cadena. La regla invariante de Cauchy

Última actualización
Guardar como PDF

Page ID: 113731

Elias Zakon
University of Windsor via The Trilla Group (support by Saylor Foundation)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Para generalizar la regla de la cadena (Capítulo 5, §1), consideramos el compuesto\(h=g \circ f\) de dos funciones,\(f : E^{\prime} \rightarrow E^{\prime \prime}\) y\(g : E^{\prime \prime} \rightarrow E,\) con\(E^{\prime}, E^{\prime \prime},\) y\(E\) como antes.

Teorema\(\PageIndex{1}\) (chain rule)

\[f : E^{\prime} \rightarrow E^{\prime \prime} \text { and } g : E^{\prime \prime} \rightarrow E\]

son diferenciables en\(\vec{p}\) y\(\vec{q}=f(\vec{p}),\) respectivamente, entonces

\[h=g \circ f\]

es diferenciable en\(\vec{p},\) y

\[d h(\vec{p} ; \cdot)=d g(\vec{q} ; \cdot) \circ d f(\vec{p} ; \cdot).\]

Brevemente: “El diferencial del compuesto es el compuesto de diferenciales”.

Prueba

Dejar\(U=d f(\vec{p} ; \cdot), V=d g(\vec{q} ; \cdot),\) y\(\phi=V \circ U\).

Como\(U\) y\(V\) son mapas lineales continuos, así es\(\phi.\) Debemos demostrar que\(\phi= d h(\vec{p} ; \cdot).\)

Aquí es más conveniente escribir\(\Delta \vec{x}\) o\(\vec{x}-\vec{p}\) para el "\(\vec{t}\)" de la Definición 1 en §3. Por brevedad, establecemos (con\(\vec{q}=f(\vec{p}))\)

\[\begin{aligned} w(\vec{x}) &=\Delta h-\phi(\Delta \vec{x})=h(\vec{x})-h(\vec{p})-\phi(\vec{x}-\vec{p}), \quad \vec{x} \in E^{\prime}, \\ u(\vec{x}) &=\Delta f-U(\Delta \vec{x})=f(\vec{x})-f(\vec{p})-U(\vec{x}-\vec{p}), \quad \vec{x} \in E^{\prime}, \\ v(\vec{y}) &=\Delta g-V(\Delta \vec{y})=g(\vec{y})-g(\vec{q})-V(\vec{y}-\vec{q}), \quad \vec{y} \in E^{\prime \prime}. \end{aligned}\]

Entonces lo que tenemos que probar (ver Definición 1 en §3) reduce a

\[\lim _{\vec{x} \rightarrow \vec{p}} \frac{w(\vec{x})}{|\vec{x}-\vec{p}|}=0,\]

mientras que la presunta existencia de\(d f(\vec{p};\cdot)=U\) y\(d g(\vec{q};\cdot)=V\) puede expresarse como

\[\lim _{\vec{x} \rightarrow \vec{p}} \frac{u(\vec{x})}{|\vec{x}-\vec{p}|}=0,\]

\[\lim _{\overline{y} \rightarrow \vec{q}} \frac{v(\vec{y})}{|\vec{y}-\vec{q}|}=0, \quad \vec{q}=f(\vec{p}).\]

De (2) y (3), recordando eso\(h=g \circ f\) y\(\phi=V \circ U,\) obtenemos

\[\begin{aligned} w(\vec{x}) &=g(f(\vec{x}))-g(\vec{q})-V(U(\vec{x}-\vec{p})) \\ &=g(f(\vec{x}))-g(\vec{q})-V(f(\vec{x})-f(\vec{p})-u(\vec{x})). \end{aligned}\]

Usando (4), con\(\vec{y}=f(\vec{x}),\) y la linealidad de\(V,\) reescribimos (6) como

\[\begin{aligned} w(\vec{x}) &=g(f(\vec{x}))-g(\vec{q})-V(f(\vec{x})-f(\vec{p}))-V(u(\vec{x})) \\ &=v(f(\vec{x}))+V(u(\vec{x})). \end{aligned}\]

(¡Verifica!) Así se probará la fórmula deseada (5) si demostramos que

\[\lim _{\vec{x} \rightarrow \vec{p}} \frac{V(u(\vec{x}))}{|\vec{x}-\vec{p}|}=0\]

\[\lim _{\vec{x} \rightarrow \vec{p}} \frac{v(f(\vec{x}))}{|\vec{x}-\vec{p}|}=0.\]

Ahora, como\(V\) es lineal y continuo, la fórmula (5') rinde (6'). En efecto,

\[\lim _{\vec{x} \rightarrow \vec{p}} \frac{V(u(\vec{x}))}{|\vec{x}-\vec{p}|}=\lim _{\vec{x} \rightarrow \vec{p}} V\left(\frac{u(\vec{x})}{|\vec{x}-\vec{p}|}\right)=V(0)=0\]

por Corolario 2 en el Capítulo 4, §2. (¿Por qué?)

Del mismo modo, (5") implica (6") sustituyendo\(\vec{y}=f(\vec{x}),\) ya

\[|f(\vec{x})-f(\vec{p})| \leq K|\vec{x}-\vec{p}|\]

por Problema 3 iii) en §3. (¡Explique!) Así todo está probado. \(\quad \square\)

Nota 1 (regla invariante de Cauchy). Bajo los mismos supuestos, también tenemos

\[d h(\vec{p} ; \vec{t})=d g(\vec{q} ; \vec{s})\]

si\(\vec{s}=d f(\vec{p} ; \vec{t}), \vec{t} \in E^{\prime}\).

Para con\(U\) y\(V\) como arriba,

\[d h(\vec{p} ; \cdot)=\phi=V \circ U.\]

Por lo tanto, si

\[\vec{s}=d f(\vec{p} ; \vec{t})=U(\vec{t}),\]

tenemos

\[d h(\vec{p} ; \vec{t})=\phi(\vec{t})=V(U(\vec{t}))=V(\vec{s})=d g(\vec{q} ; \vec{s}),\]

demostrando (7).

Nota 2. Si

\[E^{\prime}=E^{n}\left(C^{n}\right), E^{\prime \prime}=E^{m}\left(C^{m}\right), \text { and } E=E^{r}\left(C^{r}\right)\]

luego por el Teorema 3 de §2 y la Definición 2 en §3, podemos escribir (1) en forma de matriz,

\[\left[h^{\prime}(\vec{p})\right]=\left[g^{\prime}(\vec{q})\right]\left[f^{\prime}(\vec{p})\right],\]

parecido al Teorema 3 en el Capítulo 5, §1 (con\(f\) e\(g\) intercambiado). Además, tenemos el siguiente teorema.

Teorema\(\PageIndex{2}\)

Con todo como en el Teorema 1, vamos

\[E^{\prime}=E^{n}\left(C^{n}\right), E^{\prime \prime}=E^{m}\left(C^{m}\right),\]

\[f=\left(f_{1}, \ldots, f_{m}\right).\]

Entonces

\[D_{k} h(\vec{p})=\sum_{i=1}^{m} D_{i} g(\vec{q}) D_{k} f_{i}(\vec{p});\]

o, en notación clásica,

\[\frac{\partial}{\partial x_{k}} h(\vec{p})=\sum_{i=1}^{m} \frac{\partial}{\partial y_{i}} g(\vec{q}) \cdot \frac{\partial}{\partial x_{k}} f_{i}(\vec{p}), \quad k=1,2, \ldots, n.\]

Prueba

Arreglar cualquier vector básico\(\vec{e}_{k}\) en\(E^{\prime}\) y establecer

\[\vec{s}=d f\left(\vec{p} ; \vec{e}_{k}\right), \quad \vec{s}=\left(s_{1}, \ldots, s_{m}\right) \in E^{\prime \prime}.\]

Como\(f\) es diferenciable,\(\vec{p},\) también lo son sus componentes\(f_{i}\) (Problema 9 en §3), y

\[s_{i}=d f_{i}\left(\vec{p} ; \vec{e}_{k}\right)=D_{k} f_{i}(\vec{p})\]

por Teorema 2 (ii) en §3. Usando también el Corolario 1 en §3, obtenemos

\[d g(\vec{q} ; \vec{s})=\sum_{i=1}^{m} s_{i} D_{i} g(\vec{q})=\sum_{i=1}^{m} D_{k} f_{i}(\vec{p}) D_{i} g(\vec{q}).\]

Pero como la\(\vec{s}=d f\left(\vec{p} ; \vec{e}_{k}\right),\) fórmula (7) rinde

\[d g(\vec{q} ; \vec{s})=d h\left(\vec{p} ; \vec{e}_{k}\right)=D_{k} h(\vec{p})\]

por Teorema 2 (ii) en §3. Así sigue el resultado. \(\quad \square\)

Nota 3. El teorema 2 a menudo se llama la regla de la cadena para funciones de varias variables. Se rinde Teorema 3 en el Capítulo 5, §1, si\(m=n=1\).

En el cálculo clásico se habla a menudo de derivados y diferenciales de variables\(y=f\left(x_{1}, \ldots, x_{n}\right)\) más que de los mapeos. Así el Teorema 2 se establece de la siguiente manera.

Dejemos\(u=g\left(y_{1}, \ldots, y_{m}\right)\) ser diferenciables. Si, a su vez, cada

\[y_{i}=f_{i}\left(x_{1}, \dots, x_{n}\right)\]

es diferenciable para\(i=1, \ldots, m,\) entonces también\(u\) es diferenciable como una función compuesta de las\(n\) variables\(x_{k},\) y (fórmula “simplificadora” (8)) tenemos

\[\frac{\partial u}{\partial x_{k}}=\sum_{i=1}^{m} \frac{\partial u}{\partial y_{i}} \frac{\partial y_{i}}{\partial x_{k}}, \quad k=1,2, \ldots, n.\]

Se entiende que los parciales

\[\frac{\partial u}{\partial x_{k}} \text { and } \frac{\partial y_{i}}{\partial x_{k}} \text { are taken at some } \vec{p} \in E^{\prime},\]

mientras que el\(\partial u / \partial y_{i}\) son en\(\vec{q}=f(\vec{p}),\) donde\(f=\left(f_{1}, \ldots, f_{m}\right).\) Esta notación “variable” es conveniente en los cálculos, pero puede causar ambigüedades (ver el siguiente ejemplo).

Ejemplo

Que\(u=g(x, y, z),\) donde\(z\) depende de\(x\) y\(y:\)

\[z=f_{3}(x, y).\]

Establecer\(f_{1}(x, y)=x, f_{2}(x, y)=y, f=\left(f_{1}, f_{2}, f_{3}\right),\) y\(h=g \circ f;\) así

\[h(x, y)=g(x, y, z).\]

Por (8'),

\[\frac{\partial u}{\partial x}=\frac{\partial u}{\partial x} \frac{\partial x}{\partial x}+\frac{\partial u}{\partial y} \frac{\partial y}{\partial x}+\frac{\partial u}{\partial z} \frac{\partial z}{\partial x}.\]

Aquí

\[\frac{\partial x}{\partial x}=\frac{\partial f_{1}}{\partial x}=1 \text { and } \frac{\partial y}{\partial x}=0,\]

para\(f_{2}\) no depende de\(x.\) Así obtenemos

\[\frac{\partial u}{\partial x}=\frac{\partial u}{\partial x}+\frac{\partial u}{\partial z} \frac{\partial z}{\partial x}.\]

(Pregunta: Es\((\partial u / \partial z)(\partial z / \partial x)=0?\))

El problema con (9) es que la variable\(u\) “posa” como ambas\(g\) y\(h.\) a la izquierda, está a\(h;\) la derecha, es\(g.\)

Para evitar esto, nuestro método es diferenciar mapeos bien definidos, no “variables”. Así en (9), tenemos los mapas

\[g : E^{3} \rightarrow E \text { and } f : E^{2} \rightarrow E^{3},\]

con\(f_{1}, f_{2}, f_{3}\) como se indica. Entonces si el\(h=g \circ f,\) Teorema 2 afirma (9) inequívocamente como

\[D_{1} h(\vec{p})=D_{1} g(\vec{q})+D_{3} g(\vec{q}) \cdot D_{1} f(\vec{p}),\]

dónde\(\vec{p} \in E^{2}\) y

\[\vec{q}=f(\vec{p})=\left(p_{1}, p_{2}, f_{3}(\vec{p})\right).\]

(¿Por qué?) En notación clásica,

\[\frac{\partial h}{\partial x}=\frac{\partial g}{\partial x}+\frac{\partial g}{\partial z} \frac{\partial f_{3}}{\partial x}\]

(evitando la “paradoja” de (9)).

Sin embargo, con la debida precaución, se puede usar la notación “variable” cuando sea conveniente. El lector debe practicar ambos (ver los Problemas).

Nota 4. La regla de Cauchy (7), en notación “variable”, se convierte en

\[d u=\sum_{i=1}^{m} \frac{\partial u}{\partial y_{i}} d y_{i}=\sum_{k=1}^{n} \frac{\partial u}{\partial x_{k}} d x_{k},\]

dónde\(d x_{k}=t_{k}\) y\(d y_{i}=d f_{i}(\vec{p} ; \vec{t})\).

En efecto, por Corolario 1 en §3,

\[d h(\vec{p} ; \vec{t})=\sum_{k=1}^{n} D_{k} h(\vec{p}) \cdot t_{k} \text { and } d g(\vec{q} ; \vec{s})=\sum_{i=1}^{m} D_{i} g(\vec{q}) \cdot s_{i}.\]

Ahora, en (7),

\[\vec{s}=\left(s_{1}, \ldots, s_{m}\right)=d f(\vec{p} ; \vec{t});\]

así que por Problema 9 en §3,

\[d f_{i}(\vec{p} ; \vec{t})=s_{i}, \quad i=1, \ldots, m.\]

Reescribiendo todo en la notación “variable”, obtenemos (10).

La “ventaja” de (10) es que\(d u\) tiene la misma forma, independientemente de si\(u\) se trata como una función de la\(x_{k}\) o de la\(y_{i}\) (de ahí el nombre regla “invariante”). Sin embargo, hay que recordar el significado de\(d x_{k}\) y\(d y_{i},\) cuáles son bastante diferentes.

La “invarianza” también falla completamente para diferenciales de orden superior (§5).

Las ventajas de la notación “variable” desaparecen a menos que uno sea capaz de “traducirla” en fórmulas precisas.