交叉熵损失函数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| import torch import numpy as np import torch.nn as nn
x = torch.randn(3, 5, requires_grad=True) """ tensor([[ 1.3962, -0.2568, -0.7142, 1.1941, 0.5695], [-0.7136, -1.0663, 1.7642, 0.5170, -0.1858], [ 0.0424, -0.3354, -0.9049, 0.6952, 1.3032]], requires_grad=True) """ y = torch.empty(3, dtype=torch.long).random_(5) """ tensor([4, 3, 0]) """
|
Softmax
Softmax是网络输出后第一步操作,其公式可表示为:
\[
\frac{e^{v_{y_n}}}{\sum_{m=1}^K e^{v_m}}
\]
由于网络的输出有正有负,有大有小,Softmax主要是将输出概率标准化到 \([0,1]\)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| a = torch.exp(x) a /= torch.unsqueeze(torch.sum(a, dim=1),dim=1)
b = torch.softmax(x, dim=1)
print(f"手算结果:{a}, \nsoftmax结果:{b}")
""" 手算结果:tensor([[0.3911, 0.3240, 0.0895, 0.1785, 0.0168], [0.3382, 0.1688, 0.3726, 0.0407, 0.0798], [0.2213, 0.1558, 0.2684, 0.3061, 0.0484]], grad_fn=<DivBackward0>), softmax结果:tensor([[0.3911, 0.3240, 0.0895, 0.1785, 0.0168], [0.3382, 0.1688, 0.3726, 0.0407, 0.0798], [0.2213, 0.1558, 0.2684, 0.3061, 0.0484]], grad_fn=<SoftmaxBackward0>) """
|
Log_Softmax
在Softmax之后进行log(以e为底),其公式可表示为: \[
\log \left(\frac{e^{v_{y_n}}}{\sum_{m=1}^K e^{v_m}}\right)
\]
1 2 3 4 5 6 7 8 9 10 11 12 13
| c = torch.log(a) d = torch.log_softmax(x, dim=1)
print(f"手算结果:{c}, \nlog_softmax结果:{d}") """ 手算结果:tensor([[0.3911, 0.3240, 0.0895, 0.1785, 0.0168], [0.3382, 0.1688, 0.3726, 0.0407, 0.0798], [0.2213, 0.1558, 0.2684, 0.3061, 0.0484]], grad_fn=<DivBackward0>), softmax结果:tensor([[0.3911, 0.3240, 0.0895, 0.1785, 0.0168], [0.3382, 0.1688, 0.3726, 0.0407, 0.0798], [0.2213, 0.1558, 0.2684, 0.3061, 0.0484]], grad_fn=<SoftmaxBackward0>) """
|
NLLLoss
NLLLoss损失,即对Log_Softmax之后的结果,将样本标签对应位置的数值进行相加,再除以样本量,最后再去负号,因为log之后是负数,损失需要转换为正值。
对 $[− 1.5425 , − 1.4425 , − 1.3425 , − 1.2425 ] \(和\) [ − 1.3863 , − 1.3863 , − 1.3863 , −
1.3863 ] \(标签对应位置\)target = [2,
3]\(上的数值相加除样本数量再取负,即:\)$
-=1.3644 $$
\[
\begin{aligned}
& \operatorname{NLL}(\log (\operatorname{softmax}(\text { input })),
\text { target })=-\Sigma_{\mathrm{i}=1}^{\mathrm{n}} \text { OneHot
}(\text { target })_{\mathrm{i}} \times \log
\left(\operatorname{softmax}(\text { input })_{\mathrm{i}}\right)
(\text { input } \left.\in \mathbf{R}^{\mathbf{m} \times
\mathbf{n}}\right)
\end{aligned}
\]
1 2 3 4 5 6 7 8 9
| e = - torch.sum(c[np.arange(len(y)), y]) / len(y)
nll_loss = torch.nn.NLLLoss() f = nll_loss(c, y)
print(f"手算结果:{e}, NLLLoss结果{f}")
|
CrossEntropyLoss
\(CrossEntropy\_Loss = Softmax + Log +
NLLLoss = Log\_Softmax + NLLLoss\)
\[
-\frac{1}{N} \sum_{n=1}^N \log \left(\frac{e^{v_{y_n}}}{\sum_{m=1}^K
e^{v_m}}\right)
\]
1 2 3 4 5
| cross_loss = torch.nn.CrossEntropyLoss() g = cross_loss(x, y)
print(f"CrossEntropyLoss:{g}, NLLLoss结果{f}")
|