This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "LogSumExp" – news · newspapers · books · scholar · JSTOR(August 2015) (Learn how and when to remove this message)
The LogSumExp (LSE) (also called RealSoftMax[1] or multivariable softplus) function is a smooth maximum – a smooth approximation to the maximum function, mainly used by machine learning algorithms.[2] It is defined as the logarithm of the sum of the exponentials of the arguments:
^Zhang, Aston; Lipton, Zack; Li, Mu; Smola, Alex. "Dive into Deep Learning, Chapter 3 Exercises". www.d2l.ai. Retrieved 27 June 2020.
^Nielsen, Frank; Sun, Ke (2016). "Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities". Entropy. 18 (12): 442. arXiv:1606.05850. Bibcode:2016Entrp..18..442N. doi:10.3390/e18120442. S2CID 17259055.
{\displaystyle \mathrm {LSE} (x_{1},\dots ,x_{n})=\log \left(\exp(x_{1})+\cdots +\exp(x_{n})\right).} The LogSumExp function domain is R n {\displaystyle \mathbb...
Boltzmann distribution. Another smooth maximum is LogSumExp: L S E α ( x 1 , … , x n ) = 1 α log ∑ i = 1 n exp α x i {\displaystyle \mathrm {LSE} _{\alpha...
\mathbb {R} ^{K},} where the LogSumExp function is defined as LSE ( z 1 , … , z n ) = log ( exp ( z 1 ) + ⋯ + exp ( z n ) ) {\displaystyle \operatorname...
(log multiplication), and takes addition to log addition (LogSumExp), giving an isomorphism of semirings between the probability semiring and the log semiring...
operation, logadd (for multiple terms, LogSumExp) can be viewed as a deformation of maximum or minimum. The log semiring has applications in mathematical...
function is a mathematical function denoted by f ( x ) = exp ( x ) {\displaystyle f(x)=\exp(x)} or e x {\displaystyle e^{x}} (where the argument x is...
function, in physics interpreted as free entropy) is the convex conjugate of LogSumExp (in physics interpreted as the free energy). In 1953, Léon Brillouin derived...
of max ( x 0 , x 1 ) {\displaystyle \max(x_{0},x_{1})} , specifically LogSumExp. Softplus thus generalizes as (note the 0 and the corresponding 1 for...
system Logarithmic scale Logarithmic spiral Logarithmic timeline Logit LogSumExp Mantissa is a disambiguation page; see common logarithm for the traditional...
n x ) exp ( x log ( p 1 − p ) + n log ( 1 − p ) ) , {\displaystyle f(x)={n \choose x}\exp \left(x\log \left({\frac {p}{1-p}}\right)+n\log(1-p)\right)...
{\displaystyle x=0.} LogSumExp function, also called softmax function, is a convex function. The function − log det ( X ) {\displaystyle -\log \det(X)} on the...
instances of log(x) without a subscript base should be interpreted as a natural logarithm, also commonly written as ln(x) or loge(x). The sum of the reciprocals...
with the meaning of expm1(x) = exp(x) − 1. An identity in terms of the inverse hyperbolic tangent, l o g 1 p ( x ) = log ( 1 + x ) = 2 a r t a n h...
then log(X) ~ Exp(λ). If X ~ SkewLogistic(θ), then log ( 1 + e − X ) ∼ Exp ( θ ) {\displaystyle \log \left(1+e^{-X}\right)\sim \operatorname {Exp} (\theta...
(\sigma +it)|=\exp \Re \sum _{p^{n}}{\frac {p^{-n(\sigma +it)}}{n}}=\exp \sum _{p^{n}}{\frac {p^{-n\sigma }\cos(t\log p^{n})}{n}},} where the sum is over all...
p − 1 exp ( 2 π i n 2 q p ) = e π i / 4 2 q ∑ n = 0 2 q − 1 exp ( − π i n 2 p 2 q ) {\displaystyle \displaystyle {\dfrac {1}{\sqrt {p}}}\sum _{n=0}^{p-1}\exp...
m k = E [ X k ] ≤ ( k log ( k / λ + 1 ) ) k ≤ λ k exp ( k 2 2 λ ) . {\displaystyle m_{k}=E[X^{k}]\leq \left({\frac {k}{\log(k/\lambda +1)}}\right)^{k}\leq...
above for the sum of differences from the mean): p ( X ∣ μ , τ ) = ∏ i = 1 n τ 2 π exp ( − 1 2 τ ( x i − μ ) 2 ) = ( τ 2 π ) n / 2 exp ( − 1 2 τ ∑...
as the arithmetic mean in logscale: exp ( 1 n ∑ i = 1 n ln a i ) {\displaystyle \exp {\left({{\frac {1}{n}}\sum \limits _{i=1}^{n}\ln a_{i}}\right)}}...
θ), then log X {\textstyle \log X} follows an exponential-gamma (abbreviated exp-gamma) distribution. It is sometimes referred to as the log-gamma distribution...