For the formalism used to approximate the influence of an extracellular electrical field on neurons, see activating function. For a linear system’s transfer function, see transfer function.
Part of a series on
Machine learning and data mining
Paradigms
Supervised learning
Unsupervised learning
Online learning
Batch learning
Meta-learning
Semi-supervised learning
Self-supervised learning
Reinforcement learning
Curriculum learning
Rule-based learning
Quantum machine learning
Problems
Classification
Generative modeling
Regression
Clustering
Dimensionality reduction
Density estimation
Anomaly detection
Data cleaning
AutoML
Association rules
Semantic analysis
Structured prediction
Feature engineering
Feature learning
Learning to rank
Grammar induction
Ontology learning
Multimodal learning
Supervised learning (classification • regression)
Apprenticeship learning
Decision trees
Ensembles
Bagging
Boosting
Random forest
k-NN
Linear regression
Naive Bayes
Artificial neural networks
Logistic regression
Perceptron
Relevance vector machine (RVM)
Support vector machine (SVM)
Clustering
BIRCH
CURE
Hierarchical
k-means
Fuzzy
Expectation–maximization (EM)
DBSCAN
OPTICS
Mean shift
Dimensionality reduction
Factor analysis
CCA
ICA
LDA
NMF
PCA
PGD
t-SNE
SDL
Structured prediction
Graphical models
Bayes net
Conditional random field
Hidden Markov
Anomaly detection
RANSAC
k-NN
Local outlier factor
Isolation forest
Artificial neural network
Autoencoder
Cognitive computing
Deep learning
DeepDream
Feedforward neural network
Kolmogorov–Arnold Network
Recurrent neural network
LSTM
GRU
ESN
reservoir computing
Restricted Boltzmann machine
GAN
Diffusion model
SOM
Convolutional neural network
U-Net
Transformer
Vision
Mamba
Spiking neural network
Memtransistor
Electrochemical RAM (ECRAM)
Reinforcement learning
Q-learning
SARSA
Temporal difference (TD)
Multi-agent
Self-play
Learning with humans
Active learning
Crowdsourcing
Human-in-the-loop
RLHF
Model diagnostics
Coefficient of determination
Confusion matrix
Learning curve
ROC curve
Mathematical foundations
Kernel machines
Bias–variance tradeoff
Computational learning theory
Empirical risk minimization
Occam learning
PAC learning
Statistical learning
VC theory
Machine-learning venues
ECML PKDD
NeurIPS
ICML
ICLR
IJCAI
ML
JMLR
Related articles
Glossary of artificial intelligence
List of datasets for machine-learning research
List of datasets in computer vision and image processing
Outline of machine learning
v
t
e
The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is nonlinear.[1] Modern activation functions include the smooth version of the ReLU, the GELU, which was used in the 2018 BERT model,[2] the logistic (sigmoid) function used in the 2012 speech recognition model developed by Hinton et al,[3] the ReLU used in the 2012 AlexNet computer vision model[4][5] and in the 2015 ResNet model.
^Hinkelmann, Knut. "Neural Networks, p. 7" (PDF). University of Applied Sciences Northwestern Switzerland. Archived from the original (PDF) on 2018-10-06. Retrieved 2018-10-06.
^Hendrycks, Dan; Gimpel, Kevin (2016). "Gaussian Error Linear Units (GELUs)". arXiv:1606.08415 [cs.LG].
^Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E. (2017-05-24). "ImageNet classification with deep convolutional neural networks". Communications of the ACM. 60 (6): 84–90. doi:10.1145/3065386. ISSN 0001-0782.
^King Abdulaziz University; Al-johania, Norah; Elrefaei, Lamiaa; Benha University (2019-06-30). "Dorsal Hand Vein Recognition by Convolutional Neural Networks: Feature Learning and Transfer Learning Approaches" (PDF). International Journal of Intelligent Engineering and Systems. 12 (3): 178–191. doi:10.22266/ijies2019.0630.19.
and 25 Related for: Activation function information
empirical performance, activationfunctions also have different mathematical properties: Nonlinear When the activationfunction is non-linear, then a two-layer...
logistic function to multiple dimensions, and used in multinomial logistic regression. The softmax function is often used as the last activationfunction of...
wide variety of sigmoid functions including the logistic and hyperbolic tangent functions have been used as the activationfunction of artificial neurons...
Alternative activationfunctions have been proposed, including the rectifier and softplus functions. More specialized activationfunctions include radial...
the activation with the learnable parameter β, though researchers usually let β = 1 and do not use the learnable parameter β. For β = 0, the function turns...
Alternative activationfunctions have been proposed, including the rectifier and softplus functions. More specialized activationfunctions include radial...
one example of the problem cause, traditional activationfunctions such as the hyperbolic tangent function have gradients in the range [-1,1], and backpropagation...
neurons. Definition of activation: Activation can be defined in a variety of ways. For example, in a Boltzmann machine, the activation is interpreted as the...
inputs is the softmax activationfunction, used in multinomial logistic regression. Another application of the logistic function is in the Rasch model...
function and activationfunctions do not matter as long as they and their derivatives can be evaluated efficiently. Traditional activationfunctions include...
through a non-linear function known as an activationfunction or transfer function[clarification needed]. The transfer functions usually have a sigmoid...
activationfunctions, i.e. what's in practice used and most proofs assume. In recent years neocortical pyramidal neurons with oscillating activation function...
The activatingfunction is a mathematical formalism that is used to approximate the influence of an extracellular field on an axon or neurons. It was...
weights are symmetric guarantees that the energy function decreases monotonically while following the activation rules. A network with asymmetric weights may...
mathematics, the ramp function is also known as the positive part. In machine learning, it is commonly known as a ReLU activationfunction or a rectifier in...
vision. In 1969 Fukushima introduced the ReLU (Rectifier Linear Unit) activationfunction in the context of visual feature extraction in hierarchical neural...
matrix. This product is usually the Frobenius inner product, and its activationfunction is commonly ReLU. As the convolution kernel slides along the input...
The Pre-activation Residual Block applies the activationfunctions (e.g., non-linearity and normalization) before applying the residual function F {\textstyle...
modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activationfunctions. The output of the network...
hyperbolic tangent activationfunction ([P]SMHTAF), is a special S-shaped function based on the hyperbolic tangent, given by This function was originally...
network with ReLU activation is strictly larger than the input dimension, then the network can approximate any Lebesgue integrable function; if the width...
neural network with mean-square error loss function. For a neuron j {\displaystyle j} with activationfunction g ( x ) {\displaystyle g(x)} , the delta...
stays fixed unless changed by learning, an activationfunction f {\displaystyle f} that computes the new activation at a given time t + 1 {\displaystyle t+1}...
the energy function or neurons’ activationfunctions) leading to super-linear (even an exponential) memory storage capacity as a function of the number...