-
Notifications
You must be signed in to change notification settings - Fork 766
深度学习常用术语表
Chen Long edited this page Mar 30, 2021
·
1 revision
中文 | 英文 | 缩写 |
---|---|---|
深度学习 | deep learning | |
机器学习 | machine learning | |
机器学习模型 | machine learning model | |
逻辑回归 | logistic regression | |
回归 | regression | |
人工智能 | artificial intelligence | |
朴素贝叶斯 | naive Bayes | |
表示 | representation | |
表示学习 | representation learning | |
自编码器 | autoencoder | |
编码器 | encoder | |
解码器 | decoder | |
多层感知机 | multilayer perceptron | |
人工神经网络 | artificial neural network | |
神经网络 | neural network | |
随机梯度下降 | stochastic gradient descent | SGD |
线性模型 | linear model | |
线性回归 | linear regression | |
整流线性单元 | rectified linear unit | ReLU |
分布式表示 | distributed representation | |
非分布式表示 | nondistributed representation | |
非分布式 | nondistributed | |
隐藏单元 | hidden unit | |
长短期记忆 | long short-term memory | LSTM |
深度信念网络 | deep belief network | DBN |
循环神经网络 | recurrent neural network | RNN |
循环 | recurrence | |
强化学习 | reinforcement learning | |
推断 | inference | |
上溢 | overflow | |
下溢 | underflow | |
softmax函数 | softmax function | |
softmax | softmax | |
欠估计 | underestimation | |
过估计 | overestimation | |
病态条件 | poor conditioning | |
目标函数 | objective function | |
目标 | objective | |
准则 | criterion | |
代价函数 | cost function | |
代价 | cost | |
损失函数 | loss function | |
PR曲线 | PR curve | |
F值 | F-score | |
损失 | loss | |
误差函数 | error function | |
梯度下降 | gradient descent | |
导数 | derivative | |
临界点 | critical point | |
驻点 | stationary point | |
局部极小点 | local minimum | |
极小点 | minimum | |
局部极小值 | local minima | |
极小值 | minima | |
全局极小值 | global minima | |
局部极大值 | local maxima | |
极大值 | maxima | |
局部极大点 | local maximum | |
鞍点 | saddle point | |
全局最小点 | global minimum | |
偏导数 | partial derivative | |
梯度 | gradient | |
样本 | example | |
二阶导数 | second derivative | |
曲率 | curvature | |
凸优化 | Convex optimization | |
非凸 | nonconvex | |
数值优化 | numerical optimization | |
约束优化 | constrained optimization | |
可行 | feasible | |
等式约束 | equality constraint | |
不等式约束 | inequality constraint | |
正则化 | regularization | |
正则化项 | regularizer | |
正则化 | regularize | |
泛化 | generalization | |
泛化 | generalize | |
欠拟合 | underfitting | |
过拟合 | overfitting | |
偏差 | biass | |
方差 | variance | |
集成 | ensemble | |
估计 | estimator | |
权重衰减 | weight decay | |
协方差 | covariance | |
稀疏 | sparse | |
特征选择 | feature selection | |
特征提取器 | feature extractor | |
最大后验 | Maximum A Posteriori | MAP |
池化 | pooling | |
Dropout | Dropout | |
蒙特卡罗 | Monte Carlo | |
提前终止 | early stopping | |
卷积神经网络 | convolutional neural network | CNN |
小批量 | minibatch | |
重要采样 | Importance Sampling | |
变分自编码器 | variational auto-encoder | VAE |
计算机视觉 | Computer Vision | CV |
语音识别 | Speech Recognition | |
自然语言处理 | Natural Language Processing | NLP |
有向模型 | Directed Model | |
原始采样 | Ancestral Sampling | |
随机矩阵 | Stochastic Matrix | |
平稳分布 | Stationary Distribution | |
均衡分布 | Equilibrium Distribution | |
索引 | index of matrix | |
磨合 | Burning-in | |
混合时间 | Mixing Time | |
混合 | Mixing | |
Gibbs采样 | Gibbs Sampling | |
吉布斯步数 | Gibbs steps | |
Bagging | bootstrap aggregating | |
掩码 | mask | |
批标准化 | batch normalization | |
参数共享 | parameter sharing | |
KL散度 | KL divergence | |
温度 | temperature | |
临界温度 | critical temperatures | |
并行回火 | parallel tempering | |
自动语音识别 | Automatic Speech Recognition | ASR |
级联 | coalesced | |
数据并行 | data parallelism | |
模型并行 | model parallelism | |
异步随机梯度下降 | Asynchoronous Stochastic Gradient Descent | |
参数服务器 | parameter server | |
模型压缩 | model compression | |
动态结构 | dynamic structure | |
隐马尔可夫模型 | Hidden Markov Model | HMM |
高斯混合模型 | Gaussian Mixture Model | GMM |
转录 | transcribe | |
主成分分析 | principal components analysis | PCA |
因子分析 | factor analysis | |
独立成分分析 | independent component analysis | ICA |
稀疏编码 | sparse coding | |
定点运算 | fixed-point arithmetic | |
浮点运算 | float-point arithmetic | |
生成模型 | generative model | |
生成式建模 | generative modeling | |
数据集增强 | dataset augmentation | |
白化 | whitening | |
深度神经网络 | DNN | |
端到端的 | end-to-end | |
图模型 | graphical model | |
有向图模型 | directed graphical model | |
依赖 | dependency | |
贝叶斯网络 | Bayesian network | |
模型平均 | model averaging | |
声明 | statement | |
量子力学 | quantum mechanics | |
亚原子 | subatomic | |
逼真度 | fidelity | |
信任度 | degree of belief | |
频率派概率 | frequentist probability | |
贝叶斯概率 | Bayesian probability | |
似然 | likelihood | |
随机变量 | random variable | |
概率分布 | probability distribution | |
联合概率分布 | joint probability distribution | |
归一化的 | normalized | |
均匀分布 | uniform distribution | |
概率密度函数 | probability density function | |
累积函数 | cumulative function | |
边缘概率分布 | marginal probability distribution | |
求和法则 | sum rule | |
条件概率 | conditional probability | |
干预查询 | intervention query | |
因果模型 | causal modeling | |
因果因子 | causal factor | |
链式法则 | chain rule | |
乘法法则 | product rule | |
相互独立的 | independent | |
条件独立的 | conditionally independent | |
期望 | expectation | |
期望值 | expected value | |
样本 | example | |
特征 | feature | |
准确率 | accuracy | |
错误率 | error rate | |
训练集 | training set | |
解释因子 | explanatory factort | |
潜在 | underlying | |
潜在成因 | underlying cause | |
测试集 | test set | |
性能度量 | performance measures | |
经验 | experience | |
无监督 | unsupervised | |
有监督 | supervised | |
半监督 | semi-supervised | |
监督学习 | supervised learning | |
无监督学习 | unsupervised learning | |
数据集 | dataset | |
数据点 | data point | |
标签 | label | |
标注 | labeled | |
未标注 | unlabeled | |
目标 | target | |
强化学习 | reinforcement learning | |
设计矩阵 | design matrix | |
参数 | parameter | |
权重 | weight | |
均方误差 | mean squared error | MSE |
正规方程 | normal equation | |
训练误差 | training error | |
泛化误差 | generalization error | |
测试误差 | test error | |
假设空间 | hypothesis space | |
容量 | capacity | |
表示容量 | representational capacity | |
有效容量 | effective capacity | |
线性阈值单元 | linear threshold units | |
非参数 | non-parametric | |
最近邻回归 | nearest neighbor regression | |
最近邻 | nearest neighbor | |
验证集 | validation set | |
基准 | bechmark | |
基准 | baseline | |
点估计 | point estimator | |
估计量 | estimator | |
统计量 | statistics | |
无偏 | unbiased | |
有偏 | biased | |
异步 | asynchronous | |
渐近无偏 | asymptotically unbiased | |
标准差 | standard error | |
一致性 | consistency | |
统计效率 | statistic efficiency | |
有参情况 | parametric case | |
贝叶斯统计 | Bayesian statistics | |
先验概率分布 | prior probability distribution | |
最大后验 | maximum a posteriori | |
最大似然估计 | maximum likelihood estimation | |
最大似然 | maximum likelihood | |
核技巧 | kernel trick | |
核函数 | kernel function | |
高斯核 | Gaussian kernel | |
核机器 | kernel machine | |
核方法 | kernel method | |
支持向量 | support vector | |
支持向量机 | support vector machine | SVM |
音素 | phoneme | |
声学 | acoustic | |
语音 | phonetic | |
专家混合体 | mixture of experts | |
高斯混合体 | Gaussian mixtures | |
选通器 | gater | |
专家网络 | expert network | |
注意力机制 | attention mechanism | |
对抗样本 | adversarial example | |
对抗 | adversarial | |
对抗训练 | adversarial training | |
切面距离 | tangent distance | |
正切传播 | tangent prop | |
正切传播 | tangent propagation | |
双反向传播 | double backprop | |
期望最大化 | expectation maximization | EM |
均值场 | mean-field | |
变分推断 | variational inference | |
二值稀疏编码 | binary sparse coding | |
前馈网络 | feedforward network | |
转移 | transition | |
重构 | reconstruction | |
生成随机网络 | generative stochastic network | |
得分匹配 | score matching | |
因子 | factorial | |
分解的 | factorized | |
均匀场 | meanfield | |
最大似然估计 | maximum likelihood estimation | |
概率PCA | probabilistic PCA | |
随机梯度上升 | Stochastic Gradient Ascent | |
团 | clique | |
Dirac分布 | dirac distribution | |
不动点方程 | fixed point equation | |
变分法 | calculus of variations | |
信念网络 | belief network | |
马尔可夫随机场 | Markov random field | |
马尔可夫网络 | Markov network | |
对数线性模型 | log-linear model | |
自由能 | free energy | |
局部条件概率分布 | local conditional probability distribution | |
条件概率分布 | conditional probability distribution | |
玻尔兹曼分布 | Boltzmann distribution | |
吉布斯分布 | Gibbs distribution | |
能量函数 | energy function | |
标准差 | standard deviation | |
相关系数 | correlation | |
标准正态分布 | standard normal distribution | |
协方差矩阵 | covariance matrix | |
Bernoulli分布 | Bernoulli distribution | |
Bernoulli输出分布 | Bernoulli output distribution | |
Multinoulli分布 | multinoulli distribution | |
Multinoulli输出分布 | multinoulli output distribution | |
范畴分布 | categorical distribution | |
多项式分布 | multinomial distribution | |
正态分布 | normal distribution | |
高斯分布 | Gaussian distribution | |
精度 | precision | |
多维正态分布 | multivariate normal distribution | |
精度矩阵 | precision matrix | |
各向同性 | isotropic | |
指数分布 | exponential distribution | |
指示函数 | indicator function | |
广义函数 | generalized function | |
经验分布 | empirical distribution | |
经验频率 | empirical frequency | |
混合分布 | mixture distribution | |
潜变量 | latent variable | |
隐藏变量 | hidden variable | |
先验概率 | prior probability | |
后验概率 | posterior probability | |
万能近似器 | universal approximator | |
饱和 | saturate | |
分对数 | logit | |
正部函数 | positive part function | |
负部函数 | negative part function | |
贝叶斯规则 | Bayes' rule | |
测度论 | measure theory | |
零测度 | measure zero | |
Jacobian矩阵 | Jacobian matrix | |
自信息 | self-information | |
奈特 | nats | |
比特 | bit | |
香农 | shannons | |
香农熵 | Shannon entropy | |
微分熵 | differential entropy | |
微分方程 | differential equation | |
KL散度 | Kullback-Leibler (KL) divergence | |
交叉熵 | cross-entropy | |
熵 | entropy | |
分解 | factorization | |
结构化概率模型 | structured probabilistic model | |
图模型 | graphical model | |
回退 | back-off | |
有向 | directed | |
无向 | undirected | |
无向图模型 | undirected graphical model | |
成比例 | proportional | |
描述 | description | |
决策树 | decision tree | |
因子图 | factor graph | |
结构学习 | structure learning | |
环状信念传播 | loopy belief propagation | |
卷积网络 | convolutional network | |
卷积网络 | convolutional net | |
主对角线 | main diagonal | |
转置 | transpose | |
广播 | broadcasting | |
矩阵乘积 | matrix product | |
AdaGrad | AdaGrad | |
逐元素乘积 | element-wise product | |
Hadamard乘积 | Hadamard product | |
团势能 | clique potential | |
因子 | factor | |
未归一化概率函数 | unnormalized probability function | |
循环网络 | recurrent network | |
梯度消失与爆炸问题 | vanishing and exploding gradient problem | |
梯度消失 | vanishing gradient | |
梯度爆炸 | exploding gradient | |
计算图 | computational graph | |
展开 | unfolding | |
求逆 | invert | |
时间步 | time step | |
维数灾难 | curse of dimensionality | |
平滑先验 | smoothness prior | |
局部不变性先验 | local constancy prior | |
局部核 | local kernel | |
流形 | manifold | |
流形正切分类器 | manifold tangent classifier | |
流形学习 | manifold learning | |
流形假设 | manifold hypothesis | |
环 | loop | |
弦 | chord | |
弦图 | chordal graph | |
三角形化图 | triangulated graph | |
三角形化 | triangulate | |
风险 | risk | |
经验风险 | empirical risk | |
经验风险最小化 | empirical risk minimization | |
代理损失函数 | surrogate loss function | |
批量 | batch | |
确定性 | deterministic | |
随机 | stochastic | |
在线 | online | |
流 | stream | |
梯度截断 | gradient clipping | |
幂方法 | power method | |
前向传播 | forward propagation | |
反向传播 | backward propagation | |
展开图 | unfolded graph | |
深度前馈网络 | deep feedforward network | |
前馈神经网络 | feedforward neural network | |
前向 | feedforward | |
反馈 | feedback | |
网络 | network | |
深度 | depth | |
输出层 | output layer | |
隐藏层 | hidden layer | |
宽度 | width | |
单元 | unit | |
激活函数 | activation function | |
反向传播 | back propagation | backprop |
泛函 | functional | |
平均绝对误差 | mean absolute error | |
赢者通吃 | winner-take-all | |
异方差 | heteroscedastic | |
混合密度网络 | mixture density network | |
梯度截断 | clip gradient | |
绝对值整流 | absolute value rectification | |
渗漏整流线性单元 | Leaky ReLU | |
参数化整流线性单元 | parametric ReLU | PReLU |
maxout单元 | maxout unit | |
硬双曲正切函数 | hard tanh | |
架构 | architecture | |
操作 | operation | |
符号 | symbol | |
数值 | numeric value | |
动态规划 | dynamic programming | |
自动微分 | automatic differentiation | |
并行分布式处理 | Parallel Distributed Processing | |
稀疏激活 | sparse activation | |
衰减 | damping | |
学成 | learned | |
信息传输 | message passing | |
泛函导数 | functional derivative | |
变分导数 | variational derivative | |
额外误差 | excess error | |
动量 | momentum | |
混沌 | chaos | |
稀疏初始化 | sparse initialization | |
共轭方向 | conjugate directions | |
共轭 | conjugate | |
条件独立 | conditionally independent | |
集成学习 | ensemble learning | |
独立子空间分析 | independent subspace analysis | |
慢特征分析 | slow feature analysis | SFA |
慢性原则 | slowness principle | |
整流线性 | rectified linear | |
整流网络 | rectifier network | |
坐标下降 | coordinate descent | |
坐标上升 | coordinate ascent | |
预训练 | pretraining | |
无监督预训练 | unsupervised pretraining | |
逐层的 | layer-wise | |
贪心算法 | greedy algorithm | |
贪心 | greedy | |
精调 | fine-tuning | |
课程学习 | curriculum learning | |
召回率 | recall | |
覆盖 | coverage | |
超参数优化 | hyperparameter optimization | |
超参数 | hyperparameter | |
网格搜索 | grid search | |
有限差分 | finite difference | |
中心差分 | centered difference | |
储层计算 | reservoir computing | |
谱半径 | spectral radius | |
收缩 | contractive | |
长期依赖 | long-term dependency | |
跳跃连接 | skip connection | |
门控RNN | gated RNN | |
门控 | gated | |
卷积 | convolution | |
输入 | input | |
输入分布 | input distribution | |
输出 | output | |
特征映射 | feature map | |
翻转 | flip | |
稀疏交互 | sparse interactions | |
等变表示 | equivariant representations | |
稀疏连接 | sparse connectivity | |
稀疏权重 | sparse weights | |
接受域 | receptive field | |
绑定的权重 | tied weights | |
等变 | equivariance | |
探测级 | detector stage | |
符号表示 | symbolic representation | |
池化函数 | pooling function | |
最大池化 | max pooling | |
池 | pool | |
不变 | invariant | |
步幅 | stride | |
降采样 | downsampling | |
全 | full | |
非共享卷积 | unshared convolution | |
平铺卷积 | tiled convolution | |
循环卷积网络 | recurrent convolutional network | |
傅立叶变换 | Fourier transform | |
可分离的 | separable | |
初级视觉皮层 | primary visual cortex | |
简单细胞 | simple cell | |
复杂细胞 | complex cell | |
象限对 | quadrature pair | |
门控循环单元 | gated recurrent unit | GRU |
门控循环网络 | gated recurrent net | |
遗忘门 | forget gate | |
截断梯度 | clipping the gradient | |
记忆网络 | memory network | |
神经网络图灵机 | neural Turing machine | NTM |
精调 | fine-tune | |
共因 | common cause | |
编码 | code | |
再循环 | recirculation | |
欠完备 | undercomplete | |
完全图 | complete graph | |
欠定的 | underdetermined | |
过完备 | overcomplete | |
去噪 | denoising | |
去噪 | denoise | |
重构误差 | reconstruction error | |
梯度场 | gradient field | |
得分 | score | |
切平面 | tangent plane | |
最近邻图 | nearest neighbor graph | |
嵌入 | embedding | |
近似推断 | approximate inference | |
信息检索 | information retrieval | |
语义哈希 | semantic hashing | |
降维 | dimensionality reduction | |
对比散度 | contrastive divergence | |
语言模型 | language model | |
标记 | token | |
一元语法 | unigram | |
二元语法 | bigram | |
三元语法 | trigram | |
平滑 | smoothing | |
级联 | cascade | |
模型 | model | |
层 | layer | |
半监督学习 | semi-supervised learning | |
监督模型 | supervised model | |
词嵌入 | word embedding | |
one-hot | one-hot | |
监督预训练 | supervised pretraining | |
迁移学习 | transfer learning | |
学习器 | learner | |
多任务学习 | multitask learning | |
领域自适应 | domain adaption | |
一次学习 | one-shot learning | |
零次学习 | zero-shot learning | |
零数据学习 | zero-data learning | |
多模态学习 | multimodal learning | |
生成式对抗网络 | generative adversarial network | GAN |
前馈分类器 | feedforward classifier | |
线性分类器 | linear classifier | |
正相 | positive phase | |
负相 | negative phase | |
随机最大似然 | stochastic maximum likelihood | |
噪声对比估计 | noise-contrastive estimation | NCE |
噪声分布 | noise distribution | |
噪声 | noise | |
独立同分布 | independent identically distributed | |
专用集成电路 | application-specific integrated circuit | ASIC |
现场可编程门阵列 | field programmable gated array | FPGA |
标量 | scalar | |
向量 | vector | |
矩阵 | matrix | |
张量 | tensor | |
点积 | dot product | |
内积 | inner product | |
方阵 | square | |
奇异的 | singular | |
范数 | norm | |
三角不等式 | triangle inequality | |
欧几里得范数 | Euclidean norm | |
最大范数 | max norm | |
对角矩阵 | diagonal matrix | |
对称 | symmetric | |
单位向量 | unit vector | |
单位范数 | unit norm | |
正交 | orthogonal | |
正交矩阵 | orthogonal matrix | |
标准正交 | orthonormal | |
特征分解 | eigendecomposition | |
特征向量 | eigenvector | |
特征值 | eigenvalue | |
分解 | decompose | |
正定 | positive definite | |
负定 | negative definite | |
半负定 | negative semidefinite | |
半正定 | positive semidefinite | |
奇异值分解 | singular value decomposition | SVD |
奇异值 | singular value | |
奇异向量 | singular vector | |
单位矩阵 | identity matrix | |
矩阵逆 | matrix inversion | |
原点 | origin | |
线性组合 | linear combination | |
列空间 | column space | |
值域 | range | |
线性相关 | linear dependency | |
线性无关 | linearly independent | |
列 | column | |
行 | row | |
同分布的 | identically distributed | |
词嵌入 | word embedding | |
机器翻译 | machine translation | |
推荐系统 | recommender system | |
词袋 | bag of words | |
协同过滤 | collaborative filtering | |
探索 | exploration | |
策略 | policy | |
关系 | relation | |
属性 | attribute | |
词义消歧 | word-sense disambiguation | |
误差度量 | error metric | |
性能度量 | performance metrics | |
共轭梯度 | conjugate gradient | |
在线学习 | online learning | |
逐层预训练 | layer-wise pretraining | |
自回归网络 | auto-regressive network | |
生成器网络 | generator network | |
判别器网络 | discriminator network | |
矩 | moment | |
可见层 | visible layer | |
无限 | infinite | |
容差 | tolerance | |
学习率 | learning rate | |
轮数 | epochs | |
轮 | epoch | |
对数尺度 | logarithmic scale | |
随机搜索 | random search | |
分段 | piecewise | |
汉明距离 | Hamming distance | |
可见变量 | visible variable | |
近似推断 | approximate inference | |
精确推断 | exact inference | |
潜层 | latent layer | |
知识图谱 | knowledge graph |