强化学习的数学原理-notes
P1 Basic Concepts State: The status of the agent with respect to the environment. grid-world example: the location of the agent (s_1, s_2, s_3, \ldots, s_9). State space: the set of all states S = \{s_i\}_{i=1}^9. Action: For each state, there are lots of possible actions: a_1, \ldots, a_n. Action space of a state: the set of all possible actions of a state. A(s_i) = \{a_i\}^5_{i=1}. State transition: When taking an action, the agent may move from one state to another. It defines the ...
convex_set
source: web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf 凸集 1.仿射集 仿射集定义 \notag \forall x_1, x_2 \in \mathbb{C}, \forall \theta \in\R^{n}, \\ \theta x_1 + (1 - \theta) x_2 \in \mathbb{C}
25 IJCAI Image-Enhanced Hybrid Encoding with Reinforced Constrastive Learning for Spatial Domain Identification in Spatial Transcriptomics
code link: wdyi701/IE-HERCL paper link: Concept Frobenius 内积 两个矩阵进行逐元素的相乘, 并且将其累加。 可以用于衡量矩阵间的相似性。 _F \ = \sum^n_{i=1} \sum^m_{j=1} a_{ij} b_{ij} \nonumberoptimal transport 用最小代价将一个分布转移为另一个分布 Earth Mover’s Distance (EMD) [1] 给定 $P_r$ 以最优传输转移为 $P_\theta$ , 当作推土任务最小化平均做功: EMD的一般形式为: \notag EMD(P_r, P_\theta) = \inf_{\gamma \in \Pi} \sum_{x,y} ||x - y|| \gamma(x, y)其中 $\gamma(x,y)$ 为 moving plan 即从给定的 $x$ 移动到 $y$ 的质量, $||x - y||$ 为 $x$ 到 $y$ 位置的欧式距离, $\inf$ 为最大下界, $\Pi$ 为所有可能的 moving plan。...
22 WWW Towards Unsupervised Deep Graph Structure Learning
paper link: Towards Unsupervised Deep Graph Structure Learning code link: TrustAGI-Lab/SUBLIME: [WWW’22] Towards Unsupervised Deep Graph Structure Learning ConceptsMetric Learning: 学习一个变换函数, 将数据从原始的向量空间映射到新的向量空间,使得相似的点距离更近。 Overview 作者认为现有的 graph structure learning (GSL) 大多为 under supervision of node classification task, 存在 the reliance on label information 以及 the bias of learned edge distribution。 后者具体而言为:node classification 一般采用 semi-supervised setting, under supervision 的节点会有更多的指导,远离...