论文笔记(一)Re-ranking by Multi-feature Fusion with Diffusion for Image Retrieval

0x00 预备知识 \(\DeclareMathOperator{\vol}{vol}\)

无向图上的随机游走

无向图 \(G=(V,E)\),边权函数 \(w\colon V\times V \to R_+\)

若 $(u,v) \notin E $ 则 \(w(u,v) = w(v,u) = 0\),否则 \(w(u,v) , w(v,u) > 0\)

\(d(u) = \sum_{v\in V} w(v, u)\)

先不管建图的细节(比如 \(G_m\) 的边权(edge strength 是如何确定的)),先来梳理一下 \(G_m\) 上的随机游走。

\(G_m\) 上的随机游走即「连通的无向图上的随机游走」。
只要给出转移矩阵 \(\mathbf{P}\) 就能求出稳态分布。

我们使用左随机矩阵列向量,这样转移矩阵中元素 \(p_{ij}\) 的含义更为直观。

注:使用列向量是数学中的惯例。

定义 1

A probability distribution \(\pi\) satisfying
\begin{equation}
\pi^{T} = \pi^{T}\mathbf{P} \label{E:1}
\end{equation}
is called a stationary distribution of the transition matrix \(P\), or of the corresponding HMC.

\(G_m\) 上的转移矩阵定义为

\(p_m(v_j | v_i) = e_m(v_i, v_j) / d_m(v_i)\)

显然如此定义的转移矩阵 \(\mathbf{P}_m\) 是左随机矩阵,可以证明 \(P\) 的平稳分布为
\begin{equation}
\pi_m(v_i) = d_m(v_i) / \vol_m V
\end{equation}

证明:边权都大于 \(0\),连通图意味着 \(\forall v\in V, \quad d(v) > 0\),设 \(\pi^T \mathbf{P} = (p_1, p_2, \dots, p_n)\) 其中 \(n\) 是节点数,则有
\begin{aligned}
p_i &= \sum_{j = 1} ^n \pi_j p_{ji} \\
&= \sum_{j = 1} ^n \frac{d_j}{\vol V} \frac{w(j,i)}{d_j} \\
&= \frac{d_i}{\vol V} \\
&= \pi_i
\end{aligned}
证毕。

现在考虑融合图 \(G\) 上的随机游走过程。对于这个过程,我们提出的模型是

\begin{equation}
p(v_j | v_i) = \sum_m p_m(v_i) p_m(v_j | v_i) \label{E:3}
\end{equation}

其中 \(p_m(v_i)\) 是 walker 在点 \(v_i\) 时转到图 \(G_m\) 中进行下一步游走的概率。按我们的想法,应当\(p_m(v_i) \propto \pi_m(v_i)\),于是我们假定

\begin{equation}
p_m(v_i) = k_m(v_i) \pi_m(v_i) \label{E:4}
\end{equation}

其中的系数 \(k_m(v_i)\) 未知,根据

Theorem 1

Let \(\mathbf{P}\) be a transition matrix on the countable state space \(E\), and
let \(\pi\) be some probability distribution on \(E\). If for all \(i, j \in E\), the detailed balance
equations (6.8) are satisfied, then π is a stationary distribution of P.


数据集

首先要能够按文章中的描述提取特征

2 个全局特征:

  • BOW
  • VLAD

2 个局部特征:

  • GIST
  • HSV

OpenCV 处理图像。

OpenCV

不论图像(cv::Mat)的 color model 如何,只要是彩色图像(cv::Mat::channels 返回值为 3)cv::imshow 都认为 3 个 channel 依次是 BGR 。(即 BGR 的字典序 :XD)
参考一
参考二

HSV

下文中,color space 与 color model 混用,指同一个东西。

Trouble 1: How to detect the color model of an image in OpenCV?

Info: 看到一种说法

When OpenCV loads colored images (i.e. 3 channel) from the disk, camera, or a video file, the image data will be stored in the BGR format.

另一种相似的说法指出 RBG 和 RBG 是两种不同的 color model,不过差别只在于 channel 的顺序。

OpenCV has a BGR color space which is used by default. This is similar to the RGB color space except that the B and R channels are physically switched in the image. If the physical channel ordering is important to you, you will need to convert your image with this function: cvCvtColor(defaultBGR, imageRGB, CV_BGR2RGB).

Problem1: 用 cvtColor(img, img, CV_BGR2HSV);img 转为 HSV 格式后,imshow 显示的图与原图不同。我的想法:图片的样子应该与 color model 无关。(彩图转成灰度图这类情形除外)
A:已解决。

cv::Mat 存储图片的格式的一些细节:

The color-space conversions all use the following conventions: 8-bit images are in the range 0 to 255, 16-bit images are in the range 0 to 65,536, and floating-point numbers are in the range \(0.0\) to \(1.0\). When grayscale images are converted to color images, all components of the resulting image are taken to be equal; but for the reverse transformation (e.g., RGB or BGR to grayscale), the gray value is computed through the perceptually weighted formula:

\[Y = (0.299)R + (0.587)G + (0.114)B\]

In the case of HSV or HLS representations, hue is normally represented as a value from 0 to 360 (excluding 360, of course). This can cause trouble in 8-bit representations and so when you are converting to HSV, the hue is divided by 2 when the output image is an 8-bit image.

Trouble 2: cv::Mat::at 方法(和 member function 同义)不懂。

posted @ 2018-05-03 19:54  Pat  阅读(967)  评论(0编辑  收藏  举报