keras中加入self-attention模块

来源：爱玩科技网

代码演示

废话不多说，直接上代码

class Self_Attention(Layer):

	def __init__(self, output_dim, **kwargs):
		self.output_dim = output_dim
		super(Self_Attention, self).__init__(**kwargs)

	def build(self, input_shape):
		# 为该层创建一个可训练的权重
		# inputs.shape = (batch_size, time_steps, seq_len)
		self.kernel = self.add_weight(name='kernel',
									  shape=(3, input_shape[2], self.output_dim),
									  initializer='uniform',
									  trainable=True)

		super(Self_Attention, self).build(input_shape)  # 一定要在最后调用它

	def call(self, x):
		WQ = K.dot(x, self.kernel[0])
		WK = K.dot(x, self.kernel[1])
		WV = K.dot(x, self.kernel[2])

		print("WQ.shape", WQ.shape)

		print("K.permute_dimensions(WK, [0, 2, 1]).shape", K.permute_dimensions(WK, [0, 2, 1]).shape)

		QK = K.batch_dot(WQ, K.permute_dimensions(WK, [0, 2, 1]))

		QK = QK / (self.output_dim ** 0.5)

		QK = K.softmax(QK)

		print("QK.shape", QK.shape)

		V = K.batch_dot(QK, WV)

		return V

	def compute_output_shape(self, input_shape):
		return (input_shape[0], input_shape[1], self.output_dim)

调用的时候直接

	intention_token_embedding = Self_Attention(768)(all_token_embedding)

调试运行

从 intention_token_embedding = Self_Attention(768)(all_token_embedding)处直接step into my code

跳入构造函数当中，执行完构造函数__init__,就跳入到Layer对象里面的__call__内 ，分析左下角，执行__call__内的函数

接着运行__call__后面的代码，执行到__call__函数内的call方法，这儿执行重写后的方法，没错，Self_Attention的call方法重写的是Layer里的call方法。

以上即是创建self-attention层的全部内容。

有时候会遇到一些报错，最典型的报错就是AttributeError: ‘NoneType’ object has no attribute ‘_inbound_nodes’
参考.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文