I had a hard time understanding what Keras tensors really were. They are used in a lot of more advanced use of Keras but I couldn’t find a simple explanation of what they mean inside Keras. I write the following this as a way to clarify my understanding.
I could not find a description of Keras tensor however Keras is implemented over Tensorflow and share the same concepts. This paragraph on the Tensorflow website was what made tensors clearer :
A tensor is a generalization of vectors and matrices to potentially higher dimensions
That one was clear from the beginning. Tensor are matrices of many dimensions. All right. Then I read :
tf.Tensor
object represents a partially defined computation that will eventually produce a value.
That one was what I missed. The tf.Tensor object is the result of a function that is not yet evaluated.
I feel that it is like f(x) in the mathematical function f(x) = x². If we provide some value to the input x, then f(x) would evaluate to something. As long as there is no value, it is just f(x) a partially defined computation (a function).
TensorFlow programs work by first building a graph of
tf.Tensor
objects, detailing how each tensor is computed based on the other available tensors and then by running parts of this graph to achieve the desired results.
The tf.Tensor object is not just a matrix of many dimensions, it also link to other tensors by the way it is computed. The way a tf.Tensor is computed is a function that transform a tensor A to a tensor B. I suppose we recurse from the output tensor until we reach all necessary inputs, then we evaluate everything forward.
All of this is about Tensorflow, but I feel that this is a correct for Keras as well.
Examples
Identity function
from keras import backend as K
i = K.placeholder(shape=(4,), name=”input”)
f = K.function([i], [i])
ival = np.ones((4,))
print( f([ival]) )
> [array([ 1., 1., 1., 1.], dtype=float32)]
Useless function that takes an input and returns it. i is a tensor. f is a function. It takes input and outputs as tensor. When we evaluate it with f([ival]), the tensor graph is walked from the i output to the i input. Quite easy here :). i is evaluated with it’s value, then the output is returned by the function.
Square function
from keras import backend as K
i = K.placeholder(shape=(4,), name="input")
square = K.square(i)
f = K.function([i], [square])
ival = np.ones((4,))*2
print( f([ival]) )
> [array([ 4., 4., 4., 4.], dtype=float32)]
A function that returns the square of each value of the input. It is just the same as precedently but we walk the graph through the square function before getting to the input values.
Multiple outputs function
from keras import backend as K
i = K.placeholder(shape=(4,), name=”input”)
square = K.square(i)
mean = K.mean(i)
mean_of_square = K.mean(K.square(i))
f = K.function([i], [i, square, mean, mean_of_square])
ival = np.ones((4,))
print( f([ival]) )
> [array([ 2., 2., 2., 2.], dtype=float32), array([ 4., 4., 4., 4.], dtype=float32), 2.0, 4.0]
A function that returns the input, the square, the mean and the mean of the square. We can compose functions.
Gradient function
from keras import backend as K
i = K.placeholder(shape=(4,), name=”input”)
square = K.square(i)
grad = K.gradients([square], [i])
f = K.function([i], [i,square] + grad)
ival = np.ones((4,))*3
print( f([ival]) )
> [[array([ 3., 3., 3., 3.], dtype=float32), array([ 9., 9., 9., 9.], dtype=float32), array([ 6., 6., 6., 6.], dtype=float32)]
A function that compute the gradient of square relative to the variable i. Gradient compute a tensor that is the composition of all functions between square and i. square(x) = x² so square(x)/dx = 2x. For the input 3, the derivative is 6. We can compute the derivative between two related tensors.
Conclusion
I think a few of those examples might be useful to the Keras documentation. I’ll do it when my understanding would have improved a bit. At least this article will help me avoid the curse of knowledge latter and reminds me of difficulties along the way.