Tensorflow_note_2

基于cs20si的Tensorflow笔记,课程主页
本节主要内容:Basic Operations, Constants, Variables, Control Dependencies, Feeding inputs, TensorBoard

TensorBoard

The computations you’ll use TensorFlow for - like training a massive deep neural network - can be complex and confusing. To make it easier to understand,debug, and optimize TensorFlow programs, we’ve included a suite of visualization tools called TensorBoard

1
2
3
4
5
6
7
8
9
import tensorflow as tf
a = tf.constant(2)
b = tf.constant(3)
x = tf.add(a, b)
with tf.Session() as sess:
# add this line to use TensorBoard.The line is to create a writer object to write operations to the event file, stored in the folder ./graphs
writer = tf.summary.FileWriter('./graphs', sess.graph)
print sess.run(x)
writer.close() # close the writer when you’re done using it

_note : Create the summary writer after graph definition and before running your session_

Run it

Go to terminal, run:

1
2
$ python [yourprogram].py
$ tensorboard --logdir="./graphs" --port 6006

Then open your browser and go to: http://localhost:6006/
TensorBoard
“Const” and “Const_1” correspond to a and b, and the node “Add” corresponds to x. The names we give them (a, b, and x) are for us to access them when we need. They mean nothing for the internal TensorFlow. To make TensorBoard display the names of your ops, you have to explicitly name them.

1
2
3
a = tf.constant(2, name="a")
b = tf.constant(3, name="b")
x = tf.add(a, b, name="add")

TensorBoard
The graph itself defines the ops and dependencies, but not displays the values. It only cares about the values when we run the session with some values to fetch in mind.
_note : If you’ve run your code several times, there will be multiple event files in ‘~/dev/cs20si/graphs/lecture01’, TF will show only the latest graph and display the warning of multiple event files. To get rid of the warning, delete all the event files you no longer need._


Summary : Learn to use TensorBoard well and often.It will help a lot when you build complicated models.

Constant Types

Link to documentation : https://www.tensorflow.org/api_docs/python/constant_op/

tf.constant(value, dtype=None, shape=None,name='Const', verify_shape=False)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# constant of 1d tensor (vector)
a = tf.constant([2, 2], name="vector")
# constant of 2x2 tensor (matrix)
b = tf.constant([[0, 1], [2, 3]], name="b")

tf​.​zeros​(​shape​,​ dtype​=​tf​.​float32​,​ name​=​None)
# create a tensor of shape and all elements are zeros
tf.zeros([2, 3], tf.int32) ==> [[0, 0, 0], [0, 0, 0]]
# more compact than other constants in the graph def → faster startup (esp.in distributed)

tf​.​ones​(​shape​,​ dtype​=​tf​.​float32​,​ name​=​None)
# create a tensor of shape and all elements are ones
tf.ones([2, 3], tf.int32) ==> [[1, 1, 1], [1, 1, 1]]
tf.zeros_like(input_tensor, dtype=None, name=None, optimize=True)

tf​.​ones_like​(​input_tensor​,​ dtype​=​None​,​name​=​None​,​ optimize​=​True)
# create a tensor of shape and type (unless type is specified) as the input_tensor but all elements are ones.
# input_tensor is [0, 1], [2, 3], [4, 5]]
tf.ones_like(input_tensor) ==> [[1, 1], [1, 1], [1, 1]]

tf​.​fill​(​dims​,​ value​,​ name​=​None​)
# create a tensor filled with a scalar value.
tf.ones([2, 3], 8) ==> [[8, 8, 8], [8, 8, 8]]

Constants as Sequences

1
2
3
4
5
6
7
tf​.​linspace​(​start​,​ stop​,​ num​,​ name​=​None)
# return A Tensor. Has the same type as start. 1-D.
# create a sequence of num evenly-spaced values are generated beginning at start. If num > 1, the values in the sequence increase by (stop - start) / (num - 1), so that the last one is exactly stop.
# start, stop must be float32 or float64,and have the same type.num must be int32 or int64. All must be scalars.
# comparable to but slightly different from numpy.linspace
# numpy.linspace(start, stop, num=50, endpoint=True,retstep=False, dtype=None)
tf.linspace(10.0, 13.0, 4, name="linspace") ==> [10.0 11.0 12.0 13.0]

1
2
3
4
5
6
7
8
9
tf​.​range​(​start​,​ limit​=​None​,​ delta​=​1​,​ dtype​=​None​,​ name​=​'range')
# create a sequence of numbers that begins at start and extends by increments of delta(步长) up to but not including limit
# slight different from range in Python
# 'start' is 3, 'limit' is 18, 'delta' is 3
tf.range(start, limit, delta) ==> [3, 6, 9, 12, 15]
# 'start' is 3, 'limit' is 1, 'delta' is -0.5
tf.range(start, limit, delta) ==> [3, 2.5, 2, 1.5]
# 'limit' is 5
tf.range(limit) ==> [0, 1, 2, 3, 4]

_Note that unlike NumPy or Python sequences, TensorFlow sequences are not iterable._

1
2
3
4
for _ in np.linspace(0, 10, 4): # OK
for _ in tf.linspace(0, 10, 4): # TypeError("'Tensor' object is not iterable.")
for _ in range(4): # OK
for _ in tf.range(4): # TypeError("'Tensor' object is not iterable.")

Randomly Generated Constants

1
2
3
4
5
6
7
tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)
tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None,name=None)
tf.random_uniform(shape, minval=0, maxval=None, dtype=tf.float32, seed=None, name=None)
tf.random_shuffle(value, seed=None, name=None)
tf.random_crop(value, size, seed=None, name=None)
tf.multinomial(logits, num_samples, seed=None, name=None)
tf.random_gamma(shape, alpha, beta=None, dtype=tf.float32, seed=None, name=None)

tf.set_random_seed(seed)

To generate different sequences across sessions, set neither graph-level nor op-level seeds:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
a = tf.random_uniform([1])
b = tf.random_normal([1])

print("Session 1")
with tf.Session() as sess1:
print(sess1.run(a)) # generates 'A1'
print(sess1.run(a)) # generates 'A2'
print(sess1.run(b)) # generates 'B1'
print(sess1.run(b)) # generates 'B2'

print("Session 2")
with tf.Session() as sess2:
print(sess2.run(a)) # generates 'A3'
print(sess2.run(a)) # generates 'A4'
print(sess2.run(b)) # generates 'B3'
print(sess2.run(b)) # generates 'B4'


To generate the same repeatable sequence for an op across sessions, set the seed for the op:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
a = tf.random_uniform([1], seed=1)
b = tf.random_normal([1])

# Repeatedly running this block with the same graph will generate the same
# sequence of values for 'a', but different sequences of values for 'b'.
print("Session 1")
with tf.Session() as sess1:
print(sess1.run(a)) # generates 'A1'
print(sess1.run(a)) # generates 'A2'
print(sess1.run(b)) # generates 'B1'
print(sess1.run(b)) # generates 'B2'

print("Session 2")
with tf.Session() as sess2:
print(sess2.run(a)) # generates 'A1'
print(sess2.run(a)) # generates 'A2'
print(sess2.run(b)) # generates 'B3'
print(sess2.run(b)) # generates 'B4'


To make the random sequences generated by all ops be repeatable across sessions, set a graph-level seed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
tf.set_random_seed(1234)
a = tf.random_uniform([1])
b = tf.random_normal([1])

# Repeatedly running this block with the same graph will generate the same
# sequences of 'a' and 'b'.
print("Session 1")
with tf.Session() as sess1:
print(sess1.run(a)) # generates 'A1'
print(sess1.run(a)) # generates 'A2'
print(sess1.run(b)) # generates 'B1'
print(sess1.run(b)) # generates 'B2'

print("Session 2")
with tf.Session() as sess2:
print(sess2.run(a)) # generates 'A1'
print(sess2.run(a)) # generates 'A2'
print(sess2.run(b)) # generates 'B1'
print(sess2.run(b)) # generates 'B2'

_More details in https://www.tensorflow.org/api_docs/python/tf/set_random_seed_

Math Operations

1
2
3
4
5
6
7
8
9
a = tf.constant([3, 6])
b = tf.constant([2, 2])
tf.add(a, b) # >> [5 8]
tf.add_n([a, b, b]) # >> [7 10]. Equivalent to a + b + b
tf.mul(a, b) # >> [6 12] because mul is element wise
tf.matmul(a, b) # >> ValueError
tf.matmul(tf.reshape(a, [1, 2]), tf.reshape(b, [2, 1])) # >> [[18]]
tf.div(a, b) # >> [1 3]
tf.mod(a, b) # >> [1 0]

_More details in
https://www.tensorflow.org/api_guides/python/math_ops_

Data Types

Python Native Types

TensorFlow takes in Python native types such as Python boolean values, numeric values (integers, floats), and strings. Single values will be converted to 0-d tensors (or scalars), lists of
values will be converted to 1-d tensors (vectors), lists of lists of values will be converted to 2-d tensors (matrices), and so on. Example below is adapted and modified from the book
“TensorFlow for Machine Intelligence”.

1
2
3
4
5
6
7
8
9
10
11
12
t_0 = 19  # Treated as a 0-d tensor, or "scalar"
tf.zeros_like(t_0) # ==> 0
tf.ones_like(t_0) # ==> 1
t_1 = [b"apple", b"peach", b"grape"] # treated as a 1-d tensor, or "vector"
tf.zeros_like(t_1) # ==> ['' '' '']
tf.ones_like(t_1) # ==> TypeError: Expected string, got 1 of type 'int' instead.
t_2 = [[True, False, False],
[False, False, True],
[False, True, False]
] # treated as a 2-d tensor, or "matrix"
tf.zeros_like(t_2) # ==> 2x2 tensor, all elements are False
tf.ones_like(t_2) # ==> 2x2 tensor, all elements are True

TensorFlow Native Types

Like NumPy, TensorFlow also its own data types as you’ve seen tf.int32, tf.float32.Below is a list of current TensorFlow data types, taken from TensorFlow’s official documentation.

TensorFlow Native Types

NumPy Data Types

TensorFlow was designed to integrate seamlessly with Numpy, the package that has become the lingua franca(免费) of data science.
TensorFlow’s data types are based on those of NumPy; in fact, np.int32 == tf.int32 returns True.You can pass NumPy types to TensorFlow ops.

1
tf.ones([2, 2], np.float32) ==> [[1.0 1.0], [1.0 1.0]]

Summary

  • Do not use Python native types for tensors because TensorFlow has to infer(推断) Python type
  • Beware when using NumPy arrays because NumPy and TensorFlow might become not so compatible in the future!
  • Both TensorFlow and NumPy are n-d array libraries. NumPy supports ndarray, but doesn’t offer methods to create tensor functions and automatically compute derivatives, nor GPU support. So TensorFlow still wins!
  • It’s possible to convert the data into the appropriate type when you pass it into TensorFlow, but certain data types still may be difficult to declare correctly, such as complex numbers. Because of this, it is common to create hand-defined Tensor objects as NumPy arrays. However, always use TensorFlow types when possible, because both TensorFlow and NumPy can evolve to a point that such compatibility no longer exists.

Variables

The difference between a constant and a variable:

  1. A constant is constant. A variable can be assigned to, its value can be changed.
  2. A constant’s value is stored in the graph and its value is replicated(复制) wherever the graph is loaded. A variable is stored separately, and may live on a parameter server.

Point 2 basically means that constants are stored in the graph definition. When constants are memory expensive, it will be slow each time you have to load the graph.
To see the graph’s definition and what’s stored in the graph’s definition, simply print out the graph’s protobuf.Protobuf stands for protocol buffer :

“Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.”

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import tensorflow as tf
my_const = tf.constant([1.0, 2.0], name="my_const")
print tf.get_default_graph().as_graph_def()

Output:
node {
name: "my_const"
op: "Const"
attr {
key: "dtype"
value {
type: DT_FLOAT
}
}
attr {
key: "value"
value {
tensor {
dtype: DT_FLOAT
tensor_shape {
dim {
size: 2
}
}
tensor_content: "\000\000\200?\000\000\000@"
}
}
}
}
versions {
producer: 21
}

_Only use constants for primitive types.
Use variables or readers for more data that
requires more memory._

Declare Variables

To declare a variable, you create an instance of the class tf.Variable. Note that it’s tf.constant but tf.Variable and not tf.variable because tf.constant is an op, while tf.Variable is a class.
When building a machine learning model it is often convenient to distinguish between variables holding the trainable model parameters and other variables such as a global step variable used to count training steps. To make this easier, the variable constructor supports a trainable= parameter. If True, the new variable is also added to the graph collection GraphKeys.TRAINABLE_VARIABLES. The convenience function trainable_variables() returns the contents of this collection. The various Optimizer classes use this collection as the default list of variables to optimize.

1
2
3
4
5
6
7
8
# create variable a with scalar value
a = tf.Variable(2, name="scalar")
# create variable b as a vector
b = tf.Variable([2, 3], name="vector")
# create variable c as a 2x2 matrix
c = tf.Variable([[0, 1], [2, 3]], name="matrix")
# create variable W as 784 x 10 tensor, filled with zeros
W = tf.Variable(tf.zeros([784, 10]))

Initialize Variables

You have to initialize variables before using them.
If you ​ try to evaluate the variables before initializing them you’ll run into FailedPreconditionError: Attempting to use uninitialized value tensor.

1
2
3
4
# The easiest way is initializing all variables at once
init = tf.global_variables_initializer()
with tf.Session() as sess:
tf.run(init) # Note that you use tf.run() to run the initializer, not fetching any value.

To initialize only a subset of variables, you use tf.variables_initializer() with a list of variables you want to initialize :

1
2
3
init_ab = tf.variables_initializer([a, b], name="init_ab")
with tf.Session() as sess:
tf.run(init_ab)

You can also initialize each variable separately using tf.Variable.initializer :

1
2
3
W = tf.Variable(tf.zeros([784,10]))
with tf.Session() as sess:
sess.run(W.initializer)

Evaluate Values of Variables

To get the value of a variable, we need to evaluate it using eval() :

1
2
3
4
5
6
7
# W is a random 700 x 100 variable object
W = tf.Variable(tf.truncated_normal([2, 3]))
with tf.Session() as sess:
sess.run(W.initializer)
print W # <tf.Variable 'Variable:0' shape=(2, 3) dtype=float32_ref>
print W.eval() # [[-0.15255323 -0.55641884 1.33864951]
# [ 0.09549548 0.63010901 0.84027511]]

Assign Values to Variables

We can assign a value to a variable using tf.Variable.assign() :

1
2
3
4
5
W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
sess.run(W.initializer)
print W.eval() # 10

Why 10 and not 100? W.assign(100) doesn’t assign the value 100 to W, but instead create an assign op to do that. For this op to take effect, we have to run this op in session.

1
2
3
4
5
W = tf.Variable(10)
assign_op = W.assign(100)
with tf.Session() as sess:
sess.run(assign_op)
print W.eval() # 100

Note that we don’t have initialize W in this case, because assign() does it for us. In fact,initializer op is the assign op that assigns the variable’s initial value to the variable itself.

1
2
3
4
5
6
7
8
9
10
a = tf.Variable(2, name="scalar")
# assign a * 2 to a and call that op a_times_two
a_times_two = a.assign(a * 2)
init = tf.global_variables_initializer()
with tf.Session() as sess:
print sess.run(init) # none
# have to initialize a, because a_times_two op depends on the value of a
print sess.run(a_times_two) # 4
print sess.run(a_times_two) # 8
print sess.run(a_times_two) # 16

Unlike tf.Variable.assign(),tf.Variable.assign_add() and tf.Variable.assign_sub() don’t initialize your variables for you
because these ops depend on the initial values of the variable.

1
2
3
4
5
6
7
my_var = tf.Variable(10)
With tf.Session() as sess:
sess.run(my_var.initializer)
# increment by 10
sess.run(my_var.assign_add(10)) # 20
# decrement by 2
sess.run(my_var.assign_sub(2)) # 18

Because TensorFlow sessions maintain values separately, each Session can have its own current value for a variable defined in a graph.

1
2
3
4
5
6
7
8
9
10
11
W = tf.Variable(10)
sess1 = tf.Session()
sess2 = tf.Session()
sess1.run(W.initializer)
sess2.run(W.initializer)
print sess1.run(W.assign_add(10)) # 20
print sess2.run(W.assign_sub(2)) # 8
print sess1.run(W.assign_add(100)) # 120
print sess2.run(W.assign_sub(50)) # -42
sess1.close()
sess2.close()

Use a variable to initialize another variable :

1
2
3
4
5
6
7
# not safe
W = tf.Variable(2)
U = tf.Variable(W * 2)
with tf.Session() as sess:
sess.run(W.initializer)
sess.run(U.initializer)
print U.eval()

In this case, you should use initialized_value() to make sure that W is initialized before its value is used to initialize U.

1
2
3
4
5
6
7
W = tf.Variable(2)
U = tf.Variable(W.initialized_value() * 2)
# ensure that W is initialized before its value is used to initialize U
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print U.eval()

_More details in https://www.tensorflow.org/api_docs/python/tf/Variable._

InteractiveSession

The only difference is an InteractiveSession makes itself the default session so you can call run() or eval() without explicitly call the session. This is convenient in interactive shells and IPython notebooks, as it avoids having to pass an explicit Session object to run ops. However, it is complicated when you have multiple sessions to run.

1
2
3
4
5
6
7
sess = tf.InteractiveSession()
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
# We can just use 'c.eval()' without passing 'sess'
print(c.eval())
sess.close()

Control Dependencies

Sometimes, we will have two independent ops but you’d like to specify which op should be run first, then you use tf.Graph.control_dependencies(control_inputs).

1
2
3
4
5
# your graph g have 5 ops: a, b, c, d, e
with g.control_dependencies([a, b, c]):
# 'd' and 'e' will only run after 'a', 'b', and 'c' have executed.
d = ...
e = …

Placeholders

tf.placeholder(dtype, shape=None, name=None) :

  • Dtype is the required parameter that specifies of the data type of the value of the placeholder.
  • Shape specifies the shape of the tensor that can be accepted as actual value for the placeholder. shape=None means that tensors of any shape will be accepted. Using shape=None is easy to construct graphs, but nightmarish for debugging. You should always define the shape of your placeholders as detailed as possible.
    1
    2
    3
    4
    5
    6
    7
    8
    # create a placeholder of type float 32-bit, shape is a vector of 3 elements
    a = tf.placeholder(tf.float32, shape=[3])
    # create a constant of type float 32-bit, shape is a vector of 3 elements
    b = tf.constant([5, 5, 5], tf.float32)
    # use the placeholder as you would a constant or a variable
    c = a + b # Short for tf.add(a, b)
    with tf.Session() as sess:
    print sess.run(c) # Error because a doesn’t have any value

Feed the values to placeholders using a dictionary :

1
2
3
4
5
6
7
8
9
10
11
# create a placeholder of type float 32-bit, shape is a vector of 3 elements
a = tf.placeholder(tf.float32, shape=[3])
# create a constant of type float 32-bit, shape is a vector of 3 elements
b = tf.constant([5, 5, 5], tf.float32)
# use the placeholder as you would a constant or a variable
c = a + b # Short for tf.add(a, b)
with tf.Session() as sess:
# feed [1, 2, 3] to placeholder a via the dict {a: [1, 2, 3]}
# fetch value of c
print sess.run(c, {a: [1, 2, 3]}) # the tensor a is the key, not the string ‘a’
# >> [6, 7, 8]

3 is the shape of placeholder
We can feed as any data points to the placeholder as we want by iterating through the data set and feed in the value one at a time.

1
2
3
with tf.Session() as sess:
for a_value in list_of_values_for_a:
print sess.run(c, {a: a_value})

You can feed_dict any feedable tensor.Placeholder is just a way to indicate that something must be fed.
You can feed values to tensors that aren’t placeholders. Any tensors that are feedable can be fed. To check if a tensor is feedable or not, use:

tf.Graph.is_feedable(tensor)

Feeding Values to TF Ops

1
2
3
4
5
6
7
8
# create operations, tensors, etc (using the default graph)
a = tf.add(2, 5)
b = tf.mul(a, 3)
with tf.Session() as sess:
# define a dictionary that says to replace the value of 'a' with 15
replace_dict = {a: 15}
# Run the session, passing in 'replace_dict' as the value to 'feed_dict'
sess.run(b, feed_dict=replace_dict) # returns 45

feed_dict can be extremely useful to test your model. When you have a large graph and just want to test out certain parts, you can provide dummy values so TensorFlow won’t waste time doing unnecessary computations.
_More details in https://www.tensorflow.org/api_docs/python/tf/placeholder._

The Trap of Lazy Loading

Lazy loading is a term that refers to a programming pattern when you defer(推迟) declaring/initializing an object until it is loaded.. In the context of TensorFlow, it means you defer creating an op until you need to compute it.

Normal Loading

1
2
3
4
5
6
7
8
9
x = tf.Variable(10, name='x')
y = tf.Variable(20, name='y')
z = tf.add(x, y) # you create the node for add node before executing the graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
writer = tf.summary.FileWriter('./my_graph/l2', sess.graph)
for _ in range(10):
sess.run(z)
writer.close()

#### Lazy Loading

1
2
3
4
5
6
7
8
x = tf.Variable(10, name='x')
y = tf.Variable(20, name='y')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
writer = tf.summary.FileWriter('./my_graph/l2', sess.graph)
for _ in range(10):
sess.run(tf.add(x, y)) # someone decides to be clever to save one line of code
writer.close()

Both give the same value of z.What’s the problem?
Let’s see the graphs for them on TensorBoard.Normal loading graph looks just like we expected.

Normal Loading
Lazy Loading
Well, the node “Add” is missing, which is understandable since we added the note “Add” after we’ve written the graph to FileWriter. This makes it harder to read the graph but it’s not a bug.
Let’s look at the graph definition.The protobuf for the graph in normal loading has only 1 node “Add”,On the other hand, the protobuf for the graph in lazy loading has 10 copies of the node “Add”. It adds a new node “Add” every time you want to compute z!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# normal loading
node {
name: "Add"
op: "Add"
input: "x/read"
input: "y/read"
attr {
key: "T"
value {
type: DT_INT32
}
}
}
# lazy loading
node {
name: "Add"
op: "Add"
...
}
...
node {
name: "Add_9"
op: "Add"
...
}

You probably think: “This is stupid.Why would I want to compute the same value more than once?” and think that it’s a bug that nobody will ever commit. It happens more often than you think. For example, you might want to compute the same loss function or make some prediction after a certain number of training samples. Before you know it, you’ve computed it for thousands of times, and added thousands of unnecessary nodes to your graph. Your graph definition becomes bloated, slow to load and expensive to pass around.

There are two ways to avoid this bug :

  • always separate the definition of ops and their
    execution when you can.
  • But when it is not possible because you want to group related ops into classes, you can use Python property to ensure that your function is only loaded once when it’s first called.
    _More details in https://danijar.com/structuring-your-tensorflow-models/_