基于cs20si的Tensorflow笔记,课程主页;
本节主要内容:Basic Operations, Constants, Variables, Control Dependencies, Feeding inputs, TensorBoard
TensorBoard
The computations you’ll use TensorFlow for - like training a massive deep neural network - can be complex and confusing. To make it easier to understand,debug, and optimize TensorFlow programs, we’ve included a suite of visualization tools called TensorBoard
1 | import tensorflow as tf |
_note : Create the summary writer after graph definition and before running your session_
Run it
Go to terminal, run:1
2$ python [yourprogram].py
$ tensorboard --logdir="./graphs" --port 6006
Then open your browser and go to: http://localhost:6006/
“Const” and “Const_1” correspond to a and b, and the node “Add” corresponds to x. The names we give them (a, b, and x) are for us to access them when we need. They mean nothing for the internal TensorFlow. To make TensorBoard display the names of your ops, you have to explicitly name them.1
2
3a = tf.constant(2, name="a")
b = tf.constant(3, name="b")
x = tf.add(a, b, name="add")
The graph itself defines the ops and dependencies, but not displays the values. It only cares about the values when we run the session with some values to fetch in mind.
_note : If you’ve run your code several times, there will be multiple event files in ‘~/dev/cs20si/graphs/lecture01’, TF will show only the latest graph and display the warning of multiple event files. To get rid of the warning, delete all the event files you no longer need._
Summary : Learn to use TensorBoard well and often.It will help a lot when you build complicated models.
Constant Types
Link to documentation : https://www.tensorflow.org/api_docs/python/constant_op/
tf.constant(value, dtype=None, shape=None,name='Const', verify_shape=False)
1 | # constant of 1d tensor (vector) |
Constants as Sequences
1 | tf.linspace(start, stop, num, name=None) |
1 | tf.range(start, limit=None, delta=1, dtype=None, name='range') |
_Note that unlike NumPy or Python sequences, TensorFlow sequences are not iterable._1
2
3
4for _ in np.linspace(0, 10, 4): # OK
for _ in tf.linspace(0, 10, 4): # TypeError("'Tensor' object is not iterable.")
for _ in range(4): # OK
for _ in tf.range(4): # TypeError("'Tensor' object is not iterable.")
Randomly Generated Constants
1 | tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None) |
tf.set_random_seed(seed)
To generate different sequences across sessions, set neither graph-level nor op-level seeds:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16a = tf.random_uniform([1])
b = tf.random_normal([1])
print("Session 1")
with tf.Session() as sess1:
print(sess1.run(a)) # generates 'A1'
print(sess1.run(a)) # generates 'A2'
print(sess1.run(b)) # generates 'B1'
print(sess1.run(b)) # generates 'B2'
print("Session 2")
with tf.Session() as sess2:
print(sess2.run(a)) # generates 'A3'
print(sess2.run(a)) # generates 'A4'
print(sess2.run(b)) # generates 'B3'
print(sess2.run(b)) # generates 'B4'
To generate the same repeatable sequence for an op across sessions, set the seed for the op:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18a = tf.random_uniform([1], seed=1)
b = tf.random_normal([1])
# Repeatedly running this block with the same graph will generate the same
# sequence of values for 'a', but different sequences of values for 'b'.
print("Session 1")
with tf.Session() as sess1:
print(sess1.run(a)) # generates 'A1'
print(sess1.run(a)) # generates 'A2'
print(sess1.run(b)) # generates 'B1'
print(sess1.run(b)) # generates 'B2'
print("Session 2")
with tf.Session() as sess2:
print(sess2.run(a)) # generates 'A1'
print(sess2.run(a)) # generates 'A2'
print(sess2.run(b)) # generates 'B3'
print(sess2.run(b)) # generates 'B4'
To make the random sequences generated by all ops be repeatable across sessions, set a graph-level seed:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19tf.set_random_seed(1234)
a = tf.random_uniform([1])
b = tf.random_normal([1])
# Repeatedly running this block with the same graph will generate the same
# sequences of 'a' and 'b'.
print("Session 1")
with tf.Session() as sess1:
print(sess1.run(a)) # generates 'A1'
print(sess1.run(a)) # generates 'A2'
print(sess1.run(b)) # generates 'B1'
print(sess1.run(b)) # generates 'B2'
print("Session 2")
with tf.Session() as sess2:
print(sess2.run(a)) # generates 'A1'
print(sess2.run(a)) # generates 'A2'
print(sess2.run(b)) # generates 'B1'
print(sess2.run(b)) # generates 'B2'
_More details in https://www.tensorflow.org/api_docs/python/tf/set_random_seed_
Math Operations
1 | a = tf.constant([3, 6]) |
_More details in
https://www.tensorflow.org/api_guides/python/math_ops_
Data Types
Python Native Types
TensorFlow takes in Python native types such as Python boolean values, numeric values (integers, floats), and strings. Single values will be converted to 0-d tensors (or scalars), lists of
values will be converted to 1-d tensors (vectors), lists of lists of values will be converted to 2-d tensors (matrices), and so on. Example below is adapted and modified from the book
“TensorFlow for Machine Intelligence”.1
2
3
4
5
6
7
8
9
10
11
12t_0 = 19 # Treated as a 0-d tensor, or "scalar"
tf.zeros_like(t_0) # ==> 0
tf.ones_like(t_0) # ==> 1
t_1 = [b"apple", b"peach", b"grape"] # treated as a 1-d tensor, or "vector"
tf.zeros_like(t_1) # ==> ['' '' '']
tf.ones_like(t_1) # ==> TypeError: Expected string, got 1 of type 'int' instead.
t_2 = [[True, False, False],
[False, False, True],
[False, True, False]
] # treated as a 2-d tensor, or "matrix"
tf.zeros_like(t_2) # ==> 2x2 tensor, all elements are False
tf.ones_like(t_2) # ==> 2x2 tensor, all elements are True
TensorFlow Native Types
Like NumPy, TensorFlow also its own data types as you’ve seen tf.int32, tf.float32.Below is a list of current TensorFlow data types, taken from TensorFlow’s official documentation.
NumPy Data Types
TensorFlow was designed to integrate seamlessly with Numpy, the package that has become the lingua franca(免费) of data science.
TensorFlow’s data types are based on those of NumPy; in fact, np.int32 == tf.int32 returns True.You can pass NumPy types to TensorFlow ops.1
tf.ones([2, 2], np.float32) ==> [[1.0 1.0], [1.0 1.0]]
Summary
- Do not use Python native types for tensors because TensorFlow has to infer(推断) Python type
- Beware when using NumPy arrays because NumPy and TensorFlow might become not so compatible in the future!
- Both TensorFlow and NumPy are n-d array libraries. NumPy supports ndarray, but doesn’t offer methods to create tensor functions and automatically compute derivatives, nor GPU support. So TensorFlow still wins!
- It’s possible to convert the data into the appropriate type when you pass it into TensorFlow, but certain data types still may be difficult to declare correctly, such as complex numbers. Because of this, it is common to create hand-defined Tensor objects as NumPy arrays. However, always use TensorFlow types when possible, because both TensorFlow and NumPy can evolve to a point that such compatibility no longer exists.
Variables
The difference between a constant and a variable:
- A constant is constant. A variable can be assigned to, its value can be changed.
- A constant’s value is stored in the graph and its value is replicated(复制) wherever the graph is loaded. A variable is stored separately, and may live on a parameter server.
Point 2 basically means that constants are stored in the graph definition. When constants are memory expensive, it will be slow each time you have to load the graph.
To see the graph’s definition and what’s stored in the graph’s definition, simply print out the graph’s protobuf.Protobuf stands for protocol buffer :
“Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.”
1 | import tensorflow as tf |
_Only use constants for primitive types.
Use variables or readers for more data that
requires more memory._
Declare Variables
To declare a variable, you create an instance of the class tf.Variable. Note that it’s tf.constant but tf.Variable and not tf.variable because tf.constant is an op, while tf.Variable is a class.
When building a machine learning model it is often convenient to distinguish between variables holding the trainable model parameters and other variables such as a global step variable used to count training steps. To make this easier, the variable constructor supports a trainable=1
2
3
4
5
6
7
8# create variable a with scalar value
a = tf.Variable(2, name="scalar")
# create variable b as a vector
b = tf.Variable([2, 3], name="vector")
# create variable c as a 2x2 matrix
c = tf.Variable([[0, 1], [2, 3]], name="matrix")
# create variable W as 784 x 10 tensor, filled with zeros
W = tf.Variable(tf.zeros([784, 10]))
Initialize Variables
You have to initialize variables before using them.
If you try to evaluate the variables before initializing them you’ll run into FailedPreconditionError: Attempting to use uninitialized value tensor.
1 | # The easiest way is initializing all variables at once |
To initialize only a subset of variables, you use tf.variables_initializer() with a list of variables you want to initialize :1
2
3init_ab = tf.variables_initializer([a, b], name="init_ab")
with tf.Session() as sess:
tf.run(init_ab)
You can also initialize each variable separately using tf.Variable.initializer :1
2
3W = tf.Variable(tf.zeros([784,10]))
with tf.Session() as sess:
sess.run(W.initializer)
Evaluate Values of Variables
To get the value of a variable, we need to evaluate it using eval() :1
2
3
4
5
6
7# W is a random 700 x 100 variable object
W = tf.Variable(tf.truncated_normal([2, 3]))
with tf.Session() as sess:
sess.run(W.initializer)
print W # <tf.Variable 'Variable:0' shape=(2, 3) dtype=float32_ref>
print W.eval() # [[-0.15255323 -0.55641884 1.33864951]
# [ 0.09549548 0.63010901 0.84027511]]
Assign Values to Variables
We can assign a value to a variable using tf.Variable.assign() :1
2
3
4
5W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
sess.run(W.initializer)
print W.eval() # 10
Why 10 and not 100? W.assign(100) doesn’t assign the value 100 to W, but instead create an assign op to do that. For this op to take effect, we have to run this op in session.1
2
3
4
5W = tf.Variable(10)
assign_op = W.assign(100)
with tf.Session() as sess:
sess.run(assign_op)
print W.eval() # 100
Note that we don’t have initialize W in this case, because assign() does it for us. In fact,initializer op is the assign op that assigns the variable’s initial value to the variable itself.1
2
3
4
5
6
7
8
9
10a = tf.Variable(2, name="scalar")
# assign a * 2 to a and call that op a_times_two
a_times_two = a.assign(a * 2)
init = tf.global_variables_initializer()
with tf.Session() as sess:
print sess.run(init) # none
# have to initialize a, because a_times_two op depends on the value of a
print sess.run(a_times_two) # 4
print sess.run(a_times_two) # 8
print sess.run(a_times_two) # 16
Unlike tf.Variable.assign(),tf.Variable.assign_add() and tf.Variable.assign_sub() don’t initialize your variables for you
because these ops depend on the initial values of the variable.1
2
3
4
5
6
7my_var = tf.Variable(10)
With tf.Session() as sess:
sess.run(my_var.initializer)
# increment by 10
sess.run(my_var.assign_add(10)) # 20
# decrement by 2
sess.run(my_var.assign_sub(2)) # 18
Because TensorFlow sessions maintain values separately, each Session can have its own current value for a variable defined in a graph.1
2
3
4
5
6
7
8
9
10
11W = tf.Variable(10)
sess1 = tf.Session()
sess2 = tf.Session()
sess1.run(W.initializer)
sess2.run(W.initializer)
print sess1.run(W.assign_add(10)) # 20
print sess2.run(W.assign_sub(2)) # 8
print sess1.run(W.assign_add(100)) # 120
print sess2.run(W.assign_sub(50)) # -42
sess1.close()
sess2.close()
Use a variable to initialize another variable :1
2
3
4
5
6
7# not safe
W = tf.Variable(2)
U = tf.Variable(W * 2)
with tf.Session() as sess:
sess.run(W.initializer)
sess.run(U.initializer)
print U.eval()
In this case, you should use initialized_value() to make sure that W is initialized before its value is used to initialize U.1
2
3
4
5
6
7W = tf.Variable(2)
U = tf.Variable(W.initialized_value() * 2)
# ensure that W is initialized before its value is used to initialize U
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print U.eval()
_More details in https://www.tensorflow.org/api_docs/python/tf/Variable._
InteractiveSession
The only difference is an InteractiveSession makes itself the default session so you can call run() or eval() without explicitly call the session. This is convenient in interactive shells and IPython notebooks, as it avoids having to pass an explicit Session object to run ops. However, it is complicated when you have multiple sessions to run.1
2
3
4
5
6
7sess = tf.InteractiveSession()
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
# We can just use 'c.eval()' without passing 'sess'
print(c.eval())
sess.close()
Control Dependencies
Sometimes, we will have two independent ops but you’d like to specify which op should be run first, then you use tf.Graph.control_dependencies(control_inputs).1
2
3
4
5# your graph g have 5 ops: a, b, c, d, e
with g.control_dependencies([a, b, c]):
# 'd' and 'e' will only run after 'a', 'b', and 'c' have executed.
d = ...
e = …
Placeholders
tf.placeholder(dtype, shape=None, name=None) :
- Dtype is the required parameter that specifies of the data type of the value of the placeholder.
- Shape specifies the shape of the tensor that can be accepted as actual value for the placeholder. shape=None means that tensors of any shape will be accepted. Using shape=None is easy to construct graphs, but nightmarish for debugging. You should always define the shape of your placeholders as detailed as possible.
1
2
3
4
5
6
7
8# create a placeholder of type float 32-bit, shape is a vector of 3 elements
a = tf.placeholder(tf.float32, shape=[3])
# create a constant of type float 32-bit, shape is a vector of 3 elements
b = tf.constant([5, 5, 5], tf.float32)
# use the placeholder as you would a constant or a variable
c = a + b # Short for tf.add(a, b)
with tf.Session() as sess:
print sess.run(c) # Error because a doesn’t have any value
Feed the values to placeholders using a dictionary :1
2
3
4
5
6
7
8
9
10
11# create a placeholder of type float 32-bit, shape is a vector of 3 elements
a = tf.placeholder(tf.float32, shape=[3])
# create a constant of type float 32-bit, shape is a vector of 3 elements
b = tf.constant([5, 5, 5], tf.float32)
# use the placeholder as you would a constant or a variable
c = a + b # Short for tf.add(a, b)
with tf.Session() as sess:
# feed [1, 2, 3] to placeholder a via the dict {a: [1, 2, 3]}
# fetch value of c
print sess.run(c, {a: [1, 2, 3]}) # the tensor a is the key, not the string ‘a’
# >> [6, 7, 8]
We can feed as any data points to the placeholder as we want by iterating through the data set and feed in the value one at a time.1
2
3with tf.Session() as sess:
for a_value in list_of_values_for_a:
print sess.run(c, {a: a_value})
You can feed_dict any feedable tensor.Placeholder is just a way to indicate that something must be fed.
You can feed values to tensors that aren’t placeholders. Any tensors that are feedable can be fed. To check if a tensor is feedable or not, use:
tf.Graph.is_feedable(tensor)
Feeding Values to TF Ops
1 | # create operations, tensors, etc (using the default graph) |
feed_dict can be extremely useful to test your model. When you have a large graph and just want to test out certain parts, you can provide dummy values so TensorFlow won’t waste time doing unnecessary computations.
_More details in https://www.tensorflow.org/api_docs/python/tf/placeholder._
The Trap of Lazy Loading
Lazy loading is a term that refers to a programming pattern when you defer(推迟) declaring/initializing an object until it is loaded.. In the context of TensorFlow, it means you defer creating an op until you need to compute it.
Normal Loading
1 | x = tf.Variable(10, name='x') |
#### Lazy Loading1
2
3
4
5
6
7
8x = tf.Variable(10, name='x')
y = tf.Variable(20, name='y')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
writer = tf.summary.FileWriter('./my_graph/l2', sess.graph)
for _ in range(10):
sess.run(tf.add(x, y)) # someone decides to be clever to save one line of code
writer.close()
Both give the same value of z.What’s the problem?
Let’s see the graphs for them on TensorBoard.Normal loading graph looks just like we expected.
Well, the node “Add” is missing, which is understandable since we added the note “Add” after we’ve written the graph to FileWriter. This makes it harder to read the graph but it’s not a bug.
Let’s look at the graph definition.The protobuf for the graph in normal loading has only 1 node “Add”,On the other hand, the protobuf for the graph in lazy loading has 10 copies of the node “Add”. It adds a new node “Add” every time you want to compute z!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25# normal loading
node {
name: "Add"
op: "Add"
input: "x/read"
input: "y/read"
attr {
key: "T"
value {
type: DT_INT32
}
}
}
# lazy loading
node {
name: "Add"
op: "Add"
...
}
...
node {
name: "Add_9"
op: "Add"
...
}
You probably think: “This is stupid.Why would I want to compute the same value more than once?” and think that it’s a bug that nobody will ever commit. It happens more often than you think. For example, you might want to compute the same loss function or make some prediction after a certain number of training samples. Before you know it, you’ve computed it for thousands of times, and added thousands of unnecessary nodes to your graph. Your graph definition becomes bloated, slow to load and expensive to pass around.
There are two ways to avoid this bug :
- always separate the definition of ops and their
execution when you can. - But when it is not possible because you want to group related ops into classes, you can use Python property to ensure that your function is only loaded once when it’s first called.
_More details in https://danijar.com/structuring-your-tensorflow-models/_