### 目标

``````* 理解算法分析的重要性
* 能使用"Big-O"描述算法的执行时间
* 理解常用的Python数组和字典的"Big-O"执行时间
* 理解Python数据的实现是如何影响到算法分析的
* 理解如何对简单的Python程序进行性能基准测试
``````

### 算法分析

#### 程序对比

``````def sumOfN(n):
theSum = 0
for i in range(1,n+1):
theSum = theSum + i

return theSum

print(sumOfN(10))
``````

``````def foo(tom):
fred = 0
for bill in range(1,tom+1):
barney = bill
fred = fred + barney

return fred

print(foo(10))
``````

#### BenchMark

BenchMark求和程序:

``````import time

def sumOfN2(n):
start = time.time()

theSum = 0
for i in range(1,n+1):
theSum = theSum + i

end = time.time()

return theSum,end-start

for i in range(5):
print("Sum is %d required %10.7f seconds"%sumOfN2(10000))
``````

``````\$ python2 bench.py
Sum is 50005000 required  0.0014350 seconds
Sum is 50005000 required  0.0013280 seconds
Sum is 50005000 required  0.0012429 seconds
Sum is 50005000 required  0.0012109 seconds
Sum is 50005000 required  0.0012770 seconds
``````

``````import time

def sumOfN3(n):
start = time.time()

theSum = 0
theSum = (n*(n+1))/2

end = time.time()

return theSum,end-start

print("Sum is %d required %10.7f seconds"%sumOfN3(10))
print("Sum is %d required %10.7f seconds"%sumOfN3(10000))
print("Sum is %d required %10.7f seconds"%sumOfN3(100000))
print("Sum is %d required %10.7f seconds"%sumOfN3(1000000))
print("Sum is %d required %10.7f seconds"%sumOfN3(10000000))
print("Sum is %d required %10.7f seconds"%sumOfN3(100000000))
``````

``````\$ python2 bench2.py
Sum is 55 required  0.0000019 seconds
Sum is 50005000 required  0.0000000 seconds
Sum is 5000050000 required  0.0000010 seconds
Sum is 500000500000 required  0.0000012 seconds
Sum is 50000005000000 required  0.0000010 seconds
Sum is 5000000050000000 required  0.0000000 seconds
``````

BenchMark告诉我们的：直观的来看，使用迭代干的活更多，因而花费的时间更长。 随着N增大，迭代所需要花费的时间也随之增长。然而，这可能会导致问题，`sumOfN3` 在老旧架构的电脑上运行时，可能会花费更长的时间。

BenchMark提供给我们观察运行时间的方法，然而它取决于实际机器的运行情况、编程 语言、编译器等等诸多要素。而对于算法本身优劣而言，我们需要一种脱离于以上这些 制约要素的衡量方法。这种衡量方法应该仅仅针对于算法本身，在不同算法的实现之间 找寻出其优劣性。

### Big-O表示法

`sumOfN`的实现为:

``````theSum = 0
theSum = theSum + i
``````

T(n)=1+n中，当n增大时，1可以被忽略，因而执行时间就是O(n). 1也是很重要的，然而随着 n的增大，没有1计算结果也会近似于n.

n不大时，各种算法优劣相差不大，然而n增大时，很容易看到各自之间的差别。

### 字谜检测例子

``````def anagramSolution1(s1,s2):
alist = list(s2)

pos1 = 0
stillOK = True

while pos1 < len(s1) and stillOK:
pos2 = 0
found = False
if s1[pos1] == alist[pos2]:
found = True
else:
pos2 = pos2 + 1

if found:
alist[pos2] = None
else:
stillOK = False

pos1 = pos1 + 1

return stillOK

print(anagramSolution1('abced','dcbae'))
``````

``````def anagramSolution2(s1,s2):
alist1 = list(s1)
alist2 = list(s2)

alist1.sort()
alist2.sort()

pos = 0
matches = True

while pos < len(s1) and matches:
if alist1[pos]==alist2[pos]:
pos = pos + 1
else:
matches = False

return matches

print(anagramSolution2('abcde','edcba'))
``````

n!远大于2^n，譬如如果s1有20个字符，那么有20！种可能结果，用普通家用计算机，不太现实。 因而穷举在这里不符合。

``````def anagramSolution4(s1,s2):
c1 = [0]*26
c2 = [0]*26

for i in range(len(s1)):
pos = ord(s1[i])-ord('a')
c1[pos] = c1[pos] + 1

for i in range(len(s2)):
pos = ord(s2[i])-ord('a')
c2[pos] = c2[pos] + 1

j = 0
stillOK = True
while j<26 and stillOK:
if c1[j]==c2[j]:
j = j + 1
else:
stillOK = False

return stillOK

print(anagramSolution4('apple','pleap'))
``````

### 列表

python中列表的算法复杂度:

``````import timeit
from timeit import Timer
def test1():
l = []
for i in range(1000):
l = l + [i]

def test2():
l = []
for i in range(1000):
l.append(i)

def test3():
l = [i for i in range(1000)]

def test4():
l = list(range(1000))

t1 = Timer("test1()", "from __main__ import test1")
print("concat ",t1.timeit(number=1000), "milliseconds")
t2 = Timer("test2()", "from __main__ import test2")
print("append ",t2.timeit(number=1000), "milliseconds")
t3 = Timer("test3()", "from __main__ import test3")
print("comprehension ",t3.timeit(number=1000), "milliseconds")
t4 = Timer("test4()", "from __main__ import test4")
print("list range ",t4.timeit(number=1000), "milliseconds")
``````

``````('concat ', 1.8388030529022217, 'milliseconds')
('append ', 0.09388995170593262, 'milliseconds')
('comprehension ', 0.042823076248168945, 'milliseconds')
('list range ', 0.01230001449584961, 'milliseconds')
``````

concate的结果为O(k)，k为数组的大小。

``````Operation 	Big-O Efficiency
index [] 	O(1)
index assignment 	O(1)
append 	O(1)
pop() 	O(1)
pop(i) 	O(n)
insert(i,item) 	O(n)
del operator 	O(n)
iteration 	O(n)
contains (in) 	O(n)
get slice [x:y] 	O(k)
del slice 	O(n)
set slice 	O(n+k)
reverse 	O(n)
concatenate 	O(k)
sort 	O(n log n)
multiply 	O(nk)
``````

pop()与pop(n)的例子:

``````from timeit import Timer
import timeit
popzero = Timer("x.pop(0)",
"from __main__ import x")
popend = Timer("x.pop()",
"from __main__ import x")
print("pop(0)   pop()")
for i in range(1000000,100000001,1000000):
x = list(range(i))
pt = popend.timeit(number=1000)
x = list(range(i))
pz = popzero.timeit(number=1000)
print("%15.5f, %15.5f" %(pz,pt))
``````

``````\$ python2 timer.py
pop(0)   pop()
1.39908,         0.00019
2.87106,         0.00018
4.65139,         0.00018
6.42484,         0.00018
7.80698,         0.00018
9.34029,         0.00016
10.97498,         0.00017
12.16070,         0.00018
``````

### 字典

``````operation 	Big-O Efficiency
copy 	O(n)
get item 	O(1)
set item 	O(1)
delete item 	O(1)
contains (in) 	O(1)
iteration 	O(n)
``````

``````import timeit
import random

for i in range(10000,1000001,20000):
t = timeit.Timer("random.randrange(%d) in x"%i,
"from __main__ import random,x")
x = list(range(i))
lst_time = t.timeit(number=1000)
x = {j:None for j in range(i)}
d_time = t.timeit(number=1000)
print("%d,%10.3f,%10.3f" % (i, lst_time, d_time))
``````

``````\$ python2 dic.py
10000,     0.160,     0.001
30000,     0.339,     0.001
50000,     0.570,     0.001
70000,     0.801,     0.001
90000,     1.037,     0.001
110000,     1.273,     0.001
130000,     1.470,     0.001
150000,     1.685,     0.001
170000,     1.961,     0.001
190000,     2.214,     0.001
210000,     2.574,     0.001
230000,     2.781,     0.001
250000,     3.094,     0.001
``````

https://wiki.python.org/moin/TimeComplexity