Algorithm

Algorithm

Algorithm Thinking

Peak finding

1-Dimensional array

1.Def: b is a peak iff it’s b≥a & b≥c.
2.straightforward algorithm : starts from the left then walks across the array
complexity:
i.The worst case complexity is called θ \theta θ n , it’s essentially the order of n and it’s a constant times n.
ii.Asympotic complexity : The asymptotic complexity of it is linear ,cuz when theta n changes as the order changes.
To lower the asymptotic complexity ----
3.binary search : Look at n/2 position ,if n/2-1 ≥ n/2 then only check the left side. Right side the same
if neither (n/2)-1 nor (n/2)+1 is greater than n/2, then n/2 is a peak.
complexity:
worst complexity : n/2
*Cuz the algorithm is recursive ,then there must be a recurrence formula or value to describe it. *
T is the work algorithm does on input of n
T(n)=T(n/2)+ θ \theta θ (1)
θ \theta θ (1) corresponds to the two comparisons that you do looking at , potentially the two comparisons , left and right. 2 is a constant ,then we got θ \theta θ (1) ?
T(n)= θ \theta θlog2(n)

Sum: The key way to improve an algorithm is to divide it , the 2-D the same.

2D Version

1.Greedy ascent algorithm : picks a direction then follows it to find a peak.
Worst case complexity : θ \theta θ (nm) if m=n,then θ \theta θ (n2)

2.Binary search : start from m/2 column , using 1-D binary search for the peak in (I,m/2);
then keeps the row I ,then check the peak at column j . The peak is (I,j).
!!!Care that the algorithm is incorrect. For
The row peak and the column peak may not agree

3.The improved way :
assume the matrix is (n*m) then , take the m/2 column, global search the peak in n rows; then we got (I,m/2)if the (I-1,m/2) is greater or equal to the (I,m/2) then check the left side for the peak .And repeated the moves above.
Deal with the right respectively.

T(n,m)=T(n,m/2)+ θ \theta θ (n)
For a (n,m) work can be divided into a (n,m/2) one ,then a θ \theta θ (n) global search .
Hence T(n,m)= θ \theta θ (nlog2m)

I suppose the recurrence to be T(n,m)=T(n,m/2)+ θ \theta θ (log2n)+ θ \theta θ (1)

In lesson 1.1 there are two complexity : θ \theta θ and asymptotic
T describes the work that a algorhipm carries on ;
while θ \theta θ demonstrates the choice or worst searching amounts .
but it seems that all the theta are n related linear values.

Q:

I.What is the meaning of θ \theta θ (n) essentially is a constant times n.
ii.The sum of log2n θ \theta θ (1) is θ \theta θ (log2n)
iii.Why the recursive formula in 2-D is not T(n,m)=T(n,m/2)+ θ \theta θ (log2n)+ θ \theta θ (1)*
iv.Why log2m times θ \theta θ (n) equals θ \theta θ (nlog2m)
v.What’s asypomotic complexity . If it’s the average or the tendency of θ \theta θ (n) with every new order or algorithm?

Models of Computation Document Distance

Model of computation specifies
-what operations an algorithm can do
-cost of time of each operation

Two models of coputation :

1. Random access machine (ram ,same as random access memory)

(1)They are almost the same thing but not. The former is mathematically analog of the latter,which is for programing.
(2) In constant θ \theta θ(1) time , an algorithm can basically
read in or load θ \theta θ(1) word
do θ \theta θ(1) computation
write them out --store θ \theta θ(1) words
a θ \theta θ(1) of registers

word : w bits. w should be at least lg(size of memory) ?
Cuz words should be able to specify in the index of array

2.Pointer Machine

dynamically allocated objects
has a θ \theta θ(1) number of fields , the field can be either a word
a pointer is something points to others
It’s also called references
link list: the list employs pointer machine

3. Models in Python

i.list = array
L[j]=L[I]+5 θ \theta θ(1)
ii. object with θ \theta θ(1) attributes
x=x.next
iii.L.append(x)
Using table doulbing. θ \theta θ(1) time
iv. L=L1+L2
L=[], for every x in L1, L.append(x) θ \theta θ(|L1|)
for every x in L2 ,L.append (x) θ \theta θ(|L2|)
θ \theta θ(1+|L1|+|L2|). θ \theta θ(1)is the cost of append method
v. x in L
linear time just scan through the entire list
vi. len(L) θ \theta θ(1)
vii.L.sort(): O(|L|lg|L|)
it uses a comparison sort algorithm
viii.dict: D[key]=val.
Hash table. with θ \theta θ(1)
ix.x+y. O(|x|+|y|)
x*y θ \theta θ((|x|+|y|)(lg3)) lg=log23
x.heapq

Document Distance

Denote as :d(D1,D2)
distance may describe the similarity of two documents
document=a sequence of words
word= a string

idea: shared words and uses it to def document distance
think of the doc as a vector
D[w]= #occurrence of w in D 在这里插入图片描述

D1=“the cat” D2=“the dog”
dot product: d’(D1,D2)=D1·D2= ∑ D1[w] D2[w].
Default : a quite long string with a small one may have the score of dot product as 1000; while some two short but similar string may dots only at score 100 .Hence ,it may not a good way to describe the doc distance.
The best way to describe such similarity is angle!
d(D1,D2)=angle of (D1,D2)=D1 D2/|D1||D2|

Procedure of computing Document Distance
1.split doc into words
2.compute the frequency of word
3. compute dot product

mechanism :
for word in doc , count[word]+=1. θ \theta θ(|Doc|)
split may consult to method
re.findall(’\w+’,doc)

Q

1.The time cost of each model ,why’s that
2.Why w bits should be the log of the size of memory

Insertion sort Merge Sort

Why sorting
Insertion sorting
Merge sort(Divide & Conquer)
Recurrence solving

Why sorting?

1)application
2)problems becomes easy once items are sorted: Finding the median
array A[0:n]->sorted-> B[0,n]
Look at B is odd or even . if odd [B+1/2] .if not,[B/2]
3)Binary search : if you look for a specific item
A[0:n] by scanning throughout, cost linear time
while it’s sorted ,then it takes logarithmic time
? why
*Mechanism: assume that k is the target item. First algorithm will compare k to B[n/2]
if k is smaller ,then B[n/4] ,then by halfing it .It takes logarithmic time. *
binary search is the simplest but most straightforward way to show: divide and conquer .Which helps to turn a linear search to a logarithmic search[log]
4)Data compression
5)Computer graphics

Insertion Sort

For I=1,2,…n . Insert A[I] into sorted array A[0,i-1] by pairwise swaps down to the correct position for the number that is initially in A[I]
5 2 4 6 1 3
⬆️ key
which means to start from the second element, cuz the first is sorted by the definition
1)pairwise swap
2 5 4 6 1 3
2)now key is 4 then swap same
2 4 5 6 1 3
⬆️key
3)cuz the A[I]=5 is in right order with the Key 6,then the key moves forward to 1.
Then by implementing 4 swaps then it comes to be
2 4 5 6 1 3 -> 1 2 4 5 6 3
⬆️key
4) now the key at 3 ,then 3 get swipped for 3 times
1 2 3 4 5 6

All in all: θ \theta θ(n) steps (key position)
and for each key position , the worst case swap is n . θ \theta θ(n) (swaps or compares)
Hence , it’s a θ \theta θ(n2) algorithm .Cuz there are θ \theta θ(n) poisition and for each position carries θ \theta θ(n) swaps.

  • When compares >>swaps .What is the simple-fix to the θ \theta θ(n2) comparison
    my_ans: might consoult to the binary search , which means I do comparison only between A[I] and A[n/2]
    if smaller then compare [I] to A[n/4] . if I bigger ,then swap it.

    ! Correct .The simplst-fix of this algorithm is to change pairwise swaps to Binary search . Cuz it’s a sorted array. (?)
  • Do a binary search on A[0:i-1] already sorted in θ \theta θ(lgi)times .Thus a θ \theta θ(nlgn) comparison.

But this does not help a n array data structure.<why?>

Merge Sort

We split array A into L and R.And we keeps spliting to get a single number, at the bottom .

Merge: Two sorted array as input.
20 13 7 2
12 11 9 1
two finger algorithm:
Compare 2 and 1 . 1 is small ,then cross out 1. finger move up to 9.
has the comlexity of θ \theta θ(n)
The whole complexity of the merge sort is θ \theta θ(nlgn)

?T(n)=C1 +2T(n/2) +cn
divide recurrence merge

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值