u l l u u l l
DNA u u
Cystic Fibrosis u u
CF u u
CF
CF ATP u u u
CF u u
CF CFTR
CF CFTR
CF
u u
u l u l u l
1 u l l l l
u l l l l l l n n
l l l n Ø l n n
u
u
: minnumcoins(m-1) + 1 minnumcoins(m) = min of minnumcoins(m-3) + 1 minnumcoins(m-7) + 1
minnumcoins(m-c 1 ) + 1 minnumcoins(m) = min of minnumcoins(m-c 2 ) + 1 minnumcoins(m-c d ) + 1
1. RecursiveChange(M,c,d) 2. if M = 0 3. return 0 4. bestnumcoins ß infinity 5. for i ß 1 to d 6. if M c i 7. numcoins ß RecursiveChange(M c i, c, d) 8. if numcoins + 1 < bestnumcoins 9. bestnumcoins ß numcoins + 1 10. return bestnumcoins
RecursiveChange 77 76 74 70 75 73 69 73 71 67 69 67 63 74 72 68 68 66 62 70 68 64 68 66 62 62 60 56 72 70 66 72 70 66 66 64 60 66 64 60...... 70 70 70 70 70
Value Min # of coins 1 2 3 4 5 6 7 8 9 10 1 1 1
Value Min # of coins 1 2 3 4 5 6 7 8 9 10 1 2 1 2 1 2 2 2
Value Min # of coins 1 2 3 4 5 6 7 8 9 10 1 2 1 2 1 2 3 2 3 2
u u u
1. DPChange(M,c,d) 2. bestnumcoins 0 ß 0 3. for m ß 1 to M 4. bestnumcoins m ß infinity 5. for i ß 1 to d 6. if m c i 7. if bestnumcoins m ci + 1 < bestnumcoins m 8. bestnumcoins m ß bestnumcoins m ci + 1 9. return bestnumcoins M
DPChange 0 0 0 1 0 1 0 1 2 0 1 2 0 1 2 3 0 1 2 1 0 1 2 3 4 0 1 2 1 2 0 1 2 3 4 5 0 1 2 1 2 3 0 1 2 3 4 5 6 0 1 2 1 2 3 2 0 1 2 3 4 5 6 7 0 1 2 1 2 3 2 1 0 1 2 3 4 5 6 7 8 0 1 2 1 2 3 2 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 1 2 3 2 1 2 3 c = (1,3,7) M = 9
2
* * * * * * * * * * * *
u u u
1 2 5 5 3 10 5 2 1 5 3 5 3 1 2 3 4 promising start, but leads to bad choices! 0 0 0 0 0 5 2 18 22
j 0 1 i 0 5 1 1 S 0,1 = 1 1 5 S 1,0 = 5
j 0 1 2 i 0 5 1 2 3 1 3 S 0,2 = 3 1 3 5-5 4 S 1,1 = 4 2 8 S 2,0 = 8
MT(n,m) if n=0 or m=0 return MT(n,m) x ß MT(n-1,m)+ length of the edge from (n- 1,m) to (n,m) y ß MT(n,m-1)+ length of the edge from (n,m-1) to (n,m) return max{x,y}
i 0 j 0 1 2 3 1 2 1 3 5 3 10 5 8 S 3,0 = 8 1 3 5-5 5 4 1 13 S 1,2 = 13 2 0 8-5 9 S 2,1 = 9 3 8 S 3,0 = 8
i 0 j 0 1 2 3 1 2 5 1 3 8 5 3 10-5 1 3 5-5 1-5 4 5-3 13 8 S 1,3 = 8 2 0 8-5 3 0 9 12 S 2,2 = 12 0 3 8 greedy alg. fails! 9 S 3,1 = 9
j 0 1 2 3 i 0 5 1 2 5 1 3 8 3 10-5 1 5-5 1-5 4 13 8 3 5-3 2 2 0 8-5 3 3 9 0-5 12 15 S 2,3 = 15 3 8 0 0 9 9 S 3,2 = 9
i 0 j 0 1 2 3 1 2 5 1 3 8 5 3 10-5 1 5-5 1-5 4 13 8 3 5-3 2 2 8-5 3 3 9 12 15 0 0-5 1 3 8 0 0 9 9 0 16 S 3,3 = 16
u s i, j = max s i-1, j + weight of the edge between (i-1, j) and (i, j) s i, j-1 + weight of the edge between (i, j-1) and (i, j) u u
u u u c) a) b)
A 2 A 3 A 1 B u s B = max of s A1 + weight of the edge (A 1, B) s A2 + weight of the edge (A 2, B) s A3 + weight of the edge (A 3, B)
DAG u l l u l
DAG
DAG su 1 + u 1 s v = max of su 2 + u 2 su 3 + u 3 l
u l u v : A T A T A T A T w : T A T A T A T A
u v : A T A T A T A T -- w : -- T A T A T A T A u
(Edit Distance) u
Hamming distance always compares i -th letter of v with i -th letter of w V = ATATATAT W = TATATATA Hamming distance: d(v, w)=8 Computing Hamming distance is a trivial task. vs
vs Hamming distance always compares i -th letter of v with i -th letter of w V = ATATATAT W = TATATATA Hamming distance: d(v, w)=8 Just one shift Make it all line up Computing Hamming distance is a trivial task Edit distance may compare i -th letter of v with j -th letter of w V = - ATATATAT W = TATATATA Edit distance: d(v, w)=2 Computing edit is a non-trivial task
Hamming distance always compares i -th letter of v with i -th letter of w V = ATATATAT W = TATATATA Hamming distance: d(v, w)=8 vs Edit distance may compare i -th letter of v with j -th letter of w V = - ATATATAT W = TATATATA Edit distance: d(v, w)=2 (one insertion and one deletion) How to find what j goes with what i???
à TGCATAT à TGCATA à TGCAT à ATGCAT à ATCCAT à ATCCGAT
TGCATAT à ATCCGAT in 4 steps TGCATAT à (insert A at front) ATGCATAT à (delete 6 th T) ATGCATA à (substitute G for 5 th A) ATGCGTA à (substitute C for 3 rd G) ATCCGAT (Done)
TGCATAT à ATCCGAT in 4 steps TGCATAT à (insert A at front) ATGCATAT à (delete 6 th T) ATGCATA à (substitute G for 5 th A) ATGCGTA à (substitute C for 3 rd G) ATCCGAT (Done) Can it be done in 3 steps???
(alignment) V = ATCTGATG W = TGCATAC match n = 8 m = 7 mismatch 4 1 2 3 V W A T C T G A T G T G C A T A C deletion insertion
u v : w : A T CT GA T T GCA T A u m = 7 n = 6 v w A T -- G T T A T -- A T C G T -- A -- C 4 matches 2 insertions 2 deletions
u
0 1 2 2 3 4 5 6 7 7 A T _ G T T A T _ A T C G T _ A _ C 0 1 2 3 4 5 5 6 6 7 - - (0,0), (1,1), (2,2), (2,3), (3,4), (4,5), (5,5), (6,6), (7,6), (7,7)
0122345677 v= AT_GTTAT_ w= ATCGT_A_C 0123455667 (0,0), (1,1), (2,2), (2,3), (3,4), (4,5), (5,5), (6,6), (7,6), (7,7)
1 0122345677 v= AT_GTTAT_ w= ATCGT_A_C 0123455667 2 0122345677 v= AT_GTTAT_ w= ATCG_TA_C 0123445667
0 1 5
u l l l l
LCS i coords: 0 elements of v elements of w j coords: 0 1 2 2 3 3 4 5 6 7 8 A T -- C -- T G A T C -- T G C A T -- A -- C 0 1 2 3 4 5 5 6 6 7 (0,0)à (1,0)à (2,1)à (2,2)à (3,3)à (3,4)à (4,5)à (5,5)à (6,6)à (7,6)à (8,7) LCS: TCTAC positions in v: positions in w: 2 < 3 < 4 < 6 < 8 1 < 3 < 5 < 6 < 7
LCS
i T 0 1 j A T C T G A T C 0 1 2 3 4 5 6 7 8 G C 2 3 A T A 4 5 6 C 7
LCS i T 0 1 j A T C T G A T C 0 1 2 3 4 5 6 7 8 G C 2 3 A T A 4 5 6 C 7
LCS i 0 j A T C T G A T C 0 1 2 3 4 5 6 7 8 T G C A T A 1 2 3 4 5 6 C 7
LCS u i-1,j -1 i-1,j s i-1,j + 0 s i,j = MAX s i,j -1 + 0 s i-1,j -1 + 1, if v i = w i,j -1 j i,j
LCS u
LCS S i,j = S i-1, j-1 max S i-1, j S i, j-1 ç value from NW +1, if v i = w j ç value from North (top) ç value from West (left)
LCS v i = w j, s i,j = s i-1,j-1 +1 s 2,2 = [s 1,1 = 1] + 1 s 2,5 = [s 1,4 = 1] + 1 s 4,2 = [s 3,1 = 1] + 1 s 5,2 = [s 4,1 = 1] + 1 s 7,2 = [s 6,1 = 1] + 1
LCS
LCS 1. LCS(v,w) 2. for i ß 1 to n 3. s i,0 ß 0 4. for j ß 1 to m 5. s 0,j ß 0 6. for i ß 1 to n 7. for j ß 1 to m 8. s i-1,j 9. s i,j ß max s i,j-1 10. s i-1,j-1 + 1, if v i = w j 11. if s i,j = s i-1,j 12. b i,j ß if s i,j = s i,j-1 13. if s i,j = s i-1,j-1 + 1 14. return (s n,m, b)
Printing LCS: 1. PrintLCS(b,v,i,j) 2. if i = 0 or j = 0 3. return 4. if b i,j = 5. PrintLCS(b,v,i-1,j-1) 6. print v i 7. else 8. if b i,j = 9. PrintLCS(b,v,i-1,j) 10. else 11. PrintLCS(b,v,i,j-1)
u u u LCS
LCS : u l l l u l l l l
µ
u l l A C C T G A G A G A C G T G G C A G 70%
u l l l
u l l à à u l l à à
(Scoring Matrix) u l l
A R N K A 5-2 -1-1 R - 7-1 3 N - - 7 0 K - - - 6 à
l R ( x, y R) = q l R ( x, y M) i x i = i p j x i y q i y j
u u = = i y x y x i y i x i y x i i i i i i i i q q p q q p R y x R M y x R ), ( ), ( ) log( ), ( ),, ( b a ab i i i q q p b a s y x s S = =
u l l u l l
PAM u l l l
PAM X u l u Ala Arg Asn Asp Cys Gln Glu Gly His Ile Leu Lys... A R N D C Q E G H I L K... Ala A 13 6 9 9 5 8 9 12 6 8 6 7... Arg R 3 17 4 3 2 5 3 2 6 3 2 9 Asn N 4 4 6 7 2 5 6 4 6 3 2 5 Asp D 5 4 8 11 1 7 10 5 6 3 2 5 Cys C 2 1 1 1 52 1 1 2 2 2 1 1 Gln Q 3 5 5 6 1 10 7 3 7 2 3 5... Trp W 0 2 0 0 0 0 0 0 1 0 1 0 Tyr Y 1 1 2 1 3 1 1 1 3 2 2 1 Val V 7 4 4 4 4 4 4 4 5 4 15 10
BLOSUM u l u l
BLOSUM
u u l l l
--T -CC-C-AGT -TATGT-CAGGGGACACG A-GCATGCAGA-GAC AATTGCCGCC-GTCGT-T-TTCAG----CA-GTTATG T-CAGAT--C tcccagttatgtcaggggacacgagcatgcagagac aattgccgccgtcgttttcagcagttatgtcagatc
Local alignment Global alignment
u u u
u u l
Local alignment Global alignment
-
u 0 s i,j = max s i-1,j-1 + δ (v i, w j ) s i-1,j + δ (v i, -) s i,j-1 + δ (-, w j ) (Smith-Waterman )
n n n n
u u l = b a ab b a b a b a b a p q q q q b a s q q,, log ), ( = i i i i x Q x P x P Q P H ) ( ) ( )log ( ) (
Affine Gap Penalties) u u u l l l l
u l l u
-ρ - x *σ
δ δ ρ σ δ ρ δ δ σ
s i,j = max s i,j = max s i-1,j - σ s i-1,j (ρ+σ) s i,j-1 - σ s i,j-1 (ρ+σ) ß ß s i,j = s i-1,j-1 + δ (v i, w j ) max s i,j s i,j ß ß
l l l δ s 0 -ρ s -ρ 0 s -σ l -σ
u u u l l l l
u u u u u u u
) ( ), ( ) ( ), ( ) ( ), ( ), ( ) ( ), ( ), ( R P R y x P M P M y x P M P M y x P y x P M P M y x P y x M P + = = '), ( 1 ) ( ), ( ) / ( ), ( 1 ) ( ), ( ) / ( ), ( ' ' s e e R P R y x P M P M y x P R P R y x P M P M y x P S S = σ + = + = ) ) ( ) ( log( ) ), ( ), ( log( ' R P M P R y x P M y x P s + =
u u ) exp( ) ( ( u) x N KNe x M P λ ) ( 1 ) ( ) ( S E S e S x P Kmne S E = > = λ λ mn T S log + >
u l l l
u l l
u l l l
l n n n l
l l n n n n n
u u A -- T G C A A T -- C -- A T G C
0 1 1 2 3 4 A -- T G C A A T -- C -- A T G C
0 1 1 2 3 4 A -- T G C 0 1 2 3 3 4 A A T -- C -- A T G C
0 1 1 2 3 4 A -- T G C 0 1 2 3 3 4 A A T -- C 0 0 1 2 3 4 -- A T G C (0,0,0) (1,1,0) (1,2,1) (2,3,2) (3,3,3) (4,4,4)
u u u
2 3 V W 2-D 3-D
2 3
3 (i-1,j-1,k-1) (i-1,j,k-1) (i-1,j-1,k) (i-1,j,k) (i,j-1,k-1) (i,j,k-1) (i,j-1,k) (i,j,k)
s i,j,k = max s i-1,j-1,k-1 + δ(v i, w j, u k ) s i-1,j-1,k + δ (v i, w j, _ ) s i-1,j,k-1 + δ (v i, _, u k ) s i,j-1,k-1 + δ (_, w j, u k ) s i-1,j,k + δ (v i, _, _) s i,j-1,k + δ (_, w j, _) s i,j,k-1 + δ (_, _, u k ) cube diagonal: no indels face diagonal: one indel edge diagonal: two indels δ(x, y, z) is an entry in the 3-D scoring matrix
u u u
ClustalW u u u
u v 1 v 2 v 3 v 4 v 1 - v 2.17 - v 3.87.28 - v 4.59.33.62 - (.17 =17 % )
v 1 v 2 v 3 v 4 v 1 - v 2.17 - v 3.87.28 - v 4.59.33.62 - v 1 v 3 v 4 v 2 : v 1,3 = alignment (v 1, v 3 ) v 1,3,4 = alignment((v 1,3 ),v 4 ) v 1,2,3,4 = alignment((v 1,3,4 ),v 2 )
u u u FOS_RAT FOS_MOUSE FOS_CHICK FOSB_MOUSE FOSB_HUMAN PEEMSVTS-LDLTGGLPEATTPESEEAFTLPLLNDPEPK-PSLEPVKNISNMELKAEPFD PEEMSVAS-LDLTGGLPEASTPESEEAFTLPLLNDPEPK-PSLEPVKSISNVELKAEPFD SEELAAATALDLG----APSPAAAEEAFALPLMTEAPPAVPPKEPSG--SGLELKAEPFD PGPGPLAEVRDLPG-----STSAKEDGFGWLLPPPPPPP-----------------LPFQ PGPGPLAEVRDLPG-----SAPAKEDGFSWLLPPPPPPP-----------------LPFQ.. : **. :.. *:.* *. * **:
u
u u
u u
u u
u l u
vs
Simultaneous correlation Time-delayed correlation Inverted correlation
Local alignment between pairs of expression profiles u u u M ( x, y ) = i j x i y j u E i j = max( Ei 1, j 1 + M D, ij i, j = max( Di 1, j 1 M ij,0),0)
Results
Results
Results