4 8 007 8 Application Research of Computers Vol. 4 No. 8 Aug. 007 * XM L,,,,, (., 604;., 604) XML XML XML, XML XML, XML, ; XML ; XML,,, XML ; ; ; ; TP3 A 00-3695( 007) 08-000- 06 Query optimization based on complicated scheme indexes YU Hong,, WANG Xiu-kun, GAO Yan-ping, ZHANG Jian-ying, YANG Nan-hai (. School of Information Engineering, Dalian Fisheries University, Dalian Liaoning 604, China;. School of Electronic & Infomation Engineering, Dalian University of Techology, Dalian Liaoning 604, China) Abstract The paper analyzed XML query and the relationship between XML scheme and XML documents. Proposed a complicated scheme index-based XML query optimization method. Indexed parent/ child node and ancestor/ descendant node, took the XML scheme with loop into account. Firstly pretreated the query tree with repetitive labels, then decomposed the query tree into main path and branch path. When executed the query, applied the indexes to accelerate the query calculation. This method would reduce greatly number of join, improve the efficiency of query. It could process the complicated scheme. The result of the experiments indicate that the performance of this method is excellent. Key words complicated scheme; index; XML; query optimization; path expression 0 XML [ ], XML, c) XML,, XPath XQuery XML-QL, [ XML 4] [ XML DataGuides 4],, a) XML,, [ 6 3], n, n -,,,,, DataGuides XML, d) XML XML [, 5] [, DTD XML 5] b), DTD XML DTD XML, XML, DTD XML ; / DTD,,, /, DTD, DTD, 006-06- 7; 006-08- 4 973 ( 00CCA00700 ) ; ( 05L090 ) ; ( 005JJH038) ( 968- ),,,,, ( yuhong-@ hotmail. com) ; ( 945- ),,,,, ; ( 964- ),,,, ; ( 973 - ),,,,, ; ( 970- ),,,,,.
8, XML 0 XML XML books, section sections ), % / XML XML / /, books, XML XML, books book family,, XML, XML / books/book/ / family, books, XML XML, book family,, XML XML / books/book% / family, books [ 6 8] Tae-Sun Chung [ 6] DTD book books family,, e,, [ e 8] DBXI( DTD based XML ; *? index) DTD XML, DTD,! ELEMENT person ( name, e-mail *, ( school [ 9] ; XML company) { name}, [ 0], XML { e-mail*, school, company } DTD DBXI DTD XML person, DTD DTD / XML person ; DTD / ;, person DTD / XML person DTD ; XML 0 3 { email, school} { email, company} { sc hool} { company}, DTD DTD path ;, ( book ; DTD XML / ; XML /, nodeinfo, mergenodeinfonodeinfo, XML,, DTD,, XML DTD,,, XML ; DTD,,, DTD, XML MergeNodeInfo, XML DTD XML DBXI, DTD, N /,, XML mergenodeinfo 0, DBXI, DTD,! ELEMENT manager( name, ( manager, NodeInfo /, department employee) ) ; / MergeNodeInfo /, DTD,,, ;, Chiyoung Seo [ 7] DTD, path( path, pathid), XML DTD, pathindex( pathid, docid), XML, DTD term( term, termid),, XML termindex( termid, docid, pathid, position), XML RDBMS, / / % / SQL like, path XML XML pathid, pathindex ; / ; / ;, /, XML ( XML pathid, term termid, ), XML termindex pathid termid XML, XML, XML
0 007 ; XML, no,,,,, no ; effectnode XML, XML, XML e, e Department( n) ( 4) 4, n,, {, }, e XML e, e, e ;, manager department manager,, name url email employee + )! ELEMENT manager( name, email?, ( manager dept employee )! ELEMENT department ( name, email?, url?, employee +, department* )! ELEMENT employee( name, email? )! ELEMENT name( PCDATA)! ELEMENT email( PCDATA)! ELEMENT url( PCDATA). XML 4 XML, ST = ( V, E, R) V = V s V n V h, vv, v V s ; V n. ; V h E, E RV XML V Q ( ) ; E Q ( / ) ; R Q V Q n XML ) XML XML,, XML, XML,, XML ( preorder, type, QT, no, effectnode) preorder,, ; type ( ;, QT n ; h ; s ) ;, L S, QT Manager( n ) ( ) Name( ) email( 3) Name( 5 ) url( 6) Email( 7), n,, { }, h, null, { } 3, h, null, { } 5, h, null, {, } 6, h, null, {, } 7, h, null, {, } Employee( 8 ) Name( 9) Email( 0) Employee( ) Name( ) Email( 3) 8, h, null, {, } 9, h, null, {, } 0, h, null, {, }, h, null, { }, h, null, { } 3, h, null, { } ) XML XML 3 XML e /, e, e ( e e) name() email(3) manager(n)() department(n)(4) employee() name(5) url(6) email(7) employee(8) name() email(3) name(9) email(0) ; / (department,4)n (employee,) (manager,)n (department,4)n (employee,) (manager,)n 渊 label,preoder) (email,7) (email,0) (name,9) (name,5) (url,6) (email,7) (email,0) (name,5) (name,9) (url,6) 渊 a 冤子节点索引 渊 label,preoder) (email,0) (name,9) 渊 b 冤后代节点索引 渊 label,preoder) (email,3) (name,) 渊 label,preoder) (email,3) (name,) XML (department,4) (email,3) (employee,) (name,) (department,4) (email,3) (email,7) (email,0) (email,3) (employee,) (name,) (name,5) (name,9) (name,) 5, QT = ( V Q, E Q, R Q ), XML, 6 7
8, XML 03 PN4 [ ] return, E j, E j XML XML, num j,, ; QT = / manager/ manager/ manager/ / department/ / department/ employee[ email = abc@ 63. com ] / name, =,, ; P M = manager/ / department/employee / name,, L 3 L, P = employee[ email], email = abc@ 63. com, L 3 L P M P i ( i =,,, n, n. 3 XML ) XML, QT, ;,, EStack QT R = ( V R, R R ) QT R V R ; L( i) ( i =,, n), XML i ; S( i) ( i =,, n) i ; P[ j] j ; v, element / ; symbol / / P PathQuery( P) ( / ) ; leftson ; rightson STree( QT) QT R M Sp i / / ST[ ] EStack, L( i) 0 ( i =,, n), S( i) " = " ( i =,, n), j = 0 ST[ ] vqt. R ST3[ v] v ProcessNode( v) ST4[ ] P M " " while EStack do NodeEStack. pop( ) P M Node. symbol & Node. Element & P M L, S P M ST5[ ] P M, P i ( i =,, m) ProcessNode( v) v PN [ v] v. element, v L( i) + v i v no,, XML v / / / ), v PN [ v ] v, j +, P[ j] v P[ j] P[ j] L S P[ j] PN3 [ v ] v, Process- Node( v. rightson) m) n} ) R M PQ [ ] found, R M PQ[ ] P Sp i ( i =,,, PQ3 [ ] for( i =, i < = m, i + + ) found, PQ4 R M = PQ4 [ ] found, SchemeQuery( P M, P) P M P( P = { Pi i =,,, QT R ( V R, R R ) SQ[ ] V R, R R, found SQ[ ] P M / /, found, SQ4, R R PathQuery( P M ), found SQ4, V R V R R M SQ3[ ] for( i =, i < = n, i + + ) { PathQuery( P i ), found, SQ4, V R V R R i } SQ4[ ] found,, QT R ( { manager/ / department, department/ employee, employee / name} employee[ email] XML, ; / /, XML, ;, ( R M
04 007 )XML, 3 XML, 4, { manager/ /department, department/employee, employee/ name, employee[ email] }, / manager department, ( b) manager department, 4;, ( department, 4) employee, ( a), ( department, 4) employee, 8;, ( em- ployee, 8) email, ( a), ( employee, 8) email, 0;, ( employee, 8) name, ( a), ( employee, 8) name, 9, T R = ( man- ager, ) / / ( department, 4 ) / ( employee, 8 ) [ ( email, 0 ) ] / ( name, 9) R R ( manager, ), xml-preorder V R { [ ( name, 9), [ L 3 and L ] ], [ ( email, 0), email = abc@ 63. com [ L 3 andl ] ] }. 4 XML XML, ) XML [ 0] O M, O M O, XML e) ;, ancestor, XML O M R M R M xml-preorder R R preorder, b) P i XML ancestor) ( doc-id, xml-s-preorder, order, size, level, text, ( doc-id, xml-preorder, order, size, level, L, text, ancestor) ( doc-id, xml-preorder, order, size, level, ancestor) ( doc-id, xml-s-preorder, order, size, ; order XML O ; size v ; level M O e), O v XML ; L XML ; text ; ancestor, an- cestor[ i] v i xml-preorder order xml-preorder XML,, XML. Pentium 4 3. GHz CPU, 768 MB RAM 80 GB ; ) XML XML, XML XML XML, ; ; order ; ) XML shakespeare.. 00 XML XML, 37, 7. 56 MB, order KB [ 7] 4-Relation XML, ( XSBI ) [ 7] 4-, Relation 3 5 4 3 XML XML 3) a) P M XML XML, O i ( i =,,, n, n ) ancestor O i R i R i xml-preorder R R preorder c) R M R i ( i =,,, ) n, level, L, ancestor) R = R M ( n R i ) i = doc-id ; xml-preorder XML 3 左 9 左 n 3 3 d) R O M /, 3 order size level L ancestor 300 null 3 00 36 50 3 3 4 80 order size level L L text ancestor 95 5 5 amy 55 5 6 peter 9 5 4 mary Windows 000 server, Java 8 4 54 3 54 4 34 4
8, XML 05 XML, XML query Q Q Q3 Q4 Q5 Q6 3 path expression SCA NE / SPEECH[ SPEA KER = " poet" ] / / STA GEDIR A CT[ EPILOGUE] / / SPEECH /STAGEDIR PLA Y [ INDUCT] / ACT / SCA NE / /SPEECH [ SPEA KER = " TR ANIO" ] / LINE PLA Y / / PERSONA PLA Y [ PROLOGUE] / / SCNDESCR LINE / STA GEDIR,, 4-Re- XML lation ;, 4-Relation Q4, XML PLAY PERSONA, ) XML,, XML TOXgene [ ], 80 MB, 3 KB [ 7], [ 6] mergenodeinfo [ 6] mergenodeinfo NodeInfo, NodeInfo 4 University of Toronto and IBM TOXgene 6 6, [ ] RAY T, PAOLI J, SPERBERG-McQUEEN C M, ed al. extensible mergenodeinfo markup language ( XML). 0 3rd ed [ EB/OL]. ( 004-0 - 04 ) 4 [ 006-06- 05]. http / /www. w3. org /TR /004 /REC-xml. query Q Q Q3 Q4 Q5 5 path expression /manager/ manager / / department/ department /employee[ email = bob @ 63. com ] / name /manager/ department [ email = " rsk @ dlhx. c om. cn " ] / / employee [ email] / name department/ employee/ email /manager[ name = " Mary" ] / /manager/ department/name employee / name XSBI 4-Relation 6 XSBI mergenodeinfo 3), [ 9] CHIEN S Y, VAGENA Z, ZHANG Dong-hui, et al. Efficient struc- tural joins on indexed XML documents[ C] / /Proc of VLDB00. San 80 00 0 40 60 MB, 4 Q 7 7, [ 0] LI Quan-zhong, MOON B. Indexing and querying XML data for regular path expressions[ C] / / Proc of VLDB00. San Francisco Morgan, Kaufmann, 00 36-370., [ ] WANG Wei, JIANG Hai-feng, LU Hong-jun, et al. PbiTree coding 4 XML, XML /, ; XML ;, XML,,,, ( ), 7 [ ] BERGLUND A, BOAG S, CHAMBERLIN D. XML path language ( XPath). 0 [ EB/ OL]. ( 005-04- 04 ) [ 006-06 - 08 ]. http / / www. w3. org/ TR /005 /WD-xpath0. [ 3] BOAG S, CHAMBERLIN D, FERNNDE M F. XQuery. 0 an XML query language[ EB/ OL]. ( 005-04- 04 ) [ 006-06 - 08]. http / /www. w3. org/ TR /005 /WD-xquery /. [ 4 ] DEUTSCH A, FERNANDEZ M, FLORESCU D, XML-QL a query language for XML[ EB/OL]. http / /www. w3. org /TR /NOTE-xml-ql. [ 5] McHUGH J, WIDOM J. Query optimization for XML[ C] / /Proc of the 5th VLDB Conference. Edinburgh [ s. n. ], 999 35-36. [ 6] AL-KHALIFA S, JAGADISH H V, KOUDAS N, et al. Structural joins a primitive for efficient XML query pattern matching[ C] / / Proc of ICDE 00. 4-5. [ 7] JIANG Hai-feng, LU Hong-jun, WANG Wei, et al. XR-Tree indexing XML data for efficient structural joins[ C] / / Proc of ICDE003. Bangalore IEEE Computer Society, 003 53-64. [ 8 ],,. XML [ J]., 004, 5 ( 5) 70-79. Francisco Morgan Kaufmann, 00 63-74. and efficient processing of containment joins [ C ] / / Proc of IC- DE003. Los Alamitos IEEE Press, 003 39-40. ( 08 )
08 007 ( S t, t ) V t ;, v i, v i ; v i 6, B, P P 4 v i, 7, S 0 door ( a) S t - v i > d D ( d D ), ( b) S t - v i < d D,,,,,, v i,, S t - v i > d D, ( c) S t - v i < d D, 6 7 S 0 door, c), t i, [ ] USSELL S, NERVIG P. Artificial intelligence a modern approach, d [ M]. New Jersey Prentice Hall, 00. t i d S, S 0 A, [ ] LEONARD J, DURRANT-WHITE H. Mobile robot localization by t 4, tracking geometric beacons[ J]. IE EE Journ al of Rob otics and Auto Q, S 0 F S 0 A, F mation, 99, 6 ( 7 ) 89-97. [ 3] BETKE M, GURVITS L. FN Q ; Mobile robot localization using landmarks [ J]. IEE E Journal of Robotics an d A utomation, 997, 3 ( ) D, S 0 G S 0 A, 5-63. G GH D [ 4] TALLURI R, AGGARWAL J K. Mobile robot self-location using, 4, S 0 A model-image feature correspondence[ J]. IEEE Journal o f Ro botics and Auto mation, 996, ( ) 63-77. ( 5) [ 5] GUIVANT J, NEBOT E, DURRANT-WHYTE H F. Simultaneous lo- Z t S 0 v calization and map building using natural features in outdoor environ-, ments[ C] / / Proc of Intelligent Autonomous System. Venice,, [ s. n. ], 000 58-588.,, [ 6] MURPHY R R. [ M].,. ( ) ;, H P 3 P D P A N L t, 004. R B P 4 C S 0 (x s,y s ) A B H I DCJ K Italy ( 05 ) [ ] HANG C, NAUGHTON J, DeWITT D, et al. On supporting containment queries in relational database management systems [ C] / / Proc of the 00 ACM SIGMOD Int l Conf on Management of Data. [ 7] SEO C Y, LEE S W, KIM H J. An efficient inverted technique for XML documents using RDBMS [ J]. Information and Soft wa re Te chnology, 003, 45( ) -. New York ACM Press, 00 45-436. [ 3],,,. XML [ 8 ],,,. DTD XML [ J]., 005, 4 ( ) 30-37. [ J]., 005, 8 ( ) 3-7. [ 4] GOLDMAN R, WIDOM J. DataGuides enabling query formulation and optimization in semistructure databases [ C] / / Proc of the 3 rd VLDB Conference. Athens, Greece [ s. n. ], 997. [ 5] THOMPSON H S, BEECH D, MALONEY M. XML schema part structures nd ed [ EB/OL]. ( 004-0- 8) [ 006-06- 05 ]. http / / www. w3. org /TR /004 / REC-xmlschema-. [ 9 ] DIETZ P F. Maintaining order in a linked list[ C] / / Proc of the 4 th Annual ACM Symp on Theory of Computing. New York ACM Press, 98-7. [ 0] SCHLIEDER T. ApproXQL design and implementation of an approximate pattern matching language for XML, B0-0 [ D]. Berlin Freie University, 00. [ ] BARBOSA D, MENDELZON A, KEENLEYSIDE J. ToXgene the [ 6] CHUNG T S, KIM H J. Extracting indexing information from XML DTDs[ J]. Information Processing Letters, 00, 8 ( ) 97-03. ToX XML data generator [ EB /OL]. tox/ toxgene. http / /www. cs. toronto. edu/