2005/3/3 <k@bioruby.org>
CGI DB REST --- URI REpresentational State Transfer SOAP --- RPC, XML RPC XML Remote Procedure Call SOAP Service Oriented Architecture Protocol SOAP + WSDL Web Service Description Language SOAP + WSDL + UDDI Universal Description, Discovery, and Integration
CGI DB REST --- URI REpresentational State Transfer SOAP --- RPC, XML RPC XML Remote Procedure Call SOAP Service Oriented Architecture Protocol SOAP + WSDL Web Service Description Language SOAP + WSDL + UDDI Universal Description, Discovery, and Integration
DBGET, dbfetch (SRS), BioFetch --- REST Entrez E-Utilities (NCBI) --- REST BioDAS (WormBase, Ensembl ) --- REST XML Central of DDBJ ( ) --- SOAP/WSDL KEGG API ( ) --- SOAP/WSDL EBI Web Services --- SOAP/WSDL ESOAP (SOAP E-Utils) --- SOAP/WSDL
REST BioFetch EMBL http://www.ebi.ac.uk/cgi-bin/dbfetch?db=embl&id=j00231,bum FASTA http://www.ebi.ac.uk/cgi-bin/dbfetch?db=embl&id=j00231,bum&format=fasta BioDAS WormBase http://www.wormbase.org/db/das/elegans/dna?segment=i:1,20000 WormBase http://www.wormbase.org/db/das/elegans/features?segment=i:1,20000
WormBase DAS <?xml version="1.0" standalone="yes"?> <!DOCTYPE DASGFF SYSTEM "http://www.biodas.org/dtd/dasgff.dtd"> <DASGFF> <GFF version="1.01" href="http://www.wormbase.org/db/das/elegans/features?segment=i%3a1%2c20000"> <SEGMENT id="i" start="1" stop="20000" version="1.0"> <FEATURE id="sequence:yk582g6.5/241592" label="yk582g6.5"> <TYPE id="est_match:blat_est_other" category="miscellaneous">est_match:blat_est_other</type> <METHOD id="est_match">est_match</method> <START>1</START> <END>22</END> <SCORE>14.2</SCORE> <ORIENTATION>+</ORIENTATION> <PHASE>0</PHASE> <LINK href="http://www.wormbase.org/db/get?name=yk582g6.5;class=sequence">yk582g6.5</link> <TARGET id="yk582g6.5" start="284" stop="305" /> <GROUP id="sequence:yk582g6.5" type="sequence" /> </FEATURE> <FEATURE id="sequence:yk585b5.5/722458" label="yk585b5.5"> <TYPE id="est_match:blat_est_other" category="miscellaneous">est_match:blat_est_other</type> <METHOD id="est_match">est_match</method> <START>1</START> <END>50</END> <SCORE>12.8</SCORE> <ORIENTATION>-</ORIENTATION> <PHASE>0</PHASE> <LINK href="http://www.wormbase.org/db/get?name=yk585b5.5;class=sequence">yk585b5.5</link> <TARGET id="yk585b5.5" start="119" stop="168" /> <GROUP id="sequence:yk585b5.5" type="sequence" /> </FEATURE> <FEATURE id="161762" label="inverted_repeat:inverted"> <TYPE id="inverted_repeat:inverted" category="miscellaneous">inverted_repeat:inverted</type> <METHOD id="inverted_repeat">inverted_repeat</method>
DAS Distributed Annotation System REST URI XML (DTD) Ensembl UCSC WormBase FlyBase KEGG DAS etc.
KEGG DAS - GBrowse http://das.hgc.jp/ GMOD/GBrowse DAS KEGG 237 KEGG
KEGG DAS http://das.hgc.jp/cgi-bin/gbrowse/eco?name=eco:205563..255562 DAS http://das.hgc.jp/cgi-bin/das/eco/features?segment=eco:205563,255562 <?xml version="1.0" standalone="yes"?> <!DOCTYPE DASGFF SYSTEM "http://www.biodas.org/dtd/dasgff.dtd"> <DASGFF> <GFF version="1.01" href="http://das.hgc.jp/cgi-bin/das/eco/features?segment=eco%3a205563%2c255562"> <SEGMENT id="eco" start="205563" stop="255562" version="1.0"> <FEATURE id="ec:1.1.1.-/649" label="1.1.1.-"> <TYPE id="enzyme:kegg" category="enzyme">enzyme:kegg</type> <METHOD id="enzyme">enzyme</method> <START>229167</START> <END>229970</END> <SCORE>-</SCORE> <ORIENTATION>+</ORIENTATION> <PHASE>0</PHASE> <LINK href="http://www.genome.jp/dbget-bin/www_bget?ec:1.1.1.-">1.1.1.-</link> <GROUP id="ec:1.1.1.-" type="ec" /> </FEATURE> <FEATURE id="ec:1.3.99.-/700" label="1.3.99.-"> <TYPE id="enzyme:kegg" category="enzyme">enzyme:kegg</type> <METHOD id="enzyme">enzyme</method> <START>240859</START>
BioRuby DAS #!/usr/bin/env ruby require 'bio' serv = Bio::DAS.new("http://das.hgc.jp/cgi-bin/") # (eco) 200563 55562 segment = Bio::DAS::SEGMENT.region("eco", 205563, 255562) # DNA results = serv.get_dna("eco", segment) results.each do dna puts dna.sequence end # results = serv.get_features("eco", segment) results.segments.each do segment segment.features.each do feature puts feature.entry_id puts feature.start end end
REST URI
REST XML
SOAP KEGG API http://www.genome.jp/kegg/soap/ DBGET, GENES, XML Central of DDBJ http://xml.ddbj.nig.ac.jp/ DDBJ,, GIB, GTOP, PML etc. EBI Web Services http://www.ebi.ac.uk/tools/webservices/ DBFetch, WU-BLAST, FASTA, InterProScan NCBI ESOAP http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html Entrez (EFetch, ESearch )
SOAP XML SOAP get_genes_by_pathway("path:eco00010") "eco:b0114", "eco:b0115", "eco:b0116",...
WSDL XML <!-- c olor_pathway_by_objects --> <message name="color_pathway_by_objectsrequest"> <part name="pathway_id" type="xsd:string"/> <part name="object_id_list" type="typens:arrayofstring"/> <part name="fg_color_list" type="typens:arrayofstring"/> <part name="bg_color_list" type="typens:arrayofstring"/> </message> <message name="color_pathway_by_objectsresponse"> <part name="return" type="xsd:string"/> </message> <!-- Objects on the pathway --> <!-- get_genes_by_pathway --> <message name="get_genes_by_pathwayrequest"> <part name="pathway_id" type="xsd:string"/> </message> <message name="get_genes_by_pathwayresponse"> <part name="return" type="typens:arrayofstring"/> </message> <!-- get_enzymes_by_pathway --> <message name="get_enzymes_by_pathwayrequest"> <part name="pathway_id" type="xsd:string"/>
SOAP + WSDL (1) WSDL Perl, Python, Ruby, PHP, Java, C#
Ruby #!/usr/bin/env ruby require "soap/wsdldriver" wsdl = "http://soap.genome.jp/kegg.wsdl" serv = SOAP::WSDLDriverFactory.new(wsdl).create_driver # (eco) puts serv.get_genes_by_pathway("path:eco00010") # (eco) list = serv.list_pathways("eco") list.each do path end puts "#{path.entry_id} t#{path.definition} n"
Perl #!/usr/bin/env perl use SOAP::Lite; $wsdl = "http://soap.genome.jp/kegg.wsdl"; $serv = SOAP::Lite -> service($wsdl); # (eco) $list = $serv -> list_pathways("eco"); foreach $path (@{$list}) { } print "$path->{entry_id} t$path->{definition} n";
# (eco) get_genes_by_pathway("path:eco00010") eco:b0114 eco:b0115 eco:b0116 eco:b0356 eco:b0688 : # (eco) list_pathways("eco") path:eco00010 path:eco00020 path:eco00030 path:eco00040 path:eco00051 path:eco00052 path:eco00053 path:eco00061 : Glycolysis / Gluconeogenesis - Escherichia coli K-12 MG1655 Citrate cycle (TCA cycle) - Escherichia coli K-12 MG1655 Pentose phosphate pathway - Escherichia coli K-12 MG1655 Pentose and glucuronate interconversions - Escherichia coli K-12 MG1655 Fructose and mannose metabolism - Escherichia coli K-12 MG1655 Galactose metabolism - Escherichia coli K-12 MG1655 Ascorbate and aldarate metabolism - Escherichia coli K-12 MG1655 Fatty acid biosynthesis (path 1) - Escherichia coli K-12 MG1655
SOAP Ruby Ruby 1.8 Ruby 1.8.2 Perl CPAN SOAP::Lite
Java Apache Axis WSDL org.apache.axis.wsdl.wsdl2java jar CLASSPATH
Java import keggapi.*; class GetGenesByPathway { public static void main(string[] args) throws Exception { KEGGLocator locator = new KEGGLocator(); KEGGPortType serv = locator.getkeggport(); String query = args[0]; String[] results = serv.get_genes_by_pathway(query); for (int i = 0; i < results.length; i++) { System.out.println(results[i]); } } } Axis CLASSPATH % javac -classpath keggapi.jar GetGenesByPathway.java % java -classpath keggapi.jar:. GetGenesByPathway path:eco00010 eco:b0114 eco:b0115 eco:b0116 :
SOAP + WSDL (2) XML XML
XML #!/usr/bin/env ruby require 'soap/wsdldriver' wsdl = "http://soap.genome.jp/kegg.wsdl" serv = SOAP::WSDLDriverFactory.new(wsdl).create_driver # serv.wiredump_dev = STDERR # (eco) puts serv.get_genes_by_pathway("path:eco00010")
SOAP XML Wire dump: opening connection to soap.genome.jp... opened <- "POST /keggapi/request_v3.2.cgi HTTP/1.1\r\nAccept: */*\r\ncontent-type: text/xml; charset=utf-8\r\nuser-agent: SOAP4R/1.5.3-ruby1.8.2\r\nSoapaction: \"SOAP/KEGG#get_genes_by_pathway\"\r\nContent-Length: 336\r\nHost: soap.genome.jp\r\n\r\n" <- "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<env:envelope xmlns:env=\"http://schemas.xmlsoap.org/soap/envelope/\"\n xmlns:xsi=\"http://www.w3.org/2001/xmlschemainstance\">\n <env:body>\n <n1:get_genes_by_pathway xmlns:n1=\"soap/kegg\">\n <pathway_id>path:eco00010</pathway_id>\n </n1:get_genes_by_pathway>\n </env:body>\n</env:envelope>" -> "HTTP/1.1 200 OK\r\n" -> "Date: Thu, 03 Mar 2005 02:02:19 GMT\r\n" -> "Server: Apache/1.3.26 (Unix)\r\n" -> "SOAPServer: SOAP::Lite/Perl/0.55\r\n" -> "Content-Length: 2422\r\n" -> "Content-Type: text/xml; charset=utf-8\r\n" -> "\r\n" reading 2422 bytes... -> "<?xml version=\"1.0\" encoding=\"utf-8\"?><soap-env:envelope xmlns:soap- ENC=\"http://schemas.xmlsoap.org/soap/encoding/\" SOAP- ENV:encodingStyle=\"http://schemas.xmlsoap.org/soap/encoding/\" xmlns:soap- ENV=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/1999/xmlschema-instance\" xmlns:xsd=\"http://www.w3.org/1999/xmlschema\"><soap-env:body><namesp1:get_genes_by_pathwayresponse xmlns:namesp1=\"soap/kegg\"><return SOAP-ENC:arrayType=\"xsd:string[42]\" xsi:type=\"soap-enc:array\"><item xsi:type=\"xsd:string\">eco:b0114</item><item xsi:type=\"xsd:string\">eco:b0115</item><item xsi:type=\"xsd:string\">eco:b0116</item><item xsi:type=\"xsd:string\">eco:b0356</item><item xsi:type=\"xsd:string\">eco:b0688</item><item xsi:type=\"xsd:string\">eco:b0755</item><item
SOAP + WSDL (3) Java Ruby SOAP
# (eco) list = serv.list_pathways("eco") # => ArrayOfDefinition list.each do path # => Definition puts "#{path.entry_id} t#{path.definition} n" end Definition entry_id definition ID (string) (string) <SOAP-ENV:Body><namesp1:list_pathwaysResponse xmlns:namesp1=\"soap/kegg\"> <return SOAP-ENC:arrayType=\"namesp2:SOAPStruct[111]\" xsi:type=\"soap-enc:array\"> <item xsi:type=\"namesp2:soapstruct\"> <definition xsi:type=\"xsd:string\"> Glycolysis / Gluconeogenesis - Escherichia coli K-12 MG1655 </definition> <entry_id xsi:type=\"xsd:string\">path:eco00010</entry_id> </item> <item xsi:type=\"namesp2:soapstruct\"> <definition xsi:type=\"xsd:string\"> Citrate cycle (TCA cycle) - Escherichia coli K-12 MG1655 </definition> <entry_id xsi:type=\"xsd:string\">path:eco00020</entry_id> </item> : path:eco00010 Glycolysis / Gluconeogenesis - Escherichia coli K-12 MG1655 path:eco00020 Citrate cycle (TCA cycle) - Escherichia coli K-12 MG1655 path:eco00030 Pentose phosphate pathway - Escherichia coli K-12 MG1655 :
SSDBRelation genes_id1 genes_id2 sw_score bit_score identity overlap start_position1 end_position1 start_position2 end_position2 best_flag_1to2 best_flag_2to1 definition1 definition2 length1 length2 genes_id (string) genes_id (string) genes_id1 genes_id2 Smith-Waterman (int) genes_id1 genes_id2 bit (float) genes_id1 genes_id2 (float) genes_id1 genes_id2 (int) genes_id1 (int) genes_id1 (int) genes_id2 (int) genes_id2 (int) genes_id1 genes_id2 (boolean) genes_id2 genes_id1 (boolean) genes_id1 (string) genes_id2 (string) genes_id1 (int) genes_id2 (int) # b0002 list = serv.get_best_best_neighbors_by_gene("eco:b0002", 1, 100) list.each do hit puts hit.genes_id1 # => eco:b0002 eco:b0002 puts hit.genes_id2 # => ecj:jw0001 bsu:bg10350 puts hit.sw_score # => 5283 561 end
SOAP WSDL SAX REST
(1) # serv.options["protocol.http.connect_timeout"] = 60 serv.options["protocol.http.receive_timeout"] = 600 # begin results = serv.send(*arg) rescue Timeout::Error retry end
(2) Proxy http_proxy # Ruby (SOAP4R) setenv SOAP_USE_PROXY on setenv HTTP_PROXY my.proxy.server:8080 #!/usr/bin/env perl use strict; use SOAP::Lite; my $wsdl = "http://soap.genome.ad.jp/kegg.wsdl"; my $results = SOAP::Lite -> proxy("$wsdl", proxy => "http://my.proxy.server/") -> get_pathways_by_enzymes( SOAP::Data->name(data=>['ec:1.3.99.1'])); foreach (@{$results}) { print $_,"\n"; }
: #!/usr/bin/env ruby require 'bio' serv = Bio::KEGG::API.new # hsa:7368 homologs = serv.get_all_best_neighbors_by_gene("hsa:7368") homologs.each do hit gene = hit.genes_id2 # if motifs = serv.get_motifs_by_gene(gene, "pfam") motifs.each do motif name = motif.motif_id desc = motif.definition puts "#{gene}: #{name} #{desc}" end end end
: 㐿ఎᏄ 䜘䝕䜽䜪䜫䜨䛱䝢䝇䝘䝷䜴 serv = Bio::KEGG::API.new list = serv.get_genes_by_pathway("path:bsu00020") fg_colors = Array.new bg_colors = Array.new list.each do gene fg_colors << "black" bg_colors << ratio2rgb(gene) # 遺伝子名と色の対応 end url = serv.color_pathway_by_objects( "path:bsu00020", list, fg_colors, bg_colors)
PDB #!/usr/bin/env ruby require 'bio' serv = Bio::KEGG::API.new # path = ARGV.shift "path:eco00010" genes = serv.get_genes_by_pathway(path) # PDB results = Hash.new genes.each do gene if pdb_links = serv.get_all_linkdb_by_entry(gene, "pdb") pdb_links.each do link results[gene] = true end end end # url = serv.mark_pathway_by_objects(path, results.keys) # serv.save_image(url, "pdb.gif")
#!/usr/bin/env ruby require 'bio' ### KEGG API kegg = Bio::KEGG::API.new list = kegg.get_all_paralogs_by_gene("eco:b0002") genes = Array.new list.each do hit genes << hit.genes_id2 end seqs = kegg.get_aaseqs(genes) ### DDBJ XML ddbj = Bio::DDBJ::XML::ClustalW.new puts ddbj.analyzesimple(seqs)
ID DDBJ KEGG LSID
BioMOBY UDDI http://www.biomoby.org/
HTML XML
KEGG API KEGG HTML Oracle SQL HTML/CGI Perl SOAP::Lite
KEGG API PATHWAY GENES/SSDB pre-calc Pfam DBGET
KEGG API HTML RDB KEGG