How to use field search by using ARSA
The method searchByXMLPath in ARSA provides a function of field search.
The item in the field search can be specified in the XML tag(XML path).
The searchByXMLPath provides 4 parameters as follows.
1. queryPath: Query XML path.(You can specify plural parameters)
2. returnPath: Return XML path.(Result is shown with tab separator.)
3. offset: Specify a start position of getting result.
4. count: Specify a getting number of results. (Max 1000)
Please specify the parameter queryPath, returnPath, offset and count in the REST service.
Click here to see sample program of using REST service.
WSDL URL for SOAP is as follows.
http://xml.nig.ac.jp/wsdl/ARSA.wsdl
By using XML path in your retrieval query, you are able to do "detailed retrieval" that specify various items.
Moreover, you are able to get results include various items (not only accession number but also definition, organism and so on.)
by using XML path in your return query.
About the item that can be specified by each database
Please see the following links for the item that can be specified by each database.The pages are described about the description of each item and examples of XML path.
| Sequence Libraries | ||
| DDBJ | DAD | PRF |
| UniProt/Swiss-Prot | UniProt/TrEMBL | IMGT/LIGM-DB |
| Sequence Related | ||
| PROSITE | PROSITEDOC | BLOCKS |
| PRINTS | PFAMA | PFAMB |
| PRODOM | ENZYME | |
| Protein 3D Structures | ||
| PDB | HSSP | FSSP |
| Metabolic Pathways | ||
| KEGG PATHWAY | LENZYME | LCOMPOUND |
Besides, you can see XML path examples by using Advanced Search or Cross Search of ARSA System.
e.g,
/ENTRY/DDBJ/definition = 'Nostoc sp\. \'Peltigera canina\' sample 13 trnL gene, intron sequence' |
e.g,
/ENTRY/DDBJ/definition = 'Nostoc sp. \'Peltigera canina\' sample 13 trnL gene, intron sequence' |
QueryPath
Query path can be specified as follows.
[XPath] [Relational Operators] [Query] [Logical Operators] [XPath] [Relational Operators] [Query]... |
Relational Operators
For word searches There are partial matches and perfect matches.
| Type | Relational Operators | Description |
|---|---|---|
| Partial match | = | Return entries when the keyword exists in the content of XPath elements |
| != | Return entries when the keyword does not exist in the content of XPath elements | |
| Perfect match | == | Return entries when the keyword and the content of XPath elements match perfectly |
| !== | Return entries when the keyword and the content of XPath elements do not match perfectly |
For value searches there are matches and magnitude comparisons
| Type | Relational Operators | Description |
|---|---|---|
| Match | = | Return entries when the entered numerical value and the content of XPath elements match |
| != | Return entries when the entered numerical value and the content of XPath elements do not match | |
| Magnitude comparison | <, <=, >, >= | Compares the magnitude of the content of XPath entities and entered numerical values |
Logical Operators
Logical operators are used to connect keywords when specifying multiple keywords for a search.
| Type | Logical operators | Description |
|---|---|---|
| Conjunction | AND | Keywords can be connected using word "AND".Space character and capital letters are required such as " AND ". |
| Disjunction | OR | Keywords can be connected using word "OR".Space character and capital letters are required such as " OR ". |
Query
You can specify any keywords.You should enclose your search query with " ' ", if you want to search as 'word search' like DEFINITION..
For example, if you want to get DDBJ Entry which including '16S ribosomal RNA' in DEFINITION.
/ENTRY/DDBJ/definition = '16S ribosomal RNA' |
For example, if you want to get DDBJ Entry which has sequence length of between 1000bp to 20000bp.
/ENTRY/DDBJ/length > 10000 AND /ENTRY/DDBJ/length < 20000 |
There are three restrictions.
Query word length
Please specify two or more characters for query word.Upper limit of query number
The query number which can be specified at a time is up to 20.Add Escape sequence
You should add "Escape sequence" to single-quotation in your query.For example, if single-quotation exists in your query like "'Peltigera canina'", please add escape sequence like "\'Peltigera canina\'".
ReturnPath
You can get plural items by using ','.e.g, /ENTRY/DDBJ/primary-accession,/ENTRY/DDBJ/moltype
Search example
See details of XMLPathAccess from perl
To get entries which feature key is 'rRNA' , qualifier name is 'product' and qualifier value has '16S ribosomal RNA', specify as follows.
use LWP::UserAgent;
$ua = new LWP::UserAgent;
# make request
my $req = new HTTP::Request POST => 'http://xml.nig.ac.jp/rest/Invoke';
$req->content_type('application/x-www-form-urlencoded');
# set parameters
# you should encode your query.
$query = "/ENTRY/DDBJ/feature-table/feature/f_key=='rRNA' AND ";
$query .= "(/ENTRY/DDBJ/feature-table/feature{/f_key=='rRNA' AND ";
$query .= "/f_quals/qualifier{/q_name=='product' AND /q_value='16S ribosomal RNA'}})";
$query =~ s/([^\w ])/'%'.unpack('H2', $1)/eg;
$query =~ tr/ /+/;
$return = "/ENTRY/DDBJ/primary-accession,/ENTRY/DDBJ/definition";
$return =~ s/([^\w ])/'%'.unpack('H2', $1)/eg;
$return =~ tr/ /+/;
$offset = "1";
$count = "100";
$req->content("service=ARSA&method=searchByXMLPath&queryPath=$query&returnPath=$return&offset=$offset&count=$count");
# send request and get response.
my $res = $ua->request($req);
# If you want to get a large result. It is better to write to a file directly.
# my $res = $ua->request($req,'file_name.txt');
# show response.
print $res->content;
|


