Calling SRSWWW From Outside
This chapter gives detailed information on how to invoke SRS from your own web pages. You can instert hypertext links on, eg, your home page to the EMBL entries representing the gene or set of genes that your are currently working on.The WGETZ program
The WGETZ program is the interface of the SRS system to the World Wide Web. All the pages that the user sees on his browser during data retrival are in fact generated by calls to this program. Tecnically it is a so called CGI-script, as it gets executed on the server when the user clicks on a link on a page. The wgetz program is very complex: it contains the query engine, accesses all databank descriptions and contains the interpreter to read and reformat the databank data. It also must read all the informations entered by the users in the forms displayed in the rbowser.An HTML link to WGETZ is composed by two parts, a visible part and a hidden part which contains the actuall call to the wgetz program. For example the link
<A HREF=wgetz?+-fun+Pagelibinfo+-info+SWISSPROT>SWISSPROT</A> |
is displayed as: SWISSPROT . This link calls WGETZ to generate and display the information page of the SWISSPROT databank.
The actual call is:
wgetz?+-fun+Pagelibinfo+-info+SWISSPROT |
This obscure syntax is due to the fact that a space character is treated as the end of the command line and must be avoided. The '+' characters is interpreted as a space character which separates the arguments. The '?' after the command 'wgetz' is important to indicate the begin of the command options. With the '+' and '?' characters translated to spaces the command is
wgetz -fun Pagelibinfo -info SWISSPROT |
and looks very much like a normal UNIX command line. This substitution rule will be used throughout the rest of the chapter.
Displaying single entries
A simple call to retrieve the EMBL entry 'RNELAS' which can be written as
wgetz?-e+[embl-id:rnelas] |
displayed as: RNELAS entry in EMBL where option '-e' is a flag to request 'wgetz' to print full entries, and the last argument '[embl-id:rnelas]' is an SRS query to retrieve the single entry from the databank EMBL with id=rnelas.
In the example above a link is created to a single entry in the EMBL databank. It is assumed that this entry can be found in the EMBL release. Most SRS servers offer both the EMBL release and a cumulative list of daily updates since the last release which can be named, eg, EMBLNEW. These daily updates contain new entries but also entries in the release part that have been updated. It is possible to search an entry in both parts, EMBL and EMBLNEW, as in
wgetz?-e+[{embl%20emblnew}-acc:X012345]
|
displayed as: RNELAS new and old entries The two databank names are enclosed in a list starting with '{' and ending with '}'. In that list the databank names need to be separated by a space. Unfortunately we cannot use the space itself since that is interpreted as the end of the command line. Even worse, the '+' character can't be used either since it is interpreted as a separator between command line arguments. The only way the space can be specified is by 'encoding' it by the rather obscure sequence '%20', which specifies the character space using its hexadecimal code.
In cases where an update of an entry in EMBL is present in EMBLNEW, the above query would return two entries. Much better would be to get only the updated version of it. This can be achieved by removing from the query the overlap of EMBL and EMBLNEW. Since the two copies of the same entries in EMBL and EMBLNEW have the same accession number they can be linked using an SRS index. Thus the overlap can be defined as "all entries in EMBL that have links to EMBLNEW" and it can be removed as in
wgetz?-e+[{embl%20emblnew}-acc:X012345]!EMBL<EMBLNEW
|
displayed as: RNELAS latest entry
Displaying sets of entries
If the query you specified matches with more than one entry, the result will be a set of entries. To retrieve the list of entries as hypertext links to the entry contents, instead of the whole entries, simply omit the "-e" option
wgetz?[embl-all:elastase] |
displayed as: all entries containing word "elastase" The following query,
wgetz?swissprot |
retrieves all entries in SWISS-PROT and prints a list of all names. Obviously this list will take a long time to download. A possibility is to restrict the size of the entry list using '-lv', the number of entries to list, and '-bv', the number of the first entry.
wgetz?swissprot+-lv+30+-bv+31 |
displayed as: some SWISSPROT entries prints only 30 entries of SWISS-PROT starting with the 31st. To get more information displayed than the entry names you can use the predefined views, such as "SequenceSimple" which is available for all sequence databanks. The command
wgetz?swissprot+-lv+30+-bv+31+-view+SequenceSimple |
displayed as: some SWISSPROT entries gives you an HTML table with the name, accession number, description line and sequence length of each entry. You can have the same table in ASCII ('-ascii') with column ('-cs') and record separators ('-rs') of your choice using
wgetz?swissprot+-lv+30+-bv+31+-view+SequenceSimple+-ascii+-rs+||+-cs+@@ |
displayed as: some SWISSPROT entries (but it's not likely that any human being wants to read the entries in this format)
Links to Databanks
You might want to have a link to a databank directly for searching which saves you the trouble of having to go to the top page, selecting the databank(s) you are interested in and clicking the continue button. You can do that with a command like
wgetz?-fun+pagequeryform+-l+swissprot%20swissnew |
The command now includes the '-l' option, specifying that a link has to be performed between the two following databanks. The -fun parameter redirects the call to a specific function within WGETZ. This command creates also a new session for you .
As shown before, another possibility is to link to the databank information page
wgetz?-fun+pagelibinfo+-info+genbank |
The page comes with a butto for searching the databank on the top, which on clicking gives you a new session.
User Contexts
Every time a user clicks on the "START" button in the SRS front page, SRS creates a new session for the user. The session consists in a temporary directory where the system can temporarily store user-specific information, like the list of all queries performed, results of applications launched, and the views which have been defined. This is useful when the user wants to make very complex analysis, having subsequent query refinements, and is probably not necessary for a simple link in the home page.The system also assignes to every session an unique identifier, which is inserted in all WGETZ calls in all pages generated during the session. In this way the system can keep the association between a user and his session. Note that if the original query did NOT have an id, then the generated page will also not contain session id's, unless the command was a session-generating command.
For instance, at the start the user session received the id = QWERty12345ZX. The wgetz query
wgetz?-id+QWERty12345ZX+[embl-id:rnelas] |
results in the list of found entries. Each of these are displayed as hypertext links that lead to the actual entry. Such a link can look like
wgetz?-id+QWERty12345ZX+-e+[embl-id:rnelas] |
In fact, a wgetz command specified from outside a session must never include a user ID. A user context has only a temporary existence since the user's directory is deleted after a certain amount of time and the user ID becomes invalid. A hypertext link to 'wgetz' inserted in an HTML page can have a much longer life time than a user ID.
Some calls can create new user contexts - it is possible to create a new SRS session through the 'back door'!.
It is possible to create a link to an SRS query that also creates a new user context. This allows you to continue with linking entries to other databanks or to launch applications.
wgetz?swissprot+-newId+-view+SequenceSimple |
The '-newId' option forces the generation of a session. Note that in that case a chunk size of 30 is predefined.
Retrieving entries in formats different from HTML
By default all calls to wgetz will generate output with the MIME type 'text/html', which means that the output will be displayed by Netscape or the Internet Explorer. In some cases it is desired to have the page passed on to other programs. An example is the displaying of PDB entries with 3d protein structures by the program RASMOL which is associated with the MIME type 'chemical/x-pdb'
wgetz?-e+-mime+chemical/x-pdb+[pdb-id:3apr] |
Another example is the MIME type 'binary' which prompts the Web browser to present you a file menu to save the page on your disk. These are just two examples - any other MIME type may be specified.

浙公网安备 33010602011771号