代码改变世界

Using C for CGI Programming(4)

2010-09-18 12:19  shuisheng  阅读(243)  评论(0)    收藏  举报
 in
You can speed up complex Web tasks while retaining the simplicity of CGI. With many useful libraries available, the jump from a scripting language to C isn't as big as you might think.
Debugging CGI Programs

One distinct disadvantage of debugging C is that errors tend to causea segmentation fault with no diagnostic message about the source ofthe error. Debuggers are fine for most other types of programs, butCGI programs present a special challenge because of the way they acquireinput.

To help with this challenge, the cgic library includes a CGI program called capture. This programsaves to a file any CGI input sent to it. Youneed to set this filename in capture's source code. When yourCGI program needs debugging, add a call to cgiReadEnvironment(char*) tothe top of your cgiMain() function. Be sure to set the filenameparameter to match the filename set in capture. Then,send the problematic data to capture, making it either the action of theform or the script in your request. You now can use GDB or yourfavorite debugger to see what sort of trouble your codehas generated.

You can take some steps to simplify later debuggingand development. Although these apply to all programming, they pay offparticularly well in CGI programming. Remember that a function should do one thing andone thing only, and test early and test often.

It's a good idea to test each function you write as soon as possibleto make sure it performs as expected. And, it's not a bad idea to seehow it responds to erroneous data as well. It's highly likely that atsome point the function will be given bad data. Catching this behavior ahead of time cansave unpleasant calls during your off hours.

Deployment

In most situations, your development machine and your deploymentmachine are not going to be the same. As much as possible, tryto make your development system match the production system.For instance, my software tends to be developedon Linux or OpenBSD and nearly always is deployed on FreeBSD.

When you're preparing to build or install on the deployment machine,it is particularly important to be aware of differences in libraryversions. You can see which dynamic libraries your code uses withldd. It's a good idea to check this information, because you often may be surprisedby what additional dependencies your libraries bring.

If the library versions are close, usually reflected in the same major number, there probablyisn't a big problem. It's not uncommon for deployment and developmentmachines to have incompatible versions if you're deploying to anexternally hosted Web site.

The solution I use is to compile my own local version ofthe library. Remove the shared version of the library, and linkagainst this local version rather than the system version. It bulksup your binary, but it removes your dependency on libraries youdon't control.

Once you have built your binary on the deployment system, runlddagain to make sure that all of the dynamic libraries have been found.Especially when you are linking against a local copy of a library,it's easy to forget to remove the dynamic version, which won't befound at runtime (or by ldd). Keep tweaking the build process; buildand recheck until there are no unfound libraries.

Speed: CGI vs. PHP

Conventional wisdom holds that a program using the CGIinterface is slower than a program using a language provided bya server module, such as mod_php or mod_perl. Because I startedwriting Web applications with PHP, I use it here as my basis forcomparison with a CGI program written in C. I make no assertionsabout the relative speed of C vs. Perl.

The comparison that I used was the external interface to the database(events.cgi and events.php), because both used the same method for providinginterface separation. The internal interface was not tested, ascalls to the external interface should dwarf calls to the internal.

Apache Benchmark was used to hit each version with 10,000 queries,as fast as the server could take it. The C version had a meantransaction time of 581ms, and the PHP version had a meantransaction time of 601ms. With times so close, I suspect that ifthe tests were repeated, some variation in time would be seen. Thisproved correct, although the C version was slightly faster than the PHPversion more timesthan not.

My normal development uses a more complex interface separationlibrary, libtemplate (see Resources). Ihave PHP and C versions of the library. When I compared versions ofthe event scheduler using libtemplate, I found that C had a much morefavorable response time. The mean transaction time for the C versionwas 625ms, not much more than it was for the simpler version. The PHP versionhad a mean transaction time of 1,957ms. It also was notable that theload number while the PHP version was running generally was twice whatwas seen while the C version was running. No users were on thesystem, and no other significant applications were running when this testwas done.

The fairly close times of the two C versions tell us that most of theexecution time is spent loading the program. Once the program isloaded, the program executes quite quickly. PHP, on the other hand,executes relatively slowly. Of course, PHP doesn't escape the problemof having to be loaded into memory. It also must be compiled, a stepthat the C program has been through already.