Chapter 5 Using Perl for CGI Programming

Chapter 5 Using Perl for CGI
• This chapter introduces the Common
Gateway Interface (CGI) and discusses
how Perl can be used as a CGI
programming language. It begins with an
overview of CGI, how CGI programs are
liked to Web documents, and how results
are returned to clients from CGI programs.
• You’ll also learn in this chapter about the module, which provides more
efficient ways to using

5.1 The common Gateway
• HTML is a markup language and, as such,
cannot by itself describe computations,
allow interaction with the user, or provide
access to a database.
• The interface between a browser and
software on the server that was developed
is called the Common Gateway Interface.
• When a server receives a request for a
CGI program, it does not return the file-it
executes it.

5.1 Common Gateway
server CGI App

5.1 Common Gateway
• When a new document is generated, it
consists of two parts, an HTTP header and
a body. The header must be complete if
the response is going directly to the client.
• The contents of the form are encoded and
transmitted to the server. The Server must
be some program to decode the
transmitted form contents.

5.2 CGI Linkage
• In this Section, we describe how the
connection between an HTML document
displayed by a browser and a CGI
program on the server is established.

5.2 CGI Linkage
• If a CGI program has been compiled and
is in machine code, the server can invoke
it directly. However, the compiled versions
of Perl programs are not saved and are
not in machine code anyway, so the Perl
system must be invoked on every Perl CGI

5.2 CGI Linkage
• #!/usr/local/bin/perl –w
• The#! Specifies that the program whose
location follows must be executed on the rest of
the file.
• The –w flag on this line specifies that the Perl
compiler should produce warning messages
when it finds thins that could potentially be errors.
We recommend that it always be include ,
whether a program is run using the Perl
command or by including the line described

5.2 CGI Linkage
• An HTML document specifies a call to a
CGI program using an anchor tag (<a>),
which must include hypertext reference
attribute (href).
• Example page 134 Sample reply.html

5.2 CGI Linkage
• In Perl, this means that the print function is the
way to generate a response to be sent to the
• This first line of the header specifies the form of
the response, as a MIME (Multipurpose Internet
Mail Extension protocol) content type.
• The line following the one-line header must
always be blank. The blank line indicates the
end of the HTTP header.
print “Content-type: text/html /n/n”;

5.3 HTML for Forms
• The most common way for a user to
communicate information from a Web
browser to the server is through a form.
• Forms, which are modeled on the paper
forms that people continually are required
to fill out, are described in HTML and
objects on a screen form.
• These objects are called widgets.

5.3 HTML for Forms
• There are widgets for:
– Single-line text collection
– Multiple_line text collection
– Checkbox
– Radio button
– Meanu
– Submit
– Reset

5.3 HTML for Forms
• Most widgets are used to gather information
from the user, in the form of either text or button
• Each widget has a value, either by default or
from user input.
• Together , the values of all of the widgets in a
form are called the form data.
• Each form requires a Submit button. When the
user presses the Submit button, the form data is
do the processing. The Rest button resets all the
values of the widgets of the form.

5.3.1 The <form> TAG
• All of the components of a form appear as
the content of a <form> tag. <form> can
have several different attributes:
– action: specifies the URL of the application
that is to be call when the user presses the
Submit buttons.
– method: specifes one of the two techniques,
GET or POST, used to pass the form data to
the server.

5.3.1 The <form> TAG
• GET is the default, so if no method
attribute is given in the <form> tag, GET
will be used. The alternative technique is
• In both techniques, the form data is coded
into a text string when the user presses
the Submit button. This text string is called
a query string .

5.3.1 The <form> TAG
• When the GET method is used, the browser attaches the
query string to the URL of the CGI program, so the data
is transmitted to the server with the URL. The browser
inserts a question mark at the end of the actual URL just
before the first character of the query string so that the
server can easily find the beginning of the query string.
The server removes the query string from the URL and
places it in the environment variable QUERY_STRING,
where it can be accessed by the CGI program.
• The GET method can also be used to pass parameters
to the server when forms are not involved (this is cannot
be done with POST).

5.3.1 The <form> TAG
• The main disadvantage of the GET
method is that some serves place a limit
on the length of the URL string and
truncate any characters past the limit.
• Another potential problem with GET is that
the query string is vulnerable to illegal or
inappropriate access because of its
appearance with the URL, because URLs
are easy to find by network sniffers.

5.3.1 The <form> TAG
• When the POST method is used, the
query string is passed through standard
input of the CGI program, so the CGI
program can simply read the string. The
length of the query string is passed
through the environment variable
CONTENT_LENGTH. There is no length
limitation for the query string with the
POST method, so it is obviously the
choice when there are more than a few
widgets in the form.

5.3.2 WIGETS
• This section briefly describes the widgets
that can be placed in a form and how they
are specified in HTML.
• Many of the popular widgets are specified
with the <input> tag. These are for text,
checkboxes, radio buttons, and the special
buttons, Submit and Rest.

5.3.2 WIGETS
• The one attribute of <input> that is required for
all of the widgets discussed in this section is
type, which specifies the particular kind of widget.
• All widgets except Reset and Submit also
require a name attribute, which becomes the
name of the value of the widget within the query
• The widgets for check boxes and radio buttons
require the value attribute, which initializes the
value of the widget.

5.3.2 WIGETS
• Text widget
– The default size of the box created by a text
widget is 20 characters.
– The size attribute can be used to give a
different size.
– The maxlength attribute to specify the
maximum number of characters that the
browser will accept in the box.
• Example page 137
Sample widget.html

5.3.2 WIGETS
• Check box
– are used to collect multiple-choice input form the user.
– Every checkbox button requires a value attribute in its
<input> tag.
– The attribute checked, which is assigned the value
checked, indicates that the checkbox button is initially
– In many cases, checkboxes appear in lists, with every
one having the same name. the content of <input >
tag is displayed next to the check box button,
providing a label.
Example page 139
Sample check.html

5.3.2 WIGETS
• Radio box
– Are closely related to checkbox buttons. The
difference between a group of radio buttons and a
group of checkboxes is that only one radio button can
be on or pressed at any time.
– All radio buttons in a group must have the name
attribute set in the <input> tag. And all radio buttons
in a group have the same name.
– The checked attribute, set to the value checked in the
<input> tag of the button’s definition.
Example page 141
Sample radio.html

5.3.2 WIGETS
• Check box and radio buttons are effective
methods for collection multiple-choice data
from a user. However, if the number of
possible choices is large, the displayed
form becomes too complex and long. In
these cases, a menu should be used.

5.3.2 WIGETS
• Menu
• A menu is described in HTML using the <select>
• There are two kinds of menus:
– Only one menu item can be selected at a time
– Multiple menu items can be selected
• The default option is the one related to radio
buttons. The other option can be specified by
adding the multiple attribute, which gets the
value multiple, to the <select> tag.
Sample menu.html

5.3.2 WIGETS
• Menu
– When multiple menu items have been selected, the
value for the menu in the query string includes all
selected menu items.
– The size attribute can be include in the <select> tag.
Size specifies the number of menu items that are to
be displayed for the user.
– Each of the item in a menu is specified with an
<option> tag. The content of an <option> tag is the
value of the menu item, which is just text.
– The <option> tag can include the select attribute,
which specifies that the item is preselected.

5.3.2 WIGETS
• Textarea
• The text typed into the area created by
<textarea> tag is used to create such
• The default size of the visible part of the
text in a text area is often quite small, so
the rows and cols attributes should usually
be included and set to reasonable sizes.
Sample texarea.html

5.3.2 WIGETS
• Reset button
clears all of the widgets in the form to their
initial states.
• Submit button
– The form data is coded into a query string,
which is sent to the server.
– The server is requested to execute the CGI
program specified in the action attribute of the
<form> tag. Sample rest.html

• Page 145-147
• You now have the basis for a discussion of
the coding of form data (into a query
string ) and a CGI-Perl program that
decodes that data, processes it, and
sends a response back to the user
concerning the order.

5.4 Query String Format
• This section describes the encoding
format a query string. When the Submit
button in a form is clicked, the form’s data
is coded and sent to the server.
• For each widget in the form that has a
value, the widget’s name and that value
are coded as an assignment statement
and included in the query string.
• It is coded in the query string as follows:

5.4 Query String Format
• If the form has more than one widget, the
assignments that code their values in the query
string are separated by ampersands(&):
• If there are special characters in the value of a
widget, they are coded as a percent sign (%)
followed by a two-character hexadecimal
number that is the ASCII code for the character.
payment=visa&saying=Eat%20 your%20fruit%21

5.4 Query String Format
• A CGI program that processes a query string
must first get the string into a scalar variable.
CGI programs should be able to handle the GET
and the POST methods, so the Web document
designer can change his or her mind about
which one to use at any time in the life time of
the CGI program, without needing to change that
• The environment variable REQUEST_METHOD
has either GET or POST as its value. So, the
CGI program checks this variable and decides
how to get the query string.

5.5 Decoding the Query
• If GET was used, the server removes the
query string from the URL and places it in
the environment variable
• If POST was used, the read function is
used to read the query string from STDIN,
using the environment variable
CONTENT_LENGTH as the number of
bytes to be read.

5.5 Decoding the Query
If ($request_method eq “GET”) {
elsif ($request_method eq “POST”) {
else {
print “Error-the request method is illegal /n”;
1st step

5.5 Decoding the Query
foreach $name_value(@name_value_pairs) {
$value=~ tr/+/ /; #
$value=~ s/%([/dA-Fa-f] [/dA-Fa-f])/pack(“C”,hex($1))/eg;
2nd step

5.6 An Example of Form
• You now can consider the CGI program
that will not just read and display the
names and values in the query string, but
also process the data from the popcorn
• page 153-157

5.7 The Module
• Much of what a CGI program must do is
routine – that is, it is nearly the same for
all CGI programs. Therefore, it is natural to
have standard routines to do these things., which was developed by Lincoln
Stein, is a Perl module of functions for
these common tasks.
• which was developed by Lincoln
Stein, is a Perl module of functions for
these common tasks.

5.7 The Module
• A Perl program specifies that it needs
access to a particular module with the use
declaration. In the case of, only a
part of the module is usually needed. This
can be requested in the use declaration
with a list of module part names each
proceeded by a colon (:). The part you
need here, which is the most often used
part is named standard.
use CGI qw (: standard)

5.7 The Module
• Many of the functions in produce
HTML tages. In these cases, the functions
have the names of their associated HTML
tags, except that they usually use only
lowercase letters. These functions are
called shortcuts.
• p;