DESCRIPTION

Documents the HTTP server start options, some administrative functions and also specifies the Erlang Web server callback API

COMMON DATA TYPES

Type definitions that are used more than once in this module:

boolean() = true | false

string() = list of ASCII characters

path() = string() - representing a file or directory path.

ip_address() = {N1,N2,N3,N4} % IPv4 | {K1,K2,K3,K4,K5,K6,K7,K8} % IPv6

hostname() = string() - representing a host ex "foo.bar.com"

property() = atom()

ERLANG HTTP SERVER SERVICE START/STOP

A web server can be configured to start when starting the inets application or started dynamically in runtime by calling the Inets application API inets:start(httpd, ServiceConfig), or inets:start(httpd, ServiceConfig, How), see inets(3erl) Below follows a description of the available configuration options, also called properties.

File properties

When the web server is started at application start time the properties should be fetched from a configuration file that could consist of a regular erlang property list, e.i. [{Option, Value}] where Option = property() and Value = term(), followed by a full stop, or for backwards compatibility an Apache like configuration file. If the web server is started dynamically at runtime you may still specify a file but you could also just specify the complete property list.

{proplist_file, path()}: If this property is defined inets will expect to find all other properties defined in this file. Note that the file must include all properties listed under mandatory properties.

{file, path()}: If this property is defined inets will expect to find all other properties defined in this file, that uses Apache like syntax. Note that the file must include all properties listed under mandatory properties. The Apache like syntax is the property, written as one word where each new word begins with a capital, followed by a white-space followed by the value followed by a new line. Ex:

{server_root, "/urs/local/www"} -> ServerRoot /usr/local/www

With a few exceptions, that are documented for each property that behaves differently, and the special case {directory, {path(), PropertyList}} and {security_directory, {Dir, PropertyList}} that are represented as:

<Directory Dir>
 <Properties handled as described above>
</Directory>


Note:

The properties proplist_file and file are mutually exclusive.

Mandatory properties

{port, integer()} : The port that the HTTP server shall listen on. If zero is specified as port, an arbitrary available port will be picked and you can use the httpd:info/2 function to find out which port was picked.

{server_name, string()} : The name of your server, normally a fully qualified domain name.

{server_root, path()} : Defines the server's home directory where log files etc can be stored. Relative paths specified in other properties refer to this directory.

{document_root, path()}: Defines the top directory for the documents that are available on the HTTP server.

Communication properties

{bind_address, ip_address() | hostname() | any} : Defaults to any. Note that any is denoted * in the apache like configuration file.

{socket_type, ip_comm | {essl, Config::proplist()}}: For ssl configuration options see ssl:listen/2

Defaults to ip_comm.

{ipfamily, inet | inet6 | inet6fb4}: Defaults to inet6fb4.

Note that this option is only used when the option socket_type has the value ip_comm.

{minimum_bytes_per_second, integer()}: If given, sets a minimum bytes per second value for connections.

If the value is not reached, the socket will close for that connection.

The option is good for reducing the risk of "slow dos" attacks.

Erlang Web server API modules

{modules, [atom()]} : Defines which modules the HTTP server will use to handle requests. Defaults to: [mod_alias, mod_auth, mod_esi, mod_actions, mod_cgi, mod_dir, mod_get, mod_head, mod_log, mod_disk_log] Note that some mod-modules are dependent on others, so the order can not be entirely arbitrary. See the Inets Web server Modules in the Users guide for more information.

Limit properties

{disable_chunked_transfer_encoding_send, boolean()}: This property allows you to disable chunked transfer-encoding when sending a response to a HTTP/1.1 client, by default this is false.

{keep_alive, boolean()}: Instructs the server whether or not to use persistent connections when the client claims to be HTTP/1.1 compliant, default is true.

{keep_alive_timeout, integer()}: The number of seconds the server will wait for a subsequent request from the client before closing the connection. Default is 150.

{max_body_size, integer()}: Limits the size of the message body of HTTP request. By the default there is no limit.

{max_clients, integer()}: Limits the number of simultaneous requests that can be supported. Defaults to 150.

{max_header_size, integer()}: Limits the size of the message header of HTTP request. Defaults to 10240.

{max_uri_size, integer()}: Limits the size of the HTTP request URI. By default there is no limit.

{max_keep_alive_request, integer()}: The number of request that a client can do on one connection. When the server has responded to the number of requests defined by max_keep_alive_requests the server close the connection. The server will close it even if there are queued request. Defaults to no limit.

Administrative properties

{mime_types, [{MimeType, Extension}] | path()}: Where MimeType = string() and Extension = string(). Files delivered to the client are MIME typed according to RFC 1590. File suffixes are mapped to MIME types before file delivery. The mapping between file suffixes and MIME types can be specified as an Apache like file as well as directly in the property list. Such a file may look like:

# MIME type	Extension
text/html	html htm
text/plain	asc txt

Defaults to [{"html","text/html"},{"htm","text/html"}]

{mime_type, string()}: When the server is asked to provide a document type which cannot be determined by the MIME Type Settings, the server will use this default type.

{server_admin, string()}: ServerAdmin defines the email-address of the server administrator, to be included in any error messages returned by the server.

{server_tokens, prod|major|minor|minimal|os|full|{private, string()}}: ServerTokens defines how the value of the server header should look.

Example: Assuming the version of inets is 5.8.1, here is what the server header string could look like for the different values of server-tokens:

prod                  "inets"
major                 "inets/5"
minor                 "inets/5.8"
minimal               "inets/5.8.1"
os                    "inets/5.8.1 (unix)"
full                  "inets/5.8.1 (unix/linux) OTP/R15B"
{private, "foo/bar"}  "foo/bar"

By default, the value is as before, which is minimal.

{log_format, common | combined}: Defines if access logs should be written according to the common log format or to the extended common log format. The common format is one line that looks like this: remotehost rfc931 authuser [date] "request" status bytes

remotehost
	Remote
rfc931
	The client's remote username (RFC 931).
authuser
	The username with which the user authenticated
        himself.
[date]
	Date and time of the request (RFC 1123).
"request"
	The request line exactly as it came from the client
        (RFC 1945).
status
	The HTTP status code returned to the client
        (RFC 1945).
bytes
	The content-length of the document transferred.

The combined format is on line that look like this: remotehost rfc931 authuser [date] "request" status bytes "referer" "user_agent"

"referer"
	The url the client was on before
	requesting your url. (If it could not be determined
	a minus sign will be placed in this field)
"user_agent"
	The software the client claims to be using. (If it
	could not be determined a minus sign will be placed in
	this field)

This affects the access logs written by mod_log and mod_disk_log.

{error_log_format, pretty | compact}: Defaults to pretty. If the error log is meant to be read directly by a human pretty will be the best option. pretty has the format corresponding to:

io:format("[~s] ~s, reason: ~n ~p ~n~n", [Date, Msg, Reason]).

compact has the format corresponding to:

io:format("[~s] ~s, reason: ~w ~n", [Date, Msg, Reason]).

This affects the error logs written by mod_log and mod_disk_log.

URL aliasing properties - requires mod_alias

{alias, {Alias, RealName}}: Where Alias = string() and RealName = string(). The Alias property allows documents to be stored in the local file system instead of the document_root location. URLs with a path that begins with url-path is mapped to local files that begins with directory-filename, for example:

{alias, {"/image", "/ftp/pub/image"}}

{re_write, {Re, Replacement}}: Where Re = string() and Replacement = string(). The ReWrite property allows documents to be stored in the local file system instead of the document_root location. URLs are rewritten by re:replace/3 to produce a path in the local filesystem. For example:

{re_write, {"^/[~]([^/]+)(.*)$", "/home/\\1/public\\2"}}
ReWrite ^/[~]([^/]+)(.*)$ /home/\1/public\2
\040

{directory_index, [string()]}: DirectoryIndex specifies a list of resources to look for if a client requests a directory using a / at the end of the directory name. file depicts the name of a file in the directory. Several files may be given, in which case the server will return the first it finds, for example:

{directory_index, ["index.hml", "welcome.html"]}

CGI properties - requires mod_cgi

{script_alias, {Alias, RealName}}: Where Alias = string() and RealName = string(). Has the same behavior as the Alias property, except that it also marks the target directory as containing CGI scripts. URLs with a path beginning with url-path are mapped to scripts beginning with directory-filename, for example:

{script_alias, {"/cgi-bin/", "/web/cgi-bin/"}}

{script_re_write, {Re, Replacement}}: Where Re = string() and Replacement = string(). Has the same behavior as the ReWrite property, except that it also marks the target directory as containing CGI scripts. URLs with a path beginning with url-path are mapped to scripts beginning with directory-filename, for example:

{script_re_write, {"^/cgi-bin/(\\d+)/", "/web/\\1/cgi-bin/"}}

{script_nocache, boolean()}: If ScriptNoCache is set to true the HTTP server will by default add the header fields necessary to prevent proxies from caching the page. Generally this is something you want. Defaults to false.

{script_timeout, integer()}: The time in seconds the web server will wait between each chunk of data from the script. If the CGI-script not delivers any data before the timeout the connection to the client will be closed. Defaults to 15.

{action, {MimeType, CgiScript}} - requires mod_action: Where MimeType = string() and CgiScript = string(). Action adds an action, which will activate a cgi-script whenever a file of a certain mime-type is requested. It propagates the URL and file path of the requested document using the standard CGI PATH_INFO and PATH_TRANSLATED environment variables.

{action, {"text/plain", "/cgi-bin/log_and_deliver_text"}}

{script, {Method, CgiScript}} - requires mod_action: Where Method = string() and CgiScript = string(). Script adds an action, which will activate a cgi-script whenever a file is requested using a certain HTTP method. The method is either GET or POST as defined in RFC 1945. It propagates the URL and file path of the requested document using the standard CGI PATH_INFO and PATH_TRANSLATED environment variables.

{script, {"PUT", "/cgi-bin/put"}}

ESI properties - requires mod_esi

{erl_script_alias, {URLPath, [AllowedModule]}}: Where URLPath = string() and AllowedModule = atom(). erl_script_alias marks all URLs matching url-path as erl scheme scripts. A matching URL is mapped into a specific module and function. For example:

{erl_script_alias, {"/cgi-bin/example", [httpd_example]}}

{erl_script_nocache, boolean()}: If erl_script_nocache is set to true the server will add http header fields that prevents proxies from caching the page. This is generally a good idea for dynamic content, since the content often vary between each request. Defaults to false.

{erl_script_timeout, integer()}: If erl_script_timeout sets the time in seconds the server will wait between each chunk of data to be delivered through mod_esi:deliver/2. Defaults to 15. This is only relevant for scripts that uses the erl scheme.

{eval_script_alias, {URLPath, [AllowedModule]}}: Where URLPath = string() and AllowedModule = atom(). Same as erl_script_alias but for scripts using the eval scheme. Note that this is only supported for backwards compatibility. The eval scheme is deprecated.

Log properties - requires mod_log

{error_log, path()}: Defines the filename of the error log file to be used to log server errors. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.

{security_log, path()}: Defines the filename of the access log file to be used to log security events. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.

{transfer_log, path()}: Defines the filename of the access log file to be used to log incoming requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.

Disk Log properties - requires mod_disk_log

{disk_log_format, internal | external}: Defines the file-format of the log files see disk_log for more information. If the internal file-format is used, the logfile will be repaired after a crash. When a log file is repaired data might get lost. When the external file-format is used httpd will not start if the log file is broken. Defaults to external.

{error_disk_log, path()}: Defines the filename of the (disk_log(3erl)) error log file to be used to log server errors. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.

{error_disk_log_size, {MaxBytes, MaxFiles}}: Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the (disk_log(3erl)) error log file. The disk_log(3erl) error log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.

{security_disk_log, path()}: Defines the filename of the (disk_log(3erl)) access log file which logs incoming security events i.e authenticated requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.

{security_disk_log_size, {MaxBytes, MaxFiles}}: Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3erl) access log file. The disk_log(3erl) access log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.

{transfer_disk_log, path()}: Defines the filename of the (disk_log(3erl)) access log file which logs incoming requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.

{transfer_disk_log_size, {MaxBytes, MaxFiles}}: Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3erl) access log file. The disk_log(3erl) access log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.

Authentication properties - requires mod_auth

{directory, {path(), [{property(), term()}]}}

Here follows the valid properties for directories

{allow_from, all | [RegxpHostString]}: Defines a set of hosts which should be granted access to a given directory. For example:

{allow_from, ["123.34.56.11", "150.100.23"]}

{deny_from, all | [RegxpHostString]}: Defines a set of hosts which should be denied access to a given directory. For example:

{deny_from, ["123.34.56.11", "150.100.23"]}

{auth_type, plain | dets | mnesia}: Sets the type of authentication database that is used for the directory.The key difference between the different methods is that dynamic data can be saved when Mnesia and Dets is used. This property is called AuthDbType in the Apache like configuration files.

{auth_user_file, path()}: Sets the name of a file which contains the list of users and passwords for user authentication. filename can be either absolute or relative to the server_root. If using the plain storage method, this file is a plain text file, where each line contains a user name followed by a colon, followed by the non-encrypted password. If user names are duplicated, the behavior is undefined. For example:

 ragnar:s7Xxv7
 edward:wwjau8

{auth_group_file, path()}: Sets the name of a file which contains the list of user groups for user authentication. Filename can be either absolute or relative to the server_root. If you use the plain storage method, the group file is a plain text file, where each line contains a group name followed by a colon, followed by the member user names separated by spaces. For example:

group1: bob joe ante

{auth_name, string()}: Sets the name of the authorization realm (auth-domain) for a directory. This string informs the client about which user name and password to use.

{auth_access_password, string()}: If set to other than "NoPassword" the password is required for all API calls. If the password is set to "DummyPassword" the password must be changed before any other API calls. To secure the authenticating data the password must be changed after the web server is started since it otherwise is written in clear text in the configuration file.

{require_user, [string()]}: Defines users which should be granted access to a given directory using a secret password.

{require_group, [string()]}: Defines users which should be granted access to a given directory using a secret password.

Htaccess authentication properties - requires mod_htaccess

{access_files, [path()]}: Specify which filenames that are used for access-files. When a request comes every directory in the path to the requested asset will be searched after files with the names specified by this parameter. If such a file is found the file will be parsed and the restrictions specified in it will be applied to the request.

Security properties - requires mod_security

{security_directory, {path(), [{property(), term()}]}}

Here follows the valid properties for security directories

{data_file, path()}: Name of the security data file. The filename can either absolute or relative to the server_root. This file is used to store persistent data for the mod_security module.

{max_retries, integer()}: Specifies the maximum number of tries to authenticate a user has before the user is blocked out. If a user successfully authenticates when the user has been blocked, the user will receive a 403 (Forbidden) response from the server. If the user makes a failed attempt while blocked the server will return 401 (Unauthorized), for security reasons. Defaults to 3 may also be set to infinity.

{block_time, integer()}: Specifies the number of minutes a user is blocked. After this amount of time, he automatically regains access. Defaults to 60.

{fail_expire_time, integer()}: Specifies the number of minutes a failed user authentication is remembered. If a user authenticates after this amount of time, his previous failed authentications are forgotten. Defaults to 30.

{auth_timeout, integer()}: Specifies the number of seconds a successful user authentication is remembered. After this time has passed, the authentication will no longer be reported. Defaults to 30.

EXPORTS

info(Pid) ->

info(Pid, Properties) -> [{Option, Value}]

Types:

Properties = [property()]

Option = property()

Value = term()

Fetches information about the HTTP server. When called with only the pid all properties are fetched, when called with a list of specific properties they are fetched. Available properties are the same as the server's start options.

Note:

Pid is the pid returned from inets:start/[2,3]. Can also be retrieved form inets:services/0, inets:services_info/0 see inets(3erl)

info(Address, Port) ->

info(Address, Port, Properties) -> [{Option, Value}]

Types:

Address = ip_address()

Port = integer()

Properties = [property()]

Option = property()

Value = term()

Fetches information about the HTTP server. When called with only the Address and Port all properties are fetched, when called with a list of specific properties they are fetched. Available properties are the same as the server's start options.

Note:

Address has to be the ip-address and can not be the hostname.

reload_config(Config, Mode) -> ok | {error, Reason}

Types:

Config = path() | [{Option, Value}]

Option = property()

Value = term()

Mode = non_disturbing | disturbing

Reloads the HTTP server configuration without restarting the server. Incoming requests will be answered with a temporary down message during the time the it takes to reload.

Note:

Available properties are the same as the server's start options, although the properties bind_address and port can not be changed.

If mode is disturbing, the server is blocked forcefully and all ongoing requests are terminated and the reload will start immediately. If mode is non-disturbing, no new connections are accepted, but the ongoing requests are allowed to complete before the reload is done.

ERLANG WEB SERVER API DATA TYPES

      ModData = #mod{}

      -record(mod, {
		data = [],
		socket_type = ip_comm,
		socket,
		config_db,
		method,
		absolute_uri,
		request_uri,
		http_version,
		request_line,
		parsed_header = [],
		entity_body,
		connection
	}).

To acess the record in your callback-module use

 -include_lib("inets/include/httpd.hrl").

The fields of the mod record has the following meaning:

data: Type [{InteractionKey,InteractionValue}] is used to propagate data between modules. Depicted interaction_data() in function type declarations.

socket_type: socket_type(), Indicates whether it is an ip socket or a ssl socket.

socket: The actual socket in ip_comm or ssl format depending on the socket_type.

config_db: The config file directives stored as key-value tuples in an ETS-table. Depicted config_db() in function type declarations.

method: Type "GET" | "POST" | "HEAD" | "TRACE", that is the HTTP method.

absolute_uri: If the request is a HTTP/1.1 request the URI might be in the absolute URI format. In that case httpd will save the absolute URI in this field. An Example of an absolute URI could be"http://ServerName:Part/cgi-bin/find.pl?person=jocke"

request_uri: The Request-URI as defined in RFC 1945, for example "/cgi-bin/find.pl?person=jocke"

http_version: The HTTP version of the request, that is "HTTP/0.9", "HTTP/1.0", or "HTTP/1.1".

request_line: The Request-Line as defined in RFC 1945, for example "GET /cgi-bin/find.pl?person=jocke HTTP/1.0".

parsed_header: Type [{HeaderKey,HeaderValue}], parsed_header contains all HTTP header fields from the HTTP-request stored in a list as key-value tuples. See RFC 2616 for a listing of all header fields. For example the date field would be stored as: {"date","Wed, 15 Oct 1997 14:35:17 GMT"} . RFC 2616 defines that HTTP is a case insensitive protocol and the header fields may be in lower case or upper case. Httpd will ensure that all header field names are in lower case.

entity_body: The Entity-Body as defined in RFC 2616, for example data sent from a CGI-script using the POST method.

connection: true | false If set to true the connection to the client is a persistent connection and will not be closed when the request is served.

ERLANG WEB SERVER API CALLBACK FUNCTIONS

EXPORTS

Module:do(ModData)-> {proceed, OldData} | {proceed, NewData} | {break, NewData} | done

Types:

OldData = list()

NewData = [{response,{StatusCode,Body}}] | [{response,{response,Head,Body}}] | [{response,{already_sent,Statuscode,Size}}]

StatusCode = integer()

Body = io_list() | nobody | {Fun, Arg}

Head = [HeaderOption]

HeaderOption = {Option, Value} | {code, StatusCode}

Option = accept_ranges | allow | cache_control | content_MD5 | content_encoding | content_language | content_length | content_location | content_range | content_type | date | etag | expires | last_modified | location | pragma | retry_after | server | trailer | transfer_encoding

Value = string()

Fun = fun( Arg ) -> sent| close | Body

Arg = [term()]

When a valid request reaches httpd it calls do/1 in each module defined by the Modules configuration option. The function may generate data for other modules or a response that can be sent back to the client.

The field data in ModData is a list. This list will be the list returned from the last call to do/1.

Body is the body of the http-response that will be sent back to the client an appropriate header will be appended to the message. StatusCode will be the status code of the response see RFC2616 for the appropriate values.

Head is a key value list of HTTP header fields. The server will construct a HTTP header from this data. See RFC 2616 for the appropriate value for each header field. If the client is a HTTP/1.0 client then the server will filter the list so that only HTTP/1.0 header fields will be sent back to the client.

If Body is returned and equal to {Fun,Arg}, the Web server will try apply/2 on Fun with Arg as argument and expect that the fun either returns a list (Body) that is a HTTP-repsonse or the atom sent if the HTTP-response is sent back to the client. If close is returned from the fun something has gone wrong and the server will signal this to the client by closing the connection.

Module:load(Line, AccIn)-> eof | ok | {ok, AccOut} | {ok, AccOut, {Option, Value}} | {ok, AccOut, [{Option, Value}]} | {error, Reason}

Types:

Line = string()

AccIn = [{Option, Value}]

AccOut = [{Option, Value}]

Option = property()

Value = term()

Reason = term()

Load is used to convert a line in a Apache like configuration file to a {Option, Value} tuple. Some more complex configuration options such as directory and security_directory will create an accumulator.This function does only need clauses for the options implemented by this particular callback module.

Module:store({Option, Value}, Config)-> {ok, {Option, NewValue}} | {error, Reason}

Types:

Line = string()

Option = property()

Config = [{Option, Value}]

Value = term()

Reason = term()

This function is used to check the validity of the configuration options before saving them in the internal database. This function may also have a side effect e.i. setup necessary extra resources implied by the configuration option. It can also resolve possible dependencies among configuration options by changing the value of the option. This function does only need clauses for the options implemented by this particular callback module.

Module:remove(ConfigDB) -> ok | {error, Reason}

Types:

ConfigDB = ets_table()

Reason = term()

When httpd is shutdown it will try to execute remove/1 in each Erlang web server callback module. The programmer may use this function to clean up resources that may have been created in the store function.

ERLANG WEB SERVER API HELP FUNCTIONS

EXPORTS

parse_query(QueryString) -> [{Key,Value}]

Types:

QueryString = string()

Key = string()

Value = string()

parse_query/1 parses incoming data to erl and eval scripts (See mod_esi(3erl)) as defined in the standard URL format, that is '+' becomes 'space' and decoding of hexadecimal characters (%xx).

RELATED TO httpd…

RFC 2616, inets(3erl), ssl(3erl)