HTTP Working Group Koen Holtman, TUE
Internet-Draft
Expires: September 18, 1997 March 18, 1997
Wildcards in the Accept-Charset Header
draft-holtman-http-wildcards-00.txt
STATUS OF THIS MEMO
This document is an Internet-Draft. Internet-Drafts are
working documents of the Internet Engineering Task Force
(IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other
than as "work in progress".
To learn the current status of any Internet-Draft, please
check the "1id-abstracts.txt" listing contained in the
Internet-Drafts Shadow Directories on ftp.is.co.za
(Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific
Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US
West Coast).
Distribution of this document is unlimited. Please send
comments to the HTTP working group at
. Discussions of the working
group are archived at
. General
discussions about HTTP and the applications which use HTTP
should take place on the mailing list.
ABSTRACT
The HTTP/1.1 specification (RFC 2068) defines an Accept-Charset
header, but fails to define a wildcard "*" which could be used in
this header to match all character sets. This proposal corrects
this omission.
1 Introduction
The HTTP/1.1 specification (RFC 2068) defines an Accept-Charset
header, but fails to define a wildcard "*" which could be used in
this header to match all character sets. This proposal corrects this
omission.
A wildcard in the Accept-Charset header is considered important,
because it allows a better specification of the acceptance of many
character sets if it is used in combination with q values. The
support for many different character sets is one possible route (or
transition path) for web internationalization. The existence of this
path, and the desirability of enabling it, was not properly
recognized when he HTTP/1.1 specification [1] was written.
A wildcard can only be used to give an inaccurate specification of
the support levels for many character sets under HTTP/1.x-based
server-driven negotiation [1], and this inaccuracy may lead to
problems. When used in HTTP transparent content negotiation [2]
however, the wildcard does not cause inaccurate end results, and in
fact can be used as a bandwidth-saving device (see section 4.2.1 of
[3]).
2 Proposed edits
It is proposed to change the following text in section 14.2 of [1]:
The ISO-
8859-1 character set can be assumed to be acceptable to all user
agents.
Accept-Charset = "Accept-Charset" ":"
1#( charset [ ";" "q" "=" qvalue ] )
Character set values are described in section 3.4. Each charset may
be given an associated quality value which represents the user's
preference for that charset. The default value is q=1. An example is
Accept-Charset: iso-8859-5, unicode-1-1;q=0.8
If no Accept-Charset header is present, the default is that any
character set is acceptable.
to the text below:
The ISO-
8859-1 character set can be assumed to be acceptable to all user
agents.
Accept-Charset = "Accept-Charset" ":"
| 1#( ( charset | "*" ) [ ";" "q" "=" qvalue ] )
Character set values are described in section 3.4. Each charset may
be given an associated quality value which represents the user's
preference for that charset. The default value is q=1. An example is
Accept-Charset: iso-8859-5, unicode-1-1;q=0.8
| The special value "*", if present in the Accept-Charset field,
| matches every character set (including ISO-8859-1) which is not
| mentioned elsewhere in the Accept-Charset field. If no "*" is
| present in an Accept-Charset field, then all character sets not
| explicitly mentioned get a quality value of 0, except for
| ISO-8859-1, which gets a quality value of 1 if not explicitly
| mentioned.
If no Accept-Charset header is present, the default is that any character
set is acceptable.
3 Compatibility considerations
The syntax rules in the current version of the HTTP/1.1 specification
[1] allow a charset value of "*" to be present in the Accept-Charset
header. Thus, servers which implement [1] will have no trouble
parsing a header like
Accept-Charset: iso-8859-5;q=0.8, *;q=0.2
According to [1], the "*" value should be interpreted as an unknown
(unregistered) character set designator. Thus, servers which
implement [1] will simply ignore the wildcard if present.
4 Security considerations
This proposal adds no new HTTP security considerations.
5 References
[1] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, and
T. Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1. RFC
2068, HTTP Working Group, January, 1997.
[2] K. Holtman, A. Mutz. Transparent Content Negotiation in HTTP.
Internet-Draft draft-ietf-http-negotiation-01.txt, HTTP Working
Group.
[3] K. Holtman, A. Mutz. HTTP Remote Variant Selection Algorithm
-- RVSA/1.0. Internet-Draft draft-ietf-http-rvsa-v10-00.txt,
HTTP Working Group.
6 Author's address
Koen Holtman
Technische Universiteit Eindhoven
Postbus 513
Kamer HG 6.57
5600 MB Eindhoven (The Netherlands)
Email: koen@win.tue.nl
Expires: September 18, 1997