HTTP Working Group Koen Holtman, TUE Internet-Draft Expires: September 18, 1997 March 18, 1997 Wildcards in the Accept-Charset Header draft-holtman-http-wildcards-00.txt STATUS OF THIS MEMO This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Distribution of this document is unlimited. Please send comments to the HTTP working group at . Discussions of the working group are archived at . General discussions about HTTP and the applications which use HTTP should take place on the mailing list. ABSTRACT The HTTP/1.1 specification (RFC 2068) defines an Accept-Charset header, but fails to define a wildcard "*" which could be used in this header to match all character sets. This proposal corrects this omission. 1 Introduction The HTTP/1.1 specification (RFC 2068) defines an Accept-Charset header, but fails to define a wildcard "*" which could be used in this header to match all character sets. This proposal corrects this omission. A wildcard in the Accept-Charset header is considered important, because it allows a better specification of the acceptance of many character sets if it is used in combination with q values. The support for many different character sets is one possible route (or transition path) for web internationalization. The existence of this path, and the desirability of enabling it, was not properly recognized when he HTTP/1.1 specification [1] was written. A wildcard can only be used to give an inaccurate specification of the support levels for many character sets under HTTP/1.x-based server-driven negotiation [1], and this inaccuracy may lead to problems. When used in HTTP transparent content negotiation [2] however, the wildcard does not cause inaccurate end results, and in fact can be used as a bandwidth-saving device (see section 4.2.1 of [3]). 2 Proposed edits It is proposed to change the following text in section 14.2 of [1]: The ISO- 8859-1 character set can be assumed to be acceptable to all user agents. Accept-Charset = "Accept-Charset" ":" 1#( charset [ ";" "q" "=" qvalue ] ) Character set values are described in section 3.4. Each charset may be given an associated quality value which represents the user's preference for that charset. The default value is q=1. An example is Accept-Charset: iso-8859-5, unicode-1-1;q=0.8 If no Accept-Charset header is present, the default is that any character set is acceptable. to the text below: The ISO- 8859-1 character set can be assumed to be acceptable to all user agents. Accept-Charset = "Accept-Charset" ":" | 1#( ( charset | "*" ) [ ";" "q" "=" qvalue ] ) Character set values are described in section 3.4. Each charset may be given an associated quality value which represents the user's preference for that charset. The default value is q=1. An example is Accept-Charset: iso-8859-5, unicode-1-1;q=0.8 | The special value "*", if present in the Accept-Charset field, | matches every character set (including ISO-8859-1) which is not | mentioned elsewhere in the Accept-Charset field. If no "*" is | present in an Accept-Charset field, then all character sets not | explicitly mentioned get a quality value of 0, except for | ISO-8859-1, which gets a quality value of 1 if not explicitly | mentioned. If no Accept-Charset header is present, the default is that any character set is acceptable. 3 Compatibility considerations The syntax rules in the current version of the HTTP/1.1 specification [1] allow a charset value of "*" to be present in the Accept-Charset header. Thus, servers which implement [1] will have no trouble parsing a header like Accept-Charset: iso-8859-5;q=0.8, *;q=0.2 According to [1], the "*" value should be interpreted as an unknown (unregistered) character set designator. Thus, servers which implement [1] will simply ignore the wildcard if present. 4 Security considerations This proposal adds no new HTTP security considerations. 5 References [1] R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, and T. Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1. RFC 2068, HTTP Working Group, January, 1997. [2] K. Holtman, A. Mutz. Transparent Content Negotiation in HTTP. Internet-Draft draft-ietf-http-negotiation-01.txt, HTTP Working Group. [3] K. Holtman, A. Mutz. HTTP Remote Variant Selection Algorithm -- RVSA/1.0. Internet-Draft draft-ietf-http-rvsa-v10-00.txt, HTTP Working Group. 6 Author's address Koen Holtman Technische Universiteit Eindhoven Postbus 513 Kamer HG 6.57 5600 MB Eindhoven (The Netherlands) Email: koen@win.tue.nl Expires: September 18, 1997