User Agents
...suck. How simple is that? A godawful mish mash of crap that somehow manages (most of the time) to identify a browser... They've been with us for years, since the first Mozilla browser I guess, and are likely to be here for a good while longer. There is some documentation to be found explaining the format here and more recently, here. But look at this...
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)
Ick. What a load of pants.
So far be it from me to go out on a limb and suggest a replacement, but thats what I'm gonna do.
User-Agent2:
Enter the User-Agent2: header. My version of the abomination. Let me explain. The people currently "making up" the PEAR rules seemingly insist on calling the packages "Foo2", where Foo is the package, which has undergone some significant and/or miraculous BC breaking change. Whether this is a good thing, or a bad thing I won't go into (bad bad bad), however I will mercilessly steal the idea and use for my all new User-Agent2: header plan.
So whats the format? Well, it's dead easy. Most text based protocols that have the concept of headers (eg. HTTP, MIME etc), also have the concept of parameters, which are bits added on to the end of the value, separated by semi-colons, which add extra meaning, or information. I intend to steal (notice the theme...) this idea, and use here. There will be no value to the User-Agent2: header, only parameters. Here's an example (which also happens to take advantage of header folding):
User-Agent2: a="Internet Explorer";
v="6.0";
p="Windows";
pv="XP";
s="N";
l="en-GB"
See. How bloody obvious is that? Ok perhaps not. Some explanations:
- a - Application, required. Gives the REAL name of the application, that people want to read.
- v - Version, required. Gives the version of the application.
- p - Platform, optional. Gives the platform that the application runs on. Potential values include: Windows, Mac, X11.
- pv - Platform Version, optional. Gives the version of the platform that the application runs on. Given the myriad of potential platforms, this is an arbitrary string, but should retain instant readability. Eg. for Windows platform, this could be "3.11", "NT4.0", "2000" or "XP".
- s - Security, optional. Gives the security of application. Not sure why I include this. Pointless IMO.
- l - Localisation, optional. I guess some people might use this. I don't though. Btw, did you see the s? No zees here el-Yankee-Doodles.
See. How bloody obvious is that?
You might be asking, "What advantages does this offer?". Well stop asking that. Not only is it a heck of a lot easier to parse, but the introduction of a new header means you can use readable values for each of the fields. So instead of having mounds of code to do something that should be very simple (identifying the browser...), you could do it with one regex. For example...
<?php
$str = <<<END
User-Agent2: a="Internet Explorer";
v="6.0";
p="Windows";
pv="XP";
s="N";
l="en-GB"
END;
preg_match_all('#(a|v|p|pv|s|l)="([^"]*)"(?:\s*;\s*)?#i', $str, $matches);
print_r($matches);
?>
Not only this, but this new format allows for significant expandability. Need a colour (note the u...) identifier? Just add it in: c="Bright Pink".
So there you go. Food for thought. Tasty food too. Yum.