blueshoes php application framework and cms            core_net
[ class tree: core_net ] [ index: core_net ] [ all elements ]

Class: Bs_Browscap

Source Location: /core/net/http/Bs_Browscap.class.php

Class Overview

Bs_Object
   |
   --Bs_Browscap

Browser Capture Class.


Author(s):

Version:

  • 4.3.$Revision: 1.5 $ $Date: 2003/10/29 17:48:42 $

Copyright:

  • blueshoes.org

Variables

Methods


Inherited Variables

Inherited Methods

Class: Bs_Object

Bs_Object::Bs_Object()
Bs_Object::getErrors()
Basic error handling: Get *all* errors as string array from the global Bs_Error-error stack.
Bs_Object::getLastError()
Basic error handling: Get last error string from the global Bs_Error-error stack.
Bs_Object::getLastErrors()
Basic error handling: Get last errors string array from the global Bs_Error-error stack sinc last call of getLastErrors().
Bs_Object::persist()
Persists this object by serializing it and saving it to a file with unique name.
Bs_Object::setError()
Basic error handling: Push an error string on the global Bs_Error-error stack.
Bs_Object::toHtml()
Dumps the content of this object to a string using PHP's var_dump().
Bs_Object::toString()
Dumps the content of this object to a string using PHP's var_dump().
Bs_Object::unpersist()
Fetches an object that was persisted with persist()

Class Details

[line 31]
Browser Capture Class.

detects a lot. can even use a temp redirect page to grab much information using javascript.

wishlist:

  • detect junkbuster/muffin etc. it's annoying if clients send different and wrong user-agent and http-referer each time. I think that especially the user-agent could be detected.
  • re-detect previous client machines using a hash.
dependencies: Bs_Url, BsDb for detectUserType (with the kb),




Tags:

pattern:  singleton: (pseudostatic)
todo:  add HTTP_ACCEPT_ENCODING HTTP_ACCEPT
access:  public
version:  4.3.$Revision: 1.5 $ $Date: 2003/10/29 17:48:42 $
copyright:  blueshoes.org
author:  Andrej Arn <at blueshoes dot org>


[ Top ]


Class Variables

$bsDb =

[line 40]

a db instance, used in detectUserType to access the kb.



Tags:

see:  $this->detectUserType()
access:  public

Type:   object


[ Top ]

$data = array()

[line 420]

****************************************************************************************** Data hash holding the key/value pairs with the information. If something is not set or null it means 'not known' while false means 'no'.

Normally we would write var names like 'hasFrames' instead of just 'frames' and 'isJavaScriptEnabled' instead of 'javaScriptEnabled' but to save characters we don't do that here.

------------------------------------------------------------------------------------------- BROWSER, BASIC ------------------------------------------------------------------------------------------- KEY: userAgent (string): The browser string as we get it from the client. NOTE: The client can send *anything*. Some users even have stupid anonymizers and banner killers installed which send a random client string on each request. Sucks! Also it is usual that less common browsers identify themselves as well known ones. Example: 'Mozilla/4.5 [en] (X11; U; Linux 2.2.9 i586)' Supported: all afaik "As Far As I Know", (sent in the header request). note: sending a user-agent is recommended by the rfc, but not required. so it is possible that we don't have it at all. reading it by javascript will help in most cases (as long as js is supported).

KEY: browser (string): 'ie', 'ns' etc. Supported: depends on userAgent see also isGecko.

KEY: isGecko (bool): tells if the browser uses the gecko engine. that should be TRUE for things like mozilla, netscape etc. check browserBuild for the version

KEY: browserMajorVersion (int): For "internet explorer 6.0b" this would be '6'. Supported: depends on userAgent

KEY: browserMinorVersion (int): For "internet explorer 6.0b" this would be '0'. Supported: depends on userAgent

KEY: browserMinorVerlet (mixed): For "internet explorer 6.0b" this would be 'b'. Supported: depends on userAgent

KEY: browserFullVersion (string): For "internet explorer 6.0b" this would be '6.0b'. Supported: depends on userAgent

KEY: browserBuild (string): 2do... for gecko: from the browser string "Gecko/20020530" => "20020530" so it's a date. 20020530 is the release version of moz 1.0 moz 1.0 identifies as 5.0, revision 1.0.0, see here: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.0) Gecko/20020530 20020826 is the release version of moz 1.1 moz 1.1 identified as 5.0, revision 1.1, see here: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.1) Gecko/20020826

KEY: browserCodeName (string): The nickname of the browser. Invented by netscape cause their browser was called 'mozilla'. Comes from "mosaic killer", "mosaic killa". ie identifies itself as 'mozilla' aswell. So I guess most clients do, which makes this var pretty useless(?) Supported: Needs javascript. I think that it could be read out from the userAgent aswell, cause 'mozilla' is often in there. dunno.

KEY: browserLanguages (hash): key is the language as ISO code like "en-uk" or "en". value is a number from 0 to 1, for example 0.5. the user prefers the language with the highest value. for details see the rfc at: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4

this comes from HTTP_ACCEPT_LANGUAGE. Supported: all afaik (sent in the header request). history: used to be a plain string like "en-uk" or "en-us,en;q=0.5" until bs-4.4.

KEY: webCrawler (bool): 2do... If the client is a webcrawler. For example search engines crawl the web to update their indexes. So if we detect a crawler, one can send a crawler-optimized webpage, for example. Supported: depends on userAgent, maybe ip/hostname.

KEY: emailCrawler (bool): 2do... If the client is an email crawler. I don't see how we can detect these, cause i guess they usually identify themself as a standard browser. Maybe using ranges of known ip addresses? Email crawlers spider the web to get email addresses, so that's something we don't wanna have. Supported: depends on userAgent, maybe ip/hostname.

KEY: wap (bool): 2do... If the client is a wap mobile phone. => use wml Supported: depends on userAgent

KEY: pda (bool): 2do... If the client is a portable device. Like palm or handspring. Dunno what language they need. Dul I thin . Supported: depends on userAgent

KEY: os (string): One of 'win', 'mac', 'linux', 'beos', 'freebsd', 'solaris', ... Supported: depends on userAgent

KEY: osVersion (string): Something like '95', '98', '2000', 'xp', 'ppc', ... Supported: depends on userAgent see: _getOsInfo()

KEY: cpuClass (string): Something like 'x86'. Supported: only ie afaik, not netscape, needs javascript.

------------------------------------------------------------------------------------------- CLIENT -------------------------------------------------------------------------------------------

KEY: ip (string 15 chars, IPv6 would be 24): Example: 217.162.141.109 Sometimes ISP Cache Servers/Proxies replace the REMOTE_ADDR with their ip address. If so, they usually add the originators ip to the HTTP_X_FORWARDED_FOR header. but there are 2 problems: 1) Anyone can add such a header, and thus we cannot use this as the real ip address. 2) Sometimes this header value is a comma separated list cause there are more proxies in a row, and each one adds the previous host to the list. Though you can still log this information. I have read once that using netscape and a java applet, it would be possible to get the real ip address of a client inside a company using a proxy server. hrm? Supported: always

KEY: ipResolved (string): (also known as 'host') The resolved ip address. Example: dclient217-162-141-109.hispeed.ch Supported: If resolving is enabled in the web server (apache) and the host can be resolved to a name.

KEY: country (string): todo: use geo-data, and document this var. currently we used the resolved host if available, or the language code (eg "de-ch" => ch) as fallback. hacky.

KEY: referrer (string): The address of the page (if any) which referred the browser to the current page. This is set by the user's browser; not all browsers will set it. As with userAgent, anonymizer programs often fake this one.

I have seen this word written as referer and referrer (2r) both multiple times. So I think both writings are ok. But when it comes to variables it matters. For php (apache) it is HTTP_REFERER. My translator only knows 2r.

It is not really used here. the information is lost with the redirect, but we could fix that easily. Should we? The browser check is only done once; when the session starts. So after that you have to read out the current http referer yourself anyway. The same applies for proxy and via. I kinda think that we should carry on the original referrer before our redirect. The same applies to anything, like post-, get- and cookie vars.

KEY: proxy (string): 2do... see referrer.

KEY: via (string): 2do... see referrer.

 KEY: userType (array):
        type => webcrawler|emailcrawler|user|proxy|virus|translator|anonymizer
                webcrawler[s]   come from public search engines to update their indexes. they're mostly welcome.
                emailcrawler[s] come to find email addresses from spamming, we never want them.
                users[s]        are ppl visiting the web site, they're mostly welcome.
                prox[ies]       come to our site to fetch pages for ppl. they're fine. some ppl and services may
                                abuse them to be anonymous.
                virus[es]       are programs (or users) trying to fuck with our server. often the owners of these
                                machines don't know what happens from their box, are using dial-up accounts, and
                                thus we really should not block them permanently.
                translator      like babelfish from altavista, google has that aswell.
                anonymizer      like a proxy (or translator) to be anonymous.
        nice => true|false
                webcrawlers are nice until known otherwise or until they don't respect our robots.txt.
                emailcrawlers are never nice.
                users ans proxies are nice until they fuck with our bandwidth or so.
                viruses are never nice.
        name => if a crawler is coming from google, the name will be "Google" (not googlebot). it's the name
                of the service. of course a user has no such 'name'.
        url  => again for crawlers, google would have the value http://www.google.com/ here.
                if you use cloaking and know the request comes from altavista, you may want to add a back-link
                to them, the spider may like it. :-)
   supported: needs detectUserType() to be executed.
 

------------------------------------------------------------------------------------------- PROPERTIES ------------------------------------------------------------------------------------------- KEY: dom (string): string 'dom' || 'ie4' || 'ns4' || 'basic' || '' ie4: document.- all , style window. - event ns4: document.- layers array - push , pop basic: dunno supported: needs javascript todo: maybe detect it if js is disabled based on the browser string. at least for ie5+ and ns6+. BUT: spiders could fake it and that way get dom-pages. do we want that? don't think so. new implementation: if we get to compute the values with a js=0 param (js disabled, but redirect worked), this var will be computed from the browser string. spiders should not fall into that category, hopefully. i need that feature now.

KEY: height (int): The full screen height of the client computer. In 1024/768 that is 768. Supported: needs javascript

KEY: width (int): The full screen width of the client computer. in 1024/768 that is 1024. Supported: needs javascript

KEY: heightAvailable (int): The available height for the page in the browser window. That is the height withouth taskbar, office shortcut bar, browser bars etc. Supported: needs javascript

KEY: widthAvailable (int): The available width for the page in the browser window. That is the width withouth taskbar, office shortcut bar, browser bars etc. Supported: needs javascript

KEY: colorDepth (int): If set it is one of 8, 16, 24, 32 or 36. it is the number of bits. 24-bit means "high", 16.77 million colors. 16-bit means "medium". dunno the others. Supported: needs javascript

KEY: langUser (string): Something like 'en-uk' Supported: needs javascript. ie only i think. no ns/op/mo on win2k. see also: langSystem, browserLanguages

KEY: langSystem (string): Something like 'en-uk' Supported: needs javascript. ie only i think. no ns/op/mo on win2k. see also: langUser, browserLanguages

KEY: timeZoneDiffGmt (double) (stored as string): The timezone difference from the client to gmt in hours. It can be something like '3', '0', '-1.5'. Supported: needs javascript

KEY: timeZoneDiffServer (double): 2do... see timeZoneDiffGmt. The timezone difference from the client to the server in hours. It can be something like '3', '0', '-1.5'. Supported: needs javascript

KEY: browserDateTime (string): 2do... Supported: needs javascript

KEY: connectionType (): One of the following values:

  • lan User is connected through a network.
  • modem User is connected through a modem.
  • offline User is working offline.
Supported: ie5.5+ only, and needs javascript

KEY: connectionSpeed (): 2do...

KEY: isFramed (bool): Tells if the part in which we should load a page is inside a frameset. Supported: needs javascript

------------------------------------------------------------------------------------------- FEATURES -------------------------------------------------------------------------------------------

KEY: cookiesSession (bool): If the browser supports session cookies (and are enabled). Session cookies are cookies that are deleted once the browser (all browser instances) is closed. Supported: all (it is tested by setting a real cookie.)

KEY: cookiesPermanent (bool): If the browser supports permanent cookies (and are enabled). Permanent cookies are cookies that keep on living (hours, days) until the given timeout, even if the browser is closed. Supported: needs javascript. Also works without, but it may be that this var is true even though the client only treats persistant cookies as session cookies.

KEY: javaScript (bool): If the client supports javascript. 2do: I'm not able to detect if javascript would be available but is disabled. but i don't really need that. i guess we would need to check the browser string.

KEY: javaScriptEnabled (bool): If javascripts are enabled. It still could be that a firewall filters parts of it out from the html file, or that a junkbuster does it. Usually they would allow image effects and such, but not more. So if any of the javascript checks should fail this var should be set to false. Supported: needs javascript :)

KEY: javaScriptBuild (): 2do do we need this at all?

KEY: javaScriptVersion (string) (stored as string not double): The javascript version, eg '1.3'. Supported: needs javascript

KEY: javaApplets(bool): If the browser supports java (applets). 2do: I'm not able to detect if javascript would be available but is disabled. But I don't really need that. I guess we would need to check the browser string. I think this is better here than in the plugins list.

KEY: javaAppletsEnabled(bool): If java applets are enabled in the browser. It still could be that a firewall filters it out from the html file. I think this is better here than in the plugins list.

KEY: metaRefresh (bool): If we can use http redirect using meta refresh or not. In ie6 you can set that in your preferences. Default is true, and i don't think many ppl will change that. 2do dunno how to detect it (without doing a metarefresh).

KEY: frames (bool): If the client supports frames.

KEY: iFrames (bool): If the client has and supports iframes. (in ie6/op5 you can disable that.)

KEY: tables (bool):

KEY: styleSheets (bool): If the client supports style sheets.

KEY: fileUpload (bool): If the client supports file uploading.

KEY: png (bool): If the client supports png images. NULL means unknown. you better treat that as FALSE and use gif/jpg or you risk that ppl don't see your images. For further information visit

  • http://www.libpng.org/pub/png/
  • http://www.libpng.org/pub/png/pngapbr.html
KEY: tableBgImage (bool): If the client supports having background images inside tables.

------------------------------------------------------------------------------------------- PLUGINS ------------------------------------------------------------------------------------------- //Navigator.plugins.refresh() make newly installed plugins available todo: better use an array, and add some functionality to be able to better query that list. by mime type would not be bad. and better handling of plugin versions would be nice. sometimes it's needed to know if xy (pdf) can be handled. and sometimes one really needs to know the version (flash).

KEY: pluginFlash (bool): If the client supports the flash plugin.

KEY: pluginFlashVersion (string): todo: all

KEY: pluginShockwaveDirector (bool): if 'application/x-director' can be handled.

KEY: pluginAcrobat (bool): If the client supports the acrobat plugin and thus can view .pdf files ('application/pdf').

KEY: pluginSvg (bool): if 'image/svg-xml' can be handled (usually the adobe svg plugin)

KEY: pluginRealPlayer (bool): if 'audio/x-pn-realaudio-plugin' can be handled.

KEY: pluginQuickTime (bool): if 'video/quicktime' can be handled.

KEY: pluginWindowsMedia (bool): if 'application/x-mplayer2' can be handled (Windows Media Player).




Tags:

var:  (hash)

Type:   array


[ Top ]

$runTestTemplate = NULL

[line 440]

if set then the html template at this location (absolute path) will be used in runTest().

the special placeholders __HEAD__ and __BODY__ in the template will be replaced with all the needed javascript and stuff.

your body tag needs to do onLoad="bcCheck();".




Tags:

access:  public

Type:   string


[ Top ]

$runTestTimeout = NULL

[line 447]

you may want to use the browsertest page as intro page. if so, a redirect after 1.5 seconds is no fun. you want more. set a timeout here in seconds.


Type:   mixed


[ Top ]

$_getVars = NULL

[line 426]

reference to the HTTP_GET_VARS.



Tags:

var:  (hash)

Type:   array


[ Top ]



Class Methods


constructor Bs_Browscap [line 453]

Bs_Browscap Bs_Browscap( )

Constructor.



[ Top ]

method compute [line 625]

void compute( [string $userAgent = NULL])

runTest() has been done (or is not desired), now let's compute the data.



Tags:

access:  public


Parameters:

string   $userAgent   (only pass this if you want to overwrite the useragent value we got from the users browser.)

[ Top ]

method detectUserType [line 473]

void detectUserType( )

tells what kind of 'user' the user is, based on

ip and ip-range hostname user-agent (browser string)

NOTE: in order to be able to query the kb for crawler-data, we need a ref to a db instance. so set $this->bsDb first. the BsKb.CloakList table is used here.




Tags:

see:  var $this->bsDb
access:  public


[ Top ]

method isCrawler [line 613]

void isCrawler( mixed $userAgent, [mixed $ip = NULL], [mixed $hostName = NULL])

better use detectUserType()! this is for testing only.



Tags:

deprecated:  


[ Top ]

method isEmailCrawler [line 605]

void isEmailCrawler( mixed $userAgent, [mixed $ip = NULL], [mixed $hostName = NULL])

better use detectUserType()! this is for testing only.



Tags:

deprecated:  


[ Top ]

method isWebCrawler [line 575]

bool isWebCrawler( string $userAgent, [string $ip = NULL], [string $hostName = NULL])

better use detectUserType()! this is for testing only.

if it's a web crawler we might want to send different pages, and we certainly don't want a redirect for browser sniffing.




Tags:

access:  public


Parameters:

string   $userAgent  
string   $ip  
string   $hostName  

[ Top ]

method runTest [line 948]

void runTest( )

Make cookie and javascript tests. For this we need to send an html file to the client and later redirect to the same page.

NOTE: this method does EXIT at the end.




Tags:

access:  public


[ Top ]


Documentation generated on Mon, 29 Dec 2003 21:08:06 +0100 by phpDocumentor 1.2.3