blueshoes php application framework and cms            core_net
[ class tree: core_net ] [ index: core_net ] [ all elements ]

Class: Bs_HttpClient

Source Location: /core/net/http/Bs_HttpClient.class.php

Class Overview

Bs_Object
   |
   --Bs_NetApplication
      |
      --Bs_HttpClient

Can grab a website from the internet.


Author(s):

Version:

  • 4.3.$Revision: 1.4 $ $Date: 2003/11/21 17:06:29 $

Copyright:

  • blueshoes.org

Variables

Methods


Inherited Variables

Inherited Methods

Class: Bs_NetApplication

Bs_NetApplication::Bs_NetApplication()
Bs_NetApplication::connect()
Make the connection.
Bs_NetApplication::disconnect()
Close the connection if it was open.
Bs_NetApplication::_raiseError()
you have to overwrite this method.

Class: Bs_Object

Bs_Object::Bs_Object()
Bs_Object::getErrors()
Basic error handling: Get *all* errors as string array from the global Bs_Error-error stack.
Bs_Object::getLastError()
Basic error handling: Get last error string from the global Bs_Error-error stack.
Bs_Object::getLastErrors()
Basic error handling: Get last errors string array from the global Bs_Error-error stack sinc last call of getLastErrors().
Bs_Object::persist()
Persists this object by serializing it and saving it to a file with unique name.
Bs_Object::setError()
Basic error handling: Push an error string on the global Bs_Error-error stack.
Bs_Object::toHtml()
Dumps the content of this object to a string using PHP's var_dump().
Bs_Object::toString()
Dumps the content of this object to a string using PHP's var_dump().
Bs_Object::unpersist()
Fetches an object that was persisted with persist()

Class Details

[line 158]
Can grab a website from the internet.

features:

  • submit http requests using GET, POST or HEAD to a webserver
  • POST with userdefined post vars
  • userdefined port, useragent, [protocol version]
  • userdefined request headers
  • can follow redirects
  • basic user authentication (not tested yet)
  • connect/disconnect for batch requests, or grab one by one
  • automatically try to reconnect/refetch once if the connection is fucked
missing:
  • https ?
  • submit via proxy server (with authentication)
  • http file upload
  • support of servers with virtual servers (needs HTTP/1.1?? think so.)
  • remember/send session and persistent cookies




Tags:

copyright:  blueshoes.org
access:  public
pattern:  singleton: (pseudostatic)
author:  Andrej Arn <at blueshoes dot org> some ideas, maybe snippets from manuel lemos http class
version:  4.3.$Revision: 1.4 $ $Date: 2003/11/21 17:06:29 $
example:  example
example:  example
todo:  

this is a multiline description:

  • problem: if you connect to a host, then grab a few files, then disconnect, only the 1st fetch will succeed. for the 2nd, 3rd ... you can send your request without errors, but there's no reply (eof or error) from the server => connection is lost somehow?
  • reason: probably this is a windows problem only. the manual for fsockopen() sais param "double timeout" is not available on all systems, but doesn't mention on which it's not. it if's ignored, this would be the reason for my problem.
  • solution: the param $tryReconnect in fetchPage() helps. it reconnects once each time a conn is fucked/lost somehow as in this problem.
  • comments: of course this is not satisfactory, because if you connect, fetch a lot, then close, the connection has to be reopened for each fetch request. but at least the coder doesn't need to worry.
hint: maybe you are just looking for: $file = fopen("http://www.php.net/", "r");

if you're looking for information about the http protocol, try these, have fun :) Hypertext Transfer Protocol HTTP/1.1 ftp://ftp.isi.edu/in-notes/rfc2616.txt Obsoletes: 2068 Hypertext Transfer Protocol HTTP/1.1 ftp://ftp.isi.edu/in-notes/rfc2068.txt Hypertext Transfer Protocol HTTP/1.0 ftp://ftp.isi.edu/in-notes/rfc1945.txt Upgrading to TLS Within HTTP/1.1 ftp://ftp.isi.edu/in-notes/rfc2817.txt Updates: 2616

dependencies: Net/Bs_NetApplication (which uses Net/Bs_SocketClient and Net/Bs_Url)



[ Top ]


Class Variables

$acceptCookies =  3

[line 301]

which cookies we do accept.

  1. = none (bool false)
  2. = session only
  3. = persistent only
  4. = all (session & persistent)




Tags:

todo:  implement this, add option to send user-defined cookies
see:  var $receivedCookies
access:  public

Type:   int


[ Top ]

$acceptType = array('*/*', 'image/gif', 'image/x-xbitmap', 'image/jpeg')

[line 226]

the accepted types. by default, the following types are accepted: '* /*', 'image/gif', 'image/x-xbitmap', 'image/jpeg' ^ this space is only here not to fuck up the doc header of the method.



Tags:

todo:  i think the first element here (*) makes it accept everything anyway...
access:  public

Type:   array


[ Top ]

$addHeaders =  NULL

[line 236]

Additional header array.

note: when using POST, you do not need to set 'Content-type' and 'Content-length' yourself.




Tags:

var:  associative array.
see:  var $acceptType, var $postData
access:  public

Type:   array


[ Top ]

$closeConnection =  TRUE

[line 186]

by default most webservers keep the connection open when using 'HTTP/1.1'. this makes php hang until the connection times out (blocking sockets), for apache the default setting is 15 seconds afaik.

thus we close the connection by default. this gets only used if the http protocol version (see var $this->sendProtocolVersion) is higher than 'HTTP/1.0'.




Tags:

since:  bs4.3
access:  public

Type:   bool


[ Top ]

$followRedirect =  5

[line 268]

should we follow an http redirect response?

don't set this higher than 5, because it would mean an infinite loop. (rfc recommended) note: this is only valid for GET and HEAD requests. for POST we never do an auto-redirect, the user/coder has to do it on demand. (rfc)




Tags:


Type:   int


[ Top ]

$headerParsed =  NULL

[line 402]

The parsed header information.



Tags:

var:  (associative)
see:  vars $parseHeader, Bs_HttpClient::$headerRaw

Type:   array


[ Top ]

$headerRaw =  NULL

[line 393]

The raw header information.



Tags:


Type:   array


[ Top ]

$method =  'GET'

[line 204]

The request method, one of 'GET' (default), 'POST' or 'HEAD'.



Tags:

access:  public

Type:   string


[ Top ]

$numFollowed =  0

[line 278]

the number how many times an http redirect has been followed for the current request.

read only.




Tags:


Type:   int


[ Top ]

$parseHeader =  FALSE

[line 384]

If header should be parsed on a call to fetchPage().



Tags:

see:  vars $headerParsed, Bs_HttpClient::$headerRaw
access:  public

Type:   bool


[ Top ]

$port =  80

[line 163]

overwrite default value


Type:   mixed
Overrides:   Array


[ Top ]

$postData =  NULL

[line 247]

the vars we send to the server when using POST.

a string like 'key=value&key2=value2' or an associative array like array('firstName'=>'Mike', 'lastName'=>'Smith') note: this is ignored if the method is not POST.




Tags:

var:  a string or an associative array
see:  Bs_HttpClient::$addHeaders
access:  public

Type:   mixed


[ Top ]

$receivedCookies =  NULL

[line 311]

the cookies we got from the server.



Tags:

todo:  implement this, add option to send them again, and to send user-defined cookies
see:  var $acceptCookies
access:  public

Type:   array


[ Top ]

$receivedProtocolVersion =  ''

[line 196]

The http protocol version we recived from the server.

it's a common behavior that you send 1.0 and receive 1.1




Tags:


Type:   string


[ Top ]

$redirectHistory =  NULL

[line 287]

if we had to follow redirects, this zerobased array contains the urls.



Tags:


Type:   array


[ Top ]

$responseCode =  NULL

[line 375]

the http response code we got from the server. read only.

 Status Code Definitions
 +----------------------------------------+-------------------------------------+
 |HTTP/1.0                                | HTTP/1.1                            |
 +----------------------------------------+-------------------------------------+
 | Informational 1xx                                                            |
 |   should never be received from a      | 100 Continue                        |
 |   HTTP/1.0 reply.                      | 101 Switching Protocols             |
 | Successful 2xx                                                               |
 |   200 OK                               | 200 OK                              |
 |   201 Created                          | 201 Created                         |
 |   202 Accepted                         | 202 Accepted                        |
 |                                        | 203 Non-Authoritative Information   |
 |   204 No Content                       | 204 No Content                      |
 |                                        | 205 Reset Content                   |
 |                                        | 206 Partial Content                 |
 | Redirection 3xx                                                              |
 |   300 Multiple Choices                 | 300 Multiple Choices                |
 |   301 Moved Permanently                | 301 Moved Permanently               |
 |   302 Moved Temporarily                | 302 Found                           | <= !!! meaning changed !!!
 |                                        | 303 See Other                       |
 |   304 Not Modified                     | 304 Not Modified                    |
 |                                        | 305 Use Proxy                       |
 |                                        | 306     |
 |                                        | 307 Temporary Redirect              |
 | Client Error 4xx                                                             |
 |   400 Bad Request                      | 400 Bad Request                     |
 |   401 Unauthorized                     | 401 Unauthorized                    |
 |                                        | 402 Payment Required                |
 |   403 Forbidden                        | 403 Forbidden                       |
 |   404 Not Found                        | 404 Not Found                       |
 |                                        | 405 Method Not Allowed              |
 |                                        | 406 Not Acceptable                  |
 |                                        | 407 Proxy Authentication Required   |
 |                                        | 408 Request Timeout                 |
 |                                        | 409 Conflict                        |
 |                                        | 410 Gone                            |
 |                                        | 411 Length Required                 |
 |                                        | 412 Precondition Failed             |
 |                                        | 413 Request Entity Too Large        |
 |                                        | 414 Request-URI Too Long            |
 |                                        | 415 Unsupported Media Type          |
 |                                        | 416 Requested Range Not Satisfiable |
 |                                        | 417 Expectation Failed              |
 | Server Error 5xx                                                             |
 |   500 Internal Server Error            | 500 Internal Server Error           |
 |   501 Not Implemented                  | 501 Not Implemented                 |
 |   502 Bad Gateway                      | 502 Bad Gateway                     |
 |   503 Service Unavailable              | 503 Service Unavailable             |
 |                                        | 504 Gateway Timeout                 |
 |                                        | 505 HTTP Version Not Supported      |
 +----------------------------------------+-------------------------------------+
 




Tags:

see:  $this->responseCodeInfo()
access:  public

Type:   int


[ Top ]

$sendProtocolVersion =  'HTTP/1.0'

[line 173]

The http protocol version that gets sent to the webserver.

note that when you use 'HTTP/1.1' have a look at the var $this->closeConnection.




Tags:


Type:   string


[ Top ]

$stopWatch =

[line 411]

instance of Bs_StopWatch.

if set then this class will take the time at some points.




Tags:

since:  bs4.3
access:  public

Type:   object


[ Top ]

$userAgent =  'BlueShoes Walker 4.5'

[line 214]

The user agent that gets sent to the webserver.

it can be important to set this correctly, because some sites (more and more) send different content to different browsers. based on the lang for example.




Tags:

access:  public

Type:   string


[ Top ]



Class Methods


constructor Bs_HttpClient [line 417]

Bs_HttpClient Bs_HttpClient( )

constructor.



[ Top ]

method fetchPage [line 455]

string fetchPage( string $path, [string $host = NULL], [int $port = NULL], [mixed $postData = NULL], [string $method = NULL], [bool $tryReconnect = TRUE])

Grab a webpage.

example I: grab the yahoo frontpage $content = $myHttpClient->fetchPage('/', 'www.yahoo.com');

example II: submit a form $postData = array('firstName'=>'Mike', 'lastName'=>'Smith'); list($header, $content) = $myHttpClient->fetchPage('/cgi-bin/form.cgi', 'signup.yourdomain.com', $postData, 'POST', TRUE);

if you don't specify one of the optional params or set it to NULL, the corresponding var of this object is used. the object vars are not updated when calling this method. when using connect() they are.

special case: for GET and HEAD requests, you can give a fully qualified url as the first param $path. the params $host and any port setting will be ignored then. if the url doesn't have a port, 80 is assumed. the object var is ignored! example III: $content = $myHttpClient->fetchPage('http://www.yahoo.com/');

especially if you're on a previously opened connection and it timed out, or something else went wrong, this method will try a reconnect if param $tryReconnect is true. it will only be done once per call.




Tags:

return:  the content of the page
throws:  bs_exception
access:  public


Parameters:

string   $path   like '/dir/file.html' or a fully qualified url, see examples.
string   $host   like 'your.server.com'.
int   $port   the tcp port number.
mixed   $postData   a string like 'key=value&key2=value2' or a hash.
string   $method   'GET', 'POST' or 'HEAD'
bool   $tryReconnect   if we should reconnect once on a comm error. default is true. this is used internally, but feel free to set it.

[ Top ]

method getHeaderValue [line 657]

string getHeaderValue( string $key)

Returns the received header value for a given key.



Tags:

throws:  NULL if the key/value pair was not sent from the server, of if no header was sent at all.
access:  public


Parameters:

string   $key   the key for which you want the value. comparison is made lowercase since we store all keys in lowercase.

[ Top ]

method getUserAgent [line 968]

void getUserAgent( [mixed $os = 'win'], [mixed $client = 'ie'], [mixed $ver = '5.5'], [mixed $lang = 'en'])



Tags:

todo:  all


[ Top ]

method randomUserAgent [line 988]

void randomUserAgent( )



Tags:

todo:  all


[ Top ]

method responseCodeInfo [line 693]

array responseCodeInfo( [int $code = NULL], [string $protocol = 'HTTP/1.1'])

if you specify param $code, the method is used static. otherwise the currently available response code is used together with the received protocol version.

the long text descriptions are returned from HTTP/1.1 anyway.




Tags:

return:  (vector) with 3 elements, 0 = code, 1 = caption, 2 = description.
todo:  finish the descriptions for all codes, or decide to kick them out.


Parameters:

int   $code   a response code aka status code you got from the server.
string   $protocol   one of 'HTTP/1.1' (default), 'HTTP/1.0'

[ Top ]

method setAuthenticationBasic [line 674]

void setAuthenticationBasic( [string $user = ''], [string $pass = ''])

Set user/pass for a basic authentication with HTTP/1.0.

if one or both params $user/$pass are empty '', a currently used basic authentication is dropped.




Tags:

access:  public


Parameters:

string   $user  
string   $pass  

[ Top ]

method _readCookie [line 961]

void _readCookie( )



Tags:

todo:  all


[ Top ]


Documentation generated on Mon, 29 Dec 2003 21:11:12 +0100 by phpDocumentor 1.2.3