FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » Fetching an external web page
Show: Today's Messages :: Polls :: Message Navigator
Switch to threaded view of this topic Create a new topic Submit Reply
Fetching an external web page [message #171231] Wed, 29 December 2010 19:09 Go to next message
Mike is currently offline  Mike
Messages: 18
Registered: December 2010
Karma: 0
Junior Member
I want to grab a web page, it's response headers and parse the whole
thing. So, I want the response headers, meta tags, and some of the
content extracted. The way I'm doing it now takes three calls to the
remote server. How can I do it with just one call?

Here's my current code:

$htmlSource = file_get_contents($this->url);
$responseHeaders = get_headers($url, 1); // Possible to combine with
get_contents?
$metaTags = get_meta_tags($url); // How could I use this with
$htmlSource?

Thanks for the help!

Mike
Re: Fetching an external web page [message #171251 is a reply to message #171231] Wed, 29 December 2010 21:08 Go to previous messageGo to next message
Captain Paralytic is currently offline  Captain Paralytic
Messages: 204
Registered: September 2010
Karma: 0
Senior Member
On Dec 29, 7:09 pm, Mike <mpea...@gmail.com> wrote:
> I want to grab a web page, it's response headers and parse the whole
> thing.  So, I want the response headers, meta tags, and some of the
> content extracted.  The way I'm doing it now takes three calls to the
> remote server.  How can I do it with just one call?
>
> Here's my current code:
>
>                 $htmlSource = file_get_contents($this->url);
>                 $responseHeaders = get_headers($url, 1); // Possible to combine with
> get_contents?
>                 $metaTags = get_meta_tags($url); // How could I use this with
> $htmlSource?
>
> Thanks for the help!
>
> Mike

use curl (and it's "its" not "it's")
Re: Fetching an external web page [message #171296 is a reply to message #171251] Thu, 30 December 2010 03:35 Go to previous messageGo to next message
richard is currently offline  richard   
Messages: 213
Registered: June 2013
Karma: 0
Senior Member
On Wed, 29 Dec 2010 13:08:35 -0800 (PST), Captain Paralytic wrote:

> On Dec 29, 7:09 pm, Mike <mpea...@gmail.com> wrote:
>> I want to grab a web page, it's response headers and parse the whole
>> thing.  So, I want the response headers, meta tags, and some of the
>> content extracted.  The way I'm doing it now takes three calls to the
>> remote server.  How can I do it with just one call?
>>
>> Here's my current code:
>>
>>                 $htmlSource = file_get_contents($this->url);
>>                 $responseHeaders = get_headers($url, 1); // Possible to combine with
>> get_contents?
>>                 $metaTags = get_meta_tags($url); // How could I use this with
>> $htmlSource?
>>
>> Thanks for the help!
>>
>> Mike
>
> use curl (and it's "its" not "it's")

See what I mean? Bashed for a minor spelling mistake.
Re: Fetching an external web page [message #171326 is a reply to message #171296] Thu, 30 December 2010 11:25 Go to previous messageGo to next message
Captain Paralytic is currently offline  Captain Paralytic
Messages: 204
Registered: September 2010
Karma: 0
Senior Member
On Dec 30, 3:35 am, richard <mem...@newsguy.com> wrote:
> On Wed, 29 Dec 2010 13:08:35 -0800 (PST), Captain Paralytic wrote:
>> On Dec 29, 7:09 pm, Mike <mpea...@gmail.com> wrote:
>>> I want to grab a web page, it's response headers and parse the whole
>>> thing.  So, I want the response headers, meta tags, and some of the
>>> content extracted.  The way I'm doing it now takes three calls to the
>>> remote server.  How can I do it with just one call?
>
>>> Here's my current code:
>
>>>                 $htmlSource = file_get_contents($this->url);
>>>                 $responseHeaders = get_headers($url, 1); // Possible to combine with
>>> get_contents?
>>>                 $metaTags = get_meta_tags($url); // How could I use this with
>>> $htmlSource?
>
>>> Thanks for the help!
>
>>> Mike
>
>> use curl (and it's "its" not "it's")
>
> See what I mean? Bashed for a minor spelling mistake.

Bashed?
In programming syntax and semantics are extremely important. I merely
pointed out an error to help him get things right in the future.

Most people learn by their mistakes, but they can only do so if they
are informed that a mistake was made in the first place.

You are an exception, you just keep on making mistakes...
Re: Fetching an external web page [message #171333 is a reply to message #171296] Thu, 30 December 2010 13:08 Go to previous messageGo to next message
Jerry Stuckle is currently offline  Jerry Stuckle
Messages: 2598
Registered: September 2010
Karma: 0
Senior Member
On 12/29/2010 10:35 PM, richard wrote:
> On Wed, 29 Dec 2010 13:08:35 -0800 (PST), Captain Paralytic wrote:
>
>> On Dec 29, 7:09 pm, Mike<mpea...@gmail.com> wrote:
>>> I want to grab a web page, it's response headers and parse the whole
>>> thing. So, I want the response headers, meta tags, and some of the
>>> content extracted. The way I'm doing it now takes three calls to the
>>> remote server. How can I do it with just one call?
>>>
>>> Here's my current code:
>>>
>>> $htmlSource = file_get_contents($this->url);
>>> $responseHeaders = get_headers($url, 1); // Possible to combine with
>>> get_contents?
>>> $metaTags = get_meta_tags($url); // How could I use this with
>>> $htmlSource?
>>>
>>> Thanks for the help!
>>>
>>> Mike
>>
>> use curl (and it's "its" not "it's")
>
> See what I mean? Bashed for a minor spelling mistake.

Get over it, Richard. Paul has corrected me on the very same mistake
and I know better. But I didn't take it personally.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================
Re: Fetching an external web page [message #171372 is a reply to message #171326] Thu, 30 December 2010 23:10 Go to previous messageGo to next message
Mike is currently offline  Mike
Messages: 18
Registered: December 2010
Karma: 0
Junior Member
Captain, thanks for the advice. I checked out cURL and agree, it
looks like a better (though more complicated) approach.

Thanks, also, for pointing out the grammatical error ("it's" vs
"its"). I do know better.

BTW, isn't "a mistake was made" passive voice?

:)

All the best,

Mike
Re: Fetching an external web page [message #171407 is a reply to message #171231] Sat, 01 January 2011 23:59 Go to previous messageGo to next message
dog_cow is currently offline  dog_cow
Messages: 2
Registered: November 2010
Karma: 0
Junior Member
Mike wrote:
>
> How can I do it with just one call?

Open a socket with fsockopen() to port 80. Then use fputs() to send an HTTP
request. Use fgets() to read the response. Finally, use fclose() to destroy
the socket. This response will include everything that you have asked about,
which you must then parse on your own.

Here is a little example:


<?php

$sock = fsockopen('php.net', 80);

fputs($sock, "GET / HTTP/1.1\r\nHost: php.net\r\nUser-Agent:
Firefox\r\nConnection: close\r\n\r\n");

while(($buf = fgets($sock, 4096)) !== false)
{
echo $buf;
}

fclose($sock);

?>

--
Mac GUI Vault - A source for retro Apple II and
Macintosh computing.
http://macgui.com/vault/
Re: Fetching an external web page [message #171410 is a reply to message #171372] Sun, 02 January 2011 13:09 Go to previous messageGo to next message
Captain Paralytic is currently offline  Captain Paralytic
Messages: 204
Registered: September 2010
Karma: 0
Senior Member
On Dec 30 2010, 11:10 pm, Mike <mpea...@gmail.com> wrote:
> Captain, thanks for the advice.  I checked out cURL and agree, it
> looks like a better (though more complicated) approach.
Slightly more complicated maybe, but it does enable you to interact in
a more logical way with web sites.

Also, when you start to have more complicated interactions with web
sites, the availability of all the functionality of curl make things
easier (IMO)
Re: Fetching an external web page [message #171482 is a reply to message #171407] Wed, 05 January 2011 14:41 Go to previous messageGo to next message
Mike is currently offline  Mike
Messages: 18
Registered: December 2010
Karma: 0
Junior Member
Finnigan,

Thanks for the reply and code sample... very helpful.

Mike
Re: Fetching an external web page [message #171483 is a reply to message #171410] Wed, 05 January 2011 14:46 Go to previous messageGo to next message
Mike is currently offline  Mike
Messages: 18
Registered: December 2010
Karma: 0
Junior Member
Do you know offhand... is there a performance hit (or advantage) when
using curl (vs. fsock)?

Mike
Re: Fetching an external web page [message #171484 is a reply to message #171483] Wed, 05 January 2011 15:01 Go to previous messageGo to next message
Captain Paralytic is currently offline  Captain Paralytic
Messages: 204
Registered: September 2010
Karma: 0
Senior Member
On Jan 5, 2:46 pm, Mike <mpea...@gmail.com> wrote:
> Do you know offhand... is there a performance hit (or advantage) when
> using curl (vs. fsock)?
>
> Mike

No I'm afraid I don't. I only start worrying about that if there is a
performance problem.

Let's face it, if you are really worried about performance, why are
you using an interpreted language like PHP? You should be using C.
Re: Fetching an external web page [message #171486 is a reply to message #171483] Wed, 05 January 2011 19:48 Go to previous message
The Natural Philosoph is currently offline  The Natural Philosoph
Messages: 993
Registered: September 2010
Karma: 0
Senior Member
Mike wrote:
> Do you know offhand... is there a performance hit (or advantage) when
> using curl (vs. fsock)?
>
> Mike
No that you are likely to notice.

Curl is fsock in a wrapper, probably, anyway.
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Failed to write to a text file (text file is RW)
Next Topic: PHP Bug + Hotfix
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Sat Nov 30 23:22:35 GMT 2024

Total time taken to generate the page: 0.02265 seconds