FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » Imported messages » comp.lang.php » fgetcsv -- No error reporting?
Show: Today's Messages :: Polls :: Message Navigator
Switch to threaded view of this topic Create a new topic Submit Reply
fgetcsv -- No error reporting? [message #172810] Thu, 03 March 2011 16:30 Go to next message
matt[1] is currently offline  matt[1]
Messages: 40
Registered: September 2010
Karma: 0
Member
Hi all,

Trying to process a fairly large csv file, and it's bombing out early on me.. This quick test script describes the problem:

# cat test.php
<?php

ini_set("display_errors", true);
error_reporting(E_ALL | E_STRICT);

$file = "my.csv";
$fHandle = fopen($file, "r");

$rowNum = 0;
while (fgetcsv($fHandle)) ++$rowNum;

printf ("Lines: %d\nLastRow: %d\n", count(file($file)), $rowNum);

# php test.php
Lines: 329360
LastRow: 328141


There are no multi-line entries in the file, so it seems to be legitimately returning false for some reason about 1200 lines early. A visual inspection of the file around line 328,141 doesn't reveal any errors, and no errors are being triggered from PHP/fgetcsv.

Any ideas on how to diagnose what's going on here?

Thanks,
Matt
Re: fgetcsv -- No error reporting? [message #172811 is a reply to message #172810] Thu, 03 March 2011 17:09 Go to previous messageGo to next message
alvaro.NOSPAMTHANX is currently offline  alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010
Karma: 0
Senior Member
El 03/03/2011 17:30, matt escribió/wrote:
> Hi all,
>
> Trying to process a fairly large csv file, and it's bombing out early on me. This quick test script describes the problem:
>
> # cat test.php
> <?php
>
> ini_set("display_errors", true);
> error_reporting(E_ALL | E_STRICT);
>
> $file = "my.csv";
> $fHandle = fopen($file, "r");
>
> $rowNum = 0;
> while (fgetcsv($fHandle)) ++$rowNum;
>
> printf ("Lines: %d\nLastRow: %d\n", count(file($file)), $rowNum);
>
> # php test.php
> Lines: 329360
> LastRow: 328141
>
>
> There are no multi-line entries in the file, so it seems to be legitimately returning false for some reason about 1200 lines early. A visual inspection of the file around line 328,141 doesn't reveal any errors, and no errors are being triggered from PHP/fgetcsv.
>
> Any ideas on how to diagnose what's going on here?


Since a record can legitimately expand over more than one line, you
can't just load the file into an editor and go to line X. I'm not sure
about how fgetcsv() works but it's possible that calling ftell($handle)
allows you to keep track of the file position where each loop starts
reading from. You can then fseek() and fread() to print the file
fragment for manual inspection.

(I suppose that you already thought about using var_dump() to print/log
the output of successful calls and identify the first broken record.)




--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Re: fgetcsv -- No error reporting? [message #172812 is a reply to message #172811] Thu, 03 March 2011 17:44 Go to previous messageGo to next message
matt[1] is currently offline  matt[1]
Messages: 40
Registered: September 2010
Karma: 0
Member
On Thursday, March 3, 2011 12:09:15 PM UTC-5, Álvaro G. Vicario wrote:
> El 03/03/2011 17:30, matt escribi�/wrote:
>> Hi all,
>>
>> Trying to process a fairly large csv file, and it's bombing out early on me. This quick test script describes the problem:
>>
>> # cat test.php
>> <?php
>>
>> ini_set("display_errors", true);
>> error_reporting(E_ALL | E_STRICT);
>>
>> $file = "my.csv";
>> $fHandle = fopen($file, "r");
>>
>> $rowNum = 0;
>> while (fgetcsv($fHandle)) ++$rowNum;
>>
>> printf ("Lines: %d\nLastRow: %d\n", count(file($file)), $rowNum);
>>
>> # php test.php
>> Lines: 329360
>> LastRow: 328141
>>
>>
>> There are no multi-line entries in the file, so it seems to be legitimately returning false for some reason about 1200 lines early. A visual inspection of the file around line 328,141 doesn't reveal any errors, and no errors are being triggered from PHP/fgetcsv.
>>
>> Any ideas on how to diagnose what's going on here?
>
>
> Since a record can legitimately expand over more than one line, you
> can't just load the file into an editor and go to line X. I'm not sure
> about how fgetcsv() works but it's possible that calling ftell($handle)
> allows you to keep track of the file position where each loop starts
> reading from. You can then fseek() and fread() to print the file
> fragment for manual inspection.

No, I understand that. I made a faulty assumption that I had no multi-line data (more on that later). The last field of each record is a year, and a regex test showed that every line of the file did indeed end with /\,\d{4}/.

> (I suppose that you already thought about using var_dump() to print/log
> the output of successful calls and identify the first broken record.)

Yes, I did--and got the data from the last line of the file as the last successful record!

Finally, I thought of stepping through with two file handles, one being read by fgets and one by fgetcsv and doing a line-by-line comparison. Culprit turned out to be a number of unmatched double quotes through the file, causing fgetcsv to pull several records into single fields mid-document.

I've forwarded the RFC to the guy who is sending me the CSV files :)

Thanks for your suggestions.
Re: fgetcsv -- No error reporting? [message #172815 is a reply to message #172812] Thu, 03 March 2011 19:44 Go to previous messageGo to next message
Peter H. Coffin is currently offline  Peter H. Coffin
Messages: 245
Registered: September 2010
Karma: 0
Senior Member
On Thu, 3 Mar 2011 09:44:46 -0800 (PST), matt wrote:

> Finally, I thought of stepping through with two file handles, one
> being read by fgets and one by fgetcsv and doing a line-by-line
> comparison. Culprit turned out to be a number of unmatched double
> quotes through the file, causing fgetcsv to pull several records into
> single fields mid-document.
>
> I've forwarded the RFC to the guy who is sending me the CSV files :)

Best of luck with that. 99.5% of the CSV files I've ever dealt with were
created with stuff that was completely out of the control of the user.
Hell, 80% of them were from Excel alone.

--
40. I will be neither chivalrous nor sporting. If I have an unstoppable
superweapon, I will use it as early and as often as possible instead
of keeping it in reserve.
--Peter Anspach's list of things to do as an Evil Overlord
Re: fgetcsv -- No error reporting? [message #172816 is a reply to message #172815] Thu, 03 March 2011 20:18 Go to previous message
matt[1] is currently offline  matt[1]
Messages: 40
Registered: September 2010
Karma: 0
Member
On Thursday, March 3, 2011 2:44:56 PM UTC-5, Peter H. Coffin wrote:
> On Thu, 3 Mar 2011 09:44:46 -0800 (PST), matt wrote:
>
>> Finally, I thought of stepping through with two file handles, one
>> being read by fgets and one by fgetcsv and doing a line-by-line
>> comparison. Culprit turned out to be a number of unmatched double
>> quotes through the file, causing fgetcsv to pull several records into
>> single fields mid-document.
>>
>> I've forwarded the RFC to the guy who is sending me the CSV files :)
>
> Best of luck with that. 99.5% of the CSV files I've ever dealt with were
> created with stuff that was completely out of the control of the user.
> Hell, 80% of them were from Excel alone.

I'm dealing with PeopleSoft over here...
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: String Question
Next Topic: copy data from one table to another table
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Wed Nov 27 17:27:19 GMT 2024

Total time taken to generate the page: 0.02733 seconds