Saturday, February 21, 2009

PHP: how to make amendments to a CSV file

in my third article for today i want to write about a problem i had with a massive csv file the other night:

i got this big CSV file which contains over 3300 lines of entries. those entries include a date, but i can't use that date for sorting. first of all it's written among a string of other details; it's written as 20 February 2000 or February 20, 2000 or February 20 2000. that's 3 different ways. i decided that i needed another column in my CSV file where i'd store the date as a sortable value. i found that a leading id-number was followed by 3 whitespaces, and that those 3 whitespaces were otherwise not used in the document. so i decided to open my CSV file with WordPad and change the 3 whitespaces to 2 pipes thus: ||. inbetween those pipes i would store the date value i want to use for sorting. i spent about 2 nights entering the date by copying and pasting it right there when on the second night my screen got stuck again and i had to re-start my computer. i was just paging down the document when it happened; and i thought that i had just saved the document.

after re-starting my laptop i found out that the document had been in the process of saving when the screen got stuck. only a third of this document was still available to me. of course i had some kind of a backup copy, but that was the copy without the pipes. now i had to decide: do i start from scratch or am i going to look around for a script that could maybe do it with PHP which i wanted to do all the while anyway?.

i was a bit upset about all this since it was such a boring job to do and i knew i would be glad once i'd reached the end of the file.

alright, let me search the internet for some kind of a solution that may come close to what i want to achieve. now, what am i going to search for? how about php amend csv file?

in a german forum i found someone who tried to search for results in a CSV document: http://forum.de.selfhtml.org/archiv/2008/5/t171322/.

i started playing with that one. the person couldn't get his script to work.

here's my version of the script:

<pre style="font-family:'Courier New', Courier; font-size:11px; color:blue;">
<?php
define('SEARCH', '1961');
$s = (isset($_GET['s'])) ? $_GET['s'] : SEARCH;
$csv_file_open = fopen("worksheet.csv", "r");
$row = 1;
while ($data = fgetcsv ($csv_file_open, 1000, "|")) {
if (stristr ( $data[2], $s )) {
echo $data[0] . "|" . $s . "-";
echo $data[1] . "|";
echo $data[2] . "\n";
}
$row++;
}
fclose($csv_file_open);
?>
</pre>


this code snippet lets me search my document for the date 1961 (as a default value) and squeezes it between the id number and the other string devided by the pipe symbol, provided it finds the date value i've given. if i want another search result i could add ?s=1980, for instance, as a url parameter after the file name. the problem is, if i search for New York City, it would enter New York City, where found, between those 2 pipe symbols ... hm ... not exactly what i had in mind.

i gave the idea some more thought and came up with a solution that uses $y as a variable for the year, $m as a variable for the month and $d as a variable for the day. hm, that looks pretty good for starters.

as usual, i made a copy of my first example for keepsake and decided to work with a new php file i named csv-add-date.php.

i kept some of the old code, but for the moment i wouldn't need the $s variable.

here's my code in the csv-add-date.php file:

<pre style="font-family:'Courier New', Courier; font-size:11px; color:blue;">
<?php
define('SEARCH', '1961');
$s = (isset($_GET['s'])) ? $_GET['s'] : SEARCH;
$csv_file_open = fopen("clean_sheet.csv", "r");
$row = 1;
while ($data = fgetcsv ($csv_file_open, 1000, "|")) {
include ( 'csv-calculate-year.php' );
include ( 'csv-calculate-01-jan-day.php' );
include ( 'csv-calculate-02-feb-day.php' );
include ( 'csv-calculate-03-mar-day.php' );
include ( 'csv-calculate-04-apr-day.php' );
include ( 'csv-calculate-05-may-day.php' );
include ( 'csv-calculate-06-jun-day.php' );
include ( 'csv-calculate-07-jul-day.php' );
include ( 'csv-calculate-08-aug-day.php' );
include ( 'csv-calculate-09-sep-day.php' );
include ( 'csv-calculate-10-oct-day.php' );
include ( 'csv-calculate-11-nov-day.php' );
include ( 'csv-calculate-12-dec-day.php' );
echo $data[0] . "|" . $y . "-" . $m . "-" . $d;
echo $data[1] . "|";
echo $data[2] . "\n";
$row++;
}
fclose($csv_file_open);
?>
</pre>


i'm also using a new CSV file which i named clean_sheet.csv.

... and here is an excerpt from my csv-calculate-year.php:

<?php
if (stristr ( $data[2], '1958' )) { $y = '1958'; }
elseif (stristr ( $data[2], '1960' )) { $y = '1960'; }
elseif (stristr ( $data[2], '1961' )) { $y = '1961'; }
elseif (stristr ( $data[2], '1962' )) { $y = '1962'; }
elseif (stristr ( $data[2], '1963' )) { $y = '1963'; }
elseif (stristr ( $data[2], '1964' )) { $y = '1964'; }
elseif (stristr ( $data[2], '1965' )) { $y = '1965'; }
elseif (stristr ( $data[2], '1966' )) { $y = '1966'; }
elseif (stristr ( $data[2], '1967' )) { $y = '1967'; }
elseif (stristr ( $data[2], '1968' )) { $y = '1968'; }
elseif (stristr ( $data[2], '1969' )) { $y = '1969'; }
elseif (stristr ( $data[2], '1970' )) { $y = '1970'; }
...
?>


another example from one of the files that calculate the month and day:

<?php
if ( (stristr ( $data[2], ' 1 January ' )) || (stristr ( $data[2], ' January 1 ' )) || (stristr ( $data[2], ' January 1,' )) ) { $m = '01'; $d = '01'; }
elseif ( (stristr ( $data[2], ' 2 January ' )) || (stristr ( $data[2], ' January 2 ' )) || (stristr ( $data[2], ' January 2,' )) ) { $m = '01'; $d = '02'; }
elseif ( (stristr ( $data[2], ' 3 January ' )) || (stristr ( $data[2], ' January 3 ' )) || (stristr ( $data[2], ' January 3,' )) ) { $m = '01'; $d = '03'; }
elseif ( (stristr ( $data[2], ' 4 January ' )) || (stristr ( $data[2], ' January 4 ' )) || (stristr ( $data[2], ' January 4,' )) ) { $m = '01'; $d = '04'; }
elseif ( (stristr ( $data[2], ' 5 January ' )) || (stristr ( $data[2], ' January 5 ' )) || (stristr ( $data[2], ' January 5,' )) ) { $m = '01'; $d = '05'; }
elseif ( (stristr ( $data[2], ' 6 January ' )) || (stristr ( $data[2], ' January 6 ' )) || (stristr ( $data[2], ' January 6,' )) ) { $m = '01'; $d = '06'; }
elseif ( (stristr ( $data[2], ' 7 January ' )) || (stristr ( $data[2], ' January 7 ' )) || (stristr ( $data[2], ' January 7,' )) ) { $m = '01'; $d = '07'; }
elseif ( (stristr ( $data[2], ' 8 January ' )) || (stristr ( $data[2], ' January 8 ' )) || (stristr ( $data[2], ' January 8,' )) ) { $m = '01'; $d = '08'; }
elseif ( (stristr ( $data[2], ' 9 January ' )) || (stristr ( $data[2], ' January 9 ' )) || (stristr ( $data[2], ' January 9,' )) ) { $m = '01'; $d = '09'; }
elseif ( (stristr ( $data[2], ' 10 January ' )) || (stristr ( $data[2], ' January 10 ' )) || (stristr ( $data[2], ' January 10,' )) ) { $m = '01'; $d = '10'; }
...
?>


that last code is included in 12 files one for each month. in each one i had to replace the name of the month, say January with March for instance; and $m = '01 ought to be replaced with $m = '02 in the case of February.

those year and month/day files are included in the csv-add-date.php file using PHP's include function.

this csv-add-date.php file basically loads the clean_sheet.csv file and adds the year, month and day before it displays the content on the screen. it does not write the content to the clean_sheet.csv file. i think i'm going to get that done in another step to another new CSV file ... in case something should go wrong again.

now i can save myself a lot of time ... and boredom ... and repeat the process if necessary. this exercise has also taught me how to search through CSV files and display the results. that may be the topic of another article ...

i hope that this has been of some help to PHP coders.

Good Luck!

tom.paine


PHP: how to delete the last line of a file?

i'm a little friend of MySQL, but a big friend of writing content to files, text files or csv (comma-separated-value) files. one reason is that i haven't really worked that much with MySQL. i just know the basics and have usually worked with them when setting up a forum or webshop. those scripts are readily available all across the internet.

i was going to set up a table using either XML or PHP's array function with a script that would write either the XML file, or a PHP fie that i could include in another script which in turn could be used to create an XML file or simply HTML.

i looked at various examples and tutorials on the internet on how to manipulate XML files with PHP, but i wasn't too happy about where the new elements would be placed in the XML document eventually.

of course one of the problems would always be: how would i squeeze a new set of elements before the closing root element; or how would i get rid of the ?> at the end of the php file i wish to include somewhere else.

the basic idea i found here: http://bytes.com/groups/php/11847-how-delete-last-line-file.

below is the code snippet from that page that got me going:

$inp = file('yourfile.name');
$out = fopen('yourfile.name','w');
for ($i=0;$i<count($inp)-1;$i++)
fwrite($out,$inp[$i]);
fclose($out);


in this particular case the information of one file is written to another file. (not 100% what i was looking for).

in the end i started using this example:

$my_input = '$my_'.$_GET['my_number'].' = array ( $my_date => \''.$_GET['my_date'].'\', $my_city => \''.$_GET['my_city'].'\', $my_links => \''.$_GET['my_links'].'\', \'$my_update => \''.$_GET['my_update'].'\', );'."\n";
// load the data and delete the line from the array
$lines = file('#_MyIncludeFile.inc.php');
$last = sizeof($lines)-1;
unset($lines[$last]);

// write the new data to the file
$input = fopen('#_MyIncludeFile.inc.php','w');
fwrite($input, implode('', $lines));
fwrite($input, $my_input);
fwrite($input, "?>");
fclose($input);


the $_GET variables are submitted from a form:
<form method="GET" action="WriteToFile.php">
<table>
<tr><td>my my number:</td><td><input type="text" name="my_number" /></td></tr>
<tr><td>my my date:</td><td><input type="text" name="my_date" /></td></tr>
<tr><td>my my city:</td><td><input type="text" name="my_city" /></td></tr>
<tr><td>my my links:</td><td><input type="text" name="my_links" /></td></tr>
<tr><td><input type="hidden" name="my_update" value="<?=time();?>" /><br />
<br /><input type="submit" value="send" /></td></tr>
</table>
</form>


the WriteToFile.php file also contains this form for further input.

this script writes CSV data to a PHP file which can then be used elsewhere for further analyzes.

i think that says it all ... or are there any questions?

Good Luck again!

tom.paine


"Class 'XSLTProcessor' not found"

the other day i was trying an example from an xsl tutorial and couldn't get it to work, instead i got the error message "Class 'XSLTProcessor' not found".

here is how i solved it:

first i thought my php versions were not up to date. i'm running php4 and php5 with Apache 1.3 under OS Windows XP. i downloaded the Windows Binaries ZIP folders, unpacked them and placed the contents into my c:\php4 and c:\php5 directories respectively and re-started Apache. trying the example again i still got the same error message.

i thought that maybe something wasn't right in the php.ini files. therefore i made a copy of the php.ini-recommended and renamed it php.ini. i opened the php.ini file with WordPad and searched the document for xsl. the search went to the line that reads ;extension=php_xsl.dll. i uncommented that line by removing the semi-colon. now the line reads: extension=php_xsl.dll. After re-starting the server again i tried the script example once more and still got the same error message: "Class 'XSLTProcessor' not found".

alright ... i was wondering which php.ini file php was looking at and saved the following code snippet in a file named test.php:


<?php
phpinfo();
?>


after i ran this test.php file php displayed the current settings and i found that it was not looking for the php.ini file in the c:\php4 or c:\php5 directories at all, but in the c:\WINDOWS directory.

okay ... i looked for that php.ini file in the c:\WINDOWS directory, opend the file, searched for xsl again and made the change as described above.

after re-starting the server and running the script example again, believe it or not, it still gave me that error message. things like that can become quite frustrating, especially if one tests a script in the localhost environment where it won't and at the webhosting server, where it may work.

i'm not giving up so easily. i knew that the xsl extension under windows is a .dll file that can be found in the ext dirctory. and i also knew that this was a feature only introduced with php5. i checked to see if the php_xsl.dll file existed in the ext folder, which it did; then i went back to the php documentation to see how i would have to set up php under windows and came across this comment: http://de2.php.net/manual/de/book.xsl.php#65277.

alright, so i had to change the path to this ext directory.

i had the php.ini file still open (WordPad) and searched for extension_dir. the line read:

extension_dir = "./"

which i changed to read:

extension_dir = "./ext"

re-starting the Apache Server one more time and running the example script once more at last i saw the expected result on the screen and could go to rest happily after all.

so before someone else has to battle with this issue:
first: check to see if the php_xsl.dll file exists;
second: check to see which php.ini file you gotta change;
third: uncomment ;extension=php_xsl.dll and change the extension directory to extension_dir = "./ext" in the relevant php.ini file.

... and don't forget to re-start the Apache Server!

Good Luck!

tom.paine