NYCPHP Meetup

NYPHP.org

[nycphp-talk] Search for NYPHP List Archives available (alpha ver)

Chris Bielanski Cbielanski at inta.org
Thu Apr 22 11:28:23 EDT 2004


Hey cool, I've already found some apache testing errata about which I was
dreading the idea of googling.

so far, so good - but if this is an RFI for features, I strongly suggest
these three:
a date-sort (maybe also date-constraint?) option, highlighting matched
keywords (a la google) and possibly a scoring (match-strength/confidence)
function.

But definitely a good start!

PS: I realized that the date-contraint suggestion is part of the
boolean-search logic you haven't implemented. If you go through with it,
don't forget date range constraints :)

PPS: Notetab is *great*. Any word on if Fookes is going to add
context-highlighting to it?

Thanks,
Chris Bielanski
Web Programmer, 
International Trademark Association,
1133 Avenue of the Americas, 33rd Floor
New York, NY 10036
+1 (212) 642-1745, f: +1 (212) 768-7796
mailto:cbielanski at inta.org, www.inta.org  
INTA -- 125 Years of Excellence





> -----Original Message-----
> From: Jayesh Sheth [mailto:jayeshsh at ceruleansky.com]
> Sent: Thursday, April 22, 2004 1:17 AM
> To: talk at lists.nyphp.org
> Subject: [nycphp-talk] Search for NYPHP List Archives available (alpha
> ver)
> 
> 
> Hello all,
> 
> I have been wanting to search through the treasure of information 
> contained in the NYPHP archives from 2002 to date.
> Thankfully, the guys at nyphp.org made the whole archive 
> available as a 
> (Unix mail) mbox file.
> 
> I used a utility from Fookes software (fookes.com) called Mailbag to 
> extract these files as a big csv file. (Originally I thought 
> I would use 
> Mailbag's built-in search feature, but that proved too cumbersome to 
> use. Plus, I wanted to customize the output of the search.)
> 
>  I then imported that file into a MySQL database table, and enabled 
> MySQL fulltext search for that table. (I had to do a manual 
> search and 
> replace with another Fookes program, Notetab, to get certain 
> character 
> escaped. E.g. quotes and dollar signs.)
> 
> I then added on a PHP search interface to it.
> 
> With a lot of luck I seem to have got a basic version working. Having 
> this large set of information publicly searchable will no doubt be 
> useful to others.
> 
>  For example:
> http://www.ceruleansky.com/nyphp_mail/index.php?q=xml-rpc
> 
> I am not sure if it works perfectly yet (since it is just a 
> day or two's 
> work). (I  built on some functions which I had written before, so it 
> went relatively fast ...)  It currently returns a set of up to 25 
> matches. There is no built-in support for displaying messages 
> by thread, 
> or for fine grained searching (by field or with boolean expressions).
> 
> Please try out this alpha version and let me know if it works 
> and if it 
> is of use.
> 
> Best Regards,
> - Jay Sheth
> 
> _______________________________________________
> talk mailing list
> talk at lists.nyphp.org
> http://lists.nyphp.org/mailman/listinfo/talk
> 



More information about the talk mailing list