NYCPHP Meetup

NYPHP.org

[nycphp-talk] Plagiarism Checker in PHP

Justin Dearing zippy1981 at gmail.com
Sun May 2 16:43:19 EDT 2010


On Sun, May 2, 2010 at 1:58 PM, Hans Zaunere <lists at zaunere.com> wrote:

> > Hi Friends,
>
> Hi,
>
> > Could you please tell / suggest me how to develop Plagiarism Checker
> > feature or send some useful articles / free APIs and so on.
>
>
> Seriously though, unless I'm missing something, I can't see how this would
> be possible.  I suppose you could use techniques such as comparing the
> number of similar words between articles, but that's not really exact, and
> likely to have incorrect results.  Plus, you're looking to do this
> plagiarism check across the whole Internet?
>
> My understanding is many CS professors do this for programming homework.
They are looking for exact matches. Apparently that catches a lot of people.

I think chopping up an article into an array of sentences, and throwing a
few into google would be a good approach. Submit 25% of the sentences to
google as exact phrase matches. Throw the first 10 result urls into an array
for each. Sort the urls and see how many are the same.

Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20100502/9dda592a/attachment.html>


More information about the talk mailing list