The hack basically amounts to forcing the upload module to index a PDF file as it isbeing uploaded. The text is extracted from that PDF by means of pdftotext(1) and used for indexing. To do this, here is the hack. It is introduced to includes/SpecialUpload.php, function processUpload(). The function, by the way, seems a bit too long and convoluted to my taste and needs a redesign but let's not dwell on that for now.
So here is the hack (emphasis on added code):
. . .
if( $this->saveUploadedFile( $this->mUploadSaveName,
$this->mUploadTempName,
$hasBeenMunged ) ) {
/**
* Update the upload log and create the description page
* if it's a new file.
*/
$img = Image::newFromName( $this->mUploadSaveName );
/*
* Parsing the file if it is a PDF, by MHART
*/
if (strtolower($finalExt) == "pdf") {
$NewDesc = $this->mUploadDescription . "\r\n" . "","",$DocLine);
}
$NewDesc .= "\r\n" . " -->";
$this->mUploadDescription = $NewDesc;
}
$success = $img->recordUpload( $this->mUploadOldVersion,
$this->mUploadDescription,
$this->mLicense,
$this->mUploadCopyStatus,
$this->mUploadSource,
$this->mWatchthis );
...
No comments:
Post a Comment