I was using tesseract for English and Tamil. Its working perfectly in Swift for English. I can see that in tessdata folder there are lot of files for English, files like enlgish.cube.nn, english.cube.lm .. etc. But i could not find files like that for Tamil Language. All i have is this tam.traineddata file. I downloaded all files from Google Code. All files are upto date. There are some application in appStore which extracts the tamil text from image. I have no idea how people are doing this.
When i pass Tamil language text contained image to Tesseract i get errors like, there no files like tam.cube.lm, tam.cube.size..etc. I searched a lot in internet, but i could not find files for Tamil.
Please help me out here, where i can find these files.?
Code given below ->
import UIKit
protocol ValueFromTesseractProtocol
{
func textRecognizedFromImage(text : String, booleanValue : Bool)
}
class TesseractModel: NSObject
{
var delegate : ValueFromTesseractProtocol!
//MARK: - Creating sharedInstance
class var sharedInstance: TesseractModel {
struct Static {
static var sharedInstance: TesseractModel?
static var token: dispatch_once_t = 0
}
dispatch_once(&Static.token) {
Static.sharedInstance = TesseractModel()
}
return Static.sharedInstance!
}
//MARK: - imageRecognition
func imageRecognition(image : UIImage)
{
let tesseract = G8Tesseract()
tesseract.language = "eng+tam"
tesseract.engineMode = G8OCREngineMode.CubeOnly
tesseract.maximumRecognitionTime = 60.0
tesseract.pageSegmentationMode = G8PageSegmentationMode.Auto
tesseract.image = image.g8_blackAndWhite()
tesseract.recognize()
if let recognizedText = tesseract.recognizedText
{
// Call delegate - Pass value
self.delegate.textRecognizedFromImage(recognizedText, booleanValue: true)
}
else
{
// Call delegate - Nil Value
self.delegate.textRecognizedFromImage("", booleanValue: false)
}
}
}
0 comments:
Post a Comment