File tree Expand file tree Collapse file tree 3 files changed +11
-3
lines changed Expand file tree Collapse file tree 3 files changed +11
-3
lines changed Original file line number Diff line number Diff line change 60
60
- name : Setup environment
61
61
run : sudo apt-get -y install tesseract-ocr
62
62
63
- - name : Setup environment
63
+ - name : Setup Java
64
64
uses : actions/setup-java@v3
65
65
with :
66
66
distribution : ' temurin'
Original file line number Diff line number Diff line change @@ -304,6 +304,7 @@ There are a few samples to test against:
304
304
305
305
There are some issues found during tests, not related with this library:
306
306
307
+ * Apache Tika 1.17 and lower can't extract text from OCR as described in [ TIKA-2509] ( https://issues.apache.org/jira/browse/TIKA-2509 )
307
308
* Tesseract slows down document parsing as described in [ TIKA-2359] ( https://issues.apache.org/jira/browse/TIKA-2359 )
308
309
309
310
## Integrations
Original file line number Diff line number Diff line change @@ -267,9 +267,16 @@ public function testImageMetadataHeight(string $file): void
267
267
*/
268
268
public function testImageOCR (string $ file ): void
269
269
{
270
- $ text = self ::$ client ->getText ($ file );
270
+ if (version_compare (self ::$ version , '1.18 ' ) >= 0 )
271
+ {
272
+ $ text = self ::$ client ->getText ($ file );
271
273
272
- $ this ->assertMatchesRegularExpression ('/voluptate/i ' , $ text );
274
+ $ this ->assertMatchesRegularExpression ('/voluptate/i ' , $ text );
275
+ }
276
+ else
277
+ {
278
+ $ this ->markTestSkipped ('Apache Tika 1.17 and lower can \'t find Tesseract binaries ' );
279
+ }
273
280
}
274
281
275
282
/**
You can’t perform that action at this time.
0 commit comments