Description
The unit test for org.apache.tika.detect.FileCommandDetector fails on some platforms due to inconsistent return values from the operating system's "file --mime" command. For example, on Fedora Core 32 and Mint 19, the test case returns "text/xml". However, on Oracle Enterprise Linux 6, the test case returns "application/xml", which is equally valid but causes the unit test case to fail. When the unit test case fails, it is impossible to build Tika.
The unit test program should be fixed to accept either answer so all dialects of the file command work successfully. A proposed patch is included below:
diff --git a/tika-1.25/tika-core/src/test/java/org/apache/tika/detect/FileCommandDetectorTest.java b/tika-1.25/tika-core/src/test/java/org/apache/tika/detect/FileCommandDetectorTest.java index 21a24ab..1911e05 100644 --- a/tika-1.25/tika-core/src/test/java/org/apache/tika/detect/FileCommandDetectorTest.java +++ b/tika-1.25/tika-core/src/test/java/org/apache/tika/detect/FileCommandDetectorTest.java @@ -44,9 +44,11 @@ public class FileCommandDetectorTest { assumeTrue(FileCommandDetector.checkHasFile()); try (InputStream is = getClass().getResourceAsStream("/test-documents/basic_embedded.xml")) { - assertEquals(MediaType.text("xml"), DETECTOR.detect(is, new Metadata())); + MediaType answer = DETECTOR.detect(is, new Metadata())); + assert(MediaType.text("xml").equals(answer) || MediaType.application("xml").equals(answer)); //make sure that the detector is resetting the stream - assertEquals(MediaType.text("xml"), DETECTOR.detect(is, new Metadata())); + answer = DETECTOR.detect(is, new Metadata())); + assert(MediaType.text("xml").equals(answer) || MediaType.application("xml").equals(answer)); } //now try with TikaInputStream