Java 7 comes with the method java.nio.file.Files#probeContentType(path) to determine the content type of a file at the given path. It returns a mime type identifier. The implementation actually looks at the file content and inspects so-called “magic” byte sequences, which is more reliable than just trusting filename extensions.
However, the default implementation included in Java 7 seems to be platform dependent and not very complete. For example, for me it did not even recognize an mp3 file as audio/mpeg. Fortunately, the Open Source library Apache Tika provides more comprehensive mime type detection and seems to be platform independent.
As shown below, you can register a simple Tika based FileTypeDetector implementation with the Java Service Provider Interface (SPI) to transparently enhance the behaviour of java.nio.file.Files#probeContentType(path). As soon as the resulting jar is in your classpath, the SPI mechanism wil pick up our implementation class and Files.probeContentType(..) will automatically use it behind the scenes.
Maven dependency
<dependency> <groupId>org.apache.tika</groupId> <artifactId>tika-core</artifactId> <version>1.4</version> </dependency>
FileTypeDetector.java
package net.doepner.file; import java.io.IOException; import java.nio.file.Path; import org.apache.tika.Tika; /** * Detects the mime type of files (ideally based on marker in file content) */ public class FileTypeDetector extends java.nio.file.spi.FileTypeDetector { private final Tika tika = new Tika(); @Override public String probeContentType(Path path) throws IOException { return tika.detect(path.toFile()); } }
Service Provider registration
To register the implementation with the Java Service Provider Interface (SPI), you need to have a plaintext file /META-INF/services/java.nio.file.spi.FileTypeDetector in the same jar that contains the class net.doepner.file.FileTypeDetector. The text file contains just one line with the fully qualified name of the implementing class:
net.doepner.file.FileTypeDetector
With Maven, you simply create the file src/main/resources/META-INF/services/java.nio.file.spi.FileTypeDetector containing the line shown above.
See the ServiceLoader documentation for details about Java SPI.
Nice post, was looking for a solution as probeContentType doesn’t appear to be implemented on OS X yet. Any idea how you would register the service provider in Play! Framework which uses sbt, not Maven?
Have you seen the bug report and the comment about ~/.mime-types?
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8008345
To register the implementation with the Java Service Provider Interface (SPI), you need to have a plaintext file /META-INF/services/java.nio.file.spi.FileTypeDetector in the same jar that contains the class net.doepner.file.FileTypeDetector. The text file contains just one line with the fully qualified name of the implementing class.
To use Tika without Maven you just have to make sure that tika-core and all its dependencies are in the classpath.