Package mimetype
Package mimetype uses magic number signatures to detect the MIME type of a file.
File formats are stored in a hierarchy with application/octet-stream at its root.
For example, the hierarchy for HTML format is application/octet-stream ->
text/plain -> text/html.
▾ Example (Detect)
Code:
testBytes := []byte("This random text has a MIME type of text/plain; charset=utf-8.")
mtype := mimetype.Detect(testBytes)
fmt.Println(mtype.Is("text/plain"), mtype.String(), mtype.Extension())
mtype, err := mimetype.DetectReader(bytes.NewReader(testBytes))
fmt.Println(mtype.Is("text/plain"), mtype.String(), mtype.Extension(), err)
mtype, err = mimetype.DetectFile("a nonexistent file")
fmt.Println(mtype.Is("application/octet-stream"), mtype.String(), os.IsNotExist(err))
Output:
true text/plain; charset=utf-8 .txt
true text/plain; charset=utf-8 .txt <nil>
true application/octet-stream true
▾ Example (DetectReader)
Pure io.Readers (meaning those without a Seek method) cannot be read twice.
This means that once DetectReader has been called on an io.Reader, that reader
is missing the bytes representing the header of the file.
To detect the MIME type and then reuse the input, use a buffer, io.TeeReader,
and io.MultiReader to create a new reader containing the original, unaltered data.
If the input is an io.ReadSeeker instead, call input.Seek(0, io.SeekStart)
before reusing it.
Code:
package mimetype_test
import (
"bytes"
"fmt"
"io"
"github.com/gabriel-vasile/mimetype"
)
func Example_detectReader() {
testBytes := []byte("This random text has a MIME type of text/plain; charset=utf-8.")
input := bytes.NewReader(testBytes)
mtype, recycledInput, err := recycleReader(input)
text, _ := io.ReadAll(recycledInput)
fmt.Println(mtype, bytes.Equal(testBytes, text), err)
}
func recycleReader(input io.Reader) (mimeType string, recycled io.Reader, err error) {
header := bytes.NewBuffer(nil)
mtype, err := mimetype.DetectReader(io.TeeReader(input, header))
if err != nil {
return
}
recycled = io.MultiReader(header, input)
return mtype.String(), recycled, err
}
▾ Example (Extend)
Use Extend to add support for a file format which is not detected by mimetype.
https://www.garykessler.net/library/file_sigs.html and
https://github.com/file/file/tree/master/magic/Magdir
have signatures for a multitude of file formats.
Code:
foobarDetector := func(raw []byte, limit uint32) bool {
return bytes.HasPrefix(raw, []byte("foobar"))
}
mimetype.Lookup("text/plain").Extend(foobarDetector, "text/foobar", ".fb")
mtype := mimetype.Detect([]byte("foobar file content"))
fmt.Println(mtype.String(), mtype.Extension())
Output:
text/foobar .fb
▾ Example (TextVsBinary)
Considering the definition of a binary file as "a computer file that is not
a text file", they can differentiated by searching for the text/plain MIME
in their MIME hierarchy.
Code:
testBytes := []byte("This random text has a MIME type of text/plain; charset=utf-8.")
detectedMIME := mimetype.Detect(testBytes)
isBinary := true
for mtype := detectedMIME; mtype != nil; mtype = mtype.Parent() {
if mtype.Is("text/plain") {
isBinary = false
}
}
fmt.Println(isBinary, detectedMIME)
Output:
false text/plain; charset=utf-8
▾ Example (Whitelist)
Code:
testBytes := []byte("This random text has a MIME type of text/plain; charset=utf-8.")
allowed := []string{"text/plain", "application/zip", "application/pdf"}
mtype := mimetype.Detect(testBytes)
if mimetype.EqualsAny(mtype.String(), allowed...) {
fmt.Printf("%s is allowed\n", mtype)
} else {
fmt.Printf("%s is now allowed\n", mtype)
}
Output:
text/plain; charset=utf-8 is allowed
- func EqualsAny(s string, mimes ...string) bool
- func Extend(detector func(raw []byte, limit uint32) bool, mime, extension string, aliases ...string)
- func SetLimit(limit uint32)
- type MIME
- func Detect(in []byte) *MIME
- func DetectFile(path string) (*MIME, error)
- func DetectReader(r io.Reader) (*MIME, error)
- func Lookup(mime string) *MIME
- func (m *MIME) Extend(detector func(raw []byte, limit uint32) bool, mime, extension string, aliases ...string)
- func (m *MIME) Extension() string
- func (m *MIME) Is(expectedMIME string) bool
- func (m *MIME) Parent() *MIME
- func (m *MIME) String() string
Package files
mime.go
mimetype.go
tree.go
func EqualsAny(s string, mimes ...string) bool
EqualsAny reports whether s MIME type is equal to any MIME type in mimes.
MIME type equality test is done on the "type/subtype" section, ignores
any optional MIME parameters, ignores any leading and trailing whitespace,
and is case insensitive.
func Extend(detector func(raw []byte, limit uint32) bool, mime, extension string, aliases ...string)
Extend adds detection for other file formats.
It is equivalent to calling Extend() on the root mime type "application/octet-stream".
func SetLimit(limit uint32)
SetLimit sets the maximum number of bytes read from input when detecting the MIME type.
Increasing the limit provides better detection for file formats which store
their magical numbers towards the end of the file: docx, pptx, xlsx, etc.
A limit of 0 means the whole input file will be used.
MIME struct holds information about a file format: the string representation
of the MIME type, the extension and the parent file format.
type MIME struct {
}
func Detect(in []byte) *MIME
Detect returns the MIME type found from the provided byte slice.
The result is always a valid MIME type, with application/octet-stream
returned when identification failed.
func DetectFile(path string) (*MIME, error)
DetectFile returns the MIME type of the provided file.
The result is always a valid MIME type, with application/octet-stream
returned when identification failed with or without an error.
Any error returned is related to the opening and reading from the input file.
func DetectReader(r io.Reader) (*MIME, error)
DetectReader returns the MIME type of the provided reader.
The result is always a valid MIME type, with application/octet-stream
returned when identification failed with or without an error.
Any error returned is related to the reading from the input reader.
DetectReader assumes the reader offset is at the start. If the input is an
io.ReadSeeker you previously read from, it should be rewinded before detection:
reader.Seek(0, io.SeekStart)
func Lookup(mime string) *MIME
Lookup finds a MIME object by its string representation.
The representation can be the main mime type, or any of its aliases.
func (m *MIME) Extend(detector func(raw []byte, limit uint32) bool, mime, extension string, aliases ...string)
Extend adds detection for a sub-format. The detector is a function
returning true when the raw input file satisfies a signature.
The sub-format will be detected if all the detectors in the parent chain return true.
The extension should include the leading dot, as in ".html".
func (m *MIME) Extension() string
Extension returns the file extension associated with the MIME type.
It includes the leading dot, as in ".html". When the file format does not
have an extension, the empty string is returned.
func (*MIME) Is
¶
func (m *MIME) Is(expectedMIME string) bool
Is checks whether this MIME type, or any of its aliases, is equal to the
expected MIME type. MIME type equality test is done on the "type/subtype"
section, ignores any optional MIME parameters, ignores any leading and
trailing whitespace, and is case insensitive.
func (m *MIME) Parent() *MIME
Parent returns the parent MIME type from the hierarchy.
Each MIME type has a non-nil parent, except for the root MIME type.
For example, the application/json and text/html MIME types have text/plain as
their parent because they are text files who happen to contain JSON or HTML.
Another example is the ZIP format, which is used as container
for Microsoft Office files, EPUB files, JAR files, and others.
func (m *MIME) String() string
String returns the string representation of the MIME type, e.g., "application/zip".
Subdirectories