PDFFileObjectTypePDF File Object Schema

The PDFFileObjectType type is intended to characterize the structural makeup of PDF files.


Field Name Type Description
@object_referenceoptional QName

The object_reference field specifies a unique ID reference to an Object defined elsewhere. This construct allows for the re-use of the defined Properties of one Object within another, without the need to embed the full Object in the location from which it is being referenced. Thus, this ID reference is intended to resolve to the Properties of the Object that it points to.

Custom_Properties0..1 CustomPropertiesType

The Custom_Properties construct is optional and enables the specification of a set of custom Object Properties that may not be defined in existing Properties schemas.

@is_packedoptional boolean

The is_packed field is used to indicate whether the file is packed or not.

@is_masqueradedoptional boolean

The is_masqueraded field specifies whether the file is masqueraded as another type of file; e.g., a PDF file that has had its extension changed to TXT to masquerade itself as a text file.

File_Name0..1 StringObjectPropertyType

The File_Name field specifies the base name of the file (including an extension, if present).

File_Path0..1 FilePathType

The File_Path field specifies the relative or fully-qualified path to the file, not including the path to the device where the file system containing the file resides. Whether the path is relative or fully-qualified can be specified via the 'fully_qualified' attribute of this field. The File_Path field may include the name of the file; if so, it must not conflict with the File_Name field. If not, the File_Path field should contain the path of the directory containing the file, and should end with a terminating path separator("\" or "/").

Device_Path0..1 StringObjectPropertyType

The Device_Path field specifies the path to the physical device where the file system containing the file resides.

Full_Path0..1 StringObjectPropertyType

The Full_Path field specifies the complete path to the file, including the device path. It should contain the contents that would otherwise be in the Device_Path and File_Path fields, and can be used in case the producer is unable or does not wish to separate the Device_Path and File_Path fields. If the Full_Path field is specified along with the File_Path and/or Device_Path fields, it must not conflict with either. The Full_Path field may include the name of the file; if so, it must not conflict with the File_Name field. If not, the File_Path field should contain the path of the directory containing the file, and should end with a terminating path separator("\" or "/").

File_Extension0..1 StringObjectPropertyType

The File_Extension field specifies the extension of the name of the file. The File_Extension field must not conflict with the ending of the File_Name field. The File_Extension field should not begin with a "." character, but may contain a "." character in the case of a compound file extension, such as "tar.gz".

Size_In_Bytes0..1 UnsignedLongObjectPropertyType

The Size_In_Bytes field specifies the size of the file, in bytes.

Magic_Number0..1 HexBinaryObjectPropertyType

The Magic_Number specifies the particular magic number (typically a hexadecimal constant used to identify a file format) corresponding to the file, if applicable.

File_Format0..1 StringObjectPropertyType

The File_Format field specifies the particular file format of the file, most typically specified by a tool such as the UNIX file command.

Hashes0..1 HashListType

The Hashes field specifies any hashes of the file.

Digital_Signatures0..1 DigitalSignaturesType

The Digital_Signatures field is optional and captures one or more digital signatures for the file.

Modified_Time0..1 DateTimeObjectPropertyType

The Modified_Time field specifies the date/time the file was last modified.

Accessed_Time0..1 DateTimeObjectPropertyType

The Accessed_Time field specifies the date/time the file was last accessed.

Created_Time0..1 DateTimeObjectPropertyType

The Created_Time field specifies the date/time the file was created.

File_Attributes_List0..1 FileAttributeType

The File_Attributes_List field specifies the particular special attributes set for the file. Since this is a platform-specific Object property, it is defined here as an abstract type and then implemented in any platform specific derived file objects.

Permissions0..1 FilePermissionsType

The Permissions field specifies that particular permissions that a file may have. Since this is a platform-specific Object property, it is defined here as an abstract type and then implemented in any platform specific derived file objects.

User_Owner0..1 StringObjectPropertyType

The User_Owner field specifies the name of the user that owns the file.

Packer_List0..1 PackerListType

The Packer_List field specifies any packers that the file may be packed with. The term 'packer' here refers to packers, as well as things like archivers and installers.

Peak_Entropy0..1 DoubleObjectPropertyType

The Peak_Entropy field specifies the calculated peak entropy of the file.

Sym_Links0..1 SymLinksListType

The Sym_Links field specifies any symbolic links that may exist for the file.

Byte_Runs0..1 ByteRunsType

The Byte_Runs field contains a list of byte runs from the raw file or its storage medium.

Extracted_Features0..1 ExtractedFeaturesType

A description of features extracted from this file.

Encryption_Algorithm0..1 CipherType

The Encryption_Algorithm field specifies the algorithm used to encrypt the file.

Decryption_Key0..1 StringObjectPropertyType

The Decryption_Key field specifies the key used to decrypt the file.

Compression_Method0..1 StringObjectPropertyType

The Compression_Method field specifies the method used to compress the file.

Compression_Version0..1 StringObjectPropertyType

The Compression_Version field specifies the version of the compression method used to compress the file.

Compression_Comment0..1 StringObjectPropertyType

The Compression_Comment field specifies the comment string associated with the compressed file.

Metadata0..1 PDFFileMetadataType

The Metadata field captures some useful metadata associated with the PDF file.

Version0..1 DoubleObjectPropertyType

The Version field specifies the decimal version number portion of the string from the PDF Header that specifies the version of the PDF specification to which the PDF file conforms, e.g. '1.4'.

Indirect_Objects0..1 PDFIndirectObjectListType

The Indirect_Objects field captures the indirect objects included in the PDF file, representing the contents of a document.

Cross_Reference_Tables0..1 PDFXRefTableListType

The Cross_Reference_Tables field captures the cross-reference tables included in the PDF file, used for facilitating random access of indirect PDF objects.

Trailers0..1 PDFTrailerListType

The Trailers field captures the trailers included in the PDF file, used for capturing offsets to the cross-reference table and important objects.