LDIF Parsing

LDAP Directory Interchange Format (LDIF) files are the standard medium for describing directory data in a flat-file format. The most common uses of this format include information transfer and archival. However, the standard also defines a way to describe modifications to stored data in a flat-file format. LDIFs of this later type are typically referred to as changetype or modify LDIFs.spring-doc.cn

The org.springframework.ldap.ldif package provides the classes needed to parse LDIF files and deserialize them into tangible objects. The LdifParser is the main class of the org.springframework.ldap.ldif package and is capable of parsing files that comply with RFC 2849. This class reads lines from a resource and assembles them into an LdapAttributes object.spring-doc.cn

The LdifParser currently ignores changetype LDIF entries, as their usefulness in the context of an application has yet to be determined.

Object Representation

Two classes in the org.springframework.ldap.core package provide the means to represent an LDIF in code:spring-doc.cn

  • LdapAttribute: Extends javax.naming.directory.BasicAttribute adding support for LDIF options as defined in RFC2849.spring-doc.cn

  • LdapAttributes: Extends javax.naming.directory.BasicAttributes adding specialized support for DNs.spring-doc.cn

LdapAttribute objects represent options as a Set<String>. The DN support added to the LdapAttributes object employs the javax.naming.ldap.LdapName class.spring-doc.cn

The Parser

The Parser interface provides the foundation for operation and employs three supporting policy definitions:spring-doc.cn

  • SeparatorPolicy: Establishes the mechanism by which lines are assembled into attributes.spring-doc.cn

  • AttributeValidationPolicy: Ensures that attributes are correctly structured prior to parsing.spring-doc.cn

  • Specification: Provides a mechanism by which object structure can be validated after assembly.spring-doc.cn

The default implementations of these interfaces are as follows:spring-doc.cn

  • org.springframework.ldap.ldif.parser.LdifParserspring-doc.cn

  • org.springframework.ldap.ldif.support.SeparatorPolicyspring-doc.cn

  • org.springframework.ldap.ldif.support.DefaultAttributeValidationPolicyspring-doc.cn

  • org.springframework.ldap.schema.DefaultSchemaSpecificationspring-doc.cn

Together, these four classes parse a resource line by line and translate the data into LdapAttributes objects.spring-doc.cn

The SeparatorPolicy determines how individual lines read from the source file should be interpreted, as the LDIF specification lets attributes span multiple lines. The default policy assesses lines in the context of the order in which they were read to determine the nature of the line in consideration. control attributes and changetype records are ignored.spring-doc.cn

The DefaultAttributeValidationPolicy uses REGEX expressions to ensure that each attribute conforms to a valid attribute format (according to RFC 2849) once parsed. If an attribute fails validation, an InvalidAttributeFormatException is logged, and the record is skipped (the parser returns null).spring-doc.cn

Schema Validation

A mechanism for validating parsed objects against a schema is available through the Specification interface in the org.springframework.ldap.schema package. The DefaultSchemaSpecification does not do any validation and is available for instances where records are known to be valid and need not be checked. This option saves the performance penalty that validation imposes. The BasicSchemaSpecification applies basic checks, such as ensuring DN and object class declarations have been provided. Currently, validation against an actual schema requires implementation of the Specification interface.spring-doc.cn

Spring Batch Integration

While the LdifParser can be employed by any application that requires parsing of LDIF files, Spring offers a batch processing framework that offers many file-processing utilities for parsing delimited files such as CSV. The org.springframework.ldap.ldif.batch package offers the classes needed to use the LdifParser as a valid configuration option in the Spring Batch framework. There are five classes in this package. Together, they offer three basic use cases:spring-doc.cn

  • Reading LDIF records from a file and returning an LdapAttributes object.spring-doc.cn

  • Reading LDIF records from a file and mapping records to Java objects (POJOs).spring-doc.cn

  • Writing LDIF records to a file.spring-doc.cn

The first use case is accomplished with LdifReader. This class extends Spring Batch’s AbstractItemCountingItemStreamItemReader and implements its ResourceAwareItemReaderItemStream. It fits naturally into the framework, and you can use it to read LdapAttributes objects from a file.spring-doc.cn

You can use MappingLdifReader to map LDIF objects directly to any POJO. This class requires you to provide an implementation of the RecordMapper interface. This implementation should implement the logic for mapping objects to POJOs.spring-doc.cn

You can implement RecordCallbackHandler and provide the implementation to either reader. You can use this handler to operate on skipped records. See the Spring Batch API documentation for more information.spring-doc.cn

The last member of this package, the LdifAggregator, can be used to write LDIF records to a file. This class invokes the toString() method of the LdapAttributes object.spring-doc.cn