Patent attributes
The disclosed embodiments provide a system that describes a semi-structured document for the purpose of acquiring a set of data elements from the semi-structured document. During operation, the system obtains a physics model of a semi-structured document, wherein the physics model includes a set of relationships represented by physical objects that describe relative positions of a set of data elements in the semi-structured document. Next, the system applies the physics model to a representation of the semi-structured document to automatically extract a set of data from the representation. The system then provides the extracted set of data for use with one or more applications without requiring manual input of the data into the one or more applications.