US Patent 11301440 Fuzzy search using field-level deletion neighborhoods

The disclosure provides an efficient dataset search and/or deduplication that improve the speed and efficiency of dataset record search and/or deduplication over traditional methods. Certain implementations apply field-level deletion neighborhood processing to ordered field permutations of dataset records encoded with hash values. A method includes determining a field-level deletion neighborhood for two or more field combinations of the record by determining field hash values, creating field permutations, determining combined record hash values for each permutation; and associating each record hash value to the unique entity identifier. The method includes searching other entity representation records for matching combined record hash values, and assigning one or more of a unique entity identifier and a duplicate entity identifier to the other entity representation records having the matching combined record hash values. Certain implementations can include removing, from the database, at least one of the other entity representation records having a duplicate record identifier.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

No Further Resources data yet.

US Patent 11301440 Fuzzy search using field-level deletion neighborhoods

Contents

Patent attributes

Timeline

Further Resources

References

Find more entities like US Patent 11301440 Fuzzy search using field-level deletion neighborhoods