One embodiment of the invention provides a method for entity extraction, comprising determining a set of part-of-speech (POS) tags based on one or more documents, determining a concept in the one or more documents based on the set of POS tags, and extracting one or more phrases from the one or more documents based on the concept. The method further comprises generating a first set of rules corresponding to the concept based on the one or more phrases, generating a second set of rules specific to a domain based on the first set of rules, and learning, via an adapter grammar, a structure of one or more named entities in the one or more documents based on the second set of rules.