Patent attributes
The present invention discloses a visual relationship detection method based on a region-aware learning mechanism, comprising: acquiring a triplet graph structure and combining features after its aggregation with neighboring nodes, using the features as nodes in a second graph structure, and connecting in accordance with equiprobable edges to form the second graph structure; combining node features of the second graph structure with features of corresponding entity object nodes in the triplet, using the combined features as a visual attention mechanism and merging internal region visual features extracted by two entity objects, and using the merged region visual features as visual features to be used in the next message propagation by corresponding entity object nodes in the triplet; and after a certain number of times of message propagations, combining the output triplet node features and the node features of the second graph structure to infer predicates between object sets.