Is a
Patent attributes
Patent Applicant
Current Assignee
Patent Jurisdiction
Patent Number
Patent Inventor Names
Kazuma Hashimoto0
Richard Socher0
Govardana Sachithanandam Ramachandran0
Caiming Xiong0
Date of Patent
March 5, 2024
0Patent Application Number
171052620
Date Filed
November 25, 2020
0Patent Citations Received
Patent Primary Examiner
CPC Code
Patent abstract
Embodiments described herein provide safe policy improvement (SPI) in a batch reinforcement learning framework for a task-oriented dialogue. Specifically, a batch reinforcement learning framework for dialogue policy learning is provided, which improves the performance of the dialogue and learns to shape a reward that reasons the invention behind human response rather than just imitating the human demonstration.
Timeline
No Timeline data yet.
Further Resources
No Further Resources data yet.

