A conversational agent capability assessment method, system, and computer program product, includes obtaining data to create at least one scenario for testing a conversational agent, performing a set of tests using a scenario of the at least one scenario created to assess a capability of the conversational agent, and comparing a result of the capability from the set of tests with an expected result of the scenario.