Patent attributes
Techniques for synthetic data generation in computer-based reasoning systems are discussed and include receiving a request for generation of synthetic training data based on a set of training data cases. One or more focal training data cases are determined. For undetermined features (either all of them or those that are not subject to conditions), a value for the feature is determined based on the focal cases. In some embodiments, validity of the generated value may be checked based on feature information. In some embodiments, generated synthetic data may be checked against all or a portion of the training data to ensure that it is not overly similar.